Skip to content

Commit a1ea2a3

Browse files
committed
btrfs-progs: docs: add an extra note to btrfs data checksum and directIO
In v6.14 kernel release, btrfs will force a direct IO to fall back to a buffered one if the inode requires a data checksum. This will cause a small performance drop, to solve the false data checksum mismatch problem caused by direct IOs. Although such a change is small to most end users, for those requiring such a zero-copy direct IO this will be a behavior change, and this requires a proper documentation update. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Qu Wenruo <[email protected]>
1 parent 55137da commit a1ea2a3

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

Documentation/ch-checksumming.rst

+18
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,24 @@ writing and verified after reading the blocks from devices. The whole metadata
33
block has an inline checksum stored in the b-tree node header. Each data block
44
has a detached checksum stored in the checksum tree.
55

6+
.. note::
7+
Since a data checksum is calculated just before submitting to the block
8+
device, btrfs has a strong requirement that the corresponding data block must
9+
not be modified until the writeback is finished.
10+
11+
This requirement is met for a buffered write as btrfs has the full control on
12+
its page caches, but a direct write (``O_DIRECT``) bypasses page caches, and
13+
btrfs can not control the direct IO buffer (as it can be in user space memory),
14+
thus it's possible that a user space program modifies its direct write buffer
15+
before the buffer is fully written back, and this can lead to a data checksum mismatch.
16+
17+
To avoid such a checksum mismatch, since v6.14 btrfs will force a direct
18+
write to fall back to a buffered one, if the inode requires a data checksum.
19+
This will bring a small performance penalty, and if the end user requires true
20+
zero-copy direct writes, they should set the ``NODATASUM`` flag for the inode
21+
and make sure the direct IO buffer is fully aligned to btrfs block size.
22+
23+
624
There are several checksum algorithms supported. The default and backward
725
compatible algorithm is *crc32c*. Since kernel 5.5 there are three more with different
826
characteristics and trade-offs regarding speed and strength. The following list

0 commit comments

Comments
 (0)