linux-stable/fs/btrfs
Filipe Manana dae93f4168 btrfs: send: avoid unaligned encoded writes when attempting to clone range
[ Upstream commit a11452a370 ]

When trying to see if we can clone a file range, there are cases where we
end up sending two write operations in case the inode from the source root
has an i_size that is not sector size aligned and the length from the
current offset to its i_size is less than the remaining length we are
trying to clone.

Issuing two write operations when we could instead issue a single write
operation is not incorrect. However it is not optimal, specially if the
extents are compressed and the flag BTRFS_SEND_FLAG_COMPRESSED was passed
to the send ioctl. In that case we can end up sending an encoded write
with an offset that is not sector size aligned, which makes the receiver
fallback to decompressing the data and writing it using regular buffered
IO (so re-compressing the data in case the fs is mounted with compression
enabled), because encoded writes fail with -EINVAL when an offset is not
sector size aligned.

The following example, which triggered a bug in the receiver code for the
fallback logic of decompressing + regular buffer IO and is fixed by the
patchset referred in a Link at the bottom of this changelog, is an example
where we have the non-optimal behaviour due to an unaligned encoded write:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV > /dev/null
   mount -o compress $DEV $MNT

   # File foo has a size of 33K, not aligned to the sector size.
   xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo

   xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar

   # Now clone the first 32K of file bar into foo at offset 0.
   xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo

   # Snapshot the default subvolume and create a full send stream (v2).
   btrfs subvolume snapshot -r $MNT $MNT/snap

   btrfs send --compressed-data -f /tmp/test.send $MNT/snap

   echo -e "\nFile bar in the original filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT
   mkfs.btrfs -f $DEV > /dev/null
   mount $DEV $MNT

   echo -e "\nReceiving stream in a new filesystem..."
   btrfs receive -f /tmp/test.send $MNT

   echo -e "\nFile bar in the new filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT

Before this patch, the send stream included one regular write and one
encoded write for file 'bar', with the later being not sector size aligned
and causing the receiver to fallback to decompression + buffered writes.
The output of the btrfs receive command in verbose mode (-vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   write bar - offset=32768 length=1024
   encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0
   encoded_write bar - falling back to decompress and write due to errno 22 ("Invalid argument")
   (...)

This patch avoids the regular write followed by an unaligned encoded write
so that we end up sending a single encoded write that is aligned. So after
this patch the stream content is (output of btrfs receive -vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   encoded_write bar - offset=32768, len=4096, unencoded_offset=32768, unencoded_file_len=32768, unencoded_len=65536, compression=1, encryption=0
   (...)

So we get more optimal behaviour and avoid the silent data loss bug in
versions of btrfs-progs affected by the bug referred by the Link tag
below (btrfs-progs v5.19, v5.19.1, v6.0 and v6.0.1).

Link: https://lore.kernel.org/linux-btrfs/cover.1668529099.git.fdmanana@suse.com/
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-14 11:37:16 +01:00
..
tests btrfs: remove pointless and double ulist frees in error paths of qgroup tests 2022-11-26 09:24:32 +01:00
acl.c overlayfs update for 5.15 2021-09-02 09:21:27 -07:00
async-thread.c btrfs: fix memory ordering between normal and ordered work functions 2021-11-25 09:48:46 +01:00
async-thread.h
backref.c btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-12-08 11:28:38 +01:00
backref.h btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-12-08 11:28:38 +01:00
block-group.c btrfs: enhance unsupported compat RO flags handling 2022-10-29 10:12:53 +02:00
block-group.h btrfs: fix space cache corruption and potential double allocations 2022-09-05 10:30:12 +02:00
block-rsv.c
block-rsv.h
btrfs_inode.h btrfs: put initial index value of a directory in a constant 2022-08-31 17:16:35 +02:00
check-integrity.c btrfs: rename btrfs_bio to btrfs_io_context 2022-07-21 21:24:32 +02:00
check-integrity.h
compression.c btrfs: remove unused parameter nr_pages in add_ra_bio_pages() 2022-04-20 09:34:04 +02:00
compression.h btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00
ctree.c btrfs: fix lockdep splat with reloc root extent buffers 2022-09-05 10:30:12 +02:00
ctree.h btrfs: fix space cache corruption and potential double allocations 2022-09-05 10:30:12 +02:00
delalloc-space.c btrfs: convert count_max_extents() to use fs_info->max_extent_size 2022-08-31 17:16:34 +02:00
delalloc-space.h
delayed-inode.c btrfs: add ro compat flags to inodes 2021-08-23 13:19:09 +02:00
delayed-inode.h
delayed-ref.c btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
delayed-ref.h btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref 2022-07-12 16:34:50 +02:00
dev-replace.c btrfs: add info when mount fails due to stale replace target 2022-08-31 17:16:46 +02:00
dev-replace.h
dir-item.c btrfs: unify lookup return value when dir entry is missing 2021-10-07 22:06:32 +02:00
discard.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
discard.h
disk-io.c btrfs: zoned: initialize device's zone info for seeding 2022-11-16 09:58:27 +01:00
disk-io.h btrfs: move lockdep class helpers to locking.c 2022-09-05 10:30:12 +02:00
export.c btrfs: fix type of parameter generation in btrfs_get_dentry 2022-11-10 18:15:38 +01:00
export.h btrfs: fix type of parameter generation in btrfs_get_dentry 2022-11-10 18:15:38 +01:00
extent-io-tree.h
extent-tree.c btrfs: fix tree mod log mishandling of reallocated nodes 2022-11-10 18:15:37 +01:00
extent_io.c btrfs: fix lockdep splat with reloc root extent buffers 2022-09-05 10:30:12 +02:00
extent_io.h btrfs: fix qgroup reserve overflow the qgroup limit 2022-04-13 20:59:23 +02:00
extent_map.c btrfs: rename btrfs_bio to btrfs_io_context 2022-07-21 21:24:32 +02:00
extent_map.h
file-item.c btrfs: make search_csum_tree return 0 if we get -EFBIG 2022-04-08 14:23:58 +02:00
file.c btrfs: fix lost file sync on direct IO write with nowait and dsync iocb 2022-11-10 18:15:37 +01:00
free-space-cache.c btrfs: dump extra info if one free space cache has more bitmaps than it should 2022-10-26 12:35:44 +02:00
free-space-cache.h
free-space-tree.c btrfs: fix invalid delayed ref after subvolume creation failure 2022-07-12 16:34:50 +02:00
free-space-tree.h
inode-item.c
inode.c btrfs: remove root argument from btrfs_unlink_inode() 2022-09-05 10:30:09 +02:00
ioctl.c btrfs: free btrfs_path before copying inodes to userspace 2022-12-08 11:28:38 +01:00
Kconfig btrfs: disable build on platforms having page size 256K 2021-06-22 14:11:57 +02:00
locking.c btrfs: fix lockdep splat with reloc root extent buffers 2022-09-05 10:30:12 +02:00
locking.h btrfs: fix lockdep splat with reloc root extent buffers 2022-09-05 10:30:12 +02:00
lzo.c btrfs: prevent copying too big compressed lzo segment 2022-03-02 11:48:07 +01:00
Makefile btrfs: initial fsverity support 2021-08-23 13:19:09 +02:00
misc.h btrfs: use correct header for div_u64 in misc.h 2021-09-07 14:29:50 +02:00
ordered-data.c btrfs: zoned: fix double counting of split ordered extent 2021-09-07 14:30:41 +02:00
ordered-data.h btrfs: remove uptodate parameter from btrfs_dec_test_first_ordered_pending 2021-08-23 13:19:02 +02:00
orphan.c
print-tree.c
print-tree.h
props.c btrfs: props: change how empty value is interpreted 2021-06-22 14:11:58 +02:00
props.h
qgroup.c btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit() 2022-12-08 11:28:38 +01:00
qgroup.h btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
raid56.c btrfs: raid56: properly handle the error when unable to find the missing stripe 2022-11-26 09:24:31 +01:00
raid56.h btrfs: rename btrfs_bio to btrfs_io_context 2022-07-21 21:24:32 +02:00
rcu-string.h
reada.c btrfs: rename btrfs_bio to btrfs_io_context 2022-07-21 21:24:32 +02:00
ref-verify.c btrfs: stop doing GFP_KERNEL memory allocations in the ref verify tool 2021-08-23 13:19:00 +02:00
ref-verify.h
reflink.c btrfs: fix unexpected error path when reflinking an inline extent 2022-04-08 14:23:11 +02:00
reflink.h
relocation.c btrfs: fix lockdep splat with reloc root extent buffers 2022-09-05 10:30:12 +02:00
root-tree.c btrfs: fix silent failure when deleting root reference 2022-08-31 17:16:46 +02:00
scrub.c btrfs: scrub: try to fix super block errors 2022-10-26 12:35:44 +02:00
send.c btrfs: send: avoid unaligned encoded writes when attempting to clone range 2022-12-14 11:37:16 +01:00
send.h
space-info.c btrfs: extend locking to all space_info members accesses 2022-04-08 14:23:02 +02:00
space-info.h btrfs: rip out btrfs_space_info::total_bytes_pinned 2021-06-22 14:55:25 +02:00
struct-funcs.c btrfs: add special case to setget helpers for 64k pages 2021-08-23 13:18:58 +02:00
subpage.c btrfs: subpage: fix a potential use-after-free in writeback helper 2021-08-23 13:19:05 +02:00
subpage.h btrfs: subpage: fix writeback which does not have ordered extent 2021-08-23 13:19:04 +02:00
super.c btrfs: enhance unsupported compat RO flags handling 2022-10-29 10:12:53 +02:00
sysfs.c btrfs: sysfs: normalize the error handling branch in btrfs_init_sysfs() 2022-12-02 17:41:12 +01:00
sysfs.h
transaction.c btrfs: make send work with concurrent block group relocation 2022-03-16 14:23:46 +01:00
transaction.h btrfs: do not start relocation until in progress drops are done 2022-03-08 19:12:54 +01:00
tree-checker.c btrfs: tree-checker: check for overlapping extent items 2022-09-05 10:30:12 +02:00
tree-checker.h
tree-defrag.c
tree-log.c btrfs: fix warning during log replay when bumping inode link count 2022-09-05 10:30:09 +02:00
tree-log.h btrfs: pass the dentry to btrfs_log_new_name() instead of the inode 2022-08-31 17:16:36 +02:00
tree-mod-log.c btrfs: fix race when picking most recent mod log operation for an old root 2021-04-20 19:27:17 +02:00
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c
verity.c btrfs: fix transaction handle leak after verity rollback failure 2021-09-17 19:29:41 +02:00
volumes.c btrfs: zoned: initialize device's zone info for seeding 2022-11-16 09:58:27 +01:00
volumes.h btrfs: zoned: initialize device's zone info for seeding 2022-11-16 09:58:27 +01:00
xattr.c btrfs: check if root is readonly while setting security xattr 2022-08-31 17:16:46 +02:00
xattr.h
zlib.c Revert "btrfs: compression: drop kmap/kunmap from zlib" 2021-10-29 13:03:05 +02:00
zoned.c btrfs: use kvcalloc in btrfs_get_dev_zone_info 2022-12-02 17:41:12 +01:00
zoned.h btrfs: zoned: revive max_zone_append_bytes 2022-08-31 17:16:34 +02:00
zstd.c Revert "btrfs: compression: drop kmap/kunmap from zstd" 2021-10-29 13:02:50 +02:00