linux-stable/fs
Filipe Manana 3b2c064a8e btrfs: send: avoid unaligned encoded writes when attempting to clone range
[ Upstream commit a11452a370 ]

When trying to see if we can clone a file range, there are cases where we
end up sending two write operations in case the inode from the source root
has an i_size that is not sector size aligned and the length from the
current offset to its i_size is less than the remaining length we are
trying to clone.

Issuing two write operations when we could instead issue a single write
operation is not incorrect. However it is not optimal, specially if the
extents are compressed and the flag BTRFS_SEND_FLAG_COMPRESSED was passed
to the send ioctl. In that case we can end up sending an encoded write
with an offset that is not sector size aligned, which makes the receiver
fallback to decompressing the data and writing it using regular buffered
IO (so re-compressing the data in case the fs is mounted with compression
enabled), because encoded writes fail with -EINVAL when an offset is not
sector size aligned.

The following example, which triggered a bug in the receiver code for the
fallback logic of decompressing + regular buffer IO and is fixed by the
patchset referred in a Link at the bottom of this changelog, is an example
where we have the non-optimal behaviour due to an unaligned encoded write:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV > /dev/null
   mount -o compress $DEV $MNT

   # File foo has a size of 33K, not aligned to the sector size.
   xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo

   xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar

   # Now clone the first 32K of file bar into foo at offset 0.
   xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo

   # Snapshot the default subvolume and create a full send stream (v2).
   btrfs subvolume snapshot -r $MNT $MNT/snap

   btrfs send --compressed-data -f /tmp/test.send $MNT/snap

   echo -e "\nFile bar in the original filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT
   mkfs.btrfs -f $DEV > /dev/null
   mount $DEV $MNT

   echo -e "\nReceiving stream in a new filesystem..."
   btrfs receive -f /tmp/test.send $MNT

   echo -e "\nFile bar in the new filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT

Before this patch, the send stream included one regular write and one
encoded write for file 'bar', with the later being not sector size aligned
and causing the receiver to fallback to decompression + buffered writes.
The output of the btrfs receive command in verbose mode (-vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   write bar - offset=32768 length=1024
   encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0
   encoded_write bar - falling back to decompress and write due to errno 22 ("Invalid argument")
   (...)

This patch avoids the regular write followed by an unaligned encoded write
so that we end up sending a single encoded write that is aligned. So after
this patch the stream content is (output of btrfs receive -vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   encoded_write bar - offset=32768, len=4096, unencoded_offset=32768, unencoded_file_len=32768, unencoded_len=65536, compression=1, encryption=0
   (...)

So we get more optimal behaviour and avoid the silent data loss bug in
versions of btrfs-progs affected by the bug referred by the Link tag
below (btrfs-progs v5.19, v5.19.1, v6.0 and v6.0.1).

Link: https://lore.kernel.org/linux-btrfs/cover.1668529099.git.fdmanana@suse.com/
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-14 11:30:41 +01:00
..
9p 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" 2022-06-22 14:11:02 +02:00
adfs
affs fs/affs: release old buffer head on error path 2021-03-04 10:26:48 +01:00
afs afs: Fix fileserver probe RTT handling 2022-12-08 11:23:03 +01:00
autofs
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:48:39 +01:00
btrfs btrfs: send: avoid unaligned encoded writes when attempting to clone range 2022-12-14 11:30:41 +01:00
cachefiles cachefiles: Handle readpage error correctly 2020-11-05 11:43:36 +01:00
ceph ceph: avoid putting the realm twice when decoding snaps fails 2022-12-08 11:23:00 +01:00
cifs cifs: add check for returning value of SMB2_set_info_init 2022-11-25 17:42:16 +01:00
coda
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:41:10 +01:00
cramfs
crypto fscrypt: add fscrypt_symlink_getattr() for computing st_size 2021-09-12 08:56:38 +02:00
debugfs debugfs: add debugfs_lookup_and_remove() 2022-09-15 12:04:54 +02:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:24:34 +01:00
dlm fs: dlm: handle -EBUSY first in lock arg validation 2022-10-26 13:22:14 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:05:19 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-12-02 08:49:53 +01:00
efs
erofs erofs: avoid consecutive detection for Highmem memory 2022-08-25 11:17:36 +02:00
exportfs
ext2 ext2: Add more validity checks for inode counts 2022-08-25 11:17:28 +02:00
ext4 ext4: fix BUG_ON() when directory entry has invalid rec_len 2022-11-10 17:57:56 +01:00
f2fs f2fs: fix race condition on setting FI_NO_EXTENT flag 2022-10-26 13:22:46 +02:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-14 18:11:30 +02:00
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-22 12:26:25 +02:00
fuse fuse: lock inode unconditionally in fuse_fallocate() 2022-12-08 11:23:01 +01:00
gfs2 gfs2: Switch from strlcpy to strscpy 2022-11-25 17:42:22 +01:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:19:38 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:08:29 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:24:14 +02:00
hpfs
hugetlbfs mm, hugetlb: allow for "high" userspace addresses 2022-05-09 09:03:28 +02:00
iomap iomap: iomap_write_failed fix 2022-06-14 18:11:36 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:43:03 +01:00
jbd2 jbd2: wake up journal waiters in FIFO order, not LIFO 2022-10-26 13:22:17 +02:00
jffs2 jffs2: fix memory leak in jffs2_do_fill_super 2022-06-14 18:11:55 +02:00
jfs fs: jfs: fix possible NULL pointer dereference in dbFree() 2022-06-14 18:11:29 +02:00
kernfs kernfs: fix use-after-free in __kernfs_remove 2022-11-03 23:56:54 +09:00
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-22 12:26:34 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-15 14:18:35 +02:00
nfs NFSv4: Retry LOCK on OLD_STATEID during delegation return 2022-11-25 17:42:12 +01:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:51:22 +01:00
nfsd NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data 2022-10-26 13:22:47 +02:00
nilfs2 nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() 2022-12-08 11:23:04 +01:00
nls
notify fsnotify: fix wrong lockdep annotations 2022-06-14 18:11:34 +02:00
ntfs ntfs: check overflow when iterating ATTR_RECORDs 2022-11-25 17:42:22 +01:00
ocfs2 ocfs2: fix BUG when iput after ocfs2_mknod fails 2022-10-29 10:20:34 +02:00
omfs
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-20 09:19:17 +01:00
overlayfs ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh() 2022-08-25 11:17:23 +02:00
proc mm: /proc/pid/smaps_rollup: fix no vma's null-deref 2022-10-29 10:20:36 +02:00
pstore pstore: Fix typo in compression option name 2021-03-04 10:26:45 +01:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:09:26 +02:00
qnx6
quota quota: Check next/prev free block number after reading from quota file 2022-10-26 13:22:14 +02:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-29 09:57:53 +01:00
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:21:05 +02:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-26 10:40:51 +02:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:08:29 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2021-03-07 12:20:48 +01:00
sysv
tracefs tracefs: Only clobber mode/uid/gid on remount if asked 2022-09-20 12:28:00 +02:00
ubifs ubifs: Rectify space amount budget for mkdir/tmpfile operations 2022-04-15 14:18:31 +02:00
udf udf: Fix a slab-out-of-bounds write bug in udf_find_entry() 2022-11-25 17:42:09 +01:00
ufs fs/ufs: avoid potential u32 multiplication overflow 2020-08-21 13:05:37 +02:00
unicode
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-10-06 15:42:30 +02:00
xfs xfs: drain the buf delwri queue before xfsaild idles 2022-11-25 17:42:03 +01:00
aio.c aio: fix use-after-free due to missing POLLFREE handling 2021-12-14 14:49:02 +01:00
anon_inodes.c
attr.c vfs: Check the truncate maximum size in inode_newsize_ok() 2022-08-25 11:17:21 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings 2021-10-06 15:42:35 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-14 18:11:23 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:03:57 +01:00
binfmt_script.c
block_dev.c block: reexpand iov_iter after read/write 2021-05-22 11:38:29 +02:00
buffer.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-25 17:42:22 +01:00
char_dev.c
compat.c
compat_binfmt_elf.c
compat_ioctl.c compat_ioctl: remove /dev/random commands 2022-06-22 14:11:03 +02:00
coredump.c coredump: fix core_pattern parse error 2020-12-11 13:23:30 +01:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-29 09:57:45 +01:00
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-14 18:11:41 +02:00
dcache.c fix dget_parent() fastpath race 2020-10-01 13:17:19 +02:00
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:24:11 +02:00
drop_caches.c
eventfd.c
eventpoll.c epoll: check for events when removing a timed out thread from the wait queue 2022-12-08 11:23:05 +01:00
exec.c exec: Force single empty string when argv is empty 2022-06-06 08:33:50 +02:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-15 09:47:28 +02:00
fhandle.c
file.c fget: clarify and improve __fget_files() implementation 2022-03-02 11:41:18 +01:00
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-05-25 09:14:34 +02:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-17 10:50:21 +02:00
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-06-14 18:11:44 +02:00
fs_context.c memcg: charge fs_context and legacy_fs_context 2022-02-08 18:24:29 +01:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
inode.c fs: fix UAF/GPF bug in nilfs_mdt_destroy 2022-10-15 07:54:36 +02:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:19:37 +02:00
io_uring.c io_uring/af_unix: defer registered files gc to io_uring release 2022-10-26 13:22:59 +02:00
ioctl.c
Kconfig
Kconfig.binfmt
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-24 13:29:19 +01:00
locks.c
Makefile
mbcache.c
mount.h
mpage.c
namei.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-25 17:42:22 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:36:22 -04:00
no-block.c
nsfs.c
open.c
pipe.c pipe: increase minimum default pipe size to 2 pages 2021-08-12 13:21:02 +02:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-05-02 08:48:44 +02:00
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:03:33 +01:00
posix_acl.c
proc_namespace.c
read_write.c
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 12:56:16 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:25:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:10:54 +02:00
signalfd.c io_uring: disable polling pollfree files 2022-09-05 10:27:47 +02:00
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-17 17:24:32 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:50:48 +02:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-02-23 11:59:55 +01:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: open userfaultfds with O_RDONLY 2022-10-26 13:22:21 +02:00
utimes.c
xattr.c xattr: break delegations in {set,remove}xattr 2020-08-11 15:33:39 +02:00