linux-stable/fs
Qu Wenruo 56596a9fdd btrfs: zstd: fix and simplify the inline extent decompression (v2)
Note: this is a fixed version that was previously reverted as
e01a83e126 ("Revert "btrfs: zstd: fix and simplify the inline extent
decompression""), with fixed parameters to memzero_page().

[BUG]
If we have a filesystem with 4k sectorsize, and an inlined compressed
extent created like this:

	item 4 key (257 INODE_ITEM 0) itemoff 15863 itemsize 160
		generation 8 transid 8 size 4096 nbytes 4096
		block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 5 key (257 INODE_REF 256) itemoff 15839 itemsize 24
		index 2 namelen 14 name: source_inlined
	item 6 key (257 EXTENT_DATA 0) itemoff 15770 itemsize 69
		generation 8 type 0 (inline)
		inline extent data size 48 ram_bytes 4096 compression 3 (zstd)

Then trying to reflink that extent in an aarch64 system with 64K page
size, the reflink would just fail:

  # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest
  XFS_IOC_CLONE_RANGE: Input/output error

[CAUSE]
In zstd_decompress(), we didn't treat @start_byte as just a page offset,
but also use it as an indicator on whether we should error out, without
any proper explanation (this is copied from other decompression code).

In reality, for subpage cases, although @start_byte can be non-zero,
we should never switch input/output buffer nor error out, since the whole
input/output buffer should never exceed one sector, thus we should not
need to do any buffer switch.

Thus the current code using @start_byte as a condition to switch
input/output buffer or finish the decompression is completely incorrect.

[FIX]
The fix involves several modification:

- Rename @start_byte to @dest_pgoff to properly express its meaning

- Use @sectorsize other than PAGE_SIZE to properly initialize the
  output buffer size

- Use correct destination offset inside the destination page

- Simplify the main loop
  Since the input/output buffer should never switch, we only need one
  zstd_decompress_stream() call.

- Consider early end as an error

After the fix, even on 64K page sized aarch64, above reflink now
works as expected:

  # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest
  linked 4096/4096 bytes at offset 61440

And results the correct file layout:

	item 9 key (258 INODE_ITEM 0) itemoff 15542 itemsize 160
		generation 10 transid 10 size 65536 nbytes 4096
		block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 10 key (258 INODE_REF 256) itemoff 15528 itemsize 14
		index 3 namelen 4 name: dest
	item 11 key (258 XATTR_ITEM 3817753667) itemoff 15445 itemsize 83
		location key (0 UNKNOWN.0 0) type XATTR
		transid 10 data_len 37 name_len 16
		name: security.selinux
		data unconfined_u:object_r:unlabeled_t:s0
	item 12 key (258 EXTENT_DATA 61440) itemoff 15392 itemsize 53
		generation 10 type 1 (regular)
		extent data disk byte 13631488 nr 4096
		extent data offset 0 nr 4096 ram 4096
		extent compression 0 (none)

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04 16:24:46 +01:00
..
9p 9p: Use length of data written to the server in preference to error 2024-01-04 13:15:31 +00:00
adfs adfs: remove writepage implementation 2023-12-29 11:58:33 -08:00
affs affs: free affs_sb_info with kfree_rcu() 2024-02-25 02:10:31 -05:00
afs afs: Fix endless loop in directory parsing 2024-02-27 11:20:43 +01:00
autofs dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
bcachefs bcachefs: fix bch2_save_backtrace() 2024-02-25 15:45:36 -05:00
befs befs: d_obtain_alias(ERR_PTR(...)) will do the right thing 2023-12-21 12:51:02 -05:00
bfs misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
btrfs btrfs: zstd: fix and simplify the inline extent decompression (v2) 2024-03-04 16:24:46 +01:00
cachefiles cachefiles: fix memory leak in cachefiles_add_cache() 2024-02-20 09:46:07 +01:00
ceph ceph: switch to corrected encoding of max_xattr_size in mdsmap 2024-02-26 19:20:30 +01:00
coda dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
configfs
cramfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
crypto fscrypt: document that CephFS supports fscrypt now 2023-12-26 22:55:42 -06:00
debugfs Merge branches 'acpi-pm', 'acpi-video', 'acpi-apei' and 'acpi-extlog' 2024-01-04 13:19:40 +01:00
devpts fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
dlm dlm: update format header reflect current format 2023-12-20 15:36:48 -06:00
ecryptfs fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
efivarfs efivarfs: Drop 'duplicates' bool parameter on efivar_init() 2024-02-25 09:43:39 +01:00
efs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
erofs Change since last update: 2024-02-25 09:53:13 -08:00
exfat Description for this pull request: 2024-03-01 12:22:30 -08:00
exportfs fs: fix build error with CONFIG_EXPORTFS=m or not defined 2023-10-28 16:16:19 +02:00
ext2 fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
ext4 We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
f2fs f2fs: fix double free of f2fs_sb_info 2024-01-12 18:55:09 -08:00
fat vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
freevxfs freevxfs: lookup: fix function params kernel-doc 2023-12-20 15:02:58 -08:00
fuse fuse: fix UAF in rcu pathwalks 2024-02-25 02:10:32 -05:00
gfs2 Revert "gfs2: Use GL_NOBLOCK flag for non-blocking lookups" 2024-02-02 17:21:44 +01:00
hfs hfs: really remove hfs_writepage 2023-12-29 11:58:34 -08:00
hfsplus hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info 2024-02-25 02:10:31 -05:00
hostfs hostfs: use d_splice_alias() calling conventions to simplify failure exits 2023-12-21 12:51:00 -05:00
hpfs
hugetlbfs fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super 2024-02-07 21:20:36 -08:00
iomap mm: add folio_fill_tail() and use it in iomap 2023-12-10 16:51:36 -08:00
isofs
jbd2 jbd2: abort journal when detecting metadata writeback error of fs dev 2024-01-04 23:42:21 -05:00
jffs2 jffs2: mark __jffs2_dbg_superblock_counts() static 2023-12-10 17:21:43 -08:00
jfs Revert "jfs: fix shift-out-of-bounds in dbJoin" 2024-01-29 08:45:10 -06:00
kernfs Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock" 2024-01-11 11:51:27 +01:00
lockd sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
minix minixfs kmap_local_page() switchover and related fixes - very similar to sysv series. 2024-01-11 19:54:18 -08:00
netfs netfs: Fix missing zero-length check in unbuffered write 2024-01-29 14:53:21 +01:00
nfs nfs: fix UAF on pathwalk running into umount 2024-02-25 02:10:32 -05:00
nfs_common
nfsd nfsd-6.8 fixes: 2024-02-07 17:48:15 +00:00
nilfs2 nilfs2: fix potential bug in end_buffer_async_write 2024-02-07 21:20:37 -08:00
nls
notify dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
ntfs sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
ntfs3 fs/ntfs3: fix build without CONFIG_NTFS3_LZX_XPRESS 2024-02-26 09:32:23 -08:00
ocfs2 misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
omfs
openpromfs
orangefs orangefs: saner arguments passing in readdir guts 2023-12-21 12:53:36 -05:00
overlayfs vfs-6.8-rc5.fixes 2024-02-12 07:15:45 -08:00
proc We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
pstore pstore: inode: Use cleanup.h for struct pstore_private 2023-12-08 14:15:44 -08:00
qnx4 qnx4: Use get_directory_fname() in qnx4_match() 2023-12-13 11:19:18 -08:00
qnx6
quota sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
ramfs mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
reiserfs misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
romfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
smb We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
squashfs Squashfs: fix variable overflow triggered by sysbot 2023-12-10 17:21:26 -08:00
sysfs fs/sysfs/dir.c : Fix typo in comment 2023-12-07 11:35:23 +09:00
sysv sysv: remove writepage implementation 2023-12-29 11:58:35 -08:00
tracefs eventfs: Keep all directory links at 1 2024-02-01 11:53:53 -05:00
ubifs ubifs: fix kernel-doc warnings 2024-01-06 23:49:50 +01:00
udf misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
ufs Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
unicode
vboxsf fs: vboxsf: fix a kernel-doc warning 2023-12-08 15:32:31 -07:00
verity Networking changes for 6.8. 2024-01-11 10:07:29 -08:00
xfs xfs: drop experimental warning for FSDAX 2024-02-27 09:53:30 +05:30
zonefs zonefs: Improve error handling 2024-02-16 10:20:35 +09:00
aio.c fs/aio: Make io_cancel() generate completions again 2024-02-27 11:20:44 +01:00
anon_inodes.c Merge branch 'kvm-guestmemfd' into HEAD 2023-11-14 08:31:31 -05:00
attr.c fs: fix doc comment typo fs tree wide 2023-12-21 13:17:54 +01:00
backing-file.c fs: factor out backing_file_mmap() helper 2023-12-23 16:35:09 +02:00
bad_inode.c
binfmt_elf.c
binfmt_elf_fdpic.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_elf_test.c
binfmt_flat.c
binfmt_misc.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_script.c
buffer.c Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
char_dev.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
compat_binfmt_elf.c
coredump.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
d_path.c
dax.c fs : Fix warning using plain integer as NULL 2023-11-18 15:00:01 +01:00
dcache.c Revert "get rid of DCACHE_GENOCIDE" 2024-02-09 23:31:16 -05:00
direct-io.c fs : Fix warning using plain integer as NULL 2023-11-18 15:00:01 +01:00
drop_caches.c
eventfd.c eventfd: Remove usage of the deprecated ida_simple_xx() API 2023-12-12 14:24:55 +01:00
eventpoll.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
exec.c execve fixes for v6.8-rc2 2024-01-24 13:32:29 -08:00
fcntl.c
fhandle.c exportfs: add helpers to check if filesystem can encode/decode file handles 2023-10-24 17:57:45 +02:00
file.c file: remove __receive_fd() 2023-12-12 14:24:14 +01:00
file_table.c dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
filesystems.c
fs-writeback.c netfs: Move pinning-for-writeback from fscache to netfs 2023-12-24 15:08:49 +00:00
fs_context.c
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
init.c
inode.c fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
internal.h dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
ioctl.c lsm: new security_file_ioctl_compat() hook 2023-12-24 15:48:03 -05:00
Kconfig vfs-6.8.netfs 2024-01-19 09:10:23 -08:00
Kconfig.binfmt
kernel_read_file.c
libfs.c dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
locks.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
Makefile vfs-6.8.netfs 2024-01-19 09:10:23 -08:00
mbcache.c
mnt_idmapping.c mnt_idmapping: decouple from namespaces 2023-11-28 14:08:47 +01:00
mount.h mounts: keep list of mounts in an rbtree 2023-11-18 14:56:16 +01:00
mpage.c fs: convert block_write_full_page to block_write_full_folio 2023-12-29 11:58:35 -08:00
namei.c rcu pathwalk: prevent bogus hard errors from may_lookup() 2024-02-25 02:10:31 -05:00
namespace.c fs: relax mount_setattr() permission checks 2024-02-07 21:16:29 +01:00
nsfs.c nsfs: use d_make_root() 2023-11-25 02:49:43 -05:00
open.c vfs-6.8.rw 2024-01-08 11:11:51 -08:00
pipe.c sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
pnode.c mounts: keep list of mounts in an rbtree 2023-11-18 14:56:16 +01:00
pnode.h
posix_acl.c fs: fix doc comment typo fs tree wide 2023-12-21 13:17:54 +01:00
proc_namespace.c namespace: extract show_path() helper 2023-11-18 14:56:16 +01:00
read_write.c fsnotify: optionally pass access range in file permission hooks 2023-12-12 16:20:02 +01:00
readdir.c fsnotify: optionally pass access range in file permission hooks 2023-12-12 16:20:02 +01:00
remap_range.c remap_range: merge do_clone_file_range() into vfs_clone_file_range() 2024-02-06 17:07:21 +01:00
select.c
seq_file.c
signalfd.c
splice.c fs: use splice_copy_file_range() inline helper 2023-12-12 16:20:02 +01:00
stack.c
stat.c vfs-6.8.mount 2024-01-08 10:57:34 -08:00
statfs.c
super.c fs/super.c: don't drop ->s_user_ns until we free struct super_block itself 2024-02-25 02:10:31 -05:00
sync.c
sysctls.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
timerfd.c
userfaultfd.c Generic: 2024-01-17 13:03:37 -08:00
utimes.c
xattr.c