linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-11-01 00:48:50 +00:00

History

Filipe Manana 939b656bc8 btrfs: fix corruption after buffer fault in during direct IO append write During an append (O_APPEND write flag) direct IO write if the input buffer was not previously faulted in, we can corrupt the file in a way that the final size is unexpected and it includes an unexpected hole. The problem happens like this: 1) We have an empty file, with size 0, for example; 2) We do an O_APPEND direct IO with a length of 4096 bytes and the input buffer is not currently faulted in; 3) We enter btrfs_direct_write(), lock the inode and call generic_write_checks(), which calls generic_write_checks_count(), and that function sets the iocb position to 0 with the following code: if (iocb->ki_flags & IOCB_APPEND) iocb->ki_pos = i_size_read(inode); 4) We call btrfs_dio_write() and enter into iomap, which will end up calling btrfs_dio_iomap_begin() and that calls btrfs_get_blocks_direct_write(), where we update the i_size of the inode to 4096 bytes; 5) After btrfs_dio_iomap_begin() returns, iomap will attempt to access the page of the write input buffer (at iomap_dio_bio_iter(), with a call to bio_iov_iter_get_pages()) and fail with -EFAULT, which gets returned to btrfs at btrfs_direct_write() via btrfs_dio_write(); 6) At btrfs_direct_write() we get the -EFAULT error, unlock the inode, fault in the write buffer and then goto to the label 'relock'; 7) We lock again the inode, do all the necessary checks again and call again generic_write_checks(), which calls generic_write_checks_count() again, and there we set the iocb's position to 4K, which is the current i_size of the inode, with the following code pointed above: if (iocb->ki_flags & IOCB_APPEND) iocb->ki_pos = i_size_read(inode); 8) Then we go again to btrfs_dio_write() and enter iomap and the write succeeds, but it wrote to the file range [4K, 8K), leaving a hole in the [0, 4K) range and an i_size of 8K, which goes against the expectations of having the data written to the range [0, 4K) and get an i_size of 4K. Fix this by not unlocking the inode before faulting in the input buffer, in case we get -EFAULT or an incomplete write, and not jumping to the 'relock' label after faulting in the buffer - instead jump to a location immediately before calling iomap, skipping all the write checks and relocking. This solves this problem and it's fine even in case the input buffer is memory mapped to the same file range, since only holding the range locked in the inode's io tree can cause a deadlock, it's safe to keep the inode lock (VFS lock), as was fixed and described in commit `51bd9563b6` ("btrfs: fix deadlock due to page faults during direct IO reads and writes"). A sample reproducer provided by a reporter is the following: $ cat test.c #ifndef _GNU_SOURCE #define _GNU_SOURCE #endif #include <fcntl.h> #include <stdio.h> #include <sys/mman.h> #include <sys/stat.h> #include <unistd.h> int main(int argc, char argv[]) { if (argc < 2) { fprintf(stderr, "Usage: %s <test file>\n", argv[0]); return 1; } int fd = open(argv[1], O_WRONLY \| O_CREAT \| O_TRUNC \| O_DIRECT \| O_APPEND, 0644); if (fd < 0) { perror("creating test file"); return 1; } char buf = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); ssize_t ret = write(fd, buf, 4096); if (ret < 0) { perror("pwritev2"); return 1; } struct stat stbuf; ret = fstat(fd, &stbuf); if (ret < 0) { perror("stat"); return 1; } printf("size: %llu\n", (unsigned long long)stbuf.st_size); return stbuf.st_size == 4096 ? 0 : 1; } A test case for fstests will be sent soon. Reported-by: Hanna Czenczek <hreitz@redhat.com> Link: https://lore.kernel.org/linux-btrfs/0b841d46-12fe-4e64-9abb-871d8d0de271@redhat.com/ Fixes: `8184620ae2` ("btrfs: fix lost file sync on direct IO write with nowait and dsync iocb") CC: stable@vger.kernel.org # 6.1+ Tested-by: Hanna Czenczek <hreitz@redhat.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2024-07-29 19:21:22 +02:00
..
9p	Two fixes headed to stable trees:	2024-05-29 09:25:15 -07:00
adfs	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
affs	affs: remove SLAB_MEM_SPREAD flag usage	2024-02-26 11:36:28 +01:00
afs	afs: Convert comma to semicolon	2024-07-02 21:23:00 +02:00
autofs
bcachefs	bcachefs: Fix kmalloc bug in __snapshot_t_mut	2024-06-25 20:51:14 -04:00
befs	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
bfs	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
btrfs	btrfs: fix corruption after buffer fault in during direct IO append write	2024-07-29 19:21:22 +02:00
cachefiles	cachefiles: remove unneeded include of <linux/fdtable.h>	2024-06-03 15:39:17 +02:00
ceph	We have a series from Xiubo that adds support for additional access	2024-05-25 14:23:58 -07:00
coda	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
configfs
cramfs	use ->bd_mapping instead of ->bd_inode->i_mapping	2024-05-03 02:36:51 -04:00
crypto	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
debugfs	debugfs: continue to ignore unknown mount options	2024-05-28 14:32:42 +02:00
devpts
dlm	dlm: return -ENOMEM if ls_recover_buf fails	2024-04-23 16:08:55 -05:00
ecryptfs	hardening updates for 6.10-rc1	2024-05-13 14:14:05 -07:00
efivarfs	efi: Clear up misconceptions about a maximum variable name size	2024-04-13 10:33:02 +02:00
efs	efs: remove SLAB_MEM_SPREAD flag usage	2024-02-27 11:21:33 +01:00
erofs	erofs: ensure m_llen is reset to 0 if metadata is invalid	2024-06-30 10:54:28 +08:00
exfat	exfat: zero the reserved fields of file and stream extension dentries	2024-04-25 21:59:59 +09:00
exportfs
ext2	ext2: Remove LEGACY_DIRECT_IO dependency	2024-05-03 11:50:28 +02:00
ext4	bd_inode series	2024-05-21 09:51:42 -07:00
f2fs	f2fs update for 6.10-rc1	2024-05-20 13:23:43 -07:00
fat	fs: add kernel-doc comments to fat_parse_long()	2024-04-25 21:07:02 -07:00
freevxfs	freevxfs: Convert freevxfs to the new mount API.	2024-03-26 09:04:53 +01:00
fuse	virtio: features, fixes, cleanups	2024-05-23 12:04:36 -07:00
gfs2	bd_inode series	2024-05-21 09:51:42 -07:00
hfs
hfsplus	hfsplus: refactor copy_name to not use strncpy	2024-04-24 16:55:28 -07:00
hostfs
hpfs	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
hugetlbfs	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
iomap	iomap: Fix iomap_adjust_read_range for plen calculation	2024-06-05 17:27:03 +02:00
isofs	isofs: Use -y instead of -objs in Makefile	2024-05-09 18:09:57 +02:00
jbd2	bd_inode series	2024-05-21 09:51:42 -07:00
jffs2	This pull request contains the following changes for JFFS2:	2024-05-25 13:23:42 -07:00
jfs	jfs: xattr: fix buffer overflow for invalid xattr	2024-06-04 18:09:03 +02:00
kernfs	kernfs: mount: Remove unnecessary ‘NULL’ values from knparent	2024-05-04 19:02:39 +02:00
lockd	lockd: host: Remove unnecessary statements＇host = NULL;＇	2024-05-06 09:07:20 -04:00
minix	minix: convert minix to use the new mount api	2024-03-26 09:04:55 +01:00
netfs	netfs: Fix netfs_page_mkwrite() to flush conflicting data, not wait	2024-06-26 14:19:08 +02:00
nfs	nfs: drop the incorrect assertion in nfs_swap_rw()	2024-06-24 20:52:11 -07:00
nfs_common
nfsd	nfsd-6.10 fixes:	2024-06-28 09:32:33 -07:00
nilfs2	nilfs2: fix incorrect inode allocation from reserved inodes	2024-07-03 12:29:25 -07:00
nls
notify	Revert "fanotify: remove unneeded sub-zero check for unsigned value"	2024-05-20 12:43:58 -07:00
ntfs3	driver ntfs3 for linux 6.10	2024-05-25 14:19:01 -07:00
ocfs2	ocfs2: fix DIO failure due to insufficient transaction credits	2024-06-24 20:52:10 -07:00
omfs
openpromfs	openpromfs: finish conversion to the new mount API	2024-03-26 09:04:54 +01:00
orangefs	orangefs: fix out-of-bounds fsid access	2024-05-14 17:44:14 -07:00
overlayfs	ovl: fix encoding fid for lower only root	2024-06-14 10:30:40 +02:00
proc	/proc/pid/smaps: add mseal info for vma	2024-06-24 20:52:09 -07:00
pstore	pstore/zone: Don't clear memory twice	2024-03-09 12:33:22 -08:00
qnx4	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
qnx6	qnx6: convert qnx6 to use the new mount api	2024-03-26 09:04:53 +01:00
quota	quota: fix to propagate error of mark_dquot_dirty() to caller	2024-04-12 14:52:29 +02:00
ramfs	mm: switch mm->get_unmapped_area() to a flag	2024-04-25 20:56:25 -07:00
reiserfs	getting rid of bogus set_blocksize() uses, switching it	2024-05-21 08:34:51 -07:00
romfs	fs,block: yield devices early	2024-03-27 13:17:15 +01:00
smb	cifs: Fix read-performance regression by dropping readahead expansion	2024-07-02 21:23:41 -05:00
squashfs	Mainly singleton patches, documented in their respective changelogs.	2024-05-19 14:02:03 -07:00
sysfs	Merge 6.9-rc5 into driver-core-next	2024-04-23 13:27:43 +02:00
sysv	sysv: remove SLAB_MEM_SPREAD flag usage	2024-02-27 11:21:31 +01:00
tracefs	eventfs: Do not use attributes for events directory	2024-05-23 09:31:50 -04:00
ubifs	This pull request contains updates for UBI and UBIFS:	2024-03-21 15:09:29 -07:00
udf	udf: Use a folio in udf_write_end()	2024-04-23 15:37:02 +02:00
ufs	mm, slab: remove last vestiges of SLAB_MEM_SPREAD	2024-03-12 20:32:19 -07:00
unicode	kbuild: use $(src) instead of $(srctree)/$(src) for source directory	2024-05-10 04:34:52 +09:00
vboxsf	vboxsf: explicitly deny setlease attempts	2024-04-03 16:06:39 +02:00
verity	fsverity: use register_sysctl_init() to avoid kmemleak warning	2024-05-03 08:30:58 -07:00
xfs	xfs: honor init_xattrs in xfs_init_new_inode for !ATTR fs	2024-06-26 14:29:25 +05:30
zonefs	zonefs: Use str_plural() to fix Coccinelle warning	2024-04-10 07:23:47 +09:00
aio.c	Assorted commits that had missed the last merge window...	2024-05-21 13:11:44 -07:00
anon_inodes.c	fs: Create anon_inode_getfile_fmode()	2024-04-26 10:33:05 +02:00
attr.c	lsm/stable-6.9 PR 20240312	2024-03-12 20:03:34 -07:00
backing-file.c	ovl: implement tmpfile	2024-05-02 20:35:57 +02:00
bad_inode.c
binfmt_elf.c	Mainly singleton patches, documented in their respective changelogs.	2024-05-19 14:02:03 -07:00
binfmt_elf_fdpic.c	binfmt_elf_fdpic: fix /proc/<pid>/auxv	2024-04-24 15:55:28 -07:00
binfmt_elf_test.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
buffer.c	bd_inode series	2024-05-21 09:51:42 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c	virtio: features, fixes, cleanups	2024-05-23 12:04:36 -07:00
d_path.c
dax.c	dax: use huge_zero_folio	2024-04-25 20:56:20 -07:00
dcache.c	fs: better handle deep ancestor chains in is_subdir()	2024-07-02 21:18:32 +02:00
direct-io.c	fs/direct-io: remove redundant assignment to variable retval	2024-04-11 10:21:24 +02:00
drop_caches.c
eventfd.c	eventfd: strictly check the count parameter of eventfd_write to avoid inputting illegal strings	2024-02-08 10:12:26 +01:00
eventpoll.c	epoll: be better about file lifetimes	2024-05-05 14:00:48 -07:00
exec.c	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
fcntl.c	fcntl: add F_DUPFD_QUERY fcntl()	2024-05-10 08:26:31 +02:00
fhandle.c	fs: Annotate struct file_handle with __counted_by() and use struct_size()	2024-04-05 15:53:47 +02:00
file.c	fs/file: fix the check in find_next_fd()	2024-05-30 09:11:47 +02:00
file_table.c	lsm/stable-6.9 PR 20240312	2024-03-12 20:03:34 -07:00
filesystems.c
fs-writeback.c	fs/writeback: remove unnecessary return in writeback_inodes_sb	2024-04-05 15:53:45 +02:00
fs_context.c
fs_parser.c	__fs_parse: Correct a documentation comment	2024-02-02 13:11:50 +01:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
init.c
inode.c	bcachefs updates for 6.9	2024-03-15 09:00:09 -07:00
internal.h	ovl: implement tmpfile	2024-05-02 20:35:57 +02:00
ioctl.c	fs/ioctl: Add a comment to keep the logic in sync with LSM policies	2024-05-13 06:58:35 +02:00
Kconfig	- Sumanth Korikkar has taught s390 to allocate hotplug-time page frames	2024-03-14 17:43:30 -07:00
Kconfig.binfmt
kernel_read_file.c
libfs.c	shmem: Fix shmem_rename2()	2024-04-17 13:49:44 +02:00
locks.c	filelock: Remove locks reliably when fcntl/close race is detected	2024-07-02 20:48:14 +02:00
Makefile	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
mbcache.c	vfs: remove SLAB_MEM_SPREAD flag usage	2024-02-27 11:21:31 +01:00
mnt_idmapping.c	fs/mnt_idmapping.c: Return -EINVAL when no map is written	2024-02-08 10:12:37 +01:00
mount.h
mpage.c	block, fs: Restore the per-bio/request data lifetime fields	2024-02-06 14:31:05 +01:00
namei.c	vfs: generate FS_CREATE before FS_OPEN when ->atomic_open used.	2024-06-18 16:26:09 +02:00
namespace.c	fs: relax mount_setattr() permission checks	2024-02-07 21:16:29 +01:00
nsfs.c	pidfs: remove config option	2024-03-13 12:53:53 -07:00
open.c	vfs-6.10-rc7.fixes	2024-07-01 09:22:08 -07:00
pidfs.c	fs/pidfs: make 'lsof' happy with our inode changes	2024-05-21 08:08:00 -07:00
pipe.c	fs/pipe: Convert to lockdep_cmp_fn	2024-02-02 13:11:49 +01:00
pnode.c
pnode.h
posix_acl.c	lsm/stable-6.9 PR 20240312	2024-03-12 20:03:34 -07:00
proc_namespace.c
read_write.c	Assorted commits that had missed the last merge window...	2024-05-21 13:11:44 -07:00
readdir.c
remap_range.c	vfs: export remap and write check helpers	2024-04-15 14:54:13 -07:00
select.c	fs/select: rework stack allocation hack for clang	2024-02-20 09:23:52 +01:00
seq_file.c	seq_file: Simplify __seq_puts()	2024-05-02 16:28:20 +02:00
signalfd.c	signalfd: drop an obsolete comment	2024-05-24 13:34:07 +02:00
splice.c	remove call_{read,write}_iter() functions	2024-04-15 16:03:25 -04:00
stack.c
stat.c	statx: stx_subvol	2024-03-26 09:01:18 +01:00
statfs.c
super.c	fs: don't misleadingly warn during thaw operations	2024-06-18 16:20:47 +02:00
sync.c
sysctls.c
timerfd.c	timerfd: convert to ->read_iter()	2024-04-10 16:23:02 -06:00
userfaultfd.c	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
utimes.c
xattr.c	evm: Move to LSM infrastructure	2024-02-15 23:43:47 -05:00