linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-23 17:30:02 +00:00

Author	SHA1	Message	Date
Haimin Zhang	33bd243566	jfs: prevent NULL deref in diFree [ Upstream commit `a530462910` ] Add validation check for JFS_IP(ipimap)->i_imap to prevent a NULL deref in diFree since diFree uses it without do any validations. When function jfs_mount calls diMount to initialize fileset inode allocation map, it can fail and JFS_IP(ipimap)->i_imap won't be initialized. Then it calls diFreeSpecial to close fileset inode allocation map inode and it will flow into jfs_evict_inode. Function jfs_evict_inode just validates JFS_SBI(inode->i_sb)->ipimap, then calls diFree. diFree use JFS_IP(ipimap)->i_imap directly, then it will cause a NULL deref. Reported-by: TCS Robot <tcs_robot@tencent.com> Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:28 +02:00
NeilBrown	7e907c022e	NFS: swap-out must always use STABLE writes. [ Upstream commit `c265de257f` ] The commit handling code is not safe against memory-pressure deadlocks when writing to swap. In particular, nfs_commitdata_alloc() blocks indefinitely waiting for memory, and this can consume all available workqueue threads. swap-out most likely uses STABLE writes anyway as COND_STABLE indicates that a stable write should be used if the write fits in a single request, and it normally does. However if we ever swap with a small wsize, or gather unusually large numbers of pages for a single write, this might change. For safety, make it explicit in the code that direct writes used for swap must always use FLUSH_STABLE. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:28 +02:00
NeilBrown	4e70b56438	NFS: swap IO handling is slightly different for O_DIRECT IO [ Upstream commit `64158668ac` ] 1/ Taking the i_rwsem for swap IO triggers lockdep warnings regarding possible deadlocks with "fs_reclaim". These deadlocks could, I believe, eventuate if a buffered read on the swapfile was attempted. We don't need coherence with the page cache for a swap file, and buffered writes are forbidden anyway. There is no other need for i_rwsem during direct IO. So never take it for swap_rw() 2/ generic_write_checks() explicitly forbids writes to swap, and performs checks that are not needed for swap. So bypass it for swap_rw(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:28 +02:00
Zhihao Cheng	f7e252d7e4	ubifs: Rectify space amount budget for mkdir/tmpfile operations [ Upstream commit `a6dab6607d` ] UBIFS should make sure the flash has enough space to store dirty (Data that is newer than disk) data (in memory), space budget is exactly designed to do that. If space budget calculates less data than we need, 'make_reservation()' will do more work(return -ENOSPC if no free space lelf, sometimes we can see "cannot reserve xxx bytes in jhead xxx, error -28" in ubifs error messages) with ubifs inodes locked, which may effect other syscalls. A simple way to decide how much space do we need when make a budget: See how much space is needed by 'make_reservation()' in ubifs_jnl_xxx() function according to corresponding operation. It's better to report ENOSPC in ubifs_budget_space(), as early as we can. Fixes: `474b93704f` ("ubifs: Implement O_TMPFILE") Fixes: `1e51764a3c` ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:25 +02:00
Andrew Price	a598bd99b0	gfs2: Make sure FITRIM minlen is rounded up to fs block size commit `27ca8273fd` upstream. Per fstrim(8) we must round up the minlen argument to the fs block size. The current calculation doesn't take into account devices that have a discard granularity and requested minlen less than 1 fs block, so the value can get shifted away to zero in the translation to fs blocks. The zero minlen passed to gfs2_rgrp_send_discards() then allows sb_issue_discard() to be called with nr_sects == 0 which returns -EINVAL and results in gfs2_rgrp_send_discards() returning -EIO. Make sure minlen is never < 1 fs block by taking the max of the requested minlen and the fs block size before comparing to the device's discard granularity and shifting to fs blocks. Fixes: `076f0faa76` ("GFS2: Fix FITRIM argument handling") Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:24 +02:00
Baokun Li	d6a8cbe916	ubifs: rename_whiteout: correct old_dir size computing commit `7057572745` upstream. When renaming the whiteout file, the old whiteout file is not deleted. Therefore, we add the old dentry size to the old dir like XFS. Otherwise, an error may be reported due to `fscki->calc_sz != fscki->size` in check_indes. Fixes: `9e0a1fff8d` ("ubifs: Implement RENAME_WHITEOUT") Reported-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:24 +02:00
Zhihao Cheng	defaaeaae3	ubifs: setflags: Make dirtied_ino_d 8 bytes aligned commit `1b83ec057d` upstream. Make 'ui->data_len' aligned with 8 bytes before it is assigned to dirtied_ino_d. Since 8871d84c8f8b0c6b("ubifs: convert to fileattr") applied, 'setflags()' only affects regular files and directories, only xattr inode, symlink inode and special inode(pipe/char_dev/block_dev) have none- zero 'ui->data_len' field, so assertion '!(req->dirtied_ino_d & 7)' cannot fail in ubifs_budget_space(). To avoid assertion fails in future evolution(eg. setflags can operate special inodes), it's better to make dirtied_ino_d 8 bytes aligned, after all aligned size is still zero for regular files. Fixes: `1e51764a3c` ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:24 +02:00
Zhihao Cheng	a94ab3ad70	ubifs: Add missing iput if do_tmpfile() failed in rename whiteout commit `716b457302` upstream. whiteout inode should be put when do_tmpfile() failed if inode has been initialized. Otherwise we will get following warning during umount: UBIFS error (ubi0:0 pid 1494): ubifs_assert_failed [ubifs]: UBIFS assert failed: c->bi.dd_growth == 0, in fs/ubifs/super.c:1930 VFS: Busy inodes after unmount of ubifs. Self-destruct in 5 seconds. Fixes: `9e0a1fff8d` ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:24 +02:00
Zhihao Cheng	8b3c7be16f	ubifs: rename_whiteout: Fix double free for whiteout_ui->data commit `40a8f0d5e7` upstream. 'whiteout_ui->data' will be freed twice if space budget fail for rename whiteout operation as following process: rename_whiteout dev = kmalloc whiteout_ui->data = dev kfree(whiteout_ui->data) // Free first time iput(whiteout) ubifs_free_inode kfree(ui->data) // Double free! KASAN reports: ================================================================== BUG: KASAN: double-free or invalid-free in ubifs_free_inode+0x4f/0x70 Call Trace: kfree+0x117/0x490 ubifs_free_inode+0x4f/0x70 [ubifs] i_callback+0x30/0x60 rcu_do_batch+0x366/0xac0 __do_softirq+0x133/0x57f Allocated by task 1506: kmem_cache_alloc_trace+0x3c2/0x7a0 do_rename+0x9b7/0x1150 [ubifs] ubifs_rename+0x106/0x1f0 [ubifs] do_syscall_64+0x35/0x80 Freed by task 1506: kfree+0x117/0x490 do_rename.cold+0x53/0x8a [ubifs] ubifs_rename+0x106/0x1f0 [ubifs] do_syscall_64+0x35/0x80 The buggy address belongs to the object at ffff88810238bed8 which belongs to the cache kmalloc-8 of size 8 ================================================================== Let ubifs_free_inode() free 'whiteout_ui->data'. BTW, delete unused assignment 'whiteout_ui->data_len = 0', process 'ubifs_evict_inode() -> ubifs_jnl_delete_inode() -> ubifs_jnl_write_inode()' doesn't need it (because 'inc_nlink(whiteout)' won't be excuted by 'goto out_release', and the nlink of whiteout inode is 0). Fixes: `9e0a1fff8d` ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:24 +02:00
Dongliang Mu	21d490232f	ntfs: add sanity check on allocation size [ Upstream commit `714fbf2647` ] ntfs_read_inode_mount invokes ntfs_malloc_nofs with zero allocation size. It triggers one BUG in the __ntfs_malloc function. Fix this by adding sanity check on ni->attr_list_size. Link: https://lkml.kernel.org/r/20220120094914.47736-1-dzm91@hust.edu.cn Reported-by: syzbot+3c765c5248797356edaa@syzkaller.appspotmail.com Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:22 +02:00
Theodore Ts'o	d666dfaa57	ext4: don't BUG if someone dirty pages without asking ext4 first [ Upstream commit `cc5095747e` ] [un]pin_user_pages_remote is dirtying pages without properly warning the file system in advance. A related race was noted by Jan Kara in 2018[1]; however, more recently instead of it being a very hard-to-hit race, it could be reliably triggered by process_vm_writev(2) which was discovered by Syzbot[2]. This is technically a bug in mm/gup.c, but arguably ext4 is fragile in that if some other kernel subsystem dirty pages without properly notifying the file system using page_mkwrite(), ext4 will BUG, while other file systems will not BUG (although data will still be lost). So instead of crashing with a BUG, issue a warning (since there may be potential data loss) and just mark the page as clean to avoid unprivileged denial of service attacks until the problem can be properly fixed. More discussion and background can be found in the thread starting at [2]. [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz [2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db@syzkaller.appspotmail.com Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:22 +02:00
Trond Myklebust	9b54b4d062	NFSv4/pNFS: Fix another issue with a list iterator pointing to the head [ Upstream commit `7c9d845f06` ] In nfs4_callback_devicenotify(), if we don't find a matching entry for the deviceid, we're left with a pointer to 'struct nfs_server' that actually points to the list of super blocks associated with our struct nfs_client. Furthermore, even if we have a valid pointer, nothing pins the super block, and so the struct nfs_server could end up getting freed while we're using it. Since all we want is a pointer to the struct pnfs_layoutdriver_type, let's skip all the iteration over super blocks, and just use APIs to find the layout driver directly. Reported-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Fixes: `1be5683b03` ("pnfs: CB_NOTIFY_DEVICEID") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:21 +02:00
Pavel Skripkin	c73c5c5632	jfs: fix divide error in dbNextAG [ Upstream commit `2cc7cc01c1` ] Syzbot reported divide error in dbNextAG(). The problem was in missing validation check for malicious image. Syzbot crafted an image with bmp->db_numag equal to 0. There wasn't any validation checks, but dbNextAG() blindly use bmp->db_numag in divide expression Fix it by validating bmp->db_numag in dbMount() and return an error if image is malicious Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+46f5c25af73eb8330eb6@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:20 +02:00
Alexey Khoroshilov	a5ddd85bad	NFS: remove unneeded check in decode_devicenotify_args() [ Upstream commit `cb8fac6d27` ] [You don't often get email from khoroshilov@ispras.ru. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] Overflow check in not needed anymore after we switch to kmalloc_array(). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Fixes: `a4f743a6bb` ("NFSv4.1: Convert open-coded array allocation calls to kmalloc_array()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:20 +02:00
Zhang Yi	94fe9ee8d6	ext2: correct max file size computing [ Upstream commit `50b3a81899` ] We need to calculate the max file size accurately if the total blocks that can address by block tree exceed the upper_limit. But this check is not correct now, it only compute the total data blocks but missing metadata blocks are needed. So in the case of "data blocks < upper_limit && total blocks > upper_limit", we will get wrong result. Fortunately, this case could not happen in reality, but it's confused and better to correct the computing. bits data blocks metadatablocks upper_limit 10 16843020 66051 2147483647 11 134480396 263171 1073741823 12 1074791436 1050627 536870911 () 13 8594130956 4198403 268435455 () 14 68736258060 16785411 134217727 () 15 549822930956 67125251 67108863 () 16 4398314962956 268468227 33554431 () [] Need to calculate in depth. Fixes: `1c2d14212b` ("ext2: Fix underflow in ext2_max_size()") Link: https://lore.kernel.org/r/20220212050532.179055-1-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:08:17 +02:00
Baokun Li	b36bccb04e	jffs2: fix memory leak in jffs2_scan_medium commit `9cdd312887` upstream. If an error is returned in jffs2_scan_eraseblock() and some memory has been added to the jffs2_summary *s, we can observe the following kmemleak report: -------------------------------------------- unreferenced object 0xffff88812b889c40 (size 64): comm "mount", pid 692, jiffies 4294838325 (age 34.288s) hex dump (first 32 bytes): 40 48 b5 14 81 88 ff ff 01 e0 31 00 00 00 50 00 @H........1...P. 00 00 01 00 00 00 01 00 00 00 02 00 00 00 09 08 ................ backtrace: [<ffffffffae93a3a3>] __kmalloc+0x613/0x910 [<ffffffffaf423b9c>] jffs2_sum_add_dirent_mem+0x5c/0xa0 [<ffffffffb0f3afa8>] jffs2_scan_medium.cold+0x36e5/0x4794 [<ffffffffb0f3dbe1>] jffs2_do_mount_fs.cold+0xa7/0x2267 [<ffffffffaf40acf3>] jffs2_do_fill_super+0x383/0xc30 [<ffffffffaf40c00a>] jffs2_fill_super+0x2ea/0x4c0 [<ffffffffb0315d64>] mtd_get_sb+0x254/0x400 [<ffffffffb0315f5f>] mtd_get_sb_by_nr+0x4f/0xd0 [<ffffffffb0316478>] get_tree_mtd+0x498/0x840 [<ffffffffaf40bd15>] jffs2_get_tree+0x25/0x30 [<ffffffffae9f358d>] vfs_get_tree+0x8d/0x2e0 [<ffffffffaea7a98f>] path_mount+0x50f/0x1e50 [<ffffffffaea7c3d7>] do_mount+0x107/0x130 [<ffffffffaea7c5c5>] __se_sys_mount+0x1c5/0x2f0 [<ffffffffaea7c917>] __x64_sys_mount+0xc7/0x160 [<ffffffffb10142f5>] do_syscall_64+0x45/0x70 unreferenced object 0xffff888114b54840 (size 32): comm "mount", pid 692, jiffies 4294838325 (age 34.288s) hex dump (first 32 bytes): c0 75 b5 14 81 88 ff ff 02 e0 02 00 00 00 02 00 .u.............. 00 00 84 00 00 00 44 00 00 00 6b 6b 6b 6b 6b a5 ......D...kkkkk. backtrace: [<ffffffffae93be24>] kmem_cache_alloc_trace+0x584/0x880 [<ffffffffaf423b04>] jffs2_sum_add_inode_mem+0x54/0x90 [<ffffffffb0f3bd44>] jffs2_scan_medium.cold+0x4481/0x4794 [...] unreferenced object 0xffff888114b57280 (size 32): comm "mount", pid 692, jiffies 4294838393 (age 34.357s) hex dump (first 32 bytes): 10 d5 6c 11 81 88 ff ff 08 e0 05 00 00 00 01 00 ..l............. 00 00 38 02 00 00 28 00 00 00 6b 6b 6b 6b 6b a5 ..8...(...kkkkk. backtrace: [<ffffffffae93be24>] kmem_cache_alloc_trace+0x584/0x880 [<ffffffffaf423c34>] jffs2_sum_add_xattr_mem+0x54/0x90 [<ffffffffb0f3a24f>] jffs2_scan_medium.cold+0x298c/0x4794 [...] unreferenced object 0xffff8881116cd510 (size 16): comm "mount", pid 692, jiffies 4294838395 (age 34.355s) hex dump (first 16 bytes): 00 00 00 00 00 00 00 00 09 e0 60 02 00 00 6b a5 ..........`...k. backtrace: [<ffffffffae93be24>] kmem_cache_alloc_trace+0x584/0x880 [<ffffffffaf423cc4>] jffs2_sum_add_xref_mem+0x54/0x90 [<ffffffffb0f3b2e3>] jffs2_scan_medium.cold+0x3a20/0x4794 [...] -------------------------------------------- Therefore, we should call jffs2_sum_reset_collected(s) on exit to release the memory added in s. In addition, a new tag "out_buf" is added to prevent the NULL pointer reference caused by s being NULL. (thanks to Zhang Yi for this analysis) Fixes: `e631ddba58` ("[JFFS2] Add erase block summary support (mount time improvement)") Cc: stable@vger.kernel.org Co-developed-with: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:10 +02:00
Baokun Li	9a0f6610c7	jffs2: fix memory leak in jffs2_do_mount_fs commit `d051cef784` upstream. If jffs2_build_filesystem() in jffs2_do_mount_fs() returns an error, we can observe the following kmemleak report: -------------------------------------------- unreferenced object 0xffff88811b25a640 (size 64): comm "mount", pid 691, jiffies 4294957728 (age 71.952s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffa493be24>] kmem_cache_alloc_trace+0x584/0x880 [<ffffffffa5423a06>] jffs2_sum_init+0x86/0x130 [<ffffffffa5400e58>] jffs2_do_mount_fs+0x798/0xac0 [<ffffffffa540acf3>] jffs2_do_fill_super+0x383/0xc30 [<ffffffffa540c00a>] jffs2_fill_super+0x2ea/0x4c0 [...] unreferenced object 0xffff88812c760000 (size 65536): comm "mount", pid 691, jiffies 4294957728 (age 71.952s) hex dump (first 32 bytes): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ backtrace: [<ffffffffa493a449>] __kmalloc+0x6b9/0x910 [<ffffffffa5423a57>] jffs2_sum_init+0xd7/0x130 [<ffffffffa5400e58>] jffs2_do_mount_fs+0x798/0xac0 [<ffffffffa540acf3>] jffs2_do_fill_super+0x383/0xc30 [<ffffffffa540c00a>] jffs2_fill_super+0x2ea/0x4c0 [...] -------------------------------------------- This is because the resources allocated in jffs2_sum_init() are not released. Call jffs2_sum_exit() to release these resources to solve the problem. Fixes: `e631ddba58` ("[JFFS2] Add erase block summary support (mount time improvement)") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:10 +02:00
Baokun Li	3bd2454162	jffs2: fix use-after-free in jffs2_clear_xattr_subsystem commit `4c7c44ee16` upstream. When we mount a jffs2 image, assume that the first few blocks of the image are normal and contain at least one xattr-related inode, but the next block is abnormal. As a result, an error is returned in jffs2_scan_eraseblock(). jffs2_clear_xattr_subsystem() is then called in jffs2_build_filesystem() and then again in jffs2_do_fill_super(). Finally we can observe the following report: ================================================================== BUG: KASAN: use-after-free in jffs2_clear_xattr_subsystem+0x95/0x6ac Read of size 8 at addr ffff8881243384e0 by task mount/719 Call Trace: dump_stack+0x115/0x16b jffs2_clear_xattr_subsystem+0x95/0x6ac jffs2_do_fill_super+0x84f/0xc30 jffs2_fill_super+0x2ea/0x4c0 mtd_get_sb+0x254/0x400 mtd_get_sb_by_nr+0x4f/0xd0 get_tree_mtd+0x498/0x840 jffs2_get_tree+0x25/0x30 vfs_get_tree+0x8d/0x2e0 path_mount+0x50f/0x1e50 do_mount+0x107/0x130 __se_sys_mount+0x1c5/0x2f0 __x64_sys_mount+0xc7/0x160 do_syscall_64+0x45/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 719: kasan_save_stack+0x23/0x60 __kasan_kmalloc.constprop.0+0x10b/0x120 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0x1c0/0x870 jffs2_alloc_xattr_ref+0x2f/0xa0 jffs2_scan_medium.cold+0x3713/0x4794 jffs2_do_mount_fs.cold+0xa7/0x2253 jffs2_do_fill_super+0x383/0xc30 jffs2_fill_super+0x2ea/0x4c0 [...] Freed by task 719: kmem_cache_free+0xcc/0x7b0 jffs2_free_xattr_ref+0x78/0x98 jffs2_clear_xattr_subsystem+0xa1/0x6ac jffs2_do_mount_fs.cold+0x5e6/0x2253 jffs2_do_fill_super+0x383/0xc30 jffs2_fill_super+0x2ea/0x4c0 [...] The buggy address belongs to the object at ffff8881243384b8 which belongs to the cache jffs2_xattr_ref of size 48 The buggy address is located 40 bytes inside of 48-byte region [ffff8881243384b8, ffff8881243384e8) [...] ================================================================== The triggering of the BUG is shown in the following stack: ----------------------------------------------------------- jffs2_fill_super jffs2_do_fill_super jffs2_do_mount_fs jffs2_build_filesystem jffs2_scan_medium jffs2_scan_eraseblock <--- ERROR jffs2_clear_xattr_subsystem <--- free jffs2_clear_xattr_subsystem <--- free again ----------------------------------------------------------- An error is returned in jffs2_do_mount_fs(). If the error is returned by jffs2_sum_init(), the jffs2_clear_xattr_subsystem() does not need to be executed. If the error is returned by jffs2_build_filesystem(), the jffs2_clear_xattr_subsystem() also does not need to be executed again. So move jffs2_clear_xattr_subsystem() from 'out_inohash' to 'out_root' to fix this UAF problem. Fixes: `aa98d7cf59` ("[JFFS2][XATTR] XATTR support on JFFS2 (version. 5)") Cc: stable@vger.kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:10 +02:00
Dan Carpenter	438068f491	NFSD: prevent underflow in nfssvc_decode_writeargs() commit `184416d4b9` upstream. Smatch complains: fs/nfsd/nfsxdr.c:341 nfssvc_decode_writeargs() warn: no lower bound on 'args->len' Change the type to unsigned to prevent this issue. Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:09 +02:00
Miklos Szeredi	0ab55e14cf	fuse: fix pipe buffer lifetime for direct_io commit `0c4bcfdecb` upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: `c3021629a0` ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:08:08 +02:00
Lucas Wei	d6ea756586	fs: sysfs_emit: Remove PAGE_SIZE alignment check For kernel releases older than 4.20, using the SLUB alloctor will cause this alignment check to fail as that allocator did NOT align kmalloc allocations on a PAGE_SIZE boundry. Remove the check for these older kernels as it is a false-positive and causes problems on many devices. Signed-off-by: Lucas Wei <lucaswei@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-23 09:01:35 +01:00
Qu Wenruo	e0956dd95d	btrfs: unlock newly allocated extent buffer after error commit `19ea40dddf` upstream. [BUG] There is a bug report that injected ENOMEM error could leave a tree block locked while we return to user-space: BTRFS info (device loop0): enabling ssd optimizations FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106 fail_dump lib/fault-inject.c:52 [inline] should_fail+0x13c/0x160 lib/fault-inject.c:146 should_failslab+0x5/0x10 mm/slab_common.c:1328 slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494 slab_alloc_node mm/slub.c:3120 [inline] slab_alloc mm/slub.c:3214 [inline] kmem_cache_alloc+0x44/0x280 mm/slub.c:3219 btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline] btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833 __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415 btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570 btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768 btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905 btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530 btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783 lookup_open+0x660/0x780 fs/namei.c:3282 open_last_lookups fs/namei.c:3352 [inline] path_openat+0x465/0xe20 fs/namei.c:3557 do_filp_open+0xe3/0x170 fs/namei.c:3588 do_sys_openat2+0x357/0x4a0 fs/open.c:1200 do_sys_open+0x87/0xd0 fs/open.c:1216 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x46ae99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99 RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800 RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017 R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0 ================================================ WARNING: lock held when returning to user space! 5.15.0-rc1 #16 Not tainted ------------------------------------------------ syz-executor/7579 is leaving the kernel with locks still held! 1 lock held by syz-executor/7579: #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112 [CAUSE] In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new extent buffer @buf is locked, but if later operations like adding delayed tree ref fail, we just free @buf without unlocking it, resulting above warning. [FIX] Unlock @buf in out_free_buf: label. Reported-by: Hao Sun <sunhao.th@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/ CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Denis Efremov <denis.e.efremov@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-16 12:57:09 +01:00
Josh Triplett	227610ef62	ext4: add check to prevent attempting to resize an fs with sparse_super2 commit `b1489186cc` upstream. The in-kernel ext4 resize code doesn't support filesystem with the sparse_super2 feature. It fails with errors like this and doesn't finish the resize: EXT4-fs (loop0): resizing filesystem from 16640 to 7864320 blocks EXT4-fs warning (device loop0): verify_reserved_gdb:760: reserved GDT 2 missing grp 1 (32770) EXT4-fs warning (device loop0): ext4_resize_fs:2111: error (-22) occurred during file system resize EXT4-fs (loop0): resized filesystem to 2097152 To reproduce: mkfs.ext4 -b 4096 -I 256 -J size=32 -E resize=$((25610241024)) -O sparse_super2 ext4.img 65M truncate -s 30G ext4.img mount ext4.img /mnt python3 -c 'import fcntl, os, struct ; fd = os.open("/mnt", os.O_RDONLY \| os.O_DIRECTORY) ; fcntl.ioctl(fd, 0x40086610, struct.pack("Q", 30 * 1024 * 1024 * 1024 // 4096), False) ; os.close(fd)' dmesg \| tail e2fsck ext4.img The userspace resize2fs tool has a check for this case: it checks if the filesystem has sparse_super2 set and if the kernel provides /sys/fs/ext4/features/sparse_super2. However, the former check requires manually reading and parsing the filesystem superblock. Detect this case in ext4_resize_begin and error out early with a clear error message. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: https://lore.kernel.org/r/74b8ae78405270211943cd7393e65586c5faeed1.1623093259.git.josh@joshtriplett.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-16 12:57:09 +01:00
Ronnie Sahlberg	147a0e71cc	cifs: fix double free race when mount fails in cifs_get_root() [ Upstream commit `3d6cc9898e` ] When cifs_get_root() fails during cifs_smb3_do_mount() we call deactivate_locked_super() which eventually will call delayed_free() which will free the context. In this situation we should not proceed to enter the out: section in cifs_smb3_do_mount() and free the same resources a second time. [Thu Feb 10 12:59:06 2022] BUG: KASAN: use-after-free in rcu_cblist_dequeue+0x32/0x60 [Thu Feb 10 12:59:06 2022] Read of size 8 at addr ffff888364f4d110 by task swapper/1/0 [Thu Feb 10 12:59:06 2022] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G OE 5.17.0-rc3+ #4 [Thu Feb 10 12:59:06 2022] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019 [Thu Feb 10 12:59:06 2022] Call Trace: [Thu Feb 10 12:59:06 2022] <IRQ> [Thu Feb 10 12:59:06 2022] dump_stack_lvl+0x5d/0x78 [Thu Feb 10 12:59:06 2022] print_address_description.constprop.0+0x24/0x150 [Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60 [Thu Feb 10 12:59:06 2022] kasan_report.cold+0x7d/0x117 [Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60 [Thu Feb 10 12:59:06 2022] __asan_load8+0x86/0xa0 [Thu Feb 10 12:59:06 2022] rcu_cblist_dequeue+0x32/0x60 [Thu Feb 10 12:59:06 2022] rcu_core+0x547/0xca0 [Thu Feb 10 12:59:06 2022] ? call_rcu+0x3c0/0x3c0 [Thu Feb 10 12:59:06 2022] ? __this_cpu_preempt_check+0x13/0x20 [Thu Feb 10 12:59:06 2022] ? lock_is_held_type+0xea/0x140 [Thu Feb 10 12:59:06 2022] rcu_core_si+0xe/0x10 [Thu Feb 10 12:59:06 2022] __do_softirq+0x1d4/0x67b [Thu Feb 10 12:59:06 2022] __irq_exit_rcu+0x100/0x150 [Thu Feb 10 12:59:06 2022] irq_exit_rcu+0xe/0x30 [Thu Feb 10 12:59:06 2022] sysvec_hyperv_stimer0+0x9d/0xc0 ... [Thu Feb 10 12:59:07 2022] Freed by task 58179: [Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50 [Thu Feb 10 12:59:07 2022] kasan_set_track+0x25/0x30 [Thu Feb 10 12:59:07 2022] kasan_set_free_info+0x24/0x40 [Thu Feb 10 12:59:07 2022] ____kasan_slab_free+0x137/0x170 [Thu Feb 10 12:59:07 2022] __kasan_slab_free+0x12/0x20 [Thu Feb 10 12:59:07 2022] slab_free_freelist_hook+0xb3/0x1d0 [Thu Feb 10 12:59:07 2022] kfree+0xcd/0x520 [Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0x149/0xbe0 [cifs] [Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs] [Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140 [Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0 [Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210 [Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0 [Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae [Thu Feb 10 12:59:07 2022] Last potentially related work creation: [Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50 [Thu Feb 10 12:59:07 2022] __kasan_record_aux_stack+0xb6/0xc0 [Thu Feb 10 12:59:07 2022] kasan_record_aux_stack_noalloc+0xb/0x10 [Thu Feb 10 12:59:07 2022] call_rcu+0x76/0x3c0 [Thu Feb 10 12:59:07 2022] cifs_umount+0xce/0xe0 [cifs] [Thu Feb 10 12:59:07 2022] cifs_kill_sb+0xc8/0xe0 [cifs] [Thu Feb 10 12:59:07 2022] deactivate_locked_super+0x5d/0xd0 [Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0xab9/0xbe0 [cifs] [Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs] [Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140 [Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0 [Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210 [Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0 [Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:01:55 +01:00
Linus Torvalds	53f2522816	fget: clarify and improve __fget_files() implementation commit `e386dfc56f` upstream. Commit `054aa8d439` ("fget: check that the fd still exists after getting a ref to it") fixed a race with getting a reference to a file just as it was being closed. It was a fairly minimal patch, and I didn't think re-checking the file pointer lookup would be a measurable overhead, since it was all right there and cached. But I was wrong, as pointed out by the kernel test robot. The 'poll2' case of the will-it-scale.per_thread_ops benchmark regressed quite noticeably. Admittedly it seems to be a very artificial test: doing "poll()" system calls on regular files in a very tight loop in multiple threads. That means that basically all the time is spent just looking up file descriptors without ever doing anything useful with them (not that doing 'poll()' on a regular file is useful to begin with). And as a result it shows the extra "re-check fd" cost as a sore thumb. Happily, the regression is fixable by just writing the code to loook up the fd to be better and clearer. There's still a cost to verify the file pointer, but now it's basically in the noise even for that benchmark that does nothing else - and the code is more understandable and has better comments too. [ Side note: this patch is also a classic case of one that looks very messy with the default greedy Myers diff - it's much more legible with either the patience of histogram diff algorithm ] Link: https://lore.kernel.org/lkml/20211210053743.GA36420@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/20211213083154.GA20853@linux.intel.com/ Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Carel Si <beibei.si@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-02 11:34:00 +01:00
Steven Rostedt (Google)	8157b6ec86	tracefs: Set the group ownership in apply_options() not parse_options() commit `851e99ebee` upstream. Al Viro brought it to my attention that the dentries may not be filled when the parse_options() is called, causing the call to set_gid() to possibly crash. It should only be called if parse_options() succeeds totally anyway. He suggested the logical place to do the update is in apply_options(). Link: https://lore.kernel.org/all/20220225165219.737025658@goodmis.org/ Link: https://lkml.kernel.org/r/20220225153426.1c4cab6b@gandalf.local.home Cc: stable@vger.kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: `48b27b6b51` ("tracefs: Set all files to the same group ownership as the mount option") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-02 11:33:58 +01:00
ChenXiaoSong	d1654de19d	configfs: fix a race in configfs_{,un}register_subsystem() [ Upstream commit `84ec758fb2` ] When configfs_register_subsystem() or configfs_unregister_subsystem() is executing link_group() or unlink_group(), it is possible that two processes add or delete list concurrently. Some unfortunate interleavings of them can cause kernel panic. One of cases is: A --> B --> C --> D A <-- B <-- C <-- D delete list_head B \| delete list_head C --------------------------------\|----------------------------------- configfs_unregister_subsystem \| configfs_unregister_subsystem unlink_group \| unlink_group unlink_obj \| unlink_obj list_del_init \| list_del_init __list_del_entry \| __list_del_entry __list_del \| __list_del // next == C \| next->prev = prev \| \| next->prev = prev prev->next = next \| \| // prev == B \| prev->next = next Fix this by adding mutex when calling link_group() or unlink_group(), but parent configfs_subsystem is NULL when config_item is root. So I create a mutex configfs_subsystem_mutex. Fixes: `7063fbf226` ("[PATCH] configfs: User-driven configuration filesystem") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-02 11:33:56 +01:00
Trond Myklebust	428657e8d8	NFS: Do not report writeback errors in nfs_getattr() [ Upstream commit `d19e0183a8` ] The result of the writeback, whether it is an ENOSPC or an EIO, or anything else, does not inhibit the NFS client from reporting the correct file timestamps. Fixes: `79566ef018` ("NFS: Getattr doesn't require data sync semantics") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-23 11:57:36 +01:00
Trond Myklebust	2e4841a2e3	NFS: LOOKUP_DIRECTORY is also ok with symlinks commit `e0caaf75d4` upstream. Commit `ac795161c9` (NFSv4: Handle case where the lookup of a directory fails) [1], part of Linux since 5.17-rc2, introduced a regression, where a symbolic link on an NFS mount to a directory on another NFS does not resolve(?) the first time it is accessed: Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: `ac795161c9` ("NFSv4: Handle case where the lookup of a directory fails") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Donald Buczek <buczek@molgen.mpg.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-23 11:57:35 +01:00
Darrick J. Wong	36d90eced4	quota: make dquot_quota_sync return errors from ->sync_fs [ Upstream commit `dd5532a499` ] Strangely, dquot_quota_sync ignores the return code from the ->sync_fs call, which means that quotacalls like Q_SYNC never see the error. This doesn't seem right, so fix that. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-23 11:57:33 +01:00
Darrick J. Wong	ebd0a32848	vfs: make freeze_super abort when sync_filesystem returns error [ Upstream commit `2719c7160d` ] If we fail to synchronize the filesystem while preparing to freeze the fs, abort the freeze. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-23 11:57:33 +01:00
Dāvis Mosāns	97e4a8bd10	btrfs: send: in case of IO error log it commit `2e7be9db12` upstream. Currently if we get IO error while doing send then we abort without logging information about which file caused issue. So log it to help with debugging. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Dāvis Mosāns <davispuh@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-23 11:57:33 +01:00
Olga Kornievskaia	0a51b28124	NFSv4 expose nfs_parse_server_name function [ Upstream commit `f5b27cc676` ] Make nfs_parse_server_name available outside of nfs4namespace.c. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-16 12:44:49 +01:00
Olga Kornievskaia	d530d1a322	NFSv4 remove zero number of fs_locations entries error check [ Upstream commit `90e12a3191` ] Remove the check for the zero length fs_locations reply in the xdr decoding, and instead check for that in the migration code. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-16 12:44:49 +01:00
Trond Myklebust	a1a4f1704e	NFSv4.1: Fix uninitialised variable in devicenotify [ Upstream commit `b05bf5c63b` ] When decode_devicenotify_args() exits with no entries, we need to ensure that the struct cb_devicenotifyargs is initialised to { 0, NULL } in order to avoid problems in nfs4_callback_devicenotify(). Reported-by: <rtm@csail.mit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-16 12:44:49 +01:00
Xiaoke Wang	5582381be5	nfs: nfs4clinet: check the return value of kstrdup() [ Upstream commit `fbd2057e53` ] kstrdup() returns NULL when some internal memory errors happen, it is better to check the return value of it so to catch the memory error in time. Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-16 12:44:49 +01:00
Olga Kornievskaia	eb01a628b4	NFSv4 only print the label when its queried [ Upstream commit `2c52c8376d` ] When the bitmask of the attributes doesn't include the security label, don't bother printing it. Since the label might not be null terminated, adjust the printing format accordingly. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-16 12:44:49 +01:00
Chuck Lever	be711dea3b	NFSD: Clamp WRITE offsets commit `6260d9a56a` upstream. Ensure that a client cannot specify a WRITE range that falls in a byte range outside what the kernel's internal types (such as loff_t, which is signed) can represent. The kiocb iterators, invoked in nfsd_vfs_write(), should properly limit write operations to within the underlying file system's s_maxbytes. Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-16 12:44:49 +01:00
Trond Myklebust	4ecf2bc76a	NFS: Fix initialisation of nfs_client cl_flags field commit `468d126dab` upstream. For some long forgotten reason, the nfs_client cl_flags field is initialised in nfs_get_client() instead of being initialised at allocation time. This quirk was harmless until we moved the call to nfs_create_rpc_client(). Fixes: `dd99e9f98f` ("NFSv4: Initialise connection to the server in nfs4_alloc_client()") Cc: stable@vger.kernel.org # 4.8.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-16 12:44:49 +01:00
Ritesh Harjani	19cfb87724	ext4: fix error handling in ext4_restore_inline_data() commit `897026aaa7` upstream. While running "./check -I 200 generic/475" it sometimes gives below kernel BUG(). Ideally we should not call ext4_write_inline_data() if ext4_create_inline_data() has failed. <log snip> [73131.453234] kernel BUG at fs/ext4/inline.c:223! <code snip> 212 static void ext4_write_inline_data(struct inode inode, struct ext4_iloc iloc, 213 void *buffer, loff_t pos, unsigned int len) 214 { <...> 223 BUG_ON(!EXT4_I(inode)->i_inline_off); 224 BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); This patch handles the error and prints out a emergency msg saying potential data loss for the given inode (since we couldn't restore the original inline_data due to some previous error). [ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30) Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:30 +01:00
Dai Ngo	78778ffb7c	nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client. commit `ab451ea952` upstream. From RFC 7530 Section 16.34.5: o The server has not recorded an unconfirmed { v, x, c, , } and has recorded a confirmed { v, x, c, *, s }. If the principals of the record and of SETCLIENTID_CONFIRM do not match, the server returns NFS4ERR_CLID_INUSE without removing any relevant leased client state, and without changing recorded callback and callback_ident values for client { x }. The current code intends to do what the spec describes above but it forgot to set 'old' to NULL resulting to the confirmed client to be expired. Fixes: `2b63482185` ("nfsd: fix clid_inuse on mount with security change") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:29 +01:00
Trond Myklebust	db6cd55bb9	NFSv4: nfs_atomic_open() can race when looking up a non-regular file commit `1751fc1db3` upstream. If the file type changes back to being a regular file on the server between the failed OPEN and our LOOKUP, then we need to re-run the OPEN. Fixes: `0dd2b474d0` ("nfs: implement i_op->atomic_open()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:26 +01:00
Trond Myklebust	516f348b75	NFSv4: Handle case where the lookup of a directory fails commit `ac795161c9` upstream. If the application sets the O_DIRECTORY flag, and tries to open a regular file, nfs_atomic_open() will punt to doing a regular lookup. If the server then returns a regular file, we will happily return a file descriptor with uninitialised open state. The fix is to return the expected ENOTDIR error in these cases. Reported-by: Lyu Tao <tao.lyu@epfl.ch> Fixes: `0dd2b474d0` ("nfs: implement i_op->atomic_open()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:26 +01:00
Jan Kara	a312cbdb90	udf: Fix NULL ptr deref when converting from inline format commit `7fc3b7c298` upstream. udf_expand_file_adinicb() calls directly ->writepage to write data expanded into a page. This however misses to setup inode for writeback properly and so we can crash on inode->i_wb dereference when submitting page for IO like: BUG: kernel NULL pointer dereference, address: 0000000000000158 #PF: supervisor read access in kernel mode ... <TASK> __folio_start_writeback+0x2ac/0x350 __block_write_full_page+0x37d/0x490 udf_expand_file_adinicb+0x255/0x400 [udf] udf_file_write_iter+0xbe/0x1b0 [udf] new_sync_write+0x125/0x1c0 vfs_write+0x28e/0x400 Fix the problem by marking the page dirty and going through the standard writeback path to write the page. Strictly speaking we would not even have to write the page but we want to catch e.g. ENOSPC errors early. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: `52ebea749a` ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:24 +01:00
Jan Kara	3fdf975173	udf: Restore i_lenAlloc when inode expansion fails commit `ea8569194b` upstream. When we fail to expand inode from inline format to a normal format, we restore inode to contain the original inline formatting but we forgot to set i_lenAlloc back. The mismatch between i_lenAlloc and i_size was then causing further problems such as warnings and lost data down the line. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: `7e49b6f248` ("udf: Convert UDF to new truncate calling sequence") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-08 18:16:24 +01:00
Trond Myklebust	d5e6dff8c9	NFSv4: Initialise connection to the server in nfs4_alloc_client() commit `dd99e9f98f` upstream. Set up the connection to the NFSv4 server in nfs4_alloc_client(), before we've added the struct nfs_client to the net-namespace's nfs_client_list so that a downed server won't cause other mounts to hang in the trunking detection code. Reported-by: Michael Wakabayashi <mwakabayashi@vmware.com> Fixes: `5c6e5b60aa` ("NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 09:01:02 +01:00
Amir Goldstein	f78d626801	fuse: fix live lock in fuse_iget() commit `775c5033a0` upstream. Commit `5d069dbe8a` ("fuse: fix bad inode") replaced make_bad_inode() in fuse_iget() with a private implementation fuse_make_bad(). The private implementation fails to remove the bad inode from inode cache, so the retry loop with iget5_locked() finds the same bad inode and marks it bad forever. kmsg snip: [ ] rcu: INFO: rcu_sched self-detected stall on CPU ... [ ] ? bit_wait_io+0x50/0x50 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? find_inode.isra.32+0x60/0xb0 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5_nowait+0x65/0x90 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5.part.36+0x2e/0x80 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? fuse_inode_eq+0x20/0x20 [ ] iget5_locked+0x21/0x80 [ ] ? fuse_inode_eq+0x20/0x20 [ ] fuse_iget+0x96/0x1b0 Fixes: `5d069dbe8a` ("fuse: fix bad inode") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 09:01:02 +01:00
Miklos Szeredi	2cd45139c0	fuse: fix bad inode commit `5d069dbe8a` upstream. Jan Kara's analysis of the syzbot report (edited): The reproducer opens a directory on FUSE filesystem, it then attaches dnotify mark to the open directory. After that a fuse_do_getattr() call finds that attributes returned by the server are inconsistent, and calls make_bad_inode() which, among other things does: inode->i_mode = S_IFREG; This then confuses dnotify which doesn't tear down its structures properly and eventually crashes. Avoid calling make_bad_inode() on a live inode: switch to a private flag on the fuse inode. Also add the test to ops which the bad_inode_ops would have caught. This bug goes back to the initial merge of fuse in 2.6.14... Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> [bwh: Backported to 4.19: - Drop changes in fuse_dir_fsync(), fuse_readahead(), fuse_evict_inode() - In fuse_get_link(), return ERR_PTR(-EIO) for bad inodes - Convert some additional calls to is_bad_inode() - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 09:01:01 +01:00
Theodore Ts'o	acd29971d6	ext4: don't use the orphan list when migrating an inode commit `6eeaf88fd5` upstream. We probably want to remove the indirect block to extents migration feature after a deprecation window, but until then, let's fix a potential data loss problem caused by the fact that we put the tmp_inode on the orphan list. In the unlikely case where we crash and do a journal recovery, the data blocks belonging to the inode being migrated are also represented in the tmp_inode on the orphan list --- and so its data blocks will get marked unallocated, and available for reuse. Instead, stop putting the tmp_inode on the oprhan list. So in the case where we crash while migrating the inode, we'll leak an inode, which is not a disaster. It will be easily fixed the next time we run fsck, and it's better than potentially having blocks getting claimed by two different files, and losing data as a result. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 09:00:59 +01:00
Ye Bin	26a9eb4068	ext4: Fix BUG_ON in ext4_bread when write quota data commit `380a0091ca` upstream. We got issue as follows when run syzkaller: [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) \|\| handle != NULL \|\| create == 0' [ 167.983601] ------------[ cut here ]------------ [ 167.984245] kernel BUG at fs/ext4/inode.c:847! [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 168.001899] Call Trace: [ 168.002235] <TASK> [ 168.007167] ext4_bread+0xd/0x53 [ 168.007612] ext4_quota_write+0x20c/0x5c0 [ 168.010457] write_blk+0x100/0x220 [ 168.010944] remove_free_dqentry+0x1c6/0x440 [ 168.011525] free_dqentry.isra.0+0x565/0x830 [ 168.012133] remove_tree+0x318/0x6d0 [ 168.014744] remove_tree+0x1eb/0x6d0 [ 168.017346] remove_tree+0x1eb/0x6d0 [ 168.019969] remove_tree+0x1eb/0x6d0 [ 168.022128] qtree_release_dquot+0x291/0x340 [ 168.023297] v2_release_dquot+0xce/0x120 [ 168.023847] dquot_release+0x197/0x3e0 [ 168.024358] ext4_release_dquot+0x22a/0x2d0 [ 168.024932] dqput.part.0+0x1c9/0x900 [ 168.025430] __dquot_drop+0x120/0x190 [ 168.025942] ext4_clear_inode+0x86/0x220 [ 168.026472] ext4_evict_inode+0x9e8/0xa22 [ 168.028200] evict+0x29e/0x4f0 [ 168.028625] dispose_list+0x102/0x1f0 [ 168.029148] evict_inodes+0x2c1/0x3e0 [ 168.030188] generic_shutdown_super+0xa4/0x3b0 [ 168.030817] kill_block_super+0x95/0xd0 [ 168.031360] deactivate_locked_super+0x85/0xd0 [ 168.031977] cleanup_mnt+0x2bc/0x480 [ 168.033062] task_work_run+0xd1/0x170 [ 168.033565] do_exit+0xa4f/0x2b50 [ 168.037155] do_group_exit+0xef/0x2d0 [ 168.037666] __x64_sys_exit_group+0x3a/0x50 [ 168.038237] do_syscall_64+0x3b/0x90 [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae In order to reproduce this problem, the following conditions need to be met: 1. Ext4 filesystem with no journal; 2. Filesystem image with incorrect quota data; 3. Abort filesystem forced by user; 4. umount filesystem; As in ext4_quota_write: ... if (EXT4_SB(sb)->s_journal && !handle) { ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" " cancelled because transaction is not started", (unsigned long long)off, (unsigned long long)len); return -EIO; } ... We only check handle if NULL when filesystem has journal. There is need check handle if NULL even when filesystem has no journal. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 09:00:59 +01:00

1 2 3 4 5 ...

52725 commits