linux-stable/fs
Zhihao Cheng 440d345d02 fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
commit 68f4c6eba7 upstream.

Commit 505a666ee3 ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during wb_writeback, which
may cause a potential ABBA dead lock:

    wb_writeback		fat_file_fsync
blk_start_plug(&plug)
for (;;) {
  iter i-1: some reqs have been added into plug->mq_list  // LOCK A
  iter i:
    progress = __writeback_inodes_wb(wb, work)
    . writeback_sb_inodes // fat's bdev
    .   __writeback_single_inode
    .   . generic_writepages
    .   .   __block_write_full_page
    .   .   . . 	    __generic_file_fsync
    .   .   . . 	      sync_inode_metadata
    .   .   . . 	        writeback_single_inode
    .   .   . . 		  __writeback_single_inode
    .   .   . . 		    fat_write_inode
    .   .   . . 		      __fat_write_inode
    .   .   . . 		        sync_dirty_buffer	// fat's bdev
    .   .   . . 			  lock_buffer(bh)	// LOCK B
    .   .   . . 			    submit_bh
    .   .   . . 			      blk_mq_get_tag	// LOCK A
    .   .   . trylock_buffer(bh)  // LOCK B
    .   .   .   redirty_page_for_writepage
    .   .   .     wbc->pages_skipped++
    .   .   --wbc->nr_to_write
    .   wrote += write_chunk - wbc.nr_to_write  // wrote > 0
    .   requeue_inode
    .     redirty_tail_locked
    if (progress)    // progress > 0
      continue;
  iter i+1:
      queue_io
      // similar process with iter i, infinite for-loop !
}
blk_finish_plug(&plug)   // flush plug won't be called

Above process triggers a hungtask like:
[  399.044861] INFO: task bb:2607 blocked for more than 30 seconds.
[  399.046824]       Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty
[  399.051539] task:bb              state:D stack:    0 pid: 2607 ppid:
2426 flags:0x00004000
[  399.051556] Call Trace:
[  399.051570]  __schedule+0x480/0x1050
[  399.051592]  schedule+0x92/0x1a0
[  399.051602]  io_schedule+0x22/0x50
[  399.051613]  blk_mq_get_tag+0x1d3/0x3c0
[  399.051640]  __blk_mq_alloc_requests+0x21d/0x3f0
[  399.051657]  blk_mq_submit_bio+0x68d/0xca0
[  399.051674]  __submit_bio+0x1b5/0x2d0
[  399.051708]  submit_bio_noacct+0x34e/0x720
[  399.051718]  submit_bio+0x3b/0x150
[  399.051725]  submit_bh_wbc+0x161/0x230
[  399.051734]  __sync_dirty_buffer+0xd1/0x420
[  399.051744]  sync_dirty_buffer+0x17/0x20
[  399.051750]  __fat_write_inode+0x289/0x310
[  399.051766]  fat_write_inode+0x2a/0xa0
[  399.051783]  __writeback_single_inode+0x53c/0x6f0
[  399.051795]  writeback_single_inode+0x145/0x200
[  399.051803]  sync_inode_metadata+0x45/0x70
[  399.051856]  __generic_file_fsync+0xa3/0x150
[  399.051880]  fat_file_fsync+0x1d/0x80
[  399.051895]  vfs_fsync_range+0x40/0xb0
[  399.051929]  __x64_sys_fsync+0x18/0x30

In my test, 'need_resched()' (which is imported by 590dca3a71 "fs-writeback:
unplug before cond_resched in writeback_sb_inodes") in function
'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted
from write_cache_pages().

Fix it by correcting wrote number according number of skipped pages
in writeback_sb_inodes().

Goto Link to find a reproducer.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215837
Cc: stable@vger.kernel.org # v4.3
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220510133805.1988292-1-chengzhihao1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-14 18:11:44 +02:00
..
9p 9P: Cast to loff_t before multiplying 2020-11-05 11:43:34 +01:00
adfs
affs fs/affs: release old buffer head on error path 2021-03-04 10:26:48 +01:00
afs afs: Fix afs_getattr() to refetch file status if callback break occurred 2022-05-25 09:14:39 +02:00
autofs
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:48:39 +01:00
btrfs btrfs: repair super block num_devices automatically 2022-06-14 18:11:24 +02:00
cachefiles cachefiles: Handle readpage error correctly 2020-11-05 11:43:36 +01:00
ceph ceph: fix handling of "meta" errors 2021-10-27 09:54:27 +02:00
cifs cifs: destage any unwritten data to the server before calling copychunk_write 2022-05-09 09:03:26 +02:00
coda
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:41:10 +01:00
cramfs
crypto fscrypt: add fscrypt_symlink_getattr() for computing st_size 2021-09-12 08:56:38 +02:00
debugfs debugfs: lockdown: Allow reading debugfs files that are not world readable 2022-01-27 09:19:36 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:24:34 +01:00
dlm fs: dlm: filter user dlm messages for kernel locks 2022-01-27 09:19:40 +01:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:05:19 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-12-02 08:49:53 +01:00
efs
erofs erofs: fix unsafe pagevec reuse of hooked pclusters 2021-11-21 13:38:51 +01:00
exportfs
ext2 ext2: correct max file size computing 2022-04-15 14:18:13 +02:00
ext4 ext4: reject the 'commit' option on ext2 filesystems 2022-06-14 18:11:38 +02:00
f2fs f2fs: fix fallocate to use file_modified to update permissions consistently 2022-06-14 18:11:44 +02:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-14 18:11:30 +02:00
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-22 12:26:25 +02:00
fuse fuse: fix pipe buffer lifetime for direct_io 2022-03-16 13:21:47 +01:00
gfs2 gfs2: Fix filesystem block deallocation for short writes 2022-05-18 09:47:26 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:19:38 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:08:29 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:24:14 +02:00
hpfs
hugetlbfs mm, hugetlb: allow for "high" userspace addresses 2022-05-09 09:03:28 +02:00
iomap iomap: iomap_write_failed fix 2022-06-14 18:11:36 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:43:03 +01:00
jbd2 jbd2: fix a potential race while discarding reserved buffers after an abort 2022-04-27 13:50:50 +02:00
jffs2 jffs2: fix memory leak in jffs2_scan_medium 2022-04-15 14:17:59 +02:00
jfs fs: jfs: fix possible NULL pointer dereference in dbFree() 2022-06-14 18:11:29 +02:00
kernfs kernfs: do not call fsnotify() with name without a parent 2020-08-19 08:16:12 +02:00
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-22 12:26:34 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-15 14:18:35 +02:00
nfs NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout 2022-06-14 18:11:43 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:51:22 +01:00
nfsd NFSD: Fix possible sleep during nfsd4_release_lockowner() 2022-06-06 08:33:51 +02:00
nilfs2 nilfs2: fix lockdep warnings during disk space reclamation 2022-05-25 09:14:33 +02:00
nls
notify fsnotify: fix wrong lockdep annotations 2022-06-14 18:11:34 +02:00
ntfs ntfs: add sanity check on allocation size 2022-04-15 14:18:23 +02:00
ocfs2 ocfs2: fix crash when initialize filecheck kobj fails 2022-03-23 09:12:06 +01:00
omfs
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-20 09:19:17 +01:00
overlayfs ovl: fix warning in ovl_create_real() 2021-12-22 09:29:40 +01:00
proc proc: fix dentry/inode overinstantiating under /proc/${pid}/net 2022-06-14 18:11:41 +02:00
pstore pstore: Fix typo in compression option name 2021-03-04 10:26:45 +01:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:09:26 +02:00
qnx6
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-02-23 11:59:55 +01:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-29 09:57:53 +01:00
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:21:05 +02:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-26 10:40:51 +02:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:08:29 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2021-03-07 12:20:48 +01:00
sysv
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-03-02 11:41:13 +01:00
ubifs ubifs: Rectify space amount budget for mkdir/tmpfile operations 2022-04-15 14:18:31 +02:00
udf udf: Fix NULL ptr deref when converting from inline format 2022-02-01 17:24:34 +01:00
ufs fs/ufs: avoid potential u32 multiplication overflow 2020-08-21 13:05:37 +02:00
unicode
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-10-06 15:42:30 +02:00
xfs xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate 2022-01-11 15:23:32 +01:00
Kconfig
Kconfig.binfmt
Makefile
aio.c aio: fix use-after-free due to missing POLLFREE handling 2021-12-14 14:49:02 +01:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings 2021-10-06 15:42:35 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-14 18:11:23 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:03:57 +01:00
binfmt_script.c
block_dev.c block: reexpand iov_iter after read/write 2021-05-22 11:38:29 +02:00
buffer.c fs: Don't invalidate page buffers in block_write_full_page() 2020-11-05 11:43:24 +01:00
char_dev.c
compat.c
compat_binfmt_elf.c
compat_ioctl.c
coredump.c coredump: fix core_pattern parse error 2020-12-11 13:23:30 +01:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-29 09:57:45 +01:00
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-14 18:11:41 +02:00
dcache.c fix dget_parent() fastpath race 2020-10-01 13:17:19 +02:00
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:24:11 +02:00
drop_caches.c
eventfd.c eventfd: track eventfd_signal() recursion depth 2020-02-11 04:35:37 -08:00
eventpoll.c ep_create_wakeup_source(): dentry name can change under you... 2020-10-07 08:01:31 +02:00
exec.c exec: Force single empty string when argv is empty 2022-06-06 08:33:50 +02:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-15 09:47:28 +02:00
fhandle.c
file.c fget: clarify and improve __fget_files() implementation 2022-03-02 11:41:18 +01:00
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-05-25 09:14:34 +02:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-17 10:50:21 +02:00
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-06-14 18:11:44 +02:00
fs_context.c memcg: charge fs_context and legacy_fs_context 2022-02-08 18:24:29 +01:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
inode.c futex: Fix inode life-time issue 2020-03-25 08:25:58 +01:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:19:37 +02:00
io_uring.c io_uring: fix fs->users overflow 2022-04-15 14:18:41 +02:00
ioctl.c
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-24 13:29:19 +01:00
locks.c locks: reinstate locks_delete_block optimization 2020-03-25 08:25:41 +01:00
mbcache.c
mount.h
mpage.c
namei.c fsnotify: invalidate dcache before IN_DELETE event 2022-02-01 17:24:39 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:36:22 -04:00
no-block.c
nsfs.c
open.c cifs_atomic_open(): fix double-put on late allocation failure 2020-03-18 07:17:51 +01:00
pipe.c pipe: increase minimum default pipe size to 2 pages 2021-08-12 13:21:02 +02:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-05-02 08:48:44 +02:00
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:03:33 +01:00
posix_acl.c
proc_namespace.c
read_write.c
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 12:56:16 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:25:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:10:54 +02:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-14 14:49:02 +01:00
splice.c
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:50:48 +02:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-02-23 11:59:55 +01:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: prevent concurrent API initialization 2021-09-22 12:26:26 +02:00
utimes.c
xattr.c xattr: break delegations in {set,remove}xattr 2020-08-11 15:33:39 +02:00