linux-stable/fs/xfs
Darrick J. Wong cb357bf3d1 xfs: implement per-inode writeback completion queues
When scheduling writeback of dirty file data in the page cache, XFS uses
IO completion workqueue items to ensure that filesystem metadata only
updates after the write completes successfully.  This is essential for
converting unwritten extents to real extents at the right time and
performing COW remappings.

Unfortunately, XFS queues each IO completion work item to an unbounded
workqueue, which means that the kernel can spawn dozens of threads to
try to handle the items quickly.  These threads need to take the ILOCK
to update file metadata, which results in heavy ILOCK contention if a
large number of the work items target a single file, which is
inefficient.

Worse yet, the writeback completion threads get stuck waiting for the
ILOCK while holding transaction reservations, which can use up all
available log reservation space.  When that happens, metadata updates to
other parts of the filesystem grind to a halt, even if the filesystem
could otherwise have handled it.

Even worse, if one of the things grinding to a halt happens to be a
thread in the middle of a defer-ops finish holding the same ILOCK and
trying to obtain more log reservation having exhausted the permanent
reservation, we now have an ABBA deadlock - writeback completion has a
transaction reserved and wants the ILOCK, and someone else has the ILOCK
and wants a transaction reservation.

Therefore, we create a per-inode writeback io completion queue + work
item.  When writeback finishes, it can add the ioend to the per-inode
queue and let the single worker item process that queue.  This
dramatically cuts down on the number of kworkers and ILOCK contention in
the system, and seems to have eliminated an occasional deadlock I was
seeing while running generic/476.

Testing with a program that simulates a heavy random-write workload to a
single file demonstrates that the number of kworkers drops from
approximately 120 threads per file to 1, without dramatically changing
write bandwidth or pagecache access latency.

Note that we leave the xfs-conv workqueue's max_active alone because we
still want to be able to run ioend processing for as many inodes as the
system can handle.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-04-16 10:01:57 -07:00
..
libxfs xfs: report inode health via bulkstat 2019-04-14 18:15:58 -07:00
scrub xfs: scrub should only cross-reference with healthy btrees 2019-04-16 10:01:57 -07:00
Kconfig
kmem.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
kmem.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
Makefile xfs: scrub/repair should update filesystem metadata health 2019-04-16 10:01:57 -07:00
mrlock.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs.h xfs: remove b_last_holder & associated macros 2018-08-12 08:37:31 -07:00
xfs_acl.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_acl.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_aops.c xfs: implement per-inode writeback completion queues 2019-04-16 10:01:57 -07:00
xfs_aops.h xfs: implement per-inode writeback completion queues 2019-04-16 10:01:57 -07:00
xfs_attr_inactive.c xfs: remove all boilerplate defer init/finish code 2018-07-26 10:15:15 -07:00
xfs_attr_list.c xfs: don't overflow xattr listent buffer 2019-02-14 09:36:52 -08:00
xfs_bmap_item.c xfs: pass transaction to xfs_defer_add() 2018-08-02 23:05:14 -07:00
xfs_bmap_item.h xfs: use transaction for intent recovery instead of raw dfops 2018-08-02 23:05:13 -07:00
xfs_bmap_util.c xfs: introduce an always_cow mode 2019-02-21 07:55:07 -08:00
xfs_bmap_util.h xfs: flush removing page cache in xfs_reflink_remap_prep 2018-11-21 10:10:53 -08:00
xfs_buf.c xfs: fix xfs_buf magic number endian checks 2019-02-18 09:38:41 -08:00
xfs_buf.h xfs: fix xfs_buf magic number endian checks 2019-02-18 09:38:41 -08:00
xfs_buf_item.c xfs: fix use after free in buf log item unlock assert 2019-04-14 18:15:56 -07:00
xfs_buf_item.h xfs: refactor xfs_buf_log_item reference count handling 2018-09-29 13:45:26 +10:00
xfs_dir2_readdir.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_discard.c xfs,fstrim: fix to return correct minlen 2019-04-14 18:15:57 -07:00
xfs_discard.h
xfs_dquot.c xfs: remove dead error handling code in xfs_dquot_disk_alloc() 2018-08-07 10:57:13 -07:00
xfs_dquot.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_dquot_item.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_dquot_item.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_error.c xfs: cache unlinked pointers in an rhashtable 2019-02-11 16:07:01 -08:00
xfs_error.h xfs: Introduce XFS_PTAG_VERIFIER_ERROR panic mask 2019-02-11 16:07:00 -08:00
xfs_export.c xfs: clean up IRELE/iput callsites 2018-07-26 10:15:16 -07:00
xfs_export.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_extent_busy.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_extent_busy.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_extfree_item.c xfs: remove xfs_rmap_ag_owner and friends 2018-12-12 08:47:16 -08:00
xfs_extfree_item.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_file.c xfs: serialize unaligned dio writes against all other dio writes 2019-03-26 08:37:55 -07:00
xfs_filestream.c xfs: replace dop_low with transaction flag 2018-08-02 23:05:13 -07:00
xfs_filestream.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_fsmap.c xfs: trivial xfs_btree_del_cursor cleanups 2018-07-23 09:08:00 -07:00
xfs_fsmap.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_fsops.c xfs: reserve blocks for ifree transaction during log recovery 2019-02-14 22:42:57 -08:00
xfs_fsops.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_globals.c xfs: Introduce XFS_PTAG_VERIFIER_ERROR panic mask 2019-02-11 16:07:00 -08:00
xfs_health.c xfs: report inode health via bulkstat 2019-04-14 18:15:58 -07:00
xfs_icache.c xfs: implement per-inode writeback completion queues 2019-04-16 10:01:57 -07:00
xfs_icache.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_icreate_item.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_icreate_item.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_inode.c xfs: shutdown after buf release in iflush cluster abort path 2019-04-14 18:15:56 -07:00
xfs_inode.h xfs: implement per-inode writeback completion queues 2019-04-16 10:01:57 -07:00
xfs_inode_item.c xfs: remove if_real_bytes 2018-07-30 07:57:48 -07:00
xfs_inode_item.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_ioctl.c xfs: report fs and rt health via geometry structure 2019-04-14 18:15:57 -07:00
xfs_ioctl.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_ioctl32.c xfs: add a new ioctl to describe allocation group geometry 2019-04-14 18:15:57 -07:00
xfs_ioctl32.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_iomap.c xfs: rework breaking of shared extents in xfs_file_iomap_begin 2019-02-25 09:26:18 -08:00
xfs_iomap.h xfs: fix SEEK_DATA for speculative COW fork preallocation 2019-02-21 07:55:07 -08:00
xfs_iops.c xfs: fix reporting supported extra file attributes for statx() 2019-03-01 08:57:25 -08:00
xfs_iops.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_itable.c xfs: report inode health via bulkstat 2019-04-14 18:15:58 -07:00
xfs_itable.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_linux.h xfs: replace do_mod with native operations 2018-06-08 10:07:52 -07:00
xfs_log.c xfs: replace the BAD_SUMMARY mount flag with the equivalent health code 2019-04-14 18:15:57 -07:00
xfs_log.h xfs: refactor log recovery check 2018-08-01 07:40:48 -07:00
xfs_log_cil.c xfs: wake commit waiters on CIL abort before log item abort 2019-04-14 18:15:56 -07:00
xfs_log_priv.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_log_recover.c xfs: fix xfs_buf magic number endian checks 2019-02-18 09:38:41 -08:00
xfs_message.c xfs: print buffer offsets when dumping corrupt buffers 2018-11-06 07:50:50 -08:00
xfs_message.h
xfs_mount.c xfs: clear BAD_SUMMARY if unmounting an unhealthy filesystem 2019-04-14 18:15:57 -07:00
xfs_mount.h xfs: replace the BAD_SUMMARY mount flag with the equivalent health code 2019-04-14 18:15:57 -07:00
xfs_mru_cache.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_mru_cache.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_ondisk.h xfs: compile time offset checks for common v4/v5 metadata 2019-02-11 16:07:01 -08:00
xfs_pnfs.c xfs: make xfs_bmbt_to_iomap more useful 2019-02-21 07:55:07 -08:00
xfs_pnfs.h
xfs_qm.c xfs: clean up IRELE/iput callsites 2018-07-26 10:15:16 -07:00
xfs_qm.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_qm_bhv.c fs/xfs: fix f_ffree value for statfs when project quota is set 2018-11-26 15:01:37 -08:00
xfs_qm_syscalls.c xfs: clean up IRELE/iput callsites 2018-07-26 10:15:16 -07:00
xfs_quota.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_quotaops.c xfs: clean up IRELE/iput callsites 2018-07-26 10:15:16 -07:00
xfs_refcount_item.c xfs: pass transaction to xfs_defer_add() 2018-08-02 23:05:14 -07:00
xfs_refcount_item.h xfs: use transaction for intent recovery instead of raw dfops 2018-08-02 23:05:13 -07:00
xfs_reflink.c xfs: fix uninitialized error variables 2019-02-25 10:16:41 -08:00
xfs_reflink.h xfs: don't pass iomap flags to xfs_reflink_allocate_cow 2019-02-25 09:04:31 -08:00
xfs_rmap_item.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_rmap_item.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_rtalloc.c xfs: reallocate realtime summary cache on growfs 2018-12-21 18:45:18 -08:00
xfs_rtalloc.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_stats.c xfs: use offsetof() in place of offset macros for __xfsstats 2018-10-18 17:21:39 +11:00
xfs_stats.h xfs: use offsetof() in place of offset macros for __xfsstats 2018-10-18 17:21:39 +11:00
xfs_super.c xfs: introduce an always_cow mode 2019-02-21 07:55:07 -08:00
xfs_super.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_symlink.c xfs: zero length symlinks are not valid 2018-12-12 08:47:15 -08:00
xfs_symlink.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_sysctl.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_sysctl.h xfs: introduce an always_cow mode 2019-02-21 07:55:07 -08:00
xfs_sysfs.c xfs: introduce an always_cow mode 2019-02-21 07:55:07 -08:00
xfs_sysfs.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_trace.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_trace.h xfs: clear BAD_SUMMARY if unmounting an unhealthy filesystem 2019-04-14 18:15:57 -07:00
xfs_trans.c xfs: avoid lockdep false positives in xfs_trans_alloc 2018-09-29 13:46:21 +10:00
xfs_trans.h xfs: const-ify xfs_owner_info arguments 2018-12-12 08:47:16 -08:00
xfs_trans_ail.c xfs: clear ail delwri queued bufs on unmount of shutdown fs 2018-10-18 17:21:49 +11:00
xfs_trans_bmap.c xfs: remove duplicated xfs_defer.h 2019-02-11 16:07:00 -08:00
xfs_trans_buf.c xfs: clarify documentation for the function to reverify buffers 2019-02-11 16:07:01 -08:00
xfs_trans_dquot.c xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_trans_extfree.c xfs: remove duplicated xfs_defer.h 2019-02-11 16:07:00 -08:00
xfs_trans_inode.c vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
xfs_trans_priv.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_trans_refcount.c xfs: remove duplicated xfs_defer.h 2019-02-11 16:07:00 -08:00
xfs_trans_rmap.c xfs: remove duplicated xfs_defer.h 2019-02-11 16:07:00 -08:00
xfs_xattr.c xfs: don't overflow xattr listent buffer 2019-02-14 09:36:52 -08:00