linux-stable/fs
John Stultz 9145880e8c pstore: Revert pmsg_lock back to a normal mutex
[ Upstream commit 5239a89b06 ]

This reverts commit 76d62f24db.

So while priority inversion on the pmsg_lock is an occasional
problem that an rt_mutex would help with, in uses where logging
is writing to pmsg heavily from multiple threads, the pmsg_lock
can be heavily contended.

After this change landed, it was reported that cases where the
mutex locking overhead was commonly adding on the order of 10s
of usecs delay had suddenly jumped to ~msec delay with rtmutex.

It seems the slight differences in the locks under this level
of contention causes the normal mutexes to utilize the spinning
optimizations, while the rtmutexes end up in the sleeping
slowpath (which allows additional threads to pile on trying
to take the lock).

In this case, it devolves to a worse case senerio where the lock
acquisition and scheduling overhead dominates, and each thread
is waiting on the order of ~ms to do ~us of work.

Obviously, having tons of threads all contending on a single
lock for logging is non-optimal, so the proper fix is probably
reworking pstore pmsg to have per-cpu buffers so we don't have
contention.

Additionally, Steven Rostedt has provided some furhter
optimizations for rtmutexes that improves the rtmutex spinning
path, but at least in my testing, I still see the test tripping
into the sleeping path on rtmutexes while utilizing the spinning
path with mutexes.

But in the short term, lets revert the change to the rt_mutex
and go back to normal mutexes to avoid a potentially major
performance regression. And we can work on optimizations to both
rtmutexes and finer-grained locking for pstore pmsg in the
future.

Cc: Wei Wang <wvw@google.com>
Cc: Midas Chien<midaschieh@google.com>
Cc: "Chunhui Li (李春辉)" <chunhui.li@mediatek.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kernel-team@android.com
Fixes: 76d62f24db ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion")
Reported-by: "Chunhui Li (李春辉)" <chunhui.li@mediatek.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230308204043.2061631-1-jstultz@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11 23:03:27 +09:00
..
9p use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
adfs
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:34:08 +01:00
afs use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
autofs
befs
bfs
btrfs btrfs: fix uninitialized variable warnings 2023-05-01 08:26:27 +09:00
cachefiles cachefiles: use vfs_tmpfile_open() helper 2022-09-24 07:00:00 +02:00
ceph ceph: fix potential use-after-free bug when trimming caps 2023-05-11 23:03:05 +09:00
cifs cifs: fix negotiate context parsing 2023-04-20 12:35:14 +02:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-10 09:33:52 +01:00
configfs configfs: fix possible memory leak in configfs_create_dir() 2022-12-31 13:32:22 +01:00
cramfs fs/cramfs/inode.c: initialize file_ra_state 2023-03-10 09:34:09 +01:00
crypto blk-crypto: add a blk_crypto_config_supported_natively helper 2023-05-11 23:03:00 +09:00
debugfs debugfs: fix error when writing negative value to atomic_t debugfs file 2022-12-31 13:31:58 +01:00
devpts
dlm fs: dlm: fix race setting stop tx flag 2023-03-17 08:50:19 +01:00
ecryptfs whack-a-mole: constifying struct path * 2022-10-06 17:31:02 -07:00
efivarfs efi: efivars: Fix variable writes without query_variable_store() 2022-10-21 11:09:40 +02:00
efs
erofs erofs: fix potential overflow calculating xattr_isize 2023-05-11 23:03:07 +09:00
exfat exfat: fix inode->i_blocks for non-512 byte sector size device 2023-03-10 09:34:08 +01:00
exportfs
ext2 ext2: unbugger ext2_empty_dir() 2023-01-07 11:11:40 +01:00
ext4 ext4: fix possible double unlock when moving a directory 2023-03-22 13:33:55 +01:00
f2fs f2fs: fix to check return value of inc_valid_block_count() 2023-05-11 23:03:23 +09:00
fat treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
freevxfs
fscache fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work() 2023-02-22 12:59:43 +01:00
fuse fuse: always revalidate rename target dentry 2023-04-26 14:28:42 +02:00
gfs2 gfs2: Improve gfs2_make_fs_rw error handling 2023-03-10 09:33:59 +01:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-10 09:34:07 +01:00
hfsplus fs: hfsplus: fix UAF issue in hfsplus_put_super 2023-03-10 09:34:07 +01:00
hostfs hostfs: move from strlcpy with unused retval to strscpy 2022-09-19 22:46:25 +02:00
hpfs
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2022-12-31 13:33:05 +01:00
iomap iomap: add a tracepoint for mappings returned by map_blocks 2022-10-02 11:42:19 -07:00
isofs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
jbd2 jdb2: Don't refuse invalidation of already invalidated buffers 2023-05-11 23:03:23 +09:00
jffs2 jffs2: correct logic when creating a hole in jffs2_write_begin 2023-03-22 13:33:53 +01:00
jfs fs/jfs: fix shift exponent db_agl2size negative 2023-03-11 13:55:16 +01:00
kernfs kernfs: Fix spurious lockdep warning in kernfs_find_and_get_node_by_id() 2022-11-10 19:03:42 +01:00
ksmbd ksmbd: fix deadlock in ksmbd_find_crypto_ctx() 2023-05-11 23:03:04 +09:00
lockd lockd: set file_lock start and end when decoding nlm4 testargs 2023-03-30 12:49:23 +02:00
minix vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
netfs use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
nfs NFSv4: Fix hangs when recovering open state after a server reboot 2023-04-06 12:10:54 +02:00
nfs_common
nfsd NFSD: callback request does not use correct credential for AUTH_SYS 2023-04-13 16:55:23 +02:00
nilfs2 nilfs2: initialize unused bytes in segment summary blocks 2023-04-26 14:28:39 +02:00
nls
notify Merge tag 'fsnotify-for_v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2022-10-07 08:28:50 -07:00
ntfs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
ntfs3 fs/ntfs3: Validate attribute data and valid sizes 2023-02-09 11:28:26 +01:00
ocfs2 ocfs2: fix data corruption after failed write 2023-03-22 13:34:02 +01:00
omfs
openpromfs
orangefs use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
overlayfs ovl: Use "buf" flexible array for memcpy() destination 2023-02-09 11:28:26 +01:00
proc mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps 2023-02-09 11:28:20 +01:00
pstore pstore: Revert pmsg_lock back to a normal mutex 2023-05-11 23:03:27 +09:00
qnx4
qnx6 fs/qnx6: delete unnecessary checks before brelse() 2022-09-11 21:55:07 -07:00
quota ext4: fix bug_on in __es_tree_search caused by bad quota inode 2023-01-07 11:11:59 +01:00
ramfs tmpfile API change 2022-10-10 19:45:17 -07:00
reiserfs reiserfs: Add security prefix to xattr name in reiserfs_security_write() 2023-05-11 23:03:02 +09:00
romfs
smbfs_common smb3: define missing create contexts 2022-10-05 01:55:27 -05:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:59:50 +01:00
sysfs
sysv fs: sysv: Fix sysv_nblocks() returns wrong value 2022-12-31 13:32:00 +01:00
tracefs tracefs: Only clobber mode/uid/gid on remount if asked 2022-09-08 17:10:54 -04:00
ubifs ubifs: Fix memory leak in do_rename 2023-05-11 23:03:05 +09:00
udf udf: Fix off-by-one error when discarding preallocation 2023-03-17 08:50:19 +01:00
ufs ufs: replace ll_rw_block() 2022-09-11 20:26:07 -07:00
unicode
vboxsf
verity fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY 2023-04-06 12:10:34 +02:00
xfs xfs: don't consider future format versions valid 2023-05-11 23:03:05 +09:00
zonefs zonefs: Always invalidate last cached page on append write 2023-04-06 12:10:52 +02:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:59:46 +01:00
anon_inodes.c
attr.c attr: use consistent sgid stripping checks 2023-03-03 11:52:25 +01:00
bad_inode.c vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
binfmt_elf.c elfcore: Add a cprm parameter to elf_core_extra_{phdrs,data_size} 2023-01-18 11:58:12 +01:00
binfmt_elf_fdpic.c elfcore: Add a cprm parameter to elf_core_extra_{phdrs,data_size} 2023-01-18 11:58:12 +01:00
binfmt_elf_test.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-31 13:32:57 +01:00
binfmt_script.c
buffer.c - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-31 13:32:41 +01:00
compat_binfmt_elf.c
coredump.c coredump: Move dump_emit_page() to kill unused warning 2023-02-22 12:59:50 +01:00
d_path.c d_path.c: typo fix... 2022-08-20 11:34:33 -04:00
dax.c Merge branch 'for-6.0/dax' into libnvdimm-fixes 2022-09-24 18:14:12 -07:00
dcache.c tmpfile API change 2022-10-10 19:45:17 -07:00
direct-io.c block: remove PSI accounting from the bio layer 2022-09-20 08:24:38 -06:00
drop_caches.c
eventfd.c eventfd: provide a eventfd_signal_mask() helper 2023-01-04 11:28:48 +01:00
eventpoll.c eventpoll: add EPOLL_URING_WAKE poll wakeup flag 2023-01-04 11:28:47 +01:00
exec.c 23 hotfixes. 2022-10-29 17:49:33 -07:00
fcntl.c
fhandle.c do_sys_name_to_handle(): constify path 2022-09-01 17:36:39 -04:00
file.c fs: prevent out-of-bounds array speculation when closing a file descriptor 2023-03-17 08:50:13 +01:00
file_table.c
filesystems.c
fs-writeback.c writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs 2023-04-26 14:28:39 +02:00
fs_context.c
fs_parser.c ext4: journal_path mount options should follow links 2023-01-07 11:11:59 +01:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
init.c
inode.c attr: use consistent sgid stripping checks 2023-03-03 11:52:25 +01:00
internal.h attr: use consistent sgid stripping checks 2023-03-03 11:52:25 +01:00
ioctl.c
Kconfig hugetlb: make hugetlb depends on SYSFS or SYSCTL 2022-09-11 20:26:10 -07:00
Kconfig.binfmt Xtensa updates for v6.1 2022-10-10 14:21:11 -07:00
kernel_read_file.c
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2022-12-31 13:31:58 +01:00
locks.c filelocks: use mount idmapping for setlease permission check 2023-03-17 08:50:32 +01:00
Makefile fs: fix sysctls.c built 2023-05-11 23:03:01 +09:00
mbcache.c ext4: fix deadlock due to mbcache entry corruption 2023-01-07 11:12:02 +01:00
mount.h
mpage.c
namei.c vfs: vfs_tmpfile: ensure O_EXCL flag is enforced 2022-11-19 02:22:11 -05:00
namespace.c fs: drop peer group ids under namespace lock 2023-04-13 16:55:33 +02:00
no-block.c
nsfs.c
open.c fs: Use CHECK_DATA_CORRUPTION() when kernel bugs are detected 2023-03-10 09:33:46 +01:00
pipe.c
pnode.c pnode: terminate at peers of source 2023-01-04 11:29:01 +01:00
pnode.h
posix_acl.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
proc_namespace.c
read_write.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
readdir.c
remap_range.c
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
signalfd.c
splice.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
stack.c
stat.c vfs: support STATX_DIOALIGN on block devices 2022-09-11 19:47:12 -05:00
statfs.c
super.c fscrypt: destroy keyring after security_sb_delete() 2023-03-30 12:49:23 +02:00
sync.c
sysctls.c
timerfd.c
userfaultfd.c Revert "userfaultfd: don't fail on unrecognized features" 2023-04-26 14:28:37 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2022-12-31 13:31:55 +01:00