linux-stable/fs
David Howells e0cda159c8 afs: Fix refcount underflow from error handling race
[ Upstream commit 52bf9f6c09 ]

If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL
server or fileserver), an asynchronous probe to one of its addresses may
fail immediately because sendmsg() returns an error.  When this happens, a
refcount underflow can happen if certain events hit a very small window.

The way this occurs is:

 (1) There are two levels of "call" object, the afs_call and the
     rxrpc_call.  Each of them can be transitioned to a "completed" state
     in the event of success or failure.

 (2) Asynchronous afs_calls are self-referential whilst they are active to
     prevent them from evaporating when they're not being processed.  This
     reference is disposed of when the afs_call is completed.

     Note that an afs_call may only be completed once; once completed
     completing it again will do nothing.

 (3) When a call transmission is made, the app-side rxrpc code queues a Tx
     buffer for the rxrpc I/O thread to transmit.  The I/O thread invokes
     sendmsg() to transmit it - and in the case of failure, it transitions
     the rxrpc_call to the completed state.

 (4) When an rxrpc_call is completed, the app layer is notified.  In this
     case, the app is kafs and it schedules a work item to process events
     pertaining to an afs_call.

 (5) When the afs_call event processor is run, it goes down through the
     RPC-specific handler to afs_extract_data() to retrieve data from rxrpc
     - and, in this case, it picks up the error from the rxrpc_call and
     returns it.

     The error is then propagated to the afs_call and that is completed
     too.  At this point the self-reference is released.

 (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the
     window between rxrpc_send_data() queuing the request packet and
     checking for call completion on the way out, then
     rxrpc_kernel_send_data() will return the error from sendmsg() to the
     app.

 (7) Then afs_make_call() will see an error and will jump to the error
     handling path which will attempt to clean up the afs_call.

 (8) The problem comes when the error handling path in afs_make_call()
     tries to unconditionally drop an async afs_call's self-reference.
     This self-reference, however, may already have been dropped by
     afs_extract_data() completing the afs_call

 (9) The refcount underflows when we return to afs_do_probe_vlserver() and
     that tries to drop its reference on the afs_call.

Fix this by making afs_make_call() attempt to complete the afs_call rather
than unconditionally putting it.  That way, if afs_extract_data() manages
to complete the call first, afs_make_call() won't do anything.

The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and
sticking an msleep() in rxrpc_send_data() after the 'success:' label to
widen the race window.

The error message looks something like:

    refcount_t: underflow; use-after-free.
    WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
    ...
    RIP: 0010:refcount_warn_saturate+0xba/0x110
    ...
    afs_put_call+0x1dc/0x1f0 [kafs]
    afs_fs_get_capabilities+0x8b/0xe0 [kafs]
    afs_fs_probe_fileserver+0x188/0x1e0 [kafs]
    afs_lookup_server+0x3bf/0x3f0 [kafs]
    afs_alloc_server_list+0x130/0x2e0 [kafs]
    afs_create_volume+0x162/0x400 [kafs]
    afs_get_tree+0x266/0x410 [kafs]
    vfs_get_tree+0x25/0xc0
    fc_mount+0xe/0x40
    afs_d_automount+0x1b3/0x390 [kafs]
    __traverse_mounts+0x8f/0x210
    step_into+0x340/0x760
    path_openat+0x13a/0x1260
    do_filp_open+0xaf/0x160
    do_sys_openat2+0xaf/0x170

or something like:

    refcount_t: underflow; use-after-free.
    ...
    RIP: 0010:refcount_warn_saturate+0x99/0xda
    ...
    afs_put_call+0x4a/0x175
    afs_send_vl_probes+0x108/0x172
    afs_select_vlserver+0xd6/0x311
    afs_do_cell_detect_alias+0x5e/0x1e9
    afs_cell_detect_alias+0x44/0x92
    afs_validate_fc+0x9d/0x134
    afs_get_tree+0x20/0x2e6
    vfs_get_tree+0x1d/0xc9
    fc_mount+0xe/0x33
    afs_d_automount+0x48/0x9d
    __traverse_mounts+0xe0/0x166
    step_into+0x140/0x274
    open_last_lookups+0x1c1/0x1df
    path_openat+0x138/0x1c3
    do_filp_open+0x55/0xb4
    do_sys_openat2+0x6c/0xb6

Fixes: 34fa47612b ("afs: Fix race in async call refcounting")
Reported-by: Bill MacAllister <bill@ca-zephyr.org>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304
Suggested-by: Jeffrey E Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-20 17:00:15 +01:00
..
9p 9p: v9fs_listxattr: fix %s null argument warning 2023-11-28 17:07:01 +00:00
adfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:34:08 +01:00
afs afs: Fix refcount underflow from error handling race 2023-12-20 17:00:15 +01:00
autofs autofs: fix memory leak of waitqueues in autofs_catatonic_mode 2023-09-23 11:10:59 +02:00
befs befs: Convert befs_symlink_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
bfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
btrfs btrfs: fix 64bit compat send ioctl arguments not initializing version member 2023-12-08 08:51:16 +01:00
cachefiles cachefiles: use vfs_tmpfile_open() helper 2022-09-24 07:00:00 +02:00
ceph ceph_wait_on_conflict_unlink(): grab reference before dropping ->d_lock 2023-11-08 14:11:02 +01:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-10 09:33:52 +01:00
configfs configfs: fix possible memory leak in configfs_create_dir() 2022-12-31 13:32:22 +01:00
cramfs fs/cramfs/inode.c: initialize file_ra_state 2023-03-10 09:34:09 +01:00
crypto blk-crypto: add a blk_crypto_config_supported_natively helper 2023-05-11 23:03:00 +09:00
debugfs debugfs: fix error when writing negative value to atomic_t debugfs file 2022-12-31 13:31:58 +01:00
devpts
dlm dlm: fix plock lookup when using multiple lockspaces 2023-09-13 09:43:02 +02:00
ecryptfs whack-a-mole: constifying struct path * 2022-10-06 17:31:02 -07:00
efivarfs efi: efivars: Fix variable writes without query_variable_store() 2022-10-21 11:09:40 +02:00
efs efs: Convert efs symlinks to read_folio 2022-05-09 16:21:45 -04:00
erofs erofs: fix memory leak of LZMA global compressed deduplication 2023-10-10 22:00:39 +02:00
exfat exfat: support handle zero-size directory 2023-11-28 17:07:00 +00:00
exportfs Change calling conventions for filldir_t 2022-08-17 17:25:04 -04:00
ext2 ext2: fix datatype of block number in ext2_xattr_set2() 2023-09-23 11:11:05 +02:00
ext4 ext4: fix warning in ext4_dio_write_end_io() 2023-12-20 17:00:14 +01:00
f2fs f2fs: avoid format-overflow warning 2023-11-28 17:07:19 +00:00
fat treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
freevxfs freevxfs: Convert vxfs_immed_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
fscache fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work() 2023-02-22 12:59:43 +01:00
fuse fuse: nlookup missing decrement in fuse_direntplus_link 2023-09-19 12:28:05 +02:00
gfs2 gfs2: Silence "suspicious RCU usage in gfs2_permission" warning 2023-11-28 17:07:04 +00:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-10 09:34:07 +01:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-05-24 17:32:34 +01:00
hostfs hostfs: move from strlcpy with unused retval to strscpy 2022-09-19 22:46:25 +02:00
hpfs hpfs: Convert symlinks to read_folio 2022-05-09 16:21:45 -04:00
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2022-12-31 13:33:05 +01:00
iomap iomap: update ki_pos a little later in iomap_dio_complete 2023-12-08 08:51:20 +01:00
isofs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
jbd2 jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev 2023-11-28 17:07:13 +00:00
jffs2 jffs2: reduce stack usage in jffs2_build_xattr_subsystem() 2023-07-19 16:22:11 +02:00
jfs jfs: fix array-index-out-of-bounds in diAlloc 2023-11-28 17:06:59 +00:00
kernfs kernfs: fix missing kernfs_idr_lock to remove an ID from the IDR 2023-07-19 16:21:53 +02:00
lockd fs: lockd: avoid possible wrong NULL parameter 2023-09-13 09:42:49 +02:00
minix vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-10-06 14:56:32 +02:00
nfs NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO 2023-11-28 17:07:04 +00:00
nfs_common
nfsd NFSD: Fix checksum mismatches in the duplicate reply cache 2023-12-03 07:32:10 +01:00
nilfs2 nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() 2023-12-13 18:39:19 +01:00
nls fs/nls: make load_nls() take a const parameter 2023-09-13 09:42:22 +02:00
notify fanotify: disallow mount/sb marks on kernel internal pseudo fs 2023-07-19 16:22:05 +02:00
ntfs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
ntfs3 fs/ntfs3: Avoid possible memory leak 2023-11-08 14:10:59 +01:00
ocfs2 fs: ocfs2: namei: check return value of ocfs2_add_entry() 2023-09-13 09:42:33 +02:00
omfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
openpromfs
orangefs use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
overlayfs ima: detect changes to the backing overlay file 2023-11-28 17:07:12 +00:00
proc watchdog: move softlockup_panic back to early_param 2023-11-28 17:07:09 +00:00
pstore pstore/platform: Add check for kstrdup 2023-11-20 11:51:50 +01:00
qnx4 fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
qnx6 fs/qnx6: delete unnecessary checks before brelse() 2022-09-11 21:55:07 -07:00
quota quota: explicitly forbid quota files from being encrypted 2023-11-28 17:07:13 +00:00
ramfs shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs 2023-07-19 16:22:11 +02:00
reiserfs reiserfs: Check the return value from __getblk() 2023-09-13 09:42:27 +02:00
romfs romfs: Convert romfs to read_folio 2022-05-09 16:21:46 -04:00
smb ksmbd: fix memory leak in smb2_lock() 2023-12-20 17:00:14 +01:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:59:50 +01:00
sysfs
sysv fs/sysv: Null check to prevent null-ptr-deref bug 2023-08-11 12:08:23 +02:00
tracefs tracefs: Add missing lockdown check to tracefs_create_dir() 2023-09-23 11:11:12 +02:00
ubifs ubifs: Fix memory leak in do_rename 2023-05-11 23:03:05 +09:00
udf udf: initialize newblock to 0 2023-09-13 09:43:05 +02:00
ufs ufs: replace ll_rw_block() 2022-09-11 20:26:07 -07:00
unicode
vboxsf vboxsf: Convert vboxsf to read_folio 2022-05-09 16:21:46 -04:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-09-13 09:43:03 +02:00
xfs xfs: recovery should not clear di_flushiter unconditionally 2023-11-28 17:07:15 +00:00
zonefs zonefs: Always invalidate last cached page on append write 2023-04-06 12:10:52 +02:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:59:46 +01:00
anon_inodes.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
attr.c attr: block mode changes of symlinks 2023-09-23 11:11:10 +02:00
bad_inode.c vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
binfmt_elf.c mm: always expand the stack with the mmap write lock held 2023-07-01 13:16:25 +02:00
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-10-06 14:57:06 +02:00
binfmt_elf_test.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-31 13:32:57 +01:00
binfmt_script.c
buffer.c - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-31 13:32:41 +01:00
compat_binfmt_elf.c
coredump.c coredump: Move dump_emit_page() to kill unused warning 2023-02-22 12:59:50 +01:00
d_path.c d_path.c: typo fix... 2022-08-20 11:34:33 -04:00
dax.c Merge branch 'for-6.0/dax' into libnvdimm-fixes 2022-09-24 18:14:12 -07:00
dcache.c tmpfile API change 2022-10-10 19:45:17 -07:00
direct-io.c block: remove PSI accounting from the bio layer 2022-09-20 08:24:38 -06:00
drop_caches.c
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-09-13 09:42:27 +02:00
eventpoll.c epoll: ep_autoremove_wake_function should use list_del_init_careful 2023-06-21 16:00:54 +02:00
exec.c mm: always expand the stack with the mmap write lock held 2023-07-01 13:16:25 +02:00
fcntl.c keep iocb_flags() result cached in struct file 2022-06-10 16:10:23 -04:00
fhandle.c do_sys_name_to_handle(): constify path 2022-09-01 17:36:39 -04:00
file.c file: reinstate f_pos locking optimization for regular files 2023-08-11 12:08:23 +02:00
file_table.c locks: fix TOCTOU race when granting write lease 2022-08-16 10:59:54 -04:00
filesystems.c
fs-writeback.c writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs 2023-11-20 11:51:50 +01:00
fs_context.c vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing 2023-09-13 09:42:28 +02:00
fs_parser.c ext4: journal_path mount options should follow links 2023-01-07 11:11:59 +01:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c uninline may_mount() and don't opencode it in fspick(2)/fsopen(2) 2022-05-19 23:25:10 -04:00
init.c
inode.c fs: add ctime accessors infrastructure 2023-11-28 17:07:15 +00:00
internal.h nfs: use vfs setgid helper 2023-08-30 16:11:10 +02:00
ioctl.c
Kconfig smb: move client and server files to common directory fs/smb 2023-06-28 11:12:40 +02:00
Kconfig.binfmt Xtensa updates for v6.1 2022-10-10 14:21:11 -07:00
kernel_read_file.c fs/kernel_read_file: allow to read files up-to ssize_t 2022-06-16 19:58:21 -07:00
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2022-12-31 13:31:58 +01:00
locks.c locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock 2023-09-23 11:11:00 +02:00
Makefile smb: move client and server files to common directory fs/smb 2023-06-28 11:12:40 +02:00
mbcache.c ext4: fix deadlock due to mbcache entry corruption 2023-01-07 11:12:02 +01:00
mount.h switch try_to_unlazy_next() to __legitimize_mnt() 2022-07-05 16:18:21 -04:00
mpage.c Folio changes for 6.0 2022-08-03 10:35:43 -07:00
namei.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-25 12:03:04 +02:00
namespace.c fs: drop peer group ids under namespace lock 2023-04-13 16:55:33 +02:00
no-block.c
nsfs.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
open.c open: make RESOLVE_CACHED correctly test for O_TMPFILE 2023-08-11 12:08:22 +02:00
pipe.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
pnode.c pnode: terminate at peers of source 2023-01-04 11:29:01 +01:00
pnode.h
posix_acl.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
proc_namespace.c vfs: escape hash as well 2022-06-28 13:58:05 -04:00
read_write.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
readdir.c Change calling conventions for filldir_t 2022-08-17 17:25:04 -04:00
remap_range.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
signalfd.c
splice.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
stack.c
stat.c vfs: support STATX_DIOALIGN on block devices 2022-09-11 19:47:12 -05:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-24 17:32:51 +01:00
super.c fs: Protect reconfiguration of sb read-write from racing writes 2023-08-11 12:08:24 +02:00
sync.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
sysctls.c
timerfd.c
userfaultfd.c Revert "userfaultfd: don't fail on unrecognized features" 2023-04-26 14:28:37 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2022-12-31 13:31:55 +01:00