linux-stable/fs
David Howells f76e0829bb afs: Fix speculative status fetches
[ Upstream commit 22650f1481 ]

The generic/464 xfstest causes kAFS to emit occasional warnings of the
form:

        kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015)

This indicates that the data version received back from the server did not
match the expected value (the DV should be incremented monotonically for
each individual modification op committed to a vnode).

What is happening is that a lookup call is doing a bulk status fetch
speculatively on a bunch of vnodes in a directory besides getting the
status of the vnode it's actually interested in.  This is racing with a
StoreData operation (though it could also occur with, say, a MakeDir op).

On the client, a modification operation locks the vnode, but the bulk
status fetch only locks the parent directory, so no ordering is imposed
there (thereby avoiding an avenue to deadlock).

On the server, the StoreData op handler doesn't lock the vnode until it's
received all the request data, and downgrades the lock after committing the
data until it has finished sending change notifications to other clients -
which allows the status fetch to occur before it has finished.

This means that:

 - a status fetch can access the target vnode either side of the exclusive
   section of the modification

 - the status fetch could start before the modification, yet finish after,
   and vice-versa.

 - the status fetch and the modification RPCs can complete in either order.

 - the status fetch can return either the before or the after DV from the
   modification.

 - the status fetch might regress the locally cached DV.

Some of these are handled by the previous fix[1], but that's not sufficient
because it checks the DV it received against the DV it cached at the start
of the op, but the DV might've been updated in the meantime by a locally
generated modification op.

Fix this by the following means:

 (1) Keep track of when we're performing a modification operation on a
     vnode.  This is done by marking vnode parameters with a 'modification'
     note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode
     for the duration.

 (2) Alter the speculation race detection to ignore speculative status
     fetches if either the vnode is marked as being modified or the data
     version number is not what we expected.

Note that whilst the "vnode modified" warning does get recovered from as it
causes the client to refetch the status at the next opportunity, it will
also invalidate the pagecache, so changes might get lost.

Fixes: a9e5c87ca7 ("afs: Fix speculative status fetch going out of order wrt to modifications")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-and-reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:45 +02:00
..
9p fs: 9p: add generic splice_write file operation 2020-12-01 21:40:47 +01:00
adfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix speculative status fetches 2021-05-14 09:50:45 +02:00
autofs autofs: harden ioctl table 2020-10-16 11:11:22 -07:00
befs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: fix race when picking most recent mod log operation for an old root 2021-05-11 14:47:33 +02:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: fix flush_snap logic after putting caps 2021-03-04 11:38:08 +01:00
cifs smb3: do not attempt multichannel to server which does not support it 2021-05-11 14:47:37 +02:00
coda
configfs configfs: fix a use-after-free in __configfs_open_file 2021-03-17 17:06:34 +01:00
cramfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
crypto fscrypt: add fscrypt_is_nokey_name() 2020-12-26 16:02:43 +01:00
debugfs debugfs: do not attempt to create a new file before the filesystem is initalized 2021-03-04 11:37:17 +01:00
devpts
dlm networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
ecryptfs ecryptfs: fix kernel panic with null dev_name 2021-05-11 14:47:12 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-11-25 16:55:02 +01:00
efs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
erofs erofs: add unsupported inode i_format check 2021-05-11 14:47:13 +02:00
exfat exfat: fix erroneous discard when clear cluster bit 2021-05-11 14:47:36 +02:00
exportfs
ext2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
ext4 ext4: Fix occasional generic/418 failure 2021-05-11 14:47:38 +02:00
f2fs f2fs: fix to avoid out-of-bounds memory access 2021-05-11 14:47:34 +02:00
fat [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
freevxfs
fscache
fuse fuse: fix write deadlock 2021-05-11 14:47:36 +02:00
gfs2 gfs2: report "already frozen/thawed" errors 2021-04-16 11:43:20 +02:00
hfs fs: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
hfsplus fs: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
hugetlbfs mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page 2021-02-10 09:29:20 +01:00
iomap iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate 2021-04-07 15:00:04 +02:00
isofs isofs: release buffer head before return 2021-03-04 11:38:00 +01:00
jbd2 ext4: annotate data race in jbd2_journal_dirty_metadata() 2021-05-11 14:47:37 +02:00
jffs2 jffs2: check the validity of dstlen in jffs2_zlib_compress() 2021-05-11 14:47:36 +02:00
jfs JFS: more checks for invalid superblock 2021-03-07 12:34:04 +01:00
kernfs kernfs: wire up ->splice_read and ->splice_write 2021-01-27 11:55:29 +01:00
lockd lockd: don't use interval-based rebinding over TCP 2020-12-30 11:53:30 +01:00
minix [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
nfs NFSv4: Don't discard segments marked for return in _pnfs_return_layout() 2021-05-11 14:47:34 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:53:45 +01:00
nfsd NFSv4.2: fix copy stateid copying for the async copy 2021-05-14 09:50:13 +02:00
nilfs2 nilfs2: make splice write available again 2021-02-13 13:55:16 +01:00
nls
notify fanotify: Fix sys_fanotify_mark() on native x86-32 2021-01-17 14:16:59 +01:00
ntfs ntfs: check for valid standard information attribute 2021-02-26 10:13:00 +01:00
ocfs2 ocfs2: fix deadlock between setattr and dio_end_io_write 2021-04-14 08:41:58 +02:00
omfs fs: omfs: use kmemdup() rather than kmalloc+memcpy 2020-09-22 23:39:45 -04:00
openpromfs
orangefs orangefs: remove unnecessary assignment to variable ret 2020-08-04 15:01:58 -04:00
overlayfs ovl: invalidate readdir cache on changes to dir with origin 2021-05-14 09:50:35 +02:00
proc seccomp: Fix CONFIG tests for Seccomp_filters 2021-05-14 09:50:24 +02:00
pstore pstore: Fix warning in pstore_kill_sb() 2021-03-25 09:04:08 +01:00
qnx4 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
qnx6 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
quota quota: Fix memory leak when handling corrupted quota file 2021-03-04 11:37:53 +01:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-16 11:11:22 -07:00
reiserfs reiserfs: update reiserfs_xattrs_initialized() condition 2021-04-07 15:00:10 +02:00
romfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
squashfs squashfs: fix xattr id and id lookup sanity checks 2021-03-30 14:31:54 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2020-10-02 12:02:30 +02:00
sysv [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
tracefs
ubifs ubifs: Only check replay with inode type to judge if inode linked 2021-05-11 14:47:33 +02:00
udf udf: fix silent AED tagLocation corruption 2021-03-17 17:06:23 +01:00
ufs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
unicode unicode: Add utf8_casefold_hash 2020-09-10 14:03:31 -07:00
vboxsf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2020-10-15 15:11:56 -07:00
verity
xfs xfs: fix return of uninitialized value in variable error 2021-05-14 09:50:34 +02:00
zonefs zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() 2021-03-25 09:04:05 +01:00
aio.c vfs: separate __sb_start_write into blocking and non-blocking helpers 2020-11-10 16:53:07 -08:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c fs: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
binfmt_elf_fdpic.c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot 2020-10-16 11:11:21 -07:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c
block_dev.c block: don't ignore REQ_NOWAIT for direct IO 2021-04-16 11:43:21 +02:00
buffer.c mm, memcg: rework remote charging API to support nesting 2020-10-18 09:27:09 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: fix core_pattern parse error 2020-12-06 10:19:07 -08:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c mm: provide a saner PTE walking API for modules 2021-02-26 10:13:01 +01:00
dcache.c
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-11 14:47:12 +02:00
exec.c exec: Transform exec_update_mutex into a rw_semaphore 2021-01-09 13:46:24 +01:00
fcntl.c fcntl: Fix potential deadlock in send_sig{io, urg}() 2021-01-06 14:56:53 +01:00
fhandle.c
file.c kernel/io_uring: cancel io_uring before task works 2021-01-30 13:55:18 +01:00
file_table.c task_work: cleanup notification modes 2020-10-17 15:05:30 -06:00
filesystems.c
fs-writeback.c fs: fix lazytime expiration handling in __writeback_single_inode() 2021-01-27 11:54:53 +01:00
fs_context.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode() 2020-12-30 11:53:49 +01:00
internal.h fs: remove compat_sys_mount 2020-09-22 23:45:57 -04:00
io-wq.c io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
io_uring.c io_uring: fix overflows checks in provide buffers 2021-05-14 09:50:28 +02:00
ioctl.c
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt
kernel_read_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-22 10:48:22 -08:00
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
mbcache.c
mount.h
mpage.c
namei.c LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late 2021-04-14 08:41:58 +02:00
namespace.c umount(2): move the flag validity checks first 2021-01-19 18:27:32 +01:00
no-block.c
nsfs.c
open.c openat2: reject RESOLVE_BENEATH|RESOLVE_IN_ROOT 2020-12-30 11:54:24 +01:00
pipe.c fs/pipe: allow sendfile() to pipe again 2021-01-27 11:55:29 +01:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c proc mountinfo: make splice available again 2020-12-30 11:54:02 +01:00
read_write.c Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c vfs: move the remap range helpers to remap_range.c 2020-10-15 09:48:49 -07:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
seq_file.c fix return values of seq_read_iter() 2020-11-15 22:12:53 -05:00
signalfd.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
splice.c io_uring-5.10-2020-10-24 2020-10-24 12:40:18 -07:00
stack.c
stat.c fs: fix reporting supported extra file attributes for statx() 2021-05-11 14:47:33 +02:00
statfs.c Add a "nosymfollow" mount option. 2020-08-27 16:06:47 -04:00
super.c vfs: move __sb_{start,end}_write* to fs.h 2020-11-10 16:53:11 -08:00
sync.c
timerfd.c
userfaultfd.c mm: remove the now-unnecessary mmget_still_valid() hack 2020-10-16 11:11:22 -07:00
utimes.c
xattr.c fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr 2020-10-13 18:38:27 -07:00