linux-stable/fs
Filipe Manana 061dde8245 btrfs: fix race between transaction aborts and fsyncs leading to use-after-free
There is a race between a task aborting a transaction during a commit,
a task doing an fsync and the transaction kthread, which leads to an
use-after-free of the log root tree. When this happens, it results in a
stack trace like the following:

  BTRFS info (device dm-0): forced readonly
  BTRFS warning (device dm-0): Skipping commit of aborted transaction.
  BTRFS: error (device dm-0) in cleanup_transaction:1958: errno=-5 IO failure
  BTRFS warning (device dm-0): lost page write due to IO error on /dev/mapper/error-test (-5)
  BTRFS warning (device dm-0): Skipping commit of aborted transaction.
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0xa4e8 len 4096 err no 10
  BTRFS error (device dm-0): error writing primary super block to device 1
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e000 len 4096 err no 10
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e008 len 4096 err no 10
  BTRFS warning (device dm-0): direct IO failed ino 261 rw 0,0 sector 0x12e010 len 4096 err no 10
  BTRFS: error (device dm-0) in write_all_supers:4110: errno=-5 IO failure (1 errors while writing supers)
  BTRFS: error (device dm-0) in btrfs_sync_log:3308: errno=-5 IO failure
  general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b68: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  CPU: 2 PID: 2458471 Comm: fsstress Not tainted 5.12.0-rc5-btrfs-next-84 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  RIP: 0010:__mutex_lock+0x139/0xa40
  Code: c0 74 19 (...)
  RSP: 0018:ffff9f18830d7b00 EFLAGS: 00010202
  RAX: 6b6b6b6b6b6b6b68 RBX: 0000000000000001 RCX: 0000000000000002
  RDX: ffffffffb9c54d13 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff9f18830d7bc0 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff9f18830d7be0 R11: 0000000000000001 R12: ffff8c6cd199c040
  R13: ffff8c6c95821358 R14: 00000000fffffffb R15: ffff8c6cbcf01358
  FS:  00007fa9140c2b80(0000) GS:ffff8c6fac600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fa913d52000 CR3: 000000013d2b4003 CR4: 0000000000370ee0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   ? __btrfs_handle_fs_error+0xde/0x146 [btrfs]
   ? btrfs_sync_log+0x7c1/0xf20 [btrfs]
   ? btrfs_sync_log+0x7c1/0xf20 [btrfs]
   btrfs_sync_log+0x7c1/0xf20 [btrfs]
   btrfs_sync_file+0x40c/0x580 [btrfs]
   do_fsync+0x38/0x70
   __x64_sys_fsync+0x10/0x20
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x7fa9142a55c3
  Code: 8b 15 09 (...)
  RSP: 002b:00007fff26278d48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
  RAX: ffffffffffffffda RBX: 0000563c83cb4560 RCX: 00007fa9142a55c3
  RDX: 00007fff26278cb0 RSI: 00007fff26278cb0 RDI: 0000000000000005
  RBP: 0000000000000005 R08: 0000000000000001 R09: 00007fff26278d5c
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000340
  R13: 00007fff26278de0 R14: 00007fff26278d96 R15: 0000563c83ca57c0
  Modules linked in: btrfs dm_zero dm_snapshot dm_thin_pool (...)
  ---[ end trace ee2f1b19327d791d ]---

The steps that lead to this crash are the following:

1) We are at transaction N;

2) We have two tasks with a transaction handle attached to transaction N.
   Task A and Task B. Task B is doing an fsync;

3) Task B is at btrfs_sync_log(), and has saved fs_info->log_root_tree
   into a local variable named 'log_root_tree' at the top of
   btrfs_sync_log(). Task B is about to call write_all_supers(), but
   before that...

4) Task A calls btrfs_commit_transaction(), and after it sets the
   transaction state to TRANS_STATE_COMMIT_START, an error happens before
   it waits for the transaction's 'num_writers' counter to reach a value
   of 1 (no one else attached to the transaction), so it jumps to the
   label "cleanup_transaction";

5) Task A then calls cleanup_transaction(), where it aborts the
   transaction, setting BTRFS_FS_STATE_TRANS_ABORTED on fs_info->fs_state,
   setting the ->aborted field of the transaction and the handle to an
   errno value and also setting BTRFS_FS_STATE_ERROR on fs_info->fs_state.

   After that, at cleanup_transaction(), it deletes the transaction from
   the list of transactions (fs_info->trans_list), sets the transaction
   to the state TRANS_STATE_COMMIT_DOING and then waits for the number
   of writers to go down to 1, as it's currently 2 (1 for task A and 1
   for task B);

6) The transaction kthread is running and sees that BTRFS_FS_STATE_ERROR
   is set in fs_info->fs_state, so it calls btrfs_cleanup_transaction().

   There it sees the list fs_info->trans_list is empty, and then proceeds
   into calling btrfs_drop_all_logs(), which frees the log root tree with
   a call to btrfs_free_log_root_tree();

7) Task B calls write_all_supers() and, shortly after, under the label
   'out_wake_log_root', it deferences the pointer stored in
   'log_root_tree', which was already freed in the previous step by the
   transaction kthread. This results in a use-after-free leading to a
   crash.

Fix this by deleting the transaction from the list of transactions at
cleanup_transaction() only after setting the transaction state to
TRANS_STATE_COMMIT_DOING and waiting for all existing tasks that are
attached to the transaction to release their transaction handles.
This makes the transaction kthread wait for all the tasks attached to
the transaction to be done with the transaction before dropping the
log roots and doing other cleanups.

Fixes: ef67963dac ("btrfs: drop logs when we've aborted a transaction")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-19 17:25:22 +02:00
..
9p Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-27 08:07:12 -08:00
adfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
affs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
afs afs: Use wait_on_page_writeback_killable 2021-03-23 20:54:37 +00:00
autofs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
befs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
bfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
btrfs btrfs: fix race between transaction aborts and fsyncs leading to use-after-free 2021-04-19 17:25:22 +02:00
cachefiles cachefiles, afs: mm wait fixes 2021-03-24 10:22:00 -07:00
ceph idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
cifs cifs: escape spaces in share names 2021-04-07 21:30:27 -05:00
coda fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
configfs configfs: fix a use-after-free in __configfs_open_file 2021-03-11 12:13:48 +01:00
cramfs cramfs: use %pD instead of messing with file_dentry()->d_name 2021-01-05 23:02:47 -05:00
crypto block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
debugfs Driver core / debugfs update for 5.12-rc1 2021-02-24 10:13:55 -08:00
devpts
dlm fs: dlm: check on existing node address 2020-11-10 12:14:20 -06:00
ecryptfs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
efivarfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
efs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
erofs Change since last update: 2021-03-13 12:26:22 -08:00
exfat idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
exportfs exportfs: Add a function to return the raw output from fh_to_dentry() 2020-12-09 09:39:38 -05:00
ext2 fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
ext4 Miscellaneous ext4 bug fixes for v5.12. 2021-03-21 14:06:10 -07:00
f2fs block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
fat idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
freevxfs
fscache
fuse fuse: 32-bit user space ioctl compat for fuse device 2021-03-16 15:20:16 +01:00
gfs2 Two more gfs2 fixes 2021-04-03 12:15:01 -07:00
hfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
hfsplus idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
hostfs hostfs: fix memory handling in follow_link() 2021-03-25 18:57:42 -04:00
hpfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
hugetlbfs hugetlbfs: remove unneeded return value of hugetlb_vmtruncate() 2021-02-24 13:38:35 -08:00
iomap Merge branch 'iomap-5.12-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux 2021-03-18 10:37:30 -07:00
isofs isofs: handle large user and group ID 2021-02-03 19:05:52 +01:00
jbd2 block: use an on-stack bio in blkdev_issue_flush 2021-01-27 09:51:48 -07:00
jffs2 idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
jfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-27 08:07:12 -08:00
kernfs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
lockd SUNRPC: Make trace_svc_process() display the RPC procedure symbolically 2021-01-25 09:36:23 -05:00
minix fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
nfs nfs: we don't support removing system.nfs4_acl 2021-03-11 13:17:42 -05:00
nfs_common NFSv4_2: SSC helper should use its own config. 2021-01-28 10:55:37 -05:00
nfsd NFSD: fix error handling in NFSv4.0 callbacks 2021-03-11 10:58:49 -05:00
nilfs2 block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
nls
notify idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
ntfs ntfs: check for valid standard information attribute 2021-02-24 13:38:26 -08:00
ocfs2 ocfs2: fix deadlock between setattr and dio_end_io_write 2021-04-09 14:54:23 -07:00
omfs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
openpromfs
orangefs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
overlayfs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
proc mm: use is_cow_mapping() across tree where proper 2021-03-13 11:27:30 -08:00
pstore pstore fixes for v5.12-rc2 2021-03-05 17:21:25 -08:00
qnx4 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
qnx6 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
quota quota: Fix memory leak when handling corrupted quota file 2021-01-05 14:42:18 +01:00
ramfs ramfs: support O_TMPFILE 2021-02-24 13:38:26 -08:00
reiserfs reiserfs: update reiserfs_xattrs_initialized() condition 2021-03-30 14:27:32 -07:00
romfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
squashfs squashfs: fix xattr id and id lookup sanity checks 2021-03-25 09:22:55 -07:00
sysfs sysfs: Support zapping of binary attr mmaps 2021-01-12 14:26:31 +01:00
sysv fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
tracefs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
ubifs idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
udf idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
ufs fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
unicode unicode: Add utf8_casefold_hash 2020-09-10 14:03:31 -07:00
vboxsf fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
verity idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
xfs xfs: also reject BULKSTAT_SINGLE in a mount user namespace 2021-03-15 08:50:41 -07:00
zonefs zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() 2021-03-17 08:56:50 +09:00
aio.c Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
anon_inodes.c fs: anon_inodes: rephrase to appropriate kernel-doc 2021-01-15 12:17:25 -05:00
attr.c ima: handle idmapped mounts 2021-01-24 14:27:20 +01:00
bad_inode.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
binfmt_aout.c
binfmt_elf.c Merge branch 'parisc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2021-02-21 13:20:41 -08:00
binfmt_elf_fdpic.c Merge branch 'parisc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2021-02-21 13:20:41 -08:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-13 11:27:30 -08:00
binfmt_script.c
block_dev.c block: don't ignore REQ_NOWAIT for direct IO 2021-04-02 08:34:30 -06:00
buffer.c fs: buffer: use raw page_memcg() on locked page 2021-02-24 13:38:30 -08:00
char_dev.c
compat_binfmt_elf.c get rid of COMPAT_ELF_EXEC_PAGESIZE 2021-01-06 08:42:51 -05:00
coredump.c fs/coredump: use kmap_local_page() 2021-02-26 09:41:05 -08:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c mm: provide a saner PTE walking API for modules 2021-02-09 07:05:44 -05:00
dcache.c fs: delete repeated words in comments 2021-02-24 13:38:26 -08:00
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-09 14:54:23 -07:00
drop_caches.c
eventfd.c eventfd: Export eventfd_ctx_do_read() 2020-11-15 09:49:10 -05:00
eventpoll.c kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE 2021-02-16 09:59:41 +01:00
exec.c fs: delete repeated words in comments 2021-02-24 13:38:26 -08:00
fcntl.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
fhandle.c fs: delete repeated words in comments 2021-02-24 13:38:26 -08:00
file.c file: fix close_range() for unshare+cloexec 2021-04-02 14:11:10 +02:00
file_table.c epoll: take epitem list out of struct file 2020-10-25 20:02:08 -04:00
filesystems.c
fs-writeback.c fs: improve comments for writeback_single_inode() 2021-01-13 17:26:50 +01:00
fs_context.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: handle idmapped mounts 2021-01-24 14:27:19 +01:00
inode.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-27 08:07:12 -08:00
internal.h idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
io-wq.c io-wq: cancel unbounded works on io-wq destroy 2021-04-08 13:33:17 -06:00
io-wq.h io_uring: remove structures from include/linux/io_uring.h 2021-03-18 09:44:35 -06:00
io_uring.c io_uring: fix early sqd_list removal sqpoll hangs 2021-04-14 13:07:27 -06:00
ioctl.c fs: remove ksys_ioctl 2020-07-31 08:16:01 +02:00
Kconfig s390,alpha: make TMPFS_INODE64 available again 2021-03-08 10:46:30 +01:00
Kconfig.binfmt Merge branch 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-21 09:29:23 -08:00
kernel_read_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
libfs.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-09 10:37:34 -05:00
Makefile fs: Remove dcookies support 2021-01-29 10:06:46 +05:30
mbcache.c
mount.h mount: make {lock,unlock}_mount_hash() static 2021-01-24 14:29:34 +01:00
mpage.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
namei.c LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late 2021-04-06 20:33:00 -04:00
namespace.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-27 08:07:12 -08:00
no-block.c
nsfs.c
open.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
pipe.c fs: delete repeated words in comments 2021-02-24 13:38:26 -08:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-08 15:18:43 +01:00
posix_acl.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
proc_namespace.c fs: introduce MOUNT_ATTR_IDMAP 2021-01-24 14:43:45 +01:00
read_write.c teach sendfile(2) to handle send-to-pipe directly 2021-01-25 23:29:36 -05:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-17 11:39:49 -07:00
remap_range.c ioctl: handle idmapped mounts 2021-01-24 14:27:19 +01:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-16 22:13:10 +01:00
seq_file.c fs: fix kernel-doc markups 2021-01-21 14:06:00 -07:00
signalfd.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
splice.c for-5.12/block-2021-02-17 2021-02-21 11:02:48 -08:00
stack.c
stat.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
statfs.c s390,alpha: switch to 64-bit ino_t 2021-02-13 17:17:53 +01:00
super.c It has been a relatively quiet cycle in docsland. 2021-02-22 10:57:46 -08:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: use secure anon inodes for userfaultfd 2021-01-14 17:40:57 -05:00
utimes.c utimes: handle idmapped mounts 2021-01-24 14:27:18 +01:00
xattr.c namei: handle idmapped mounts in may_*() helpers 2021-01-24 14:27:17 +01:00