linux-stable/fs/btrfs
Josef Bacik 591bff187a btrfs: take the cleaner_mutex earlier in qgroup disable
[ Upstream commit 0f2b8098d7 ]

One of my CI runs popped the following lockdep splat

======================================================
WARNING: possible circular locking dependency detected
6.9.0-rc4+ #1 Not tainted
------------------------------------------------------
btrfs/471533 is trying to acquire lock:
ffff92ba46980850 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_quota_disable+0x54/0x4c0

but task is already holding lock:
ffff92ba46980bd0 (&fs_info->subvol_sem){++++}-{3:3}, at: btrfs_ioctl+0x1c8f/0x2600

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&fs_info->subvol_sem){++++}-{3:3}:
       down_read+0x42/0x170
       btrfs_rename+0x607/0xb00
       btrfs_rename2+0x2e/0x70
       vfs_rename+0xaf8/0xfc0
       do_renameat2+0x586/0x600
       __x64_sys_rename+0x43/0x50
       do_syscall_64+0x95/0x180
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #1 (&sb->s_type->i_mutex_key#16){++++}-{3:3}:
       down_write+0x3f/0xc0
       btrfs_inode_lock+0x40/0x70
       prealloc_file_extent_cluster+0x1b0/0x370
       relocate_file_extent_cluster+0xb2/0x720
       relocate_data_extent+0x107/0x160
       relocate_block_group+0x442/0x550
       btrfs_relocate_block_group+0x2cb/0x4b0
       btrfs_relocate_chunk+0x50/0x1b0
       btrfs_balance+0x92f/0x13d0
       btrfs_ioctl+0x1abf/0x2600
       __x64_sys_ioctl+0x97/0xd0
       do_syscall_64+0x95/0x180
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #0 (&fs_info->cleaner_mutex){+.+.}-{3:3}:
       __lock_acquire+0x13e7/0x2180
       lock_acquire+0xcb/0x2e0
       __mutex_lock+0xbe/0xc00
       btrfs_quota_disable+0x54/0x4c0
       btrfs_ioctl+0x206b/0x2600
       __x64_sys_ioctl+0x97/0xd0
       do_syscall_64+0x95/0x180
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

other info that might help us debug this:

Chain exists of:
  &fs_info->cleaner_mutex --> &sb->s_type->i_mutex_key#16 --> &fs_info->subvol_sem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&fs_info->subvol_sem);
                               lock(&sb->s_type->i_mutex_key#16);
                               lock(&fs_info->subvol_sem);
  lock(&fs_info->cleaner_mutex);

 *** DEADLOCK ***

2 locks held by btrfs/471533:
 #0: ffff92ba4319e420 (sb_writers#14){.+.+}-{0:0}, at: btrfs_ioctl+0x3b5/0x2600
 #1: ffff92ba46980bd0 (&fs_info->subvol_sem){++++}-{3:3}, at: btrfs_ioctl+0x1c8f/0x2600

stack backtrace:
CPU: 1 PID: 471533 Comm: btrfs Kdump: loaded Not tainted 6.9.0-rc4+ #1
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 check_noncircular+0x148/0x160
 ? lock_acquire+0xcb/0x2e0
 __lock_acquire+0x13e7/0x2180
 lock_acquire+0xcb/0x2e0
 ? btrfs_quota_disable+0x54/0x4c0
 ? lock_is_held_type+0x9a/0x110
 __mutex_lock+0xbe/0xc00
 ? btrfs_quota_disable+0x54/0x4c0
 ? srso_return_thunk+0x5/0x5f
 ? lock_acquire+0xcb/0x2e0
 ? btrfs_quota_disable+0x54/0x4c0
 ? btrfs_quota_disable+0x54/0x4c0
 btrfs_quota_disable+0x54/0x4c0
 btrfs_ioctl+0x206b/0x2600
 ? srso_return_thunk+0x5/0x5f
 ? __do_sys_statfs+0x61/0x70
 __x64_sys_ioctl+0x97/0xd0
 do_syscall_64+0x95/0x180
 ? srso_return_thunk+0x5/0x5f
 ? reacquire_held_locks+0xd1/0x1f0
 ? do_user_addr_fault+0x307/0x8a0
 ? srso_return_thunk+0x5/0x5f
 ? lock_acquire+0xcb/0x2e0
 ? srso_return_thunk+0x5/0x5f
 ? srso_return_thunk+0x5/0x5f
 ? find_held_lock+0x2b/0x80
 ? srso_return_thunk+0x5/0x5f
 ? lock_release+0xca/0x2a0
 ? srso_return_thunk+0x5/0x5f
 ? do_user_addr_fault+0x35c/0x8a0
 ? srso_return_thunk+0x5/0x5f
 ? trace_hardirqs_off+0x4b/0xc0
 ? srso_return_thunk+0x5/0x5f
 ? lockdep_hardirqs_on_prepare+0xde/0x190
 ? srso_return_thunk+0x5/0x5f

This happens because when we call rename we already have the inode mutex
held, and then we acquire the subvol_sem if we are a subvolume.  This
makes the dependency

inode lock -> subvol sem

When we're running data relocation we will preallocate space for the
data relocation inode, and we always run the relocation under the
->cleaner_mutex.  This now creates the dependency of

cleaner_mutex -> inode lock (from the prealloc) -> subvol_sem

Qgroup delete is doing this in the opposite order, it is acquiring the
subvol_sem and then it is acquiring the cleaner_mutex, which results in
this lockdep splat.  This deadlock can't happen in reality, because we
won't ever rename the data reloc inode, nor is the data reloc inode a
subvolume.

However this is fairly easy to fix, simply take the cleaner mutex in the
case where we are disabling qgroups before we take the subvol_sem.  This
resolves the lockdep splat.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30 09:48:52 +02:00
..
tests btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() 2024-05-02 16:35:27 +02:00
Kconfig btrfs: check-integrity: remove CONFIG_BTRFS_FS_CHECK_INTEGRITY option 2023-10-12 16:44:05 +02:00
Makefile btrfs: add support for inserting raid stripe extents 2023-10-12 16:44:09 +02:00
accessors.c btrfs: migrate get_eb_page_index() and get_eb_offset_in_page() to folios 2023-12-15 23:03:58 +01:00
accessors.h btrfs: migrate extent_buffer::pages[] to folio 2023-12-15 23:01:04 +01:00
acl.c
acl.h
async-thread.c btrfs: merge ordered work callbacks in btrfs_work into one 2023-10-12 16:44:10 +02:00
async-thread.h btrfs: merge ordered work callbacks in btrfs_work into one 2023-10-12 16:44:10 +02:00
backref.c btrfs: fix information leak in btrfs_ioctl_logical_to_ino() 2024-05-02 16:35:27 +02:00
backref.h for-6.7-tag 2023-10-30 10:42:06 -10:00
bio.c btrfs: migrate btrfs_repair_io_failure() to folio interfaces 2023-12-15 23:03:58 +01:00
bio.h btrfs: migrate btrfs_repair_io_failure() to folio interfaces 2023-12-15 23:03:58 +01:00
block-group.c btrfs: zoned: don't skip block groups with 100% zone unusable 2024-04-03 15:32:38 +02:00
block-group.h btrfs: add and use helper to check if block group is used 2024-02-09 20:29:14 +01:00
block-rsv.c btrfs: fix data race at btrfs_use_block_rsv() when accessing block reserve 2024-02-22 12:15:12 +01:00
block-rsv.h btrfs: fix data race at btrfs_use_block_rsv() when accessing block reserve 2024-02-22 12:15:12 +01:00
btrfs_inode.h btrfs: fix mismatching parameter names for btrfs_get_extent() 2023-12-15 22:59:30 +01:00
compression.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
compression.h Revert "btrfs: zstd: fix and simplify the inline extent decompression" 2024-01-22 15:39:01 -08:00
ctree.c btrfs: migrate get_eb_page_index() and get_eb_offset_in_page() to folios 2023-12-15 23:03:58 +01:00
ctree.h btrfs: switch btrfs_root::delayed_nodes_tree to xarray from radix-tree 2023-12-15 23:01:03 +01:00
defrag.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
defrag.h btrfs: move btrfs_defrag_root() to defrag.{c,h} 2023-10-12 16:44:13 +02:00
delalloc-space.c btrfs: don't reserve space for checksums when writing to nocow files 2024-02-13 18:36:35 +01:00
delalloc-space.h
delayed-inode.c btrfs: record delayed inode root in transaction 2024-04-17 11:23:35 +02:00
delayed-inode.h btrfs: remove redundant root argument from btrfs_delayed_update_inode() 2023-10-12 16:44:12 +02:00
delayed-ref.c btrfs: fix qgroup record leaks when using simple quotas 2023-11-09 14:01:59 +01:00
delayed-ref.h btrfs: stop reserving excessive space for block group item insertions 2023-10-12 16:44:16 +02:00
dev-replace.c btrfs: dev-replace: properly validate device names 2024-02-22 12:14:21 +01:00
dev-replace.h
dir-item.c btrfs: abort transaction on generation mismatch when marking eb as dirty 2023-10-12 16:44:07 +02:00
dir-item.h btrfs: add fscrypt related dependencies to respective headers 2023-10-12 16:44:02 +02:00
discard.c
discard.h
disk-io.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
disk-io.h btrfs: fix double free of anonymous device after snapshot creation failure 2024-02-29 22:34:11 +01:00
export.c btrfs: export: handle invalid inode or root reference in btrfs_get_parent() 2024-04-13 13:10:02 +02:00
export.h
extent-io-tree.c btrfs: allocate btrfs_inode::file_extent_tree only without NO_HOLES 2023-12-15 22:59:01 +01:00
extent-io-tree.h btrfs: always set extent_io_tree::inode and drop fs_info 2023-12-15 20:27:02 +01:00
extent-tree.c btrfs: don't warn if discard range is not aligned to sector 2024-01-18 23:35:57 +01:00
extent-tree.h btrfs: get correct owning_root when dropping snapshot 2023-11-03 16:39:06 +01:00
extent_io.c btrfs: zoned: do not flag ZEROOUT on non-dirty extent buffer 2024-04-27 17:12:48 +02:00
extent_io.h btrfs: add set_folio_extent_mapped() helper 2024-04-03 15:32:29 +02:00
extent_map.c btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() 2024-05-02 16:35:27 +02:00
extent_map.h btrfs: use the flags of an extent map to identify the compression type 2023-12-15 22:59:02 +01:00
file-item.c btrfs: use the flags of an extent map to identify the compression type 2023-12-15 22:59:02 +01:00
file-item.h btrfs: scrub: avoid unnecessary csum tree search preparing stripes 2023-08-21 14:54:48 +02:00
file.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
file.h
free-space-cache.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
free-space-cache.h
free-space-tree.c btrfs: abort transaction on generation mismatch when marking eb as dirty 2023-10-12 16:44:07 +02:00
free-space-tree.h
fs.c
fs.h btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
inode-item.c btrfs: track owning root in btrfs_ref 2023-10-12 16:44:11 +02:00
inode-item.h btrfs: add fscrypt related dependencies to respective headers 2023-10-12 16:44:02 +02:00
inode.c btrfs: make btrfs_clear_delalloc_extent() free delalloc reserve 2024-05-17 12:14:42 +02:00
ioctl.c btrfs: take the cleaner_mutex earlier in qgroup disable 2024-05-30 09:48:52 +02:00
ioctl.h
locking.c btrfs: add raid stripe tree definitions 2023-10-12 16:44:09 +02:00
locking.h btrfs: do not block starts waiting on previous transaction commit 2023-09-08 14:10:49 +02:00
lru_cache.c btrfs: fix typos found by codespell 2023-12-15 23:00:04 +01:00
lru_cache.h
lzo.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
messages.c btrfs: constify fs_info parameter in __btrfs_panic() 2023-12-15 20:27:02 +01:00
messages.h btrfs: constify fs_info parameter in __btrfs_panic() 2023-12-15 20:27:02 +01:00
misc.h minmax: add in_range() macro 2023-08-24 16:20:18 -07:00
ordered-data.c btrfs: set correct ram_bytes when splitting ordered extent 2024-05-17 12:15:01 +02:00
ordered-data.h btrfs: remove unused btrfs_ordered_extent::outstanding_isize 2023-12-15 20:27:01 +01:00
orphan.c
orphan.h
print-tree.c btrfs: new inline ref storing owning subvol of data extents 2023-10-12 16:44:11 +02:00
print-tree.h
props.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
props.h
qgroup.c btrfs: take the cleaner_mutex earlier in qgroup disable 2024-05-30 09:48:52 +02:00
qgroup.h btrfs: qgroup: validate btrfs_qgroup_inherit parameter 2024-04-03 15:32:30 +02:00
raid-stripe-tree.c btrfs: directly return 0 on no error code in btrfs_insert_raid_extent() 2023-11-03 16:38:51 +01:00
raid-stripe-tree.h btrfs: zoned: support RAID0/1/10 on top of raid stripe tree 2023-10-12 16:44:09 +02:00
raid56.c btrfs: refactor alloc_extent_buffer() to allocate-then-attach method 2023-12-15 23:01:04 +01:00
raid56.h btrfs: use a dedicated data structure for chunk maps 2023-12-15 20:27:02 +01:00
rcu-string.h
ref-verify.c btrfs: ref-verify: free ref cache before clearing mount opt 2024-01-12 01:59:49 +01:00
ref-verify.h
reflink.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
reflink.h
relocation.c btrfs: add helper to get fs_info from struct inode pointer 2024-04-03 15:32:30 +02:00
relocation.h btrfs: relocation: constify parameters where possible 2023-10-12 16:44:13 +02:00
root-tree.c btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations 2024-04-17 11:23:35 +02:00
root-tree.h btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations 2024-04-17 11:23:35 +02:00
scrub.c btrfs: scrub: run relocation repair when/only needed 2024-05-02 16:35:27 +02:00
scrub.h
send.c btrfs: send: handle path ref underflow in header iterate_inode_ref() 2024-04-13 13:10:02 +02:00
send.h
space-info.c btrfs: fix data races when accessing the reserved amount of block reserves 2024-02-22 12:15:06 +01:00
space-info.h btrfs: pass a space_info argument to btrfs_reserve_metadata_bytes() 2023-10-12 16:44:05 +02:00
subpage.c for-6.8-rc1-tag 2024-01-22 13:29:42 -08:00
subpage.h btrfs: migrate subpage code to folio interfaces 2023-12-15 23:03:58 +01:00
super.c btrfs: replace sb::s_blocksize by fs_info::sectorsize 2024-04-03 15:32:29 +02:00
super.h btrfs: remove old mount API code 2023-12-15 20:27:04 +01:00
sysfs.c btrfs: sysfs: validate scrub_speed_max value 2023-12-15 23:01:04 +01:00
sysfs.h
transaction.c btrfs: always clear PERTRANS metadata during commit 2024-05-17 12:14:43 +02:00
transaction.h btrfs: free qgroup pertrans reserve on transaction abort 2023-12-06 22:32:49 +01:00
tree-checker.c btrfs: make sure that WRITTEN is set on all metadata blocks 2024-05-17 12:15:01 +02:00
tree-checker.h btrfs: make sure that WRITTEN is set on all metadata blocks 2024-05-17 12:15:01 +02:00
tree-log.c btrfs: use the flags of an extent map to identify the compression type 2023-12-15 22:59:02 +01:00
tree-log.h
tree-mod-log.c
tree-mod-log.h
ulist.c btrfs: reformat remaining kdoc style comments 2023-10-12 16:44:04 +02:00
ulist.h
uuid-tree.c btrfs: abort transaction on generation mismatch when marking eb as dirty 2023-10-12 16:44:07 +02:00
uuid-tree.h
verity.c btrfs: remove redundant root argument from btrfs_update_inode() 2023-10-12 16:44:12 +02:00
verity.h
volumes.c btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks() 2024-05-17 12:15:01 +02:00
volumes.h btrfs: fix typos found by codespell 2023-12-15 23:00:04 +01:00
xattr.c btrfs: cache that we don't have security.capability set 2023-12-15 20:27:05 +01:00
xattr.h btrfs: move btrfs_xattr_handlers to .rodata 2023-10-09 16:24:17 +02:00
zlib.c btrfs: zlib: fix and simplify the inline extent decompression 2024-01-18 23:35:26 +01:00
zoned.c btrfs: zoned: fix use-after-free in do_zone_finish() 2024-04-03 15:32:38 +02:00
zoned.h for-6.8/block-2024-01-08 2024-01-11 13:58:04 -08:00
zstd.c Revert "btrfs: zstd: fix and simplify the inline extent decompression" 2024-01-22 15:39:01 -08:00