linux-stable/fs/btrfs
Filipe Manana 4d9380e0da btrfs: silence lockdep when reading chunk tree during mount
Often some test cases like btrfs/161 trigger lockdep splats that complain
about possible unsafe lock scenario due to the fact that during mount,
when reading the chunk tree we end up calling blkdev_get_by_path() while
holding a read lock on a leaf of the chunk tree. That produces a lockdep
splat like the following:

[ 3653.683975] ======================================================
[ 3653.685148] WARNING: possible circular locking dependency detected
[ 3653.686301] 5.15.0-rc7-btrfs-next-103 #1 Not tainted
[ 3653.687239] ------------------------------------------------------
[ 3653.688400] mount/447465 is trying to acquire lock:
[ 3653.689320] ffff8c6b0c76e528 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.691054]
               but task is already holding lock:
[ 3653.692155] ffff8c6b0a9f39e0 (btrfs-chunk-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 3653.693978]
               which lock already depends on the new lock.

[ 3653.695510]
               the existing dependency chain (in reverse order) is:
[ 3653.696915]
               -> #3 (btrfs-chunk-00){++++}-{3:3}:
[ 3653.698053]        down_read_nested+0x4b/0x140
[ 3653.698893]        __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 3653.699988]        btrfs_read_lock_root_node+0x31/0x40 [btrfs]
[ 3653.701205]        btrfs_search_slot+0x537/0xc00 [btrfs]
[ 3653.702234]        btrfs_insert_empty_items+0x32/0x70 [btrfs]
[ 3653.703332]        btrfs_init_new_device+0x563/0x15b0 [btrfs]
[ 3653.704439]        btrfs_ioctl+0x2110/0x3530 [btrfs]
[ 3653.705405]        __x64_sys_ioctl+0x83/0xb0
[ 3653.706215]        do_syscall_64+0x3b/0xc0
[ 3653.706990]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3653.708040]
               -> #2 (sb_internal#2){.+.+}-{0:0}:
[ 3653.708994]        lock_release+0x13d/0x4a0
[ 3653.709533]        up_write+0x18/0x160
[ 3653.710017]        btrfs_sync_file+0x3f3/0x5b0 [btrfs]
[ 3653.710699]        __loop_update_dio+0xbd/0x170 [loop]
[ 3653.711360]        lo_ioctl+0x3b1/0x8a0 [loop]
[ 3653.711929]        block_ioctl+0x48/0x50
[ 3653.712442]        __x64_sys_ioctl+0x83/0xb0
[ 3653.712991]        do_syscall_64+0x3b/0xc0
[ 3653.713519]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3653.714233]
               -> #1 (&lo->lo_mutex){+.+.}-{3:3}:
[ 3653.715026]        __mutex_lock+0x92/0x900
[ 3653.715648]        lo_open+0x28/0x60 [loop]
[ 3653.716275]        blkdev_get_whole+0x28/0x90
[ 3653.716867]        blkdev_get_by_dev.part.0+0x142/0x320
[ 3653.717537]        blkdev_open+0x5e/0xa0
[ 3653.718043]        do_dentry_open+0x163/0x390
[ 3653.718604]        path_openat+0x3f0/0xa80
[ 3653.719128]        do_filp_open+0xa9/0x150
[ 3653.719652]        do_sys_openat2+0x97/0x160
[ 3653.720197]        __x64_sys_openat+0x54/0x90
[ 3653.720766]        do_syscall_64+0x3b/0xc0
[ 3653.721285]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3653.721986]
               -> #0 (&disk->open_mutex){+.+.}-{3:3}:
[ 3653.722775]        __lock_acquire+0x130e/0x2210
[ 3653.723348]        lock_acquire+0xd7/0x310
[ 3653.723867]        __mutex_lock+0x92/0x900
[ 3653.724394]        blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.725041]        blkdev_get_by_path+0xb8/0xd0
[ 3653.725614]        btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs]
[ 3653.726332]        open_fs_devices+0xd7/0x2c0 [btrfs]
[ 3653.726999]        btrfs_read_chunk_tree+0x3ad/0x870 [btrfs]
[ 3653.727739]        open_ctree+0xb8e/0x17bf [btrfs]
[ 3653.728384]        btrfs_mount_root.cold+0x12/0xde [btrfs]
[ 3653.729130]        legacy_get_tree+0x30/0x50
[ 3653.729676]        vfs_get_tree+0x28/0xc0
[ 3653.730192]        vfs_kern_mount.part.0+0x71/0xb0
[ 3653.730800]        btrfs_mount+0x11d/0x3a0 [btrfs]
[ 3653.731427]        legacy_get_tree+0x30/0x50
[ 3653.731970]        vfs_get_tree+0x28/0xc0
[ 3653.732486]        path_mount+0x2d4/0xbe0
[ 3653.732997]        __x64_sys_mount+0x103/0x140
[ 3653.733560]        do_syscall_64+0x3b/0xc0
[ 3653.734080]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3653.734782]
               other info that might help us debug this:

[ 3653.735784] Chain exists of:
                 &disk->open_mutex --> sb_internal#2 --> btrfs-chunk-00

[ 3653.737123]  Possible unsafe locking scenario:

[ 3653.737865]        CPU0                    CPU1
[ 3653.738435]        ----                    ----
[ 3653.739007]   lock(btrfs-chunk-00);
[ 3653.739449]                                lock(sb_internal#2);
[ 3653.740193]                                lock(btrfs-chunk-00);
[ 3653.740955]   lock(&disk->open_mutex);
[ 3653.741431]
                *** DEADLOCK ***

[ 3653.742176] 3 locks held by mount/447465:
[ 3653.742739]  #0: ffff8c6acf85c0e8 (&type->s_umount_key#44/1){+.+.}-{3:3}, at: alloc_super+0xd5/0x3b0
[ 3653.744114]  #1: ffffffffc0b28f70 (uuid_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x59/0x870 [btrfs]
[ 3653.745563]  #2: ffff8c6b0a9f39e0 (btrfs-chunk-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 3653.747066]
               stack backtrace:
[ 3653.747723] CPU: 4 PID: 447465 Comm: mount Not tainted 5.15.0-rc7-btrfs-next-103 #1
[ 3653.748873] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 3653.750592] Call Trace:
[ 3653.750967]  dump_stack_lvl+0x57/0x72
[ 3653.751526]  check_noncircular+0xf3/0x110
[ 3653.752136]  ? stack_trace_save+0x4b/0x70
[ 3653.752748]  __lock_acquire+0x130e/0x2210
[ 3653.753356]  lock_acquire+0xd7/0x310
[ 3653.753898]  ? blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.754596]  ? lock_is_held_type+0xe8/0x140
[ 3653.755125]  ? blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.755729]  ? blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.756338]  __mutex_lock+0x92/0x900
[ 3653.756794]  ? blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.757400]  ? do_raw_spin_unlock+0x4b/0xa0
[ 3653.757930]  ? _raw_spin_unlock+0x29/0x40
[ 3653.758437]  ? bd_prepare_to_claim+0x129/0x150
[ 3653.758999]  ? trace_module_get+0x2b/0xd0
[ 3653.759508]  ? try_module_get.part.0+0x50/0x80
[ 3653.760072]  blkdev_get_by_dev.part.0+0xe7/0x320
[ 3653.760661]  ? devcgroup_check_permission+0xc1/0x1f0
[ 3653.761288]  blkdev_get_by_path+0xb8/0xd0
[ 3653.761797]  btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs]
[ 3653.762454]  open_fs_devices+0xd7/0x2c0 [btrfs]
[ 3653.763055]  ? clone_fs_devices+0x8f/0x170 [btrfs]
[ 3653.763689]  btrfs_read_chunk_tree+0x3ad/0x870 [btrfs]
[ 3653.764370]  ? kvm_sched_clock_read+0x14/0x40
[ 3653.764922]  open_ctree+0xb8e/0x17bf [btrfs]
[ 3653.765493]  ? super_setup_bdi_name+0x79/0xd0
[ 3653.766043]  btrfs_mount_root.cold+0x12/0xde [btrfs]
[ 3653.766780]  ? rcu_read_lock_sched_held+0x3f/0x80
[ 3653.767488]  ? kfree+0x1f2/0x3c0
[ 3653.767979]  legacy_get_tree+0x30/0x50
[ 3653.768548]  vfs_get_tree+0x28/0xc0
[ 3653.769076]  vfs_kern_mount.part.0+0x71/0xb0
[ 3653.769718]  btrfs_mount+0x11d/0x3a0 [btrfs]
[ 3653.770381]  ? rcu_read_lock_sched_held+0x3f/0x80
[ 3653.771086]  ? kfree+0x1f2/0x3c0
[ 3653.771574]  legacy_get_tree+0x30/0x50
[ 3653.772136]  vfs_get_tree+0x28/0xc0
[ 3653.772673]  path_mount+0x2d4/0xbe0
[ 3653.773201]  __x64_sys_mount+0x103/0x140
[ 3653.773793]  do_syscall_64+0x3b/0xc0
[ 3653.774333]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3653.775094] RIP: 0033:0x7f648bc45aaa

This happens because through btrfs_read_chunk_tree(), which is called only
during mount, ends up acquiring the mutex open_mutex of a block device
while holding a read lock on a leaf of the chunk tree while other paths
need to acquire other locks before locking extent buffers of the chunk
tree.

Since at mount time when we call btrfs_read_chunk_tree() we know that
we don't have other tasks running in parallel and modifying the chunk
tree, we can simply skip locking of chunk tree extent buffers. So do
that and move the assertion that checks the fs is not yet mounted to the
top block of btrfs_read_chunk_tree(), with a comment before doing it.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-11-16 16:50:47 +01:00
..
tests btrfs: subpage: avoid potential deadlock with compression and delalloc 2021-10-26 19:08:05 +02:00
acl.c overlayfs update for 5.15 2021-09-02 09:21:27 -07:00
async-thread.c btrfs: fix memory ordering between normal and ordered work functions 2021-11-16 16:50:23 +01:00
async-thread.h
backref.c btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
backref.h btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
block-group.c btrfs: update comments for chunk allocation -ENOSPC cases 2021-10-26 19:08:07 +02:00
block-group.h btrfs: fix deadlock between chunk allocation and chunk btree modifications 2021-10-26 19:08:07 +02:00
block-rsv.c
block-rsv.h
btrfs_inode.h btrfs: rename btrfs_dio_private::logical_offset to file_offset 2021-10-26 19:08:06 +02:00
check-integrity.c btrfs: check-integrity: stop storing the block device name in btrfsic_dev_state 2021-10-26 19:08:07 +02:00
check-integrity.h
compression.c btrfs: subpage: make end_compressed_bio_writeback() compatible 2021-10-26 19:08:04 +02:00
compression.h btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write 2021-10-26 19:08:04 +02:00
ctree.c btrfs: unexport setup_items_for_insert() 2021-10-26 19:08:03 +02:00
ctree.h btrfs: remove root argument from btrfs_unlink_inode() 2021-10-29 12:39:13 +02:00
delalloc-space.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
delalloc-space.h
delayed-inode.c btrfs: loop only once over data sizes array when inserting an item batch 2021-10-26 19:08:03 +02:00
delayed-inode.h
delayed-ref.c btrfs: pull up qgroup checks from delayed-ref core to init time 2021-10-26 19:08:06 +02:00
delayed-ref.h btrfs: make btrfs_ref::real_root optional 2021-10-26 19:08:06 +02:00
dev-replace.c btrfs: handle device lookup with btrfs_dev_lookup_args 2021-10-26 19:08:07 +02:00
dev-replace.h
dir-item.c btrfs: unify lookup return value when dir entry is missing 2021-10-07 22:06:32 +02:00
discard.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
discard.h
disk-io.c btrfs: call btrfs_check_rw_degradable only if there is a missing device 2021-10-29 12:39:13 +02:00
disk-io.h btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE 2021-10-26 19:08:07 +02:00
export.c
export.h
extent-io-tree.h
extent-tree.c btrfs: reduce btrfs_update_block_group alloc argument to bool 2021-10-26 19:08:06 +02:00
extent_io.c btrfs: remove btrfs_bio::logical member 2021-10-26 19:08:06 +02:00
extent_io.h btrfs: cleanup for extent_write_locked_range() 2021-10-26 19:08:04 +02:00
extent_map.c btrfs: rename btrfs_bio to btrfs_io_context 2021-10-26 19:08:02 +02:00
extent_map.h
file-item.c btrfs: use bvec_kmap_local in btrfs_csum_one_bio 2021-10-26 19:08:06 +02:00
file.c btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref 2021-10-26 19:08:06 +02:00
free-space-cache.c btrfs: subpage: add bitmap for PageChecked flag 2021-10-26 19:08:03 +02:00
free-space-cache.h
free-space-tree.c
free-space-tree.h
inode-item.c
inode.c btrfs: remove root argument from btrfs_unlink_inode() 2021-10-29 12:39:13 +02:00
ioctl.c btrfs: send: prepare for v2 protocol 2021-10-29 12:38:43 +02:00
Kconfig btrfs: disable build on platforms having page size 256K 2021-06-22 14:11:57 +02:00
locking.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
locking.h btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
lzo.c btrfs: fix a out-of-bound access in copy_compressed_data_to_page() 2021-11-16 16:46:40 +01:00
Makefile btrfs: initial fsverity support 2021-08-23 13:19:09 +02:00
misc.h btrfs: use correct header for div_u64 in misc.h 2021-09-07 14:29:50 +02:00
ordered-data.c btrfs: zoned: fix double counting of split ordered extent 2021-09-07 14:30:41 +02:00
ordered-data.h btrfs: remove uptodate parameter from btrfs_dec_test_first_ordered_pending 2021-08-23 13:19:02 +02:00
orphan.c
print-tree.c
print-tree.h
props.c btrfs: props: change how empty value is interpreted 2021-06-22 14:11:58 +02:00
props.h
qgroup.c btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
qgroup.h btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
raid56.c btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
raid56.h btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
rcu-string.h
reada.c btrfs: rename btrfs_bio to btrfs_io_context 2021-10-26 19:08:02 +02:00
ref-verify.c btrfs: rename root fields in delayed refs structs 2021-10-26 19:08:06 +02:00
ref-verify.h
reflink.c btrfs: subpage: add bitmap for PageChecked flag 2021-10-26 19:08:03 +02:00
reflink.h
relocation.c btrfs: fix deadlock between chunk allocation and chunk btree modifications 2021-10-26 19:08:07 +02:00
root-tree.c
scrub.c btrfs: handle device lookup with btrfs_dev_lookup_args 2021-10-26 19:08:07 +02:00
send.c btrfs: send: prepare for v2 protocol 2021-10-29 12:38:43 +02:00
send.h btrfs: send: prepare for v2 protocol 2021-10-29 12:38:43 +02:00
space-info.c btrfs: do not infinite loop in data reclaim if we aborted 2021-10-26 19:08:05 +02:00
space-info.h btrfs: rip out btrfs_space_info::total_bytes_pinned 2021-06-22 14:55:25 +02:00
struct-funcs.c btrfs: add special case to setget helpers for 64k pages 2021-08-23 13:18:58 +02:00
subpage.c btrfs: handle page locking in btrfs_page_end_writer_lock with no writers 2021-10-26 19:08:05 +02:00
subpage.h btrfs: rework page locking in __extent_writepage() 2021-10-26 19:08:05 +02:00
super.c btrfs: add a BTRFS_FS_ERROR helper 2021-10-26 19:08:05 +02:00
sysfs.c btrfs: sysfs: convert scnprintf and snprintf to sysfs_emit 2021-10-26 19:08:07 +02:00
sysfs.h
transaction.c btrfs: add a BTRFS_FS_ERROR helper 2021-10-26 19:08:05 +02:00
transaction.h btrfs: rework chunk allocation to avoid exhaustion of the system chunk array 2021-07-07 17:42:41 +02:00
tree-checker.c btrfs: add ro compat flags to inodes 2021-08-23 13:19:09 +02:00
tree-checker.h
tree-defrag.c
tree-log.c btrfs: remove root argument from check_item_in_log() 2021-10-29 12:39:13 +02:00
tree-log.h btrfs: change error handling for btrfs_delete_*_in_log 2021-10-26 19:08:05 +02:00
tree-mod-log.c
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c
verity.c btrfs: fix transaction handle leak after verity rollback failure 2021-09-17 19:29:41 +02:00
volumes.c btrfs: silence lockdep when reading chunk tree during mount 2021-11-16 16:50:47 +01:00
volumes.h btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls 2021-10-26 19:08:07 +02:00
xattr.c btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
xattr.h
zlib.c btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00
zoned.c btrfs: zoned: use kmemdup() to replace kmalloc + memcpy 2021-10-26 19:08:05 +02:00
zoned.h btrfs: zoned: add a dedicated data relocation block group 2021-10-26 19:08:01 +02:00
zstd.c btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00