linux-stable/fs/btrfs
Filipe Manana ea9c846f54 Btrfs: fix deadlock when writing out free space caches
commit 5ce555578e upstream.

When writing out a block group free space cache we can end deadlocking
with ourselves on an extent buffer lock resulting in a warning like the
following:

  [245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251 btrfs_tree_lock+0x1be/0x1d0 [btrfs]
  [245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
    W I      4.16.8 #1
  [245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
  [245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
  [245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX: 0000000000000001
  [245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff8807e20a3d20
  [245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09: ffff880000000000
  [245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12: ffff8807cb220630
  [245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15: ffff88041fbdf000
  [245043.404780] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
  [245043.406050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4: 00000000000206e0
  [245043.408670] Call Trace:
  [245043.409977]  btrfs_search_slot+0x761/0xa60 [btrfs]
  [245043.411278]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
  [245043.412572]  btrfs_insert_item+0x5b/0xc0 [btrfs]
  [245043.413922]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
  [245043.415216]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
  [245043.416487]  find_free_extent+0xcd0/0xf60 [btrfs]
  [245043.417813]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
  [245043.419105]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
  [245043.420378]  __btrfs_cow_block+0x127/0x550 [btrfs]
  [245043.421652]  btrfs_cow_block+0xee/0x190 [btrfs]
  [245043.422979]  btrfs_search_slot+0x227/0xa60 [btrfs]
  [245043.424279]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
  [245043.425538]  ? iput+0x72/0x1e0
  [245043.426798]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
  [245043.428131]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
  [245043.429419]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
  [245043.430712]  ? start_transaction+0x8e/0x410 [btrfs]
  [245043.432006]  transaction_kthread+0x184/0x1a0 [btrfs]
  [245043.433341]  kthread+0xf0/0x130
  [245043.434628]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
  [245043.435928]  ? kthread_create_worker_on_cpu+0x40/0x40
  [245043.437236]  ret_from_fork+0x1f/0x30
  [245043.441054] ---[ end trace 15abaa2aaf36827f ]---

This is because at write_one_cache_group() when we are COWing a leaf from
the extent tree we end up allocating a new block group (chunk) and,
because we have hit a threshold on the number of bytes reserved for system
chunks, we attempt to finalize the creation of new block groups from the
current transaction, by calling btrfs_create_pending_block_groups().
However here we also need to modify the extent tree in order to insert
a block group item, and if the location for this new block group item
happens to be in the same leaf that we were COWing earlier, we deadlock
since btrfs_search_slot() tries to write lock the extent buffer that we
locked before at write_one_cache_group().

We have already hit similar cases in the past and commit d9a0540a79
("Btrfs: fix deadlock when finalizing block group creation") fixed some
of those cases by delaying the creation of pending block groups at the
known specific spots that could lead to a deadlock. This change reworks
that commit to be more generic so that we don't have to add similar logic
to every possible path that can lead to a deadlock. This is done by
making __btrfs_cow_block() disallowing the creation of new block groups
(setting the transaction's can_flush_pending_bgs to false) before it
attempts to allocate a new extent buffer for either the extent, chunk or
device trees, since those are the trees that pending block creation
modifies. Once the new extent buffer is allocated, it allows creation of
pending block groups to happen again.

This change depends on a recent patch from Josef which is not yet in
Linus' tree, named "btrfs: make sure we create all new block groups" in
order to avoid occasional warnings at btrfs_trans_release_chunk_metadata().

Fixes: d9a0540a79 ("Btrfs: fix deadlock when finalizing block group creation")
CC: stable@vger.kernel.org # 4.4+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199753
Link: https://lore.kernel.org/linux-btrfs/CAJtFHUTHna09ST-_EEiyWmDH6gAqS6wa=zMNMBsifj8ABu99cw@mail.gmail.com/
Reported-by: E V <eliventer@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:58 -08:00
..
tests btrfs: qgroup: Drop fs_info parameter from btrfs_qgroup_account_extent 2018-08-06 13:12:52 +02:00
acl.c btrfs: remove unnecessary curly braces in btrfs_get_acl 2018-08-06 13:12:41 +02:00
async-thread.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
async-thread.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
backref.c btrfs: backref: Use ERR_CAST to return error code 2018-08-06 13:13:01 +02:00
backref.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
btrfs_inode.h btrfs: use timespec64 for i_otime 2018-08-06 13:12:39 +02:00
check-integrity.c btrfs: open-code bio_set_op_attrs 2018-08-06 13:12:44 +02:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: drop extent_io_ops::merge_bio_hook callback 2018-08-06 13:12:56 +02:00
compression.h btrfs: compression: Add linux/sizes.h for compression.h 2018-05-29 18:13:00 +02:00
ctree.c Btrfs: fix deadlock when writing out free space caches 2018-11-13 11:08:58 -08:00
ctree.h for-4.19-rc2-tag 2018-09-06 09:04:45 -07:00
dedupe.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
delayed-inode.c btrfs: Remove fs_info from btrfs_delete_delayed_dir_index 2018-08-06 13:13:00 +02:00
delayed-inode.h btrfs: Remove fs_info from btrfs_delete_delayed_dir_index 2018-08-06 13:13:00 +02:00
delayed-ref.c btrfs: Streamline memory allocation failure handling in btrfs_add_delayed_tree_ref 2018-08-06 13:12:39 +02:00
delayed-ref.h btrfs: Remove fs_info from btrfs_add_delayed_data_ref 2018-08-06 13:12:34 +02:00
dev-replace.c btrfs: fix error handling in btrfs_dev_replace_start 2018-11-13 11:08:56 -08:00
dev-replace.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
dir-item.c btrfs: Remove fs_info from btrfs_insert_delayed_dir_index 2018-08-06 13:13:00 +02:00
disk-io.c Btrfs: fix unexpected failure of nocow buffered writes after snapshotting when low on space 2018-08-17 18:35:43 +02:00
disk-io.h btrfs: unify end_io callbacks of async_submit_bio 2018-08-06 13:12:55 +02:00
export.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
export.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
extent-tree.c Btrfs: fix deadlock when writing out free space caches 2018-11-13 11:08:58 -08:00
extent_io.c btrfs: readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
extent_io.h btrfs: drop extent_io_ops::set_range_writeback callback 2018-08-06 13:12:56 +02:00
extent_map.c btrfs: use fs_info for btrfs_handle_em_exist tracepoint 2018-05-28 18:07:17 +02:00
extent_map.h btrfs: use fs_info for btrfs_handle_em_exist tracepoint 2018-05-28 18:07:17 +02:00
file-item.c btrfs: simplify pointer chasing of local fs_info variables 2018-08-06 13:12:43 +02:00
file.c Btrfs: don't clean dirty pages during buffered writes 2018-11-13 11:08:57 -08:00
free-space-cache.c btrfs: reset max_extent_size on clear in a bitmap 2018-11-13 11:08:57 -08:00
free-space-cache.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
free-space-tree.c btrfs: Remove fs_info from btrfs_del_root 2018-08-06 13:13:00 +02:00
free-space-tree.h btrfs: Remove fs_info argument from add_to_free_space_tree 2018-05-28 18:07:36 +02:00
inode-item.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
inode-map.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c Btrfs: fix null pointer dereference on compressed write path error 2018-11-13 11:08:58 -08:00
ioctl.c btrfs: Ensure btrfs_trim_fs can trim the whole filesystem 2018-11-13 11:08:56 -08:00
Kconfig btrfs: add SPDX header to Kconfig 2018-04-12 16:29:55 +02:00
locking.c btrfs: replace waitqueue_actvie with cond_wake_up 2018-05-28 18:23:09 +02:00
locking.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
lzo.c btrfs: lzo: Harden inline lzo compressed extent decompression 2018-05-30 16:46:43 +02:00
Makefile btrfs: Remove custom crc32c init code 2018-03-26 15:09:39 +02:00
math.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
ordered-data.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
ordered-data.h btrfs: remove remaing full_sync logic from btrfs_sync_file 2018-08-06 13:12:31 +02:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: annotate unlikely branches after V0 extent type removal 2018-08-06 13:12:41 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: property: Set incompat flag if lzo/zstd compression is set 2018-05-17 14:18:25 +02:00
props.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
qgroup.c btrfs: qgroup: Dirty all qgroups before rescan 2018-11-13 11:08:58 -08:00
qgroup.h btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled 2018-11-13 11:08:56 -08:00
raid56.c btrfs: raid56: catch errors from full_stripe_write 2018-08-06 13:12:45 +02:00
raid56.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
ref-verify.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ref-verify.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
relocation.c btrfs: Handle owner mismatch gracefully when walking up tree 2018-11-13 11:08:56 -08:00
root-tree.c btrfs: Remove fs_info from btrfs_add_root_ref 2018-08-06 13:13:00 +02:00
scrub.c btrfs: Use wrapper macro for rcu string to remove duplicate code 2018-08-06 13:13:02 +02:00
send.c Btrfs: send, fix incorrect file layout after hole punching beyond eof 2018-08-06 13:13:03 +02:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
struct-funcs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
super.c btrfs: Use wrapper macro for rcu string to remove duplicate code 2018-08-06 13:13:02 +02:00
sysfs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
sysfs.h btrfs: sysfs: Use enum/define value for feature array definitions 2018-05-28 18:23:39 +02:00
transaction.c btrfs: release metadata before running delayed refs 2018-11-13 11:08:57 -08:00
transaction.h btrfs: replace get_seconds with new 64bit time API 2018-08-06 13:12:29 +02:00
tree-checker.c btrfs: tree-checker: Detect invalid and empty essential trees 2018-08-06 13:12:42 +02:00
tree-checker.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
tree-defrag.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
tree-log.c Btrfs: fix assertion on fsync of regular file when using no-holes feature 2018-11-13 11:08:58 -08:00
tree-log.h Btrfs: sync log after logging new name 2018-08-23 17:37:26 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: Remove fs_info argument from btrfs_uuid_tree_rem 2018-05-30 16:46:53 +02:00
volumes.c btrfs: btrfs_shrink_device should call commit transaction at the end 2018-08-23 17:37:27 +02:00
volumes.h btrfs: Introduce mount time chunk <-> dev extent mapping check 2018-08-06 13:13:03 +02:00
xattr.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
xattr.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
zlib.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
zstd.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00