linux-stable/fs/btrfs
Filipe Manana c7a082fb63 Btrfs: fix infinite loop on inode eviction after deduplication of eof block
commit 11023d3f5f upstream.

If we attempt to deduplicate the last block of a file A into the middle of
a file B, and file A's size is not a multiple of the block size, we end
rounding the deduplication length to 0 bytes, to avoid the data corruption
issue fixed by commit de02b9f6bb ("Btrfs: fix data corruption when
deduplicating between different files"). However a length of zero will
cause the insertion of an extent state with a start value greater (by 1)
then the end value, leading to a corrupt extent state that will trigger a
warning and cause chaos such as an infinite loop during inode eviction.
Example trace:

 [96049.833585] ------------[ cut here ]------------
 [96049.833714] WARNING: CPU: 0 PID: 24448 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs]
 [96049.833767] CPU: 0 PID: 24448 Comm: xfs_io Not tainted 4.19.0-rc7-btrfs-next-39 #1
 [96049.833768] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
 [96049.833780] RIP: 0010:insert_state+0x101/0x120 [btrfs]
 [96049.833783] RSP: 0018:ffffafd2c3707af0 EFLAGS: 00010282
 [96049.833785] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006
 [96049.833786] RDX: 0000000000000007 RSI: ffff99045c143230 RDI: ffff99047b2168a0
 [96049.833787] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000
 [96049.833787] R10: ffffafd2c3707ab8 R11: 0000000000000000 R12: ffff9903b93b12c8
 [96049.833788] R13: 000000000004e000 R14: ffffafd2c3707b80 R15: ffffafd2c3707b78
 [96049.833790] FS:  00007f5c14e7d700(0000) GS:ffff99047b200000(0000) knlGS:0000000000000000
 [96049.833791] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [96049.833792] CR2: 00007f5c146abff8 CR3: 0000000115f4c004 CR4: 00000000003606f0
 [96049.833795] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [96049.833796] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [96049.833796] Call Trace:
 [96049.833809]  __set_extent_bit+0x46c/0x6a0 [btrfs]
 [96049.833823]  lock_extent_bits+0x6b/0x210 [btrfs]
 [96049.833831]  ? _raw_spin_unlock+0x24/0x30
 [96049.833841]  ? test_range_bit+0xdf/0x130 [btrfs]
 [96049.833853]  lock_extent_range+0x8e/0x150 [btrfs]
 [96049.833864]  btrfs_double_extent_lock+0x78/0xb0 [btrfs]
 [96049.833875]  btrfs_extent_same_range+0x14e/0x550 [btrfs]
 [96049.833885]  ? rcu_read_lock_sched_held+0x3f/0x70
 [96049.833890]  ? __kmalloc_node+0x2b0/0x2f0
 [96049.833899]  ? btrfs_dedupe_file_range+0x19a/0x280 [btrfs]
 [96049.833909]  btrfs_dedupe_file_range+0x270/0x280 [btrfs]
 [96049.833916]  vfs_dedupe_file_range_one+0xd9/0xe0
 [96049.833919]  vfs_dedupe_file_range+0x131/0x1b0
 [96049.833924]  do_vfs_ioctl+0x272/0x6e0
 [96049.833927]  ? __fget+0x113/0x200
 [96049.833931]  ksys_ioctl+0x70/0x80
 [96049.833933]  __x64_sys_ioctl+0x16/0x20
 [96049.833937]  do_syscall_64+0x60/0x1b0
 [96049.833939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
 [96049.833941] RIP: 0033:0x7f5c1478ddd7
 [96049.833943] RSP: 002b:00007ffe15b196a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
 [96049.833945] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c1478ddd7
 [96049.833946] RDX: 00005625ece322d0 RSI: 00000000c0189436 RDI: 0000000000000004
 [96049.833947] RBP: 0000000000000000 R08: 00007f5c14a46f48 R09: 0000000000000040
 [96049.833948] R10: 0000000000000541 R11: 0000000000000202 R12: 0000000000000000
 [96049.833949] R13: 0000000000000000 R14: 0000000000000004 R15: 00005625ece322d0
 [96049.833954] irq event stamp: 6196
 [96049.833956] hardirqs last  enabled at (6195): [<ffffffff91b00663>] console_unlock+0x503/0x640
 [96049.833958] hardirqs last disabled at (6196): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
 [96049.833959] softirqs last  enabled at (6114): [<ffffffff92600370>] __do_softirq+0x370/0x421
 [96049.833964] softirqs last disabled at (6095): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0
 [96049.833965] ---[ end trace db7b05f01b7fa10c ]---
 [96049.935816] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910
 [96049.935822] irq event stamp: 6584
 [96049.935823] hardirqs last  enabled at (6583): [<ffffffff91b00663>] console_unlock+0x503/0x640
 [96049.935825] hardirqs last disabled at (6584): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
 [96049.935827] softirqs last  enabled at (6328): [<ffffffff92600370>] __do_softirq+0x370/0x421
 [96049.935828] softirqs last disabled at (6313): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0
 [96049.935829] ---[ end trace db7b05f01b7fa123 ]---
 [96049.935840] ------------[ cut here ]------------
 [96049.936065] WARNING: CPU: 1 PID: 24463 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs]
 [96049.936107] CPU: 1 PID: 24463 Comm: umount Tainted: G        W         4.19.0-rc7-btrfs-next-39 #1
 [96049.936108] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
 [96049.936117] RIP: 0010:insert_state+0x101/0x120 [btrfs]
 [96049.936119] RSP: 0018:ffffafd2c3637bc0 EFLAGS: 00010282
 [96049.936120] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006
 [96049.936121] RDX: 0000000000000007 RSI: ffff990445cf88e0 RDI: ffff99047b2968a0
 [96049.936122] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000
 [96049.936123] R10: ffffafd2c3637b88 R11: 0000000000000000 R12: ffff9904574301e8
 [96049.936124] R13: 000000000004e000 R14: ffffafd2c3637c50 R15: ffffafd2c3637c48
 [96049.936125] FS:  00007fe4b87e72c0(0000) GS:ffff99047b280000(0000) knlGS:0000000000000000
 [96049.936126] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [96049.936128] CR2: 00005562e52618d8 CR3: 00000001151c8005 CR4: 00000000003606e0
 [96049.936129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [96049.936131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [96049.936131] Call Trace:
 [96049.936141]  __set_extent_bit+0x46c/0x6a0 [btrfs]
 [96049.936154]  lock_extent_bits+0x6b/0x210 [btrfs]
 [96049.936167]  btrfs_evict_inode+0x1e1/0x5a0 [btrfs]
 [96049.936172]  evict+0xbf/0x1c0
 [96049.936174]  dispose_list+0x51/0x80
 [96049.936176]  evict_inodes+0x193/0x1c0
 [96049.936180]  generic_shutdown_super+0x3f/0x110
 [96049.936182]  kill_anon_super+0xe/0x30
 [96049.936189]  btrfs_kill_super+0x13/0x100 [btrfs]
 [96049.936191]  deactivate_locked_super+0x3a/0x70
 [96049.936193]  cleanup_mnt+0x3b/0x80
 [96049.936195]  task_work_run+0x93/0xc0
 [96049.936198]  exit_to_usermode_loop+0xfa/0x100
 [96049.936201]  do_syscall_64+0x17f/0x1b0
 [96049.936202]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
 [96049.936204] RIP: 0033:0x7fe4b80cfb37
 [96049.936206] RSP: 002b:00007ffff092b688 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
 [96049.936207] RAX: 0000000000000000 RBX: 00005562e5259060 RCX: 00007fe4b80cfb37
 [96049.936208] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005562e525faa0
 [96049.936209] RBP: 00005562e525faa0 R08: 00005562e525f770 R09: 0000000000000015
 [96049.936210] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fe4b85d1e64
 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910
 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910
 [96049.936216] irq event stamp: 6616
 [96049.936219] hardirqs last  enabled at (6615): [<ffffffff91b00663>] console_unlock+0x503/0x640
 [96049.936219] hardirqs last disabled at (6616): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c
 [96049.936222] softirqs last  enabled at (6328): [<ffffffff92600370>] __do_softirq+0x370/0x421
 [96049.936222] softirqs last disabled at (6313): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0
 [96049.936223] ---[ end trace db7b05f01b7fa124 ]---

The second stack trace, from inode eviction, is repeated forever due to
the infinite loop during eviction.

This is the same type of problem fixed way back in 2015 by commit
113e828386 ("Btrfs: fix inode eviction infinite loop after extent_same
ioctl") and commit ccccf3d672 ("Btrfs: fix inode eviction infinite loop
after cloning into it").

So fix this by returning immediately if the deduplication range length
gets rounded down to 0 bytes, as there is nothing that needs to be done in
such case.

Example reproducer:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt

 $ xfs_io -f -c "pwrite -S 0xe6 0 100" /mnt/foo
 $ xfs_io -f -c "pwrite -S 0xe6 0 1M" /mnt/bar

 # Unmount the filesystem and mount it again so that we start without any
 # extent state records when we ask for the deduplication.
 $ umount /mnt
 $ mount /dev/sdb /mnt

 $ xfs_io -c "dedupe /mnt/foo 0 500K 100" /mnt/bar

 # This unmount triggers the infinite loop.
 $ umount /mnt

A test case for fstests will follow soon.

Fixes: de02b9f6bb ("Btrfs: fix data corruption when deduplicating between different files")
CC: <stable@vger.kernel.org> # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:24:11 +01:00
..
tests btrfs: tests/qgroup: Fix wrong tree backref level 2018-05-30 07:52:26 +02:00
acl.c btrfs: preserve i_mode if __btrfs_set_acl() fails 2017-08-21 17:47:42 +02:00
async-thread.c btrfs: constify tracepoint arguments 2017-08-16 14:19:53 +02:00
async-thread.h btrfs: constify tracepoint arguments 2017-08-16 14:19:53 +02:00
backref.c btrfs: remove spurious WARN_ON(ref->count < 0) in find_parent_nodes 2018-03-21 12:06:44 +01:00
backref.h btrfs: backref, add tracepoints for prelim_ref insertion and merging 2017-08-16 16:12:01 +02:00
btrfs_inode.h btrfs: separate defrag and property compression 2017-08-16 16:12:05 +02:00
check-integrity.c Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-09-09 13:27:51 -07:00
check-integrity.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
compression.c Merge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-09-29 12:57:35 -07:00
compression.h Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-09-14 17:30:49 -07:00
ctree.c btrfs: fix reading stale metadata blocks after degraded raid1 mounts 2018-05-22 18:54:01 +02:00
ctree.h btrfs: use kvzalloc to allocate btrfs_fs_info 2018-05-30 07:52:08 +02:00
dedupe.h btrfs: expand cow_file_range() to support in-band dedup and subpage-blocksize 2016-07-26 13:52:25 +02:00
delayed-inode.c Btrfs: fix stale entries in readdir 2018-01-31 14:03:42 +01:00
delayed-inode.h btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
delayed-ref.c Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
delayed-ref.h Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
dev-replace.c btrfs: replace: Reset on-disk dev stats value after replace 2018-09-15 09:45:32 +02:00
dev-replace.h btrfs: constify device path passed to relevant helpers 2017-02-28 14:26:07 +01:00
dir-item.c btrfs: fix validation of XATTR_ITEM dir items 2017-06-29 20:06:11 +02:00
disk-io.c btrfs: fix pinned underflow after transaction aborted 2018-11-21 09:24:10 +01:00
disk-io.h btrfs: use named constant for bdev blocksize 2017-08-16 16:12:04 +02:00
export.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
export.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent-tree.c btrfs: don't use ctl->free_space for max_extent_size 2018-11-13 11:15:15 -08:00
extent_io.c Btrfs: fix file data corruption after cloning a range and fsync 2018-08-09 12:16:39 +02:00
extent_io.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent_map.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent_map.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
file-item.c Merge branch 'for-4.13-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-07-05 16:41:23 -07:00
file.c btrfs: move the dio_sem higher up the callchain 2018-11-13 11:15:17 -08:00
free-space-cache.c Btrfs: fix use-after-free when dumping free space 2018-11-13 11:15:18 -08:00
free-space-cache.h btrfs: free-space-cache, clean up unnecessary root arguments 2017-02-17 12:03:56 +01:00
free-space-tree.c btrfs: pass fs_info to btrfs_del_root instead of tree_root 2017-08-21 17:49:54 +02:00
free-space-tree.h btrfs: expose internal free space tree routine only if sanity tests are enabled 2017-08-18 16:36:29 +02:00
hash.c crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
inode-map.c btrfs: qgroup: Introduce extent changeset for qgroup reserve functions 2017-06-29 20:17:02 +02:00
inode-map.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
inode.c Btrfs: fix cur_offset in the error case for nocow 2018-11-21 09:24:10 +01:00
ioctl.c Btrfs: fix infinite loop on inode eviction after deduplication of eof block 2018-11-21 09:24:11 +01:00
Kconfig btrfs: Add zstd support 2017-08-15 09:02:09 -07:00
locking.c
locking.h
lzo.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
math.h
ordered-data.c btrfs: fix integer overflow in calc_reclaim_items_nr 2017-06-29 20:17:02 +02:00
ordered-data.h btrfs: fix integer overflow in calc_reclaim_items_nr 2017-06-29 20:17:02 +02:00
orphan.c
print-tree.c Btrfs: add one more sanity check for shared ref type 2017-08-21 17:47:43 +02:00
print-tree.h btrfs: get fs_info from eb in btrfs_print_tree, remove argument 2017-08-16 16:12:03 +02:00
props.c btrfs: property: Set incompat flag if lzo/zstd compression is set 2018-05-22 18:54:01 +02:00
props.h
qgroup.c btrfs: qgroup: Dirty all qgroups before rescan 2018-11-13 11:15:14 -08:00
qgroup.h btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled 2018-11-13 11:15:13 -08:00
raid56.c Btrfs: make raid6 rebuild retry more 2018-06-21 04:03:02 +09:00
raid56.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
rcu-string.h
reada.c btrfs: remove unused member err from reada_extent 2017-06-19 18:25:59 +02:00
relocation.c btrfs: Handle owner mismatch gracefully when walking up tree 2018-11-13 11:15:13 -08:00
root-tree.c Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 18:54:01 -07:00
scrub.c btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block() 2018-08-24 13:09:18 +02:00
send.c Btrfs: send, fix issuing write op when processing hole in no data mode 2018-05-30 07:52:09 +02:00
send.h
struct-funcs.c btrfs: struct-funcs, constify readers 2017-08-16 14:19:53 +02:00
super.c Btrfs: fix btrfs_write_inode vs delayed iput deadlock 2018-09-05 09:26:34 +02:00
sysfs.c Revert "btrfs: use proper endianness accessors for super_copy" 2018-03-19 08:42:47 +01:00
sysfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
transaction.c btrfs: don't run delayed_iputs in commit 2018-11-13 11:15:17 -08:00
transaction.h btrfs: remove unused qgroup members from btrfs_trans_handle 2017-04-18 14:07:25 +02:00
tree-defrag.c
tree-log.c Btrfs: fix fsync after hole punching when using no-holes feature 2018-11-13 11:15:18 -08:00
tree-log.h btrfs: Make btrfs_del_inode_ref take btrfs_inode 2017-02-14 15:50:54 +01:00
ulist.c btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
ulist.h btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
uuid-tree.c btrfs: return the actual error value from from btrfs_uuid_tree_iterate 2016-12-19 18:08:15 +01:00
volumes.c btrfs: btrfs_shrink_device should call commit transaction at the end 2018-10-10 08:54:19 +02:00
volumes.h btrfs: Fix memory barriers usage with device stats counters 2018-03-21 12:06:44 +01:00
xattr.c btrfs: Check name_len with boundary in verify dir_item 2017-06-21 19:16:04 +02:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00
zstd.c btrfs: Add zstd support 2017-08-15 09:02:09 -07:00