linux-stable/fs/btrfs
Liu Bo 671c9a69f4 btrfs: fix reading stale metadata blocks after degraded raid1 mounts
commit 02a3307aa9 upstream.

If a btree block, aka. extent buffer, is not available in the extent
buffer cache, it'll be read out from the disk instead, i.e.

btrfs_search_slot()
  read_block_for_search()  # hold parent and its lock, go to read child
    btrfs_release_path()
    read_tree_block()  # read child

Unfortunately, the parent lock got released before reading child, so
commit 5bdd3536cb ("Btrfs: Fix block generation verification race") had
used 0 as parent transid to read the child block.  It forces
read_tree_block() not to check if parent transid is different with the
generation id of the child that it reads out from disk.

A simple PoC is included in btrfs/124,

0. A two-disk raid1 btrfs,

1. Right after mkfs.btrfs, block A is allocated to be device tree's root.

2. Mount this filesystem and put it in use, after a while, device tree's
   root got COW but block A hasn't been allocated/overwritten yet.

3. Umount it and reload the btrfs module to remove both disks from the
   global @fs_devices list.

4. mount -odegraded dev1 and write some data, so now block A is allocated
   to be a leaf in checksum tree.  Note that only dev1 has the latest
   metadata of this filesystem.

5. Umount it and mount it again normally (with both disks), since raid1
   can pick up one disk by the writer task's pid, if btrfs_search_slot()
   needs to read block A, dev2 which does NOT have the latest metadata
   might be read for block A, then we got a stale block A.

6. As parent transid is not checked, block A is marked as uptodate and
   put into the extent buffer cache, so the future search won't bother
   to read disk again, which means it'll make changes on this stale
   one and make it dirty and flush it onto disk.

To avoid the problem, parent transid needs to be passed to
read_tree_block().

In order to get a valid parent transid, we need to hold the parent's
lock until finishing reading child.

This patch needs to be slightly adapted for stable kernels, the
&first_key parameter added to read_tree_block() is from 4.16+
(581c176041). The fix is to replace 0 by 'gen'.

Fixes: 5bdd3536cb ("Btrfs: Fix block generation verification race")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22 18:54:01 +02:00
..
tests btrfs: tests: Fix a memory leak in error handling path in 'run_test()' 2017-12-20 10:10:30 +01:00
acl.c btrfs: preserve i_mode if __btrfs_set_acl() fails 2017-08-21 17:47:42 +02:00
async-thread.c btrfs: constify tracepoint arguments 2017-08-16 14:19:53 +02:00
async-thread.h btrfs: constify tracepoint arguments 2017-08-16 14:19:53 +02:00
backref.c btrfs: remove spurious WARN_ON(ref->count < 0) in find_parent_nodes 2018-03-21 12:06:44 +01:00
backref.h btrfs: backref, add tracepoints for prelim_ref insertion and merging 2017-08-16 16:12:01 +02:00
btrfs_inode.h btrfs: separate defrag and property compression 2017-08-16 16:12:05 +02:00
check-integrity.c Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-09-09 13:27:51 -07:00
check-integrity.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
compression.c Merge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-09-29 12:57:35 -07:00
compression.h Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-09-14 17:30:49 -07:00
ctree.c btrfs: fix reading stale metadata blocks after degraded raid1 mounts 2018-05-22 18:54:01 +02:00
ctree.h btrfs: Split btrfs_del_delalloc_inode into 2 functions 2018-05-22 18:54:01 +02:00
dedupe.h btrfs: expand cow_file_range() to support in-band dedup and subpage-blocksize 2016-07-26 13:52:25 +02:00
delayed-inode.c Btrfs: fix stale entries in readdir 2018-01-31 14:03:42 +01:00
delayed-inode.h btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t 2017-04-18 14:07:23 +02:00
delayed-ref.c Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
delayed-ref.h Btrfs: return old and new total ref mods when adding delayed refs 2017-06-29 20:17:01 +02:00
dev-replace.c Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 18:54:01 -07:00
dev-replace.h btrfs: constify device path passed to relevant helpers 2017-02-28 14:26:07 +01:00
dir-item.c btrfs: fix validation of XATTR_ITEM dir items 2017-06-29 20:06:11 +02:00
disk-io.c btrfs: Fix delalloc inodes invalidation during transaction abort 2018-05-22 18:54:01 +02:00
disk-io.h btrfs: use named constant for bdev blocksize 2017-08-16 16:12:04 +02:00
export.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
export.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent-tree.c btrfs: Take trans lock before access running trans in check_delayed_ref 2018-05-19 10:20:27 +02:00
extent_io.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent_io.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent_map.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extent_map.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
file-item.c Merge branch 'for-4.13-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-07-05 16:41:23 -07:00
file.c Btrfs: set plug for fsync 2018-04-26 11:02:09 +02:00
free-space-cache.c btrfs: fix deadlock when writing out space cache 2018-02-03 17:39:04 +01:00
free-space-cache.h btrfs: free-space-cache, clean up unnecessary root arguments 2017-02-17 12:03:56 +01:00
free-space-tree.c btrfs: pass fs_info to btrfs_del_root instead of tree_root 2017-08-21 17:49:54 +02:00
free-space-tree.h btrfs: expose internal free space tree routine only if sanity tests are enabled 2017-08-18 16:36:29 +02:00
hash.c crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
inode-map.c btrfs: qgroup: Introduce extent changeset for qgroup reserve functions 2017-06-29 20:17:02 +02:00
inode-map.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
inode.c btrfs: Split btrfs_del_delalloc_inode into 2 functions 2018-05-22 18:54:01 +02:00
ioctl.c btrfs: Fix possible off-by-one in btrfs_search_path_in_tree 2018-02-25 11:08:00 +01:00
Kconfig btrfs: Add zstd support 2017-08-15 09:02:09 -07:00
locking.c
locking.h
lzo.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
math.h
ordered-data.c btrfs: fix integer overflow in calc_reclaim_items_nr 2017-06-29 20:17:02 +02:00
ordered-data.h btrfs: fix integer overflow in calc_reclaim_items_nr 2017-06-29 20:17:02 +02:00
orphan.c
print-tree.c Btrfs: add one more sanity check for shared ref type 2017-08-21 17:47:43 +02:00
print-tree.h btrfs: get fs_info from eb in btrfs_print_tree, remove argument 2017-08-16 16:12:03 +02:00
props.c btrfs: property: Set incompat flag if lzo/zstd compression is set 2018-05-22 18:54:01 +02:00
props.h
qgroup.c btrfs: Report error on removing qgroup if del_qgroup_item fails 2017-09-26 14:54:01 +02:00
qgroup.h btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges 2017-06-29 20:17:02 +02:00
raid56.c Btrfs: raid56: fix race between merge_bio and rbio_orig_end_io 2018-04-26 11:02:09 +02:00
raid56.h btrfs: take an fs_info directly when the root is not used otherwise 2016-12-06 16:06:59 +01:00
rcu-string.h
reada.c btrfs: remove unused member err from reada_extent 2017-06-19 18:25:59 +02:00
relocation.c btrfs: fix NULL pointer dereference from free_reloc_roots() 2017-09-26 14:51:49 +02:00
root-tree.c Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 18:54:01 -07:00
scrub.c Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux 2017-09-09 13:27:51 -07:00
send.c Btrfs: incremental send, fix wrong unlink path after renaming file 2018-02-03 17:39:10 +01:00
send.h
struct-funcs.c btrfs: struct-funcs, constify readers 2017-08-16 14:19:53 +02:00
super.c btrfs: avoid null pointer dereference on fs_info when calling btrfs_crit 2017-12-20 10:10:30 +01:00
sysfs.c Revert "btrfs: use proper endianness accessors for super_copy" 2018-03-19 08:42:47 +01:00
sysfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
transaction.c Revert "btrfs: use proper endianness accessors for super_copy" 2018-03-19 08:42:47 +01:00
transaction.h btrfs: remove unused qgroup members from btrfs_trans_handle 2017-04-18 14:07:25 +02:00
tree-defrag.c
tree-log.c Btrfs: fix xattr loss after power failure 2018-05-22 18:54:00 +02:00
tree-log.h btrfs: Make btrfs_del_inode_ref take btrfs_inode 2017-02-14 15:50:54 +01:00
ulist.c btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
ulist.h btrfs: ulist: rename ulist_fini to ulist_release 2017-02-17 12:03:50 +01:00
uuid-tree.c btrfs: return the actual error value from from btrfs_uuid_tree_iterate 2016-12-19 18:08:15 +01:00
volumes.c btrfs: fix crash when trying to resume balance without the resume flag 2018-05-22 18:54:01 +02:00
volumes.h btrfs: Fix memory barriers usage with device stats counters 2018-03-21 12:06:44 +01:00
xattr.c btrfs: Check name_len with boundary in verify dir_item 2017-06-21 19:16:04 +02:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace 2017-06-19 18:26:02 +02:00
zstd.c btrfs: Add zstd support 2017-08-15 09:02:09 -07:00