linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-11-01 17:08:10 +00:00

No description

Find a file

Qu Wenruo 28d70e237d btrfs: scrub: Fix RAID56 recovery race condition When scrubbing a RAID5 which has recoverable data corruption (only one data stripe is corrupted), sometimes scrub will report more csum errors than expected. Sometimes even unrecoverable error will be reported. The problem can be easily reproduced by the following steps: 1) Create a btrfs with RAID5 data profile with 3 devs 2) Mount it with nospace_cache or space_cache=v2 To avoid extra data space usage. 3) Create a 128K file and sync the fs, unmount it Now the 128K file lies at the beginning of the data chunk 4) Locate the physical bytenr of data chunk on dev3 Dev3 is the 1st data stripe. 5) Corrupt the first 64K of the data chunk stripe on dev3 6) Mount the fs and scrub it The correct csum error number should be 16 (assuming using x86_64). Larger csum error number can be reported in a 1/3 chance. And unrecoverable error can also be reported in a 1/10 chance. The root cause of the problem is RAID5/6 recover code has race condition, due to the fact that full scrub is initiated per device. While for other mirror based profiles, each mirror is independent with each other, so race won't cause any big problem. For example: Corrupted \| Correct \| Correct \| \| Scrub dev3 (D1) \| Scrub dev2 (D2) \| Scrub dev1(P) \| ------------------------------------------------------------------------ Read out D1 \|Read out D2 \|Read full stripe \| Check csum \|Check csum \|Check parity \| Csum mismatch \|Csum match, continue \|Parity mismatch \| handle_errored_block \| \|handle_errored_block \| Read out full stripe \| \| Read out full stripe\| D1 csum error(err++) \| \| D1 csum error(err++)\| Recover D1 \| \| Recover D1 \| So D1's csum error is accounted twice, just because handle_errored_block() doesn't have enough protection, and race can happen. On even worse case, for example D1's recovery code is re-writing D1/D2/P, and P's recovery code is just reading out full stripe, then we can cause unrecoverable error. This patch will use previously introduced lock_full_stripe() and unlock_full_stripe() to protect the whole scrub_handle_errored_block() function for RAID56 recovery. So no extra csum error nor unrecoverable error. Reported-by: Goffredo Baroncelli <kreijack@libero.it> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2017-04-18 14:07:27 +02:00
arch	ARM: SoC fixes	2017-04-16 12:38:17 -07:00
block	blk-mq: Restart a single queue if tag sets are shared	2017-04-07 12:40:09 -06:00
certs	certs: Add a secondary system keyring that can be added to dynamically	2016-04-11 22:48:09 +01:00
crypto	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	2017-03-31 12:11:32 -07:00
Documentation	Driver core fixes for 4.11-rc6	2017-04-09 09:03:51 -07:00
drivers	ARM: SoC fixes	2017-04-16 12:38:17 -07:00
firmware	WHENCE: use https://linuxtv.org for LinuxTV URLs	2015-12-04 10:35:11 -02:00
fs	btrfs: scrub: Fix RAID56 recovery race condition	2017-04-18 14:07:27 +02:00
include	btrfs: qgroup: Add trace point for qgroup reserved space	2017-04-18 14:07:26 +02:00
init	mm: move mm_percpu_wq initialization earlier	2017-03-31 17:13:30 -07:00
ipc	Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-03-03 10:16:38 -08:00
kernel	Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2017-04-16 11:48:10 -07:00
lib	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2017-04-06 11:57:04 -07:00
mm	zsmalloc: expand class bit	2017-04-13 18:24:21 -07:00
net	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	2017-04-14 10:47:13 -04:00
samples	statx: Include a mask for stx_attributes in struct statx	2017-04-03 01:06:00 -04:00
scripts	Kbuild fixes for v4.11	2017-04-05 08:37:28 -07:00
security	Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-03-03 10:16:38 -08:00
sound	ALSA: hda - fix a problem for lineout on a Dell AIO machine	2017-03-31 10:58:26 +02:00
tools	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-04-14 16:58:38 -07:00
usr	kbuild: initramfs cleanup, set target from Kconfig	2017-01-05 09:40:16 -08:00
virt	KVM/ARM Fixes for v4.11-rc6	2017-04-05 16:27:47 +02:00
.cocciconfig	scripts: add Linux .cocciconfig for coccinelle	2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes	.gitattributes: set git diff driver for C source code files	2016-10-07 18:46:30 -07:00
.gitignore	Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild	2016-08-02 16:48:52 -04:00
.mailmap	mailmap: add Martin Kepplinger's email	2017-04-13 18:24:21 -07:00
COPYING
CREDITS	MAINTAINERS: Remove old e-mail address	2017-02-13 12:24:56 -05:00
Kbuild	scripts/gdb: provide linux constants	2016-05-23 17:04:14 -07:00
Kconfig
MAINTAINERS	MAINTAINERS: add btrfs file entries for include directories	2017-04-18 14:07:23 +02:00
Makefile	Linux 4.11-rc7	2017-04-16 13:00:18 -07:00
README	README: add a new README file, pointing to the Documentation/	2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.