No description
Find a file
Brian Foster c85cbb0b21 xfs: hold buffer across unpin and potential shutdown processing
commit 84d8949e77 upstream.

The special processing used to simulate a buffer I/O failure on fs
shutdown has a difficult to reproduce race that can result in a use
after free of the associated buffer. Consider a buffer that has been
committed to the on-disk log and thus is AIL resident. The buffer
lands on the writeback delwri queue, but is subsequently locked,
committed and pinned by another transaction before submitted for
I/O. At this point, the buffer is stuck on the delwri queue as it
cannot be submitted for I/O until it is unpinned. A log checkpoint
I/O failure occurs sometime later, which aborts the bli. The unpin
handler is called with the aborted log item, drops the bli reference
count, the pin count, and falls into the I/O failure simulation
path.

The potential problem here is that once the pin count falls to zero
in ->iop_unpin(), xfsaild is free to retry delwri submission of the
buffer at any time, before the unpin handler even completes. If
delwri queue submission wins the race to the buffer lock, it
observes the shutdown state and simulates the I/O failure itself.
This releases both the bli and delwri queue holds and frees the
buffer while xfs_buf_item_unpin() sits on xfs_buf_lock() waiting to
run through the same failure sequence. This problem is rare and
requires many iterations of fstest generic/019 (which simulates disk
I/O failures) to reproduce.

To avoid this problem, grab a hold on the buffer before the log item
is unpinned if the associated item has been aborted and will require
a simulated I/O failure. The hold is already required for the
simulated I/O failure, so the ordering simply guarantees the unpin
handler access to the buffer before it is unpinned and thus
processed by the AIL. This particular ordering is required so long
as the AIL does not acquire a reference on the bli, which is the
long term solution to this problem.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:00:51 +02:00
arch ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow 2022-08-03 12:00:50 +02:00
block block: Fix handling of offline queues in blk_mq_alloc_request_hctx() 2022-06-22 14:13:17 +02:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:13:17 +02:00
crypto crypto: memneq - move into lib/ 2022-06-22 14:13:18 +02:00
Documentation docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed 2022-08-03 12:00:50 +02:00
drivers EDAC/ghes: Set the DIMM label unconditionally 2022-08-03 12:00:50 +02:00
fs xfs: hold buffer across unpin and potential shutdown processing 2022-08-03 12:00:51 +02:00
include ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr 2022-08-03 12:00:46 +02:00
init Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug 2022-06-09 10:21:25 +02:00
ipc ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() 2022-06-09 10:21:17 +02:00
kernel watch_queue: Fix missing locking in add_watch_to_object() 2022-08-03 12:00:44 +02:00
lib ida: don't use BUG_ON() for debugging 2022-07-12 16:32:23 +02:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm page_alloc: fix invalid watermark check on a negative value 2022-08-03 12:00:50 +02:00
net sctp: leave the err path free in sctp_stream_init to sctp_stream_free 2022-08-03 12:00:49 +02:00
samples x86: Prepare inline-asm for straight-line-speculation 2022-07-25 11:26:29 +02:00
scripts x86/retbleed: Add fine grained Kconfig knobs 2022-07-25 11:26:50 +02:00
security lockdown: Fix kexec lockdown bypass with ima policy 2022-07-29 17:19:06 +02:00
sound ALSA: memalloc: Align buffer allocations in page size 2022-07-29 17:19:25 +02:00
tools perf symbol: Correct address for bss symbols 2022-08-03 12:00:49 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:25:48 +01:00
virt KVM: Don't null dereference ops->destroy 2022-07-29 17:19:23 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add Amir as xfs maintainer for 5.10.y 2022-07-02 16:39:22 +02:00
Makefile Linux 5.10.134 2022-07-29 17:19:29 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.