No description
Find a file
Brian Foster 385a82f62a bcachefs: serialize on cached key in early bucket allocator
bcachefs had a transient bug where freespace_initialized was not
properly being set, which lead to unexpected use of the early bucket
allocator at runtime. This issue has been fixed, but the existence
of it uncovered a coherency issue in the early bucket allocation
code that is somewhat related to how uncached iterators deal with
the key cache.

The problem itself manifests as occasional failure of generic/113
due to corruption, often seen as a duplicate backpointer or multiple
data types per-bucket error. The immediate cause of the error is a
racing bucket allocation along the lines of the following sequence:

- Task 1 selects key A in bch2_bucket_alloc_early() and schedules.
- Task 2 selects the same key A, but proceeds to complete the
  allocation and associated I/O, after which it releases the
  open_bucket.
- Task 1 resumes with key A, but does not recognize the bucket is
  now allocated because the open_bucket has been removed
  from the hash when it was released in the previous step.

This generally shouldn't happen because the allocating task updates
the alloc btree key before releasing the bucket. This is not
sufficient in this particular instance, however, because an uncached
iterator for a cached btree doesn't actually lock the key cache slot
when no key exists for a given slot in the cache. Thus the fact that
the allocation side updates the cached key means that multiple
uncached iters can stumble across the same alloc key and duplicate
the bucket allocation as described above.

This is something that probably needs a longer term fix in the
iterator code. As a short term fix, close the race through explicit
use of a cached iterator for likely allocation candidates. We don't
want to scan the btree with a cached iterator because that would
unnecessarily pollute the cache. This mitigates cache pollution by
primarily scanning the tree with an uncached iterator, but closes
the race by creating a key cache entry for any prospective slot
prior to the bucket allocation attempt (also similar to how
_alloc_freelist() works via try_alloc_bucket()). This survives many
iterations of generic/113 on a kernel hacked to always use the early
bucket allocator.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04 22:19:13 -04:00
arch Misc fixes and cleanups: 2023-10-30 13:20:02 -10:00
block vfs-6.7.super 2023-10-30 08:59:05 -10:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto KEYS: asymmetric: Fix sign/verify on pkcs1pad without a hash 2023-10-18 12:27:10 +08:00
Documentation Scheduler changes for v6.7 are: 2023-10-30 13:12:15 -10:00
drivers Scheduler changes for v6.7 are: 2023-10-30 13:12:15 -10:00
fs bcachefs: serialize on cached key in early bucket allocator 2023-11-04 22:19:13 -04:00
include closures: Fix race in closure_sync() 2023-10-30 21:48:22 -04:00
init - A bunch of improvements, cleanups and fixlets to the SRSO mitigation 2023-10-30 11:48:49 -10:00
io_uring vfs-6.7.misc 2023-10-30 09:14:19 -10:00
ipc ipc: convert to new timestamp accessors 2023-10-18 14:08:30 +02:00
kernel Scheduler changes for v6.7 are: 2023-10-30 13:12:15 -10:00
lib closures: Fix race in closure_sync() 2023-10-30 21:48:22 -04:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm Scheduler changes for v6.7 are: 2023-10-30 13:12:15 -10:00
net NFSD 6.7 Release Notes 2023-10-30 10:12:29 -10:00
rust rust: docs: fix logo replacement 2023-10-19 16:40:00 +02:00
samples VFIO updates for v6.6-rc1 2023-08-30 20:36:01 -07:00
scripts Misc fixes and cleanups: 2023-10-30 13:20:02 -10:00
security vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
sound vfs-6.7.iov_iter 2023-10-30 09:24:21 -10:00
tools Misc fixes and cleanups: 2023-10-30 13:20:02 -10:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: rename binkernel.spec to kernel.spec 2023-07-25 00:59:33 +09:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS - A new EDAC driver for Xilinx's Versal integrated memory controller 2023-10-30 11:45:36 -10:00
Makefile Linux 6.6 2023-10-29 16:31:08 -10:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.