No description
Find a file
Yosry Ahmed 3cd9992b93 memcg: replace stats_flush_lock with an atomic
As Johannes notes in [1], stats_flush_lock is currently used to:
(a) Protect updated to stats_flush_threshold.
(b) Protect updates to flush_next_time.
(c) Serializes calls to cgroup_rstat_flush() based on those ratelimits.

However:

1. stats_flush_threshold is already an atomic

2. flush_next_time is not atomic. The writer is locked, but the reader
   is lockless. If the reader races with a flush, you could see this:

                                        if (time_after(jiffies, flush_next_time))
        spin_trylock()
        flush_next_time = now + delay
        flush()
        spin_unlock()
                                        spin_trylock()
                                        flush_next_time = now + delay
                                        flush()
                                        spin_unlock()

   which means we already can get flushes at a higher frequency than
   FLUSH_TIME during races. But it isn't really a problem.

   The reader could also see garbled partial updates if the compiler
   decides to split the write, so it needs at least READ_ONCE and
   WRITE_ONCE protection.

3. Serializing cgroup_rstat_flush() calls against the ratelimit
   factors is currently broken because of the race in 2. But the race
   is actually harmless, all we might get is the occasional earlier
   flush. If there is no delta, the flush won't do much. And if there
   is, the flush is justified.

So the lock can be removed all together. However, the lock also served
the purpose of preventing a thundering herd problem for concurrent
flushers, see [2]. Use an atomic instead to serve the purpose of
unifying concurrent flushers.

[1]https://lore.kernel.org/lkml/20230323172732.GE739026@cmpxchg.org/
[2]https://lore.kernel.org/lkml/20210716212137.1391164-2-shakeelb@google.com/

Link: https://lkml.kernel.org/r/20230330191801.1967435-5-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vasily Averin <vasily.averin@linux.dev>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:29:49 -07:00
arch xtensa: reword ARCH_FORCE_MAX_ORDER prompt and help text 2023-04-18 16:29:46 -07:00
block block: remove obsolete config BLOCK_COMPAT 2023-03-16 09:35:44 -06:00
certs Kbuild updates for v6.3 2023-02-26 11:53:25 -08:00
crypto asymmetric_keys: log on fatal failures in PE/pkcs7 2023-03-21 16:23:56 +00:00
Documentation sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-16 12:31:58 -07:00
drivers drm/ttm: remove comment referencing now-removed vmf_insert_mixed_prot() 2023-04-05 19:42:56 -07:00
fs sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-18 14:53:49 -07:00
include memcg: rename mem_cgroup_flush_stats_"delayed" to "ratelimited" 2023-04-18 16:29:49 -07:00
init init,mm: fold late call to page_ext_init() to page_alloc_init_late() 2023-04-05 19:42:54 -07:00
io_uring block-6.3-2023-03-24 2023-03-24 14:10:39 -07:00
ipc Merge branch 'work.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:20:07 -08:00
kernel cgroup: rename cgroup_rstat_flush_"irqsafe" to "atomic" 2023-04-18 16:29:49 -07:00
lib lib/test_vmalloc.c: add vm_map_ram()/vm_unmap_ram() test case 2023-04-18 16:29:47 -07:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm memcg: replace stats_flush_lock with an atomic 2023-04-18 16:29:49 -07:00
net mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
rust Rust fixes for 6.3-rc1 2023-03-03 14:51:15 -08:00
samples kmemleak-test: fix kmemleak_test.c build logic 2023-04-18 16:29:47 -07:00
scripts kasan: remove hwasan-kernel-mem-intrinsic-prefix=1 for clang-14 2023-04-18 16:29:43 -07:00
security mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
sound ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set() 2023-03-14 17:04:53 +01:00
tools sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-18 14:53:49 -07:00
usr usr/gen_init_cpio.c: remove unnecessary -1 values from int file 2022-10-03 14:21:44 -07:00
virt KVM/riscv changes for 6.3 2023-02-15 12:33:28 -05:00
.clang-format cpumask: re-introduce constant-sized cpumask optimizations 2023-03-05 14:30:34 -08:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: use 'dts' diff driver for *.dtso files 2023-02-26 15:28:23 +09:00
.gitignore kbuild: rpm-pkg: move source components to rpmbuild/SOURCES 2023-03-16 22:45:56 +09:00
.mailmap mailmap: update jtoppins' entry to reference correct email 2023-04-16 10:41:25 -07:00
.rustfmt.toml
COPYING
CREDITS There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig
MAINTAINERS MAINTAINERS: extend memblock entry to include MM initialization 2023-04-05 19:42:55 -07:00
Makefile Linux 6.3-rc4 2023-03-26 14:40:20 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.