Go to file
Yutian Yang 4bbb6e6a1c memcg: charge fs_context and legacy_fs_context
commit bb902cb47c upstream.

This patch adds accounting flags to fs_context and legacy_fs_context
allocation sites so that kernel could correctly charge these objects.

We have written a PoC to demonstrate the effect of the missing-charging
bugs.  The PoC takes around 1,200MB unaccounted memory, while it is
charged for only 362MB memory usage.  We evaluate the PoC on QEMU x86_64
v5.2.90 + Linux kernel v5.10.19 + Debian buster.  All the limitations
including ulimits and sysctl variables are set as default.  Specifically,
the hard NOFILE limit and nr_open in sysctl are both 1,048,576.

/*------------------------- POC code ----------------------------*/

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/file.h>
#include <time.h>
#include <sys/wait.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
#include <sched.h>
#include <fcntl.h>
#include <linux/mount.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define STACK_SIZE (8 * 1024)
#ifndef __NR_fsopen
#define __NR_fsopen 430
#endif
static inline int fsopen(const char *fs_name, unsigned int flags)
{
        return syscall(__NR_fsopen, fs_name, flags);
}

static char thread_stack[512][STACK_SIZE];

int thread_fn(void* arg)
{
  for (int i = 0; i< 800000; ++i) {
    int fsfd = fsopen("nfs", FSOPEN_CLOEXEC);
    if (fsfd == -1) {
      errExit("fsopen");
    }
  }
  while(1);
  return 0;
}

int main(int argc, char *argv[]) {
  int thread_pid;
  for (int i = 0; i < 1; ++i) {
    thread_pid = clone(thread_fn, thread_stack[i] + STACK_SIZE, \
      SIGCHLD, NULL);
  }
  while(1);
  return 0;
}

/*-------------------------- end --------------------------------*/

Link: https://lkml.kernel.org/r/1626517201-24086-1-git-send-email-nglaive@gmail.com
Signed-off-by: Yutian Yang <nglaive@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <shenwenbo@zju.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08 18:24:29 +01:00
Documentation psi: Fix uaf issue when psi trigger is destroyed while being polled 2022-02-05 12:35:36 +01:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
arch powerpc/32: Fix boot failure with GCC latent entropy plugin 2022-02-01 17:24:36 +01:00
block block: bio-integrity: Advance seed correctly for larger interval sizes 2022-02-08 18:24:28 +01:00
certs certs: Trigger creation of RSA module signing key if it's not an RSA key 2021-09-15 09:47:29 +02:00
crypto crypto: pcrypt - Delay write to padata->info 2021-11-17 09:48:40 +01:00
drivers Revert "ASoC: mediatek: Check for error clk pointer" 2022-02-08 18:24:28 +01:00
fs memcg: charge fs_context and legacy_fs_context 2022-02-08 18:24:29 +01:00
include psi: Fix uaf issue when psi trigger is destroyed while being polled 2022-02-05 12:35:36 +01:00
init kbuild: add CONFIG_LD_IS_LLD 2021-06-30 08:47:44 -04:00
ipc shm: extend forced shm destroy to support objects from several IPC nses 2021-12-01 09:23:35 +01:00
kernel audit: improve audit queue handling when "audit=1" on cmdline 2022-02-08 18:24:26 +01:00
lib lib/test_meminit: destroy cache in kmem_cache_alloc_bulk() test 2022-01-27 09:19:55 +01:00
mm mm/kmemleak: avoid scanning potential huge holes 2022-02-08 18:24:28 +01:00
net af_packet: fix data-race in packet_setsockopt / packet_setsockopt 2022-02-05 12:35:37 +01:00
samples samples/kretprobes: Fix return value if register_kretprobe() failed 2021-11-17 09:48:39 +01:00
scripts scripts/dtc: dtx_diff: remove broken example from help text 2022-01-27 09:19:55 +01:00
security selinux: fix potential memleak in selinux_add_opt() 2022-01-27 09:19:35 +01:00
sound ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows 2022-02-08 18:24:27 +01:00
tools perf script: Fix hex dump character output 2022-01-27 09:19:54 +01:00
usr initramfs: restore default compression behavior 2020-04-08 09:08:38 +02:00
virt KVM: do not shrink halt_poll_ns below grow_start 2021-10-09 14:39:50 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS Documentation/llvm: add documentation on building w/ Clang/LLVM 2020-08-26 10:40:46 +02:00
Makefile Linux 5.4.177 2022-02-05 12:35:37 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.