linux-stable/include/linux/sched
Alexey Gladkov 21d1c5e386 Reimplement RLIMIT_NPROC on top of ucounts
The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

To illustrate the impact of rlimits, let's say there is a program that
does not fork. Some service-A wants to run this program as user X in
multiple containers. Since the program never fork the service wants to
set RLIMIT_NPROC=1.

service-A
 \- program (uid=1000, container1, rlimit_nproc=1)
 \- program (uid=1000, container2, rlimit_nproc=1)

The service-A sets RLIMIT_NPROC=1 and runs the program in container1.
When the service-A tries to run a program with RLIMIT_NPROC=1 in
container2 it fails since user X already has one running process.

We cannot use existing inc_ucounts / dec_ucounts because they do not
allow us to exceed the maximum for the counter. Some rlimits can be
overlimited by root or if the user has the appropriate capability.

Changelog

v11:
* Change inc_rlimit_ucounts() which now returns top value of ucounts.
* Drop inc_rlimit_ucounts_and_test() because the return code of
  inc_rlimit_ucounts() can be checked.

Signed-off-by: Alexey Gladkov <legion@kernel.org>
Link: https://lkml.kernel.org/r/c5286a8aa16d2d698c222f7532f3d735c82bc6bc.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-04-30 14:14:01 -05:00
..
autogroup.h
clock.h
coredump.h
cpufreq.h cpufreq: Add special-purpose fast-switching callback for drivers 2020-12-15 19:24:18 +01:00
cputime.h
deadline.h
debug.h treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
hotplug.h sched/hotplug: Consolidate task migration on CPU unplug 2020-11-10 18:38:58 +01:00
idle.h
init.h
isolation.h
jobctl.h signal: kill JOBCTL_TASK_WORK 2020-12-12 09:17:38 -07:00
loadavg.h
mm.h include/linux/sched/mm.h: use rcu_dereference in in_vfork() 2021-03-13 11:27:30 -08:00
nohz.h
numa_balancing.h
prio.h sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO 2021-02-17 14:08:17 +01:00
rt.h
sd_flags.h
signal.h tif-task_work.arch-2020-12-14 2020-12-16 12:33:35 -08:00
smt.h
stat.h
sysctl.h
task.h kernel: provide create_io_thread() helper 2021-03-04 15:45:03 -07:00
task_stack.h
topology.h sched/topology,schedutil: Wrap sched domains rebuild 2020-11-19 11:25:47 +01:00
types.h
user.h Reimplement RLIMIT_NPROC on top of ucounts 2021-04-30 14:14:01 -05:00
wake_q.h
xacct.h