Commit graph

7 commits

Author SHA1 Message Date
Tejun Heo
8cfd8147df cgroup: implement cgroup v2 thread support
This patch implements cgroup v2 thread support.  The goal of the
thread mode is supporting hierarchical accounting and control at
thread granularity while staying inside the resource domain model
which allows coordination across different resource controllers and
handling of anonymous resource consumptions.

A cgroup is always created as a domain and can be made threaded by
writing to the "cgroup.type" file.  When a cgroup becomes threaded, it
becomes a member of a threaded subtree which is anchored at the
closest ancestor which isn't threaded.

The threads of the processes which are in a threaded subtree can be
placed anywhere without being restricted by process granularity or
no-internal-process constraint.  Note that the threads aren't allowed
to escape to a different threaded subtree.  To be used inside a
threaded subtree, a controller should explicitly support threaded mode
and be able to handle internal competition in the way which is
appropriate for the resource.

The root of a threaded subtree, the nearest ancestor which isn't
threaded, is called the threaded domain and serves as the resource
domain for the whole subtree.  This is the last cgroup where domain
controllers are operational and where all the domain-level resource
consumptions in the subtree are accounted.  This allows threaded
controllers to operate at thread granularity when requested while
staying inside the scope of system-level resource distribution.

As the root cgroup is exempt from the no-internal-process constraint,
it can serve as both a threaded domain and a parent to normal cgroups,
so, unlike non-root cgroups, the root cgroup can have both domain and
threaded children.

Internally, in a threaded subtree, each css_set has its ->dom_cset
pointing to a matching css_set which belongs to the threaded domain.
This ensures that thread root level cgroup_subsys_state for all
threaded controllers are readily accessible for domain-level
operations.

This patch enables threaded mode for the pids and perf_events
controllers.  Neither has to worry about domain-level resource
consumptions and it's enough to simply set the flag.

For more details on the interface and behavior of the thread mode,
please refer to the section 2-2-2 in Documentation/cgroup-v2.txt added
by this patch.

v5: - Dropped silly no-op ->dom_cgrp init from cgroup_create().
      Spotted by Waiman.
    - Documentation updated as suggested by Waiman.
    - cgroup.type content slightly reformatted.
    - Mark the debug controller threaded.

v4: - Updated to the general idea of marking specific cgroups
      domain/threaded as suggested by PeterZ.

v3: - Dropped "join" and always make mixed children join the parent's
      threaded subtree.

v2: - After discussions with Waiman, support for mixed thread mode is
      added.  This should address the issue that Peter pointed out
      where any nesting should be avoided for thread subtrees while
      coexisting with other domain cgroups.
    - Enabling / disabling thread mode now piggy backs on the existing
      control mask update mechanism.
    - Bug fixes and cleanup.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2017-07-21 11:14:51 -04:00
Tejun Heo
b6053d40e3 cgroup: fix lockdep warning in debug controller
The debug controller grabs cgroup_mutex from interface file show
functions which can deadlock and triggers lockdep warnings.  Fix it by
using cgroup_kn_lock_live()/cgroup_kn_unlock() instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:41 -04:00
Tejun Heo
2866c0b4cf cgroup: refactor cgroup_masks_read() in the debug controller
Factor out cgroup_masks_read_one() out of cgroup_masks_read() for
simplicity.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:36 -04:00
Tejun Heo
8cc38fa7fa cgroup: make debug an implicit controller on cgroup2
Make debug an implicit controller on cgroup2 which is enabled by
"cgroup_debug" boot param.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:32 -04:00
Waiman Long
575313f40f cgroup: Make debug cgroup support v2 and thread mode
Besides supporting cgroup v2 and thread mode, the following changes
are also made:
 1) current_* cgroup files now resides only at the root as we don't
    need duplicated files of the same function all over the cgroup
    hierarchy.
 2) The cgroup_css_links_read() function is modified to report
    the number of tasks that are skipped because of overflow.
 3) The number of extra unaccounted references are displayed.
 4) The current_css_set_read() function now prints out the addresses of
    the css'es associated with the current css_set.
 5) A new cgroup_subsys_states file is added to display the css objects
    associated with a cgroup.
 6) A new cgroup_masks file is added to display the various controller
    bit masks in the cgroup.

tj: Dropped thread mode related information for now so that debug
    controller changes aren't blocked on the thread mode.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Waiman Long
23b0be480f cgroup: Make Kconfig prompt of debug cgroup more accurate
The Kconfig prompt and description of the debug cgroup controller
more accurate by saying that it is for debug purpose only and its
interfaces are unstable.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Waiman Long
a28f8f5e99 cgroup: Move debug cgroup to its own file
The debug cgroup currently resides within cgroup-v1.c and is enabled
only for v1 cgroup. To enable the debug cgroup also for v2, it makes
sense to put the code into its own file as it will no longer be v1
specific. There is no change to the debug cgroup specific code.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00