License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2009-12-03 17:59:42 +00:00
|
|
|
#ifndef _BLK_CGROUP_H
|
|
|
|
#define _BLK_CGROUP_H
|
|
|
|
/*
|
|
|
|
* Common Block IO controller cgroup interface
|
|
|
|
*
|
|
|
|
* Based on ideas and code from CFQ, CFS and BFQ:
|
|
|
|
* Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
|
|
|
|
*
|
|
|
|
* Copyright (C) 2008 Fabio Checconi <fabio@gandalf.sssup.it>
|
|
|
|
* Paolo Valente <paolo.valente@unimore.it>
|
|
|
|
*
|
|
|
|
* Copyright (C) 2009 Vivek Goyal <vgoyal@redhat.com>
|
|
|
|
* Nauman Rafique <nauman@google.com>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/cgroup.h>
|
2019-11-07 19:18:03 +00:00
|
|
|
#include <linux/percpu.h>
|
2015-08-18 21:55:22 +00:00
|
|
|
#include <linux/percpu_counter.h>
|
2019-11-07 19:18:03 +00:00
|
|
|
#include <linux/u64_stats_sync.h>
|
2012-04-01 21:38:43 +00:00
|
|
|
#include <linux/seq_file.h>
|
2012-04-19 23:29:24 +00:00
|
|
|
#include <linux/radix-tree.h>
|
blkcg: implement per-blkg request allocation
Currently, request_queue has one request_list to allocate requests
from regardless of blkcg of the IO being issued. When the unified
request pool is used up, cfq proportional IO limits become meaningless
- whoever grabs the next request being freed wins the race regardless
of the configured weights.
This can be easily demonstrated by creating a blkio cgroup w/ very low
weight, put a program which can issue a lot of random direct IOs there
and running a sequential IO from a different cgroup. As soon as the
request pool is used up, the sequential IO bandwidth crashes.
This patch implements per-blkg request_list. Each blkg has its own
request_list and any IO allocates its request from the matching blkg
making blkcgs completely isolated in terms of request allocation.
* Root blkcg uses the request_list embedded in each request_queue,
which was renamed to @q->root_rl from @q->rq. While making blkcg rl
handling a bit harier, this enables avoiding most overhead for root
blkcg.
* Queue fullness is properly per request_list but bdi isn't blkcg
aware yet, so congestion state currently just follows the root
blkcg. As writeback isn't aware of blkcg yet, this works okay for
async congestion but readahead may get the wrong signals. It's
better than blkcg completely collapsing with shared request_list but
needs to be improved with future changes.
* After this change, each block cgroup gets a full request pool making
resource consumption of each cgroup higher. This makes allowing
non-root users to create cgroups less desirable; however, note that
allowing non-root users to directly manage cgroups is already
severely broken regardless of this patch - each block cgroup
consumes kernel memory and skews IO weight (IO weights are not
hierarchical).
v2: queue-sysfs.txt updated and patch description udpated as suggested
by Vivek.
v3: blk_get_rl() wasn't checking error return from
blkg_lookup_create() and may cause oops on lookup failure. Fix it
by falling back to root_rl on blkg lookup failures. This problem
was spotted by Rakesh Iyer <rni@google.com>.
v4: Updated to accomodate 458f27a982 "block: Avoid missed wakeup in
request waitqueue". blk_drain_queue() now wakes up waiters on all
blkg->rl on the target queue.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-26 22:05:44 +00:00
|
|
|
#include <linux/blkdev.h>
|
blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t
Hello,
So, this patch should do. Joe, Vivek, can one of you guys please
verify that the oops goes away with this patch?
Jens, the original thread can be read at
http://thread.gmane.org/gmane.linux.kernel/1720729
The fix converts blkg->refcnt from int to atomic_t. It does some
overhead but it should be minute compared to everything else which is
going on and the involved cacheline bouncing, so I think it's highly
unlikely to cause any noticeable difference. Also, the refcnt in
question should be converted to a perpcu_ref for blk-mq anyway, so the
atomic_t is likely to go away pretty soon anyway.
Thanks.
------- 8< -------
__blkg_release_rcu() may be invoked after the associated request_queue
is released with a RCU grace period inbetween. As such, the function
and callbacks invoked from it must not dereference the associated
request_queue. This is clearly indicated in the comment above the
function.
Unfortunately, while trying to fix a different issue, 2a4fd070ee85
("blkcg: move bulk of blkcg_gq release operations to the RCU
callback") ignored this and added [un]locking of @blkg->q->queue_lock
to __blkg_release_rcu(). This of course can cause oops as the
request_queue may be long gone by the time this code gets executed.
general protection fault: 0000 [#1] SMP
CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1
Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013
task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000
RIP: 0010:[<ffffffff8162e9e5>] [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
RSP: 0018:ffff88085403fdf0 EFLAGS: 00010086
RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000
RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b
RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39
R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130
R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0
Stack:
ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000
ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0
ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30
Call Trace:
[<ffffffff812cbfc2>] __blkg_release_rcu+0x72/0x150
[<ffffffff810d1d28>] rcu_nocb_kthread+0x1e8/0x300
[<ffffffff81091d81>] kthread+0xe1/0x100
[<ffffffff8163813c>] ret_from_fork+0x7c/0xb0
Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5
+fa 66 66 90 66 66 90 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f
+b7
RIP [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
RSP <ffff88085403fdf0>
The request_queue locking was added because blkcg_gq->refcnt is an int
protected with the queue lock and __blkg_release_rcu() needs to put
the parent. Let's fix it by making blkcg_gq->refcnt an atomic_t and
dropping queue locking in the function.
Given the general heavy weight of the current request_queue and blkcg
operations, this is unlikely to cause any noticeable overhead.
Moreover, blkcg_gq->refcnt is likely to be converted to percpu_ref in
the near future, so whatever (most likely negligible) overhead it may
add is temporary.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/g/alpine.DEB.2.02.1406081816540.17948@jlaw-desktop.mno.stratus.com
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-19 21:42:57 +00:00
|
|
|
#include <linux/atomic.h>
|
2017-09-14 21:02:06 +00:00
|
|
|
#include <linux/kthread.h>
|
2018-12-05 17:10:31 +00:00
|
|
|
#include <linux/fs.h>
|
2009-12-03 17:59:42 +00:00
|
|
|
|
2021-06-08 04:35:45 +00:00
|
|
|
#define FC_APPID_LEN 129
|
|
|
|
|
2012-04-13 20:11:25 +00:00
|
|
|
#ifdef CONFIG_BLK_CGROUP
|
|
|
|
|
2019-11-07 19:18:03 +00:00
|
|
|
enum blkg_iostat_type {
|
|
|
|
BLKG_IOSTAT_READ,
|
|
|
|
BLKG_IOSTAT_WRITE,
|
|
|
|
BLKG_IOSTAT_DISCARD,
|
|
|
|
|
|
|
|
BLKG_IOSTAT_NR,
|
|
|
|
};
|
|
|
|
|
2012-04-19 23:29:24 +00:00
|
|
|
struct blkcg_gq;
|
2022-02-11 10:11:49 +00:00
|
|
|
struct blkg_policy_data;
|
2012-04-19 23:29:24 +00:00
|
|
|
|
2012-04-16 20:57:25 +00:00
|
|
|
struct blkcg {
|
2012-04-16 20:57:24 +00:00
|
|
|
struct cgroup_subsys_state css;
|
|
|
|
spinlock_t lock;
|
2019-07-24 17:37:22 +00:00
|
|
|
refcount_t online_pin;
|
2012-04-19 23:29:24 +00:00
|
|
|
|
|
|
|
struct radix_tree_root blkg_tree;
|
2016-09-23 16:07:56 +00:00
|
|
|
struct blkcg_gq __rcu *blkg_hint;
|
2012-04-16 20:57:24 +00:00
|
|
|
struct hlist_head blkg_list;
|
2012-03-19 22:10:56 +00:00
|
|
|
|
2015-08-18 21:55:15 +00:00
|
|
|
struct blkcg_policy_data *cpd[BLKCG_MAX_POLS];
|
2015-05-22 21:13:37 +00:00
|
|
|
|
2015-07-09 20:39:49 +00:00
|
|
|
struct list_head all_blkcgs_node;
|
2021-06-08 04:35:45 +00:00
|
|
|
#ifdef CONFIG_BLK_CGROUP_FC_APPID
|
|
|
|
char fc_app_id[FC_APPID_LEN];
|
|
|
|
#endif
|
2015-05-22 21:13:37 +00:00
|
|
|
#ifdef CONFIG_CGROUP_WRITEBACK
|
|
|
|
struct list_head cgwb_list;
|
|
|
|
#endif
|
2009-12-03 17:59:42 +00:00
|
|
|
};
|
|
|
|
|
2019-11-07 19:18:03 +00:00
|
|
|
struct blkg_iostat {
|
|
|
|
u64 bytes[BLKG_IOSTAT_NR];
|
|
|
|
u64 ios[BLKG_IOSTAT_NR];
|
|
|
|
};
|
|
|
|
|
|
|
|
struct blkg_iostat_set {
|
|
|
|
struct u64_stats_sync sync;
|
|
|
|
struct blkg_iostat cur;
|
|
|
|
struct blkg_iostat last;
|
|
|
|
};
|
|
|
|
|
2012-04-16 20:57:25 +00:00
|
|
|
/* association between a blk cgroup and a request queue */
|
|
|
|
struct blkcg_gq {
|
2012-03-05 21:15:22 +00:00
|
|
|
/* Pointer to the associated request_queue */
|
2012-04-16 20:57:24 +00:00
|
|
|
struct request_queue *q;
|
|
|
|
struct list_head q_node;
|
|
|
|
struct hlist_node blkcg_node;
|
2012-04-16 20:57:25 +00:00
|
|
|
struct blkcg *blkcg;
|
2013-01-09 16:05:10 +00:00
|
|
|
|
|
|
|
/* all non-root blkcg_gq's are guaranteed to have access to parent */
|
|
|
|
struct blkcg_gq *parent;
|
|
|
|
|
2012-03-05 21:15:15 +00:00
|
|
|
/* reference count */
|
2018-12-05 17:10:38 +00:00
|
|
|
struct percpu_ref refcnt;
|
2009-12-03 17:59:49 +00:00
|
|
|
|
2013-01-09 16:05:12 +00:00
|
|
|
/* is this blkg online? protected by both blkcg and q locks */
|
|
|
|
bool online;
|
|
|
|
|
2019-11-07 19:18:03 +00:00
|
|
|
struct blkg_iostat_set __percpu *iostat_cpu;
|
|
|
|
struct blkg_iostat_set iostat;
|
2015-08-18 21:55:24 +00:00
|
|
|
|
2012-04-16 20:57:24 +00:00
|
|
|
struct blkg_policy_data *pd[BLKCG_MAX_POLS];
|
2012-03-05 21:15:15 +00:00
|
|
|
|
2019-06-27 20:39:52 +00:00
|
|
|
spinlock_t async_bio_lock;
|
|
|
|
struct bio_list async_bios;
|
block: avoid calling blkg_free() in atomic context
blkg_free() can currently be called in atomic context, either spin lock is
held, or run in rcu callback. Meantime either request queue's release
handler or ->pd_free_fn can sleep.
Fix the issue by scheduling a work function for freeing blkcg_gq the
instance.
[ 148.553894] BUG: sleeping function called from invalid context at block/blk-sysfs.c:767
[ 148.557381] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/13
[ 148.560741] preempt_count: 101, expected: 0
[ 148.562577] RCU nest depth: 0, expected: 0
[ 148.564379] 1 lock held by swapper/13/0:
[ 148.566127] #0: ffffffff82615f80 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire+0x0/0x1b
[ 148.569640] Preemption disabled at:
[ 148.569642] [<ffffffff8123f9c3>] ___slab_alloc+0x554/0x661
[ 148.573559] CPU: 13 PID: 0 Comm: swapper/13 Kdump: loaded Not tainted 5.17.0_up+ #110
[ 148.576834] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[ 148.579768] Call Trace:
[ 148.580567] <IRQ>
[ 148.581262] dump_stack_lvl+0x56/0x7c
[ 148.582367] ? ___slab_alloc+0x554/0x661
[ 148.583526] __might_resched+0x1af/0x1c8
[ 148.584678] blk_release_queue+0x24/0x109
[ 148.585861] kobject_cleanup+0xc9/0xfe
[ 148.586979] blkg_free+0x46/0x63
[ 148.587962] rcu_do_batch+0x1c5/0x3db
[ 148.589057] rcu_core+0x14a/0x184
[ 148.590065] __do_softirq+0x14d/0x2c7
[ 148.591167] __irq_exit_rcu+0x7a/0xd4
[ 148.592264] sysvec_apic_timer_interrupt+0x82/0xa5
[ 148.593649] </IRQ>
[ 148.594354] <TASK>
[ 148.595058] asm_sysvec_apic_timer_interrupt+0x12/0x20
Cc: Tejun Heo <tj@kernel.org>
Fixes: 0a9a25ca7843 ("block: let blkcg_gq grab request queue's refcnt")
Reported-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-block/20220322093322.GA27283@lst.de/
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220323011308.2010380-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-23 01:13:08 +00:00
|
|
|
union {
|
|
|
|
struct work_struct async_bio_work;
|
|
|
|
struct work_struct free_work;
|
|
|
|
};
|
2018-07-03 15:14:55 +00:00
|
|
|
|
|
|
|
atomic_t use_delay;
|
|
|
|
atomic64_t delay_nsec;
|
|
|
|
atomic64_t delay_start;
|
|
|
|
u64 last_delay;
|
|
|
|
int last_use;
|
2019-06-27 20:39:52 +00:00
|
|
|
|
|
|
|
struct rcu_head rcu_head;
|
2009-12-03 17:59:42 +00:00
|
|
|
};
|
|
|
|
|
2015-05-22 21:13:21 +00:00
|
|
|
extern struct cgroup_subsys_state * const blkcg_root_css;
|
2018-12-05 17:10:26 +00:00
|
|
|
|
2022-02-11 10:11:49 +00:00
|
|
|
void blkcg_destroy_blkgs(struct blkcg *blkcg);
|
|
|
|
void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay);
|
|
|
|
void blkcg_maybe_throttle_current(void);
|
2018-12-05 17:10:26 +00:00
|
|
|
|
2013-08-09 00:11:23 +00:00
|
|
|
static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css)
|
|
|
|
{
|
|
|
|
return css ? container_of(css, struct blkcg, css) : NULL;
|
|
|
|
}
|
|
|
|
|
2018-12-05 17:10:26 +00:00
|
|
|
/**
|
|
|
|
* bio_blkcg - grab the blkcg associated with a bio
|
|
|
|
* @bio: target bio
|
|
|
|
*
|
|
|
|
* This returns the blkcg associated with a bio, %NULL if not associated.
|
|
|
|
* Callers are expected to either handle %NULL or know association has been
|
|
|
|
* done prior to calling this.
|
|
|
|
*/
|
|
|
|
static inline struct blkcg *bio_blkcg(struct bio *bio)
|
|
|
|
{
|
2018-12-05 17:10:35 +00:00
|
|
|
if (bio && bio->bi_blkg)
|
|
|
|
return bio->bi_blkg->blkcg;
|
2018-12-05 17:10:26 +00:00
|
|
|
return NULL;
|
2015-05-22 21:13:23 +00:00
|
|
|
}
|
|
|
|
|
2018-07-03 15:14:55 +00:00
|
|
|
static inline bool blk_cgroup_congested(void)
|
|
|
|
{
|
|
|
|
struct cgroup_subsys_state *css;
|
|
|
|
bool ret = false;
|
|
|
|
|
|
|
|
rcu_read_lock();
|
|
|
|
css = kthread_blkcg();
|
|
|
|
if (!css)
|
|
|
|
css = task_css(current, io_cgrp_id);
|
|
|
|
while (css) {
|
|
|
|
if (atomic_read(&css->cgroup->congestion_count)) {
|
|
|
|
ret = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
css = css->parent;
|
|
|
|
}
|
|
|
|
rcu_read_unlock();
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-01-09 16:05:10 +00:00
|
|
|
/**
|
|
|
|
* blkcg_parent - get the parent of a blkcg
|
|
|
|
* @blkcg: blkcg of interest
|
|
|
|
*
|
|
|
|
* Return the parent blkcg of @blkcg. Can be called anytime.
|
|
|
|
*/
|
|
|
|
static inline struct blkcg *blkcg_parent(struct blkcg *blkcg)
|
|
|
|
{
|
2014-05-16 17:22:48 +00:00
|
|
|
return css_to_blkcg(blkcg->css.parent);
|
2013-01-09 16:05:10 +00:00
|
|
|
}
|
|
|
|
|
2018-08-31 20:22:43 +00:00
|
|
|
/**
|
2019-07-24 17:37:22 +00:00
|
|
|
* blkcg_pin_online - pin online state
|
2018-08-31 20:22:43 +00:00
|
|
|
* @blkcg: blkcg of interest
|
|
|
|
*
|
2019-07-24 17:37:22 +00:00
|
|
|
* While pinned, a blkcg is kept online. This is primarily used to
|
|
|
|
* impedance-match blkg and cgwb lifetimes so that blkg doesn't go offline
|
|
|
|
* while an associated cgwb is still active.
|
2018-08-31 20:22:43 +00:00
|
|
|
*/
|
2019-07-24 17:37:22 +00:00
|
|
|
static inline void blkcg_pin_online(struct blkcg *blkcg)
|
2018-08-31 20:22:43 +00:00
|
|
|
{
|
2019-07-24 17:37:22 +00:00
|
|
|
refcount_inc(&blkcg->online_pin);
|
2018-08-31 20:22:43 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2019-07-24 17:37:22 +00:00
|
|
|
* blkcg_unpin_online - unpin online state
|
2018-08-31 20:22:43 +00:00
|
|
|
* @blkcg: blkcg of interest
|
|
|
|
*
|
2019-07-24 17:37:22 +00:00
|
|
|
* This is primarily used to impedance-match blkg and cgwb lifetimes so
|
|
|
|
* that blkg doesn't go offline while an associated cgwb is still active.
|
|
|
|
* When this count goes to zero, all active cgwbs have finished so the
|
2018-08-31 20:22:43 +00:00
|
|
|
* blkcg can continue destruction by calling blkcg_destroy_blkgs().
|
|
|
|
*/
|
2019-07-24 17:37:22 +00:00
|
|
|
static inline void blkcg_unpin_online(struct blkcg *blkcg)
|
2018-08-31 20:22:43 +00:00
|
|
|
{
|
2019-07-24 17:37:55 +00:00
|
|
|
do {
|
|
|
|
if (!refcount_dec_and_test(&blkcg->online_pin))
|
|
|
|
break;
|
2018-08-31 20:22:43 +00:00
|
|
|
blkcg_destroy_blkgs(blkcg);
|
2019-07-24 17:37:55 +00:00
|
|
|
blkcg = blkcg_parent(blkcg);
|
|
|
|
} while (blkcg);
|
2018-08-31 20:22:43 +00:00
|
|
|
}
|
|
|
|
|
2012-04-16 20:57:24 +00:00
|
|
|
#else /* CONFIG_BLK_CGROUP */
|
|
|
|
|
2015-05-22 21:13:18 +00:00
|
|
|
struct blkcg {
|
|
|
|
};
|
2009-12-03 20:06:43 +00:00
|
|
|
|
2012-04-16 20:57:25 +00:00
|
|
|
struct blkcg_gq {
|
2009-12-03 20:06:43 +00:00
|
|
|
};
|
|
|
|
|
2015-05-22 21:13:21 +00:00
|
|
|
#define blkcg_root_css ((struct cgroup_subsys_state *)ERR_PTR(-EINVAL))
|
|
|
|
|
2018-07-03 15:14:55 +00:00
|
|
|
static inline void blkcg_maybe_throttle_current(void) { }
|
|
|
|
static inline bool blk_cgroup_congested(void) { return false; }
|
|
|
|
|
2015-05-22 21:13:18 +00:00
|
|
|
#ifdef CONFIG_BLOCK
|
2018-07-03 15:14:55 +00:00
|
|
|
static inline void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay) { }
|
2012-06-05 03:40:57 +00:00
|
|
|
static inline struct blkcg *bio_blkcg(struct bio *bio) { return NULL; }
|
2022-02-11 10:11:49 +00:00
|
|
|
#endif /* CONFIG_BLOCK */
|
blkcg: implement per-blkg request allocation
Currently, request_queue has one request_list to allocate requests
from regardless of blkcg of the IO being issued. When the unified
request pool is used up, cfq proportional IO limits become meaningless
- whoever grabs the next request being freed wins the race regardless
of the configured weights.
This can be easily demonstrated by creating a blkio cgroup w/ very low
weight, put a program which can issue a lot of random direct IOs there
and running a sequential IO from a different cgroup. As soon as the
request pool is used up, the sequential IO bandwidth crashes.
This patch implements per-blkg request_list. Each blkg has its own
request_list and any IO allocates its request from the matching blkg
making blkcgs completely isolated in terms of request allocation.
* Root blkcg uses the request_list embedded in each request_queue,
which was renamed to @q->root_rl from @q->rq. While making blkcg rl
handling a bit harier, this enables avoiding most overhead for root
blkcg.
* Queue fullness is properly per request_list but bdi isn't blkcg
aware yet, so congestion state currently just follows the root
blkcg. As writeback isn't aware of blkcg yet, this works okay for
async congestion but readahead may get the wrong signals. It's
better than blkcg completely collapsing with shared request_list but
needs to be improved with future changes.
* After this change, each block cgroup gets a full request pool making
resource consumption of each cgroup higher. This makes allowing
non-root users to create cgroups less desirable; however, note that
allowing non-root users to directly manage cgroups is already
severely broken regardless of this patch - each block cgroup
consumes kernel memory and skews IO weight (IO weights are not
hierarchical).
v2: queue-sysfs.txt updated and patch description udpated as suggested
by Vivek.
v3: blk_get_rl() wasn't checking error return from
blkg_lookup_create() and may cause oops on lookup failure. Fix it
by falling back to root_rl on blkg lookup failures. This problem
was spotted by Rakesh Iyer <rni@google.com>.
v4: Updated to accomodate 458f27a982 "block: Avoid missed wakeup in
request waitqueue". blk_drain_queue() now wakes up waiters on all
blkg->rl on the target queue.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-26 22:05:44 +00:00
|
|
|
|
2012-04-16 20:57:24 +00:00
|
|
|
#endif /* CONFIG_BLK_CGROUP */
|
2021-06-08 04:35:45 +00:00
|
|
|
|
2022-04-20 04:27:12 +00:00
|
|
|
int blkcg_set_fc_appid(char *app_id, u64 cgrp_id, size_t app_id_len);
|
|
|
|
char *blkcg_get_fc_appid(struct bio *bio);
|
2021-06-08 04:35:45 +00:00
|
|
|
|
2012-04-16 20:57:24 +00:00
|
|
|
#endif /* _BLK_CGROUP_H */
|