License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2010-08-03 11:14:58 +00:00
|
|
|
/*
|
|
|
|
* Block data types and constants. Directly include this file only to
|
|
|
|
* break include dependency loop.
|
|
|
|
*/
|
|
|
|
#ifndef __LINUX_BLK_TYPES_H
|
|
|
|
#define __LINUX_BLK_TYPES_H
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
2016-05-30 13:34:30 +00:00
|
|
|
#include <linux/bvec.h>
|
2020-11-27 15:43:51 +00:00
|
|
|
#include <linux/device.h>
|
2018-05-09 09:08:49 +00:00
|
|
|
#include <linux/ktime.h>
|
2010-08-03 11:14:58 +00:00
|
|
|
|
|
|
|
struct bio_set;
|
|
|
|
struct bio;
|
|
|
|
struct bio_integrity_payload;
|
|
|
|
struct page;
|
2012-03-05 21:15:27 +00:00
|
|
|
struct io_context;
|
|
|
|
struct cgroup_subsys_state;
|
2015-07-20 13:29:37 +00:00
|
|
|
typedef void (bio_end_io_t) (struct bio *);
|
block: Inline encryption support for blk-mq
We must have some way of letting a storage device driver know what
encryption context it should use for en/decrypting a request. However,
it's the upper layers (like the filesystem/fscrypt) that know about and
manages encryption contexts. As such, when the upper layer submits a bio
to the block layer, and this bio eventually reaches a device driver with
support for inline encryption, the device driver will need to have been
told the encryption context for that bio.
We want to communicate the encryption context from the upper layer to the
storage device along with the bio, when the bio is submitted to the block
layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can
represent an encryption context (note that we can't use the bi_private
field in struct bio to do this because that field does not function to pass
information across layers in the storage stack). We also introduce various
functions to manipulate the bio_crypt_ctx and make the bio/request merging
logic aware of the bio_crypt_ctx.
We also make changes to blk-mq to make it handle bios with encryption
contexts. blk-mq can merge many bios into the same request. These bios need
to have contiguous data unit numbers (the necessary changes to blk-merge
are also made to ensure this) - as such, it suffices to keep the data unit
number of just the first bio, since that's all a storage driver needs to
infer the data unit number to use for each data block in each bio in a
request. blk-mq keeps track of the encryption context to be used for all
the bios in a request with the request's rq_crypt_ctx. When the first bio
is added to an empty request, blk-mq will program the encryption context
of that bio into the request_queue's keyslot manager, and store the
returned keyslot in the request's rq_crypt_ctx. All the functions to
operate on encryption contexts are in blk-crypto.c.
Upper layers only need to call bio_crypt_set_ctx with the encryption key,
algorithm and data_unit_num; they don't have to worry about getting a
keyslot for each encryption context, as blk-mq/blk-crypto handles that.
Blk-crypto also makes it possible for request-based layered devices like
dm-rq to make use of inline encryption hardware by cloning the
rq_crypt_ctx and programming a keyslot in the new request_queue when
necessary.
Note that any user of the block layer can submit bios with an
encryption context, such as filesystems, device-mapper targets, etc.
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-14 00:37:18 +00:00
|
|
|
struct bio_crypt_ctx;
|
2010-08-03 11:14:58 +00:00
|
|
|
|
2021-10-18 10:11:01 +00:00
|
|
|
/*
|
|
|
|
* The basic unit of block I/O is a sector. It is used in a number of contexts
|
|
|
|
* in Linux (blk, bio, genhd). The size of one sector is 512 = 2**9
|
|
|
|
* bytes. Variables of type sector_t represent an offset or size that is a
|
|
|
|
* multiple of 512 bytes. Hence these two constants.
|
|
|
|
*/
|
|
|
|
#ifndef SECTOR_SHIFT
|
|
|
|
#define SECTOR_SHIFT 9
|
|
|
|
#endif
|
|
|
|
#ifndef SECTOR_SIZE
|
|
|
|
#define SECTOR_SIZE (1 << SECTOR_SHIFT)
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#define PAGE_SECTORS_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
|
|
|
|
#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT)
|
|
|
|
#define SECTOR_MASK (PAGE_SECTORS - 1)
|
|
|
|
|
2020-06-20 07:16:44 +00:00
|
|
|
struct block_device {
|
2020-11-24 08:34:24 +00:00
|
|
|
sector_t bd_start_sect;
|
2021-10-18 17:39:45 +00:00
|
|
|
sector_t bd_nr_sectors;
|
2023-04-14 13:30:16 +00:00
|
|
|
struct gendisk * bd_disk;
|
|
|
|
struct request_queue * bd_queue;
|
2020-11-24 08:34:00 +00:00
|
|
|
struct disk_stats __percpu *bd_stats;
|
|
|
|
unsigned long bd_stamp;
|
2020-11-23 15:36:02 +00:00
|
|
|
bool bd_read_only; /* read-only policy */
|
2023-04-14 13:30:16 +00:00
|
|
|
u8 bd_partno;
|
|
|
|
bool bd_write_holder;
|
2023-04-14 13:32:02 +00:00
|
|
|
bool bd_has_submit_bio;
|
2020-08-31 18:02:35 +00:00
|
|
|
dev_t bd_dev;
|
2022-03-30 05:29:07 +00:00
|
|
|
atomic_t bd_openers;
|
2023-04-14 13:30:16 +00:00
|
|
|
spinlock_t bd_size_lock; /* for bd_inode->i_size updates */
|
2020-06-20 07:16:44 +00:00
|
|
|
struct inode * bd_inode; /* will die */
|
|
|
|
struct super_block * bd_super;
|
|
|
|
void * bd_claiming;
|
|
|
|
void * bd_holder;
|
2023-04-14 13:30:16 +00:00
|
|
|
/* The counter of freeze processes */
|
|
|
|
int bd_fsfreeze_count;
|
2020-06-20 07:16:44 +00:00
|
|
|
int bd_holders;
|
2020-11-23 18:00:13 +00:00
|
|
|
struct kobject *bd_holder_dir;
|
2020-06-20 07:16:44 +00:00
|
|
|
|
|
|
|
/* Mutex for freeze */
|
|
|
|
struct mutex bd_fsfreeze_mutex;
|
2020-11-24 10:54:06 +00:00
|
|
|
struct super_block *bd_fsfreeze_sb;
|
2020-11-24 11:01:45 +00:00
|
|
|
|
|
|
|
struct partition_meta_info *bd_meta_info;
|
2020-11-23 15:28:47 +00:00
|
|
|
#ifdef CONFIG_FAIL_MAKE_REQUEST
|
|
|
|
bool bd_make_it_fail;
|
|
|
|
#endif
|
2023-04-14 13:30:16 +00:00
|
|
|
/*
|
|
|
|
* keep this out-of-line as it's both big and not needed in the fast
|
|
|
|
* path
|
|
|
|
*/
|
|
|
|
struct device bd_device;
|
2020-06-20 07:16:44 +00:00
|
|
|
} __randomize_layout;
|
|
|
|
|
2020-11-23 12:29:55 +00:00
|
|
|
#define bdev_whole(_bdev) \
|
2020-11-26 17:47:17 +00:00
|
|
|
((_bdev)->bd_disk->part0)
|
2020-11-23 12:29:55 +00:00
|
|
|
|
2020-11-27 15:43:51 +00:00
|
|
|
#define dev_to_bdev(device) \
|
|
|
|
container_of((device), struct block_device, bd_device)
|
|
|
|
|
2020-11-17 07:18:55 +00:00
|
|
|
#define bdev_kobj(_bdev) \
|
2020-11-27 15:43:51 +00:00
|
|
|
(&((_bdev)->bd_device.kobj))
|
2020-11-17 07:18:55 +00:00
|
|
|
|
2017-06-03 07:38:04 +00:00
|
|
|
/*
|
|
|
|
* Block error status values. See block/blk-core:blk_errors for the details.
|
2018-03-21 16:42:25 +00:00
|
|
|
* Alpha cannot write a byte atomically, so we need to use 32-bit value.
|
2017-06-03 07:38:04 +00:00
|
|
|
*/
|
2018-03-21 16:42:25 +00:00
|
|
|
#if defined(CONFIG_ALPHA) && !defined(__alpha_bwx__)
|
|
|
|
typedef u32 __bitwise blk_status_t;
|
2022-03-28 16:34:31 +00:00
|
|
|
typedef u32 blk_short_t;
|
2018-03-21 16:42:25 +00:00
|
|
|
#else
|
2017-06-03 07:38:04 +00:00
|
|
|
typedef u8 __bitwise blk_status_t;
|
2022-03-28 16:34:31 +00:00
|
|
|
typedef u16 blk_short_t;
|
2018-03-21 16:42:25 +00:00
|
|
|
#endif
|
2017-06-03 07:38:04 +00:00
|
|
|
#define BLK_STS_OK 0
|
|
|
|
#define BLK_STS_NOTSUPP ((__force blk_status_t)1)
|
|
|
|
#define BLK_STS_TIMEOUT ((__force blk_status_t)2)
|
|
|
|
#define BLK_STS_NOSPC ((__force blk_status_t)3)
|
|
|
|
#define BLK_STS_TRANSPORT ((__force blk_status_t)4)
|
|
|
|
#define BLK_STS_TARGET ((__force blk_status_t)5)
|
|
|
|
#define BLK_STS_NEXUS ((__force blk_status_t)6)
|
|
|
|
#define BLK_STS_MEDIUM ((__force blk_status_t)7)
|
|
|
|
#define BLK_STS_PROTECTION ((__force blk_status_t)8)
|
|
|
|
#define BLK_STS_RESOURCE ((__force blk_status_t)9)
|
|
|
|
#define BLK_STS_IOERR ((__force blk_status_t)10)
|
|
|
|
|
2017-06-03 07:38:06 +00:00
|
|
|
/* hack for device mapper, don't use elsewhere: */
|
|
|
|
#define BLK_STS_DM_REQUEUE ((__force blk_status_t)11)
|
|
|
|
|
2022-05-24 05:56:30 +00:00
|
|
|
/*
|
|
|
|
* BLK_STS_AGAIN should only be returned if RQF_NOWAIT is set
|
|
|
|
* and the bio would block (cf bio_wouldblock_error())
|
|
|
|
*/
|
2017-06-20 12:05:46 +00:00
|
|
|
#define BLK_STS_AGAIN ((__force blk_status_t)12)
|
|
|
|
|
2018-01-31 03:04:57 +00:00
|
|
|
/*
|
|
|
|
* BLK_STS_DEV_RESOURCE is returned from the driver to the block layer if
|
|
|
|
* device related resources are unavailable, but the driver can guarantee
|
|
|
|
* that the queue will be rerun in the future once resources become
|
|
|
|
* available again. This is typically the case for device specific
|
|
|
|
* resources that are consumed for IO. If the driver fails allocating these
|
|
|
|
* resources, we know that inflight (or pending) IO will free these
|
|
|
|
* resource upon completion.
|
|
|
|
*
|
|
|
|
* This is different from BLK_STS_RESOURCE in that it explicitly references
|
|
|
|
* a device specific resource. For resources of wider scope, allocation
|
|
|
|
* failure can happen without having pending IO. This means that we can't
|
|
|
|
* rely on request completions freeing these resources, as IO may not be in
|
|
|
|
* flight. Examples of that are kernel memory allocations, DMA mappings, or
|
|
|
|
* any other system wide resources.
|
|
|
|
*/
|
|
|
|
#define BLK_STS_DEV_RESOURCE ((__force blk_status_t)13)
|
|
|
|
|
2020-05-12 08:55:47 +00:00
|
|
|
/*
|
|
|
|
* BLK_STS_ZONE_RESOURCE is returned from the driver to the block layer if zone
|
|
|
|
* related resources are unavailable, but the driver can guarantee the queue
|
|
|
|
* will be rerun in the future once the resources become available again.
|
|
|
|
*
|
|
|
|
* This is different from BLK_STS_DEV_RESOURCE in that it explicitly references
|
|
|
|
* a zone specific resource and IO to a different zone on the same device could
|
|
|
|
* still be served. Examples of that are zones that are write-locked, but a read
|
|
|
|
* to the same zone could be served.
|
|
|
|
*/
|
|
|
|
#define BLK_STS_ZONE_RESOURCE ((__force blk_status_t)14)
|
|
|
|
|
2020-09-24 20:53:28 +00:00
|
|
|
/*
|
|
|
|
* BLK_STS_ZONE_OPEN_RESOURCE is returned from the driver in the completion
|
|
|
|
* path if the device returns a status indicating that too many zone resources
|
|
|
|
* are currently open. The same command should be successful if resubmitted
|
|
|
|
* after the number of open zones decreases below the device's limits, which is
|
|
|
|
* reported in the request_queue's max_open_zones.
|
|
|
|
*/
|
|
|
|
#define BLK_STS_ZONE_OPEN_RESOURCE ((__force blk_status_t)15)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* BLK_STS_ZONE_ACTIVE_RESOURCE is returned from the driver in the completion
|
|
|
|
* path if the device returns a status indicating that too many zone resources
|
|
|
|
* are currently active. The same command should be successful if resubmitted
|
|
|
|
* after the number of active zones decreases below the device's limits, which
|
|
|
|
* is reported in the request_queue's max_active_zones.
|
|
|
|
*/
|
|
|
|
#define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
|
|
|
|
|
2022-02-03 19:28:25 +00:00
|
|
|
/*
|
|
|
|
* BLK_STS_OFFLINE is returned from the driver when the target device is offline
|
|
|
|
* or is being taken offline. This could help differentiate the case where a
|
|
|
|
* device is intentionally being shut down from a real I/O error.
|
|
|
|
*/
|
|
|
|
#define BLK_STS_OFFLINE ((__force blk_status_t)17)
|
|
|
|
|
2018-01-09 19:04:16 +00:00
|
|
|
/**
|
|
|
|
* blk_path_error - returns true if error may be path related
|
|
|
|
* @error: status the request was completed with
|
|
|
|
*
|
|
|
|
* Description:
|
|
|
|
* This classifies block error status into non-retryable errors and ones
|
|
|
|
* that may be successful if retried on a failover path.
|
|
|
|
*
|
|
|
|
* Return:
|
|
|
|
* %false - retrying failover path will not help
|
|
|
|
* %true - may succeed if retried
|
|
|
|
*/
|
|
|
|
static inline bool blk_path_error(blk_status_t error)
|
|
|
|
{
|
|
|
|
switch (error) {
|
|
|
|
case BLK_STS_NOTSUPP:
|
|
|
|
case BLK_STS_NOSPC:
|
|
|
|
case BLK_STS_TARGET:
|
|
|
|
case BLK_STS_NEXUS:
|
|
|
|
case BLK_STS_MEDIUM:
|
|
|
|
case BLK_STS_PROTECTION:
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Anything else could be a path failure, so should be retried */
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2018-05-09 09:08:49 +00:00
|
|
|
/*
|
|
|
|
* From most significant bit:
|
|
|
|
* 1 bit: reserved for other usage, see below
|
|
|
|
* 12 bits: original size of bio
|
|
|
|
* 51 bits: issue time of bio
|
|
|
|
*/
|
|
|
|
#define BIO_ISSUE_RES_BITS 1
|
|
|
|
#define BIO_ISSUE_SIZE_BITS 12
|
|
|
|
#define BIO_ISSUE_RES_SHIFT (64 - BIO_ISSUE_RES_BITS)
|
|
|
|
#define BIO_ISSUE_SIZE_SHIFT (BIO_ISSUE_RES_SHIFT - BIO_ISSUE_SIZE_BITS)
|
|
|
|
#define BIO_ISSUE_TIME_MASK ((1ULL << BIO_ISSUE_SIZE_SHIFT) - 1)
|
|
|
|
#define BIO_ISSUE_SIZE_MASK \
|
|
|
|
(((1ULL << BIO_ISSUE_SIZE_BITS) - 1) << BIO_ISSUE_SIZE_SHIFT)
|
|
|
|
#define BIO_ISSUE_RES_MASK (~((1ULL << BIO_ISSUE_RES_SHIFT) - 1))
|
|
|
|
|
|
|
|
/* Reserved bit for blk-throtl */
|
|
|
|
#define BIO_ISSUE_THROTL_SKIP_LATENCY (1ULL << 63)
|
|
|
|
|
|
|
|
struct bio_issue {
|
|
|
|
u64 value;
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline u64 __bio_issue_time(u64 time)
|
|
|
|
{
|
|
|
|
return time & BIO_ISSUE_TIME_MASK;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline u64 bio_issue_time(struct bio_issue *issue)
|
|
|
|
{
|
|
|
|
return __bio_issue_time(issue->value);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline sector_t bio_issue_size(struct bio_issue *issue)
|
|
|
|
{
|
|
|
|
return ((issue->value & BIO_ISSUE_SIZE_MASK) >> BIO_ISSUE_SIZE_SHIFT);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void bio_issue_init(struct bio_issue *issue,
|
|
|
|
sector_t size)
|
|
|
|
{
|
|
|
|
size &= (1ULL << BIO_ISSUE_SIZE_BITS) - 1;
|
|
|
|
issue->value = ((issue->value & BIO_ISSUE_RES_MASK) |
|
|
|
|
(ktime_get_ns() & BIO_ISSUE_TIME_MASK) |
|
|
|
|
((u64)size << BIO_ISSUE_SIZE_SHIFT));
|
|
|
|
}
|
|
|
|
|
2022-07-14 18:06:31 +00:00
|
|
|
typedef __u32 __bitwise blk_opf_t;
|
|
|
|
|
2021-10-12 11:12:24 +00:00
|
|
|
typedef unsigned int blk_qc_t;
|
|
|
|
#define BLK_QC_T_NONE -1U
|
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
/*
|
|
|
|
* main unit of I/O for the block layer and lower layers (ie drivers and
|
|
|
|
* stacking drivers)
|
|
|
|
*/
|
|
|
|
struct bio {
|
|
|
|
struct bio *bi_next; /* request queue link */
|
2021-01-24 10:02:34 +00:00
|
|
|
struct block_device *bi_bdev;
|
2022-07-14 18:06:31 +00:00
|
|
|
blk_opf_t bi_opf; /* bottom bits REQ_OP, top bits
|
2022-05-11 23:51:52 +00:00
|
|
|
* req_flags.
|
2016-06-05 19:32:22 +00:00
|
|
|
*/
|
2021-02-02 17:19:29 +00:00
|
|
|
unsigned short bi_flags; /* BIO_* below */
|
2016-06-05 19:32:20 +00:00
|
|
|
unsigned short bi_ioprio;
|
2017-12-20 18:10:17 +00:00
|
|
|
blk_status_t bi_status;
|
block: reorder bio::__bi_remaining for better packing
Simple reordering of __bi_remaining can reduce bio size by 8 bytes that
are now wasted on padding (measured on x86_64):
struct bio {
struct bio * bi_next; /* 0 8 */
struct gendisk * bi_disk; /* 8 8 */
unsigned int bi_opf; /* 16 4 */
short unsigned int bi_flags; /* 20 2 */
short unsigned int bi_ioprio; /* 22 2 */
short unsigned int bi_write_hint; /* 24 2 */
blk_status_t bi_status; /* 26 1 */
u8 bi_partno; /* 27 1 */
/* XXX 4 bytes hole, try to pack */
struct bvec_iter bi_iter; /* 32 24 */
/* XXX last struct has 4 bytes of padding */
atomic_t __bi_remaining; /* 56 4 */
/* XXX 4 bytes hole, try to pack */
[...]
/* size: 104, cachelines: 2, members: 19 */
/* sum members: 96, holes: 2, sum holes: 8 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 40 bytes */
};
Now becomes:
struct bio {
struct bio * bi_next; /* 0 8 */
struct gendisk * bi_disk; /* 8 8 */
unsigned int bi_opf; /* 16 4 */
short unsigned int bi_flags; /* 20 2 */
short unsigned int bi_ioprio; /* 22 2 */
short unsigned int bi_write_hint; /* 24 2 */
blk_status_t bi_status; /* 26 1 */
u8 bi_partno; /* 27 1 */
atomic_t __bi_remaining; /* 28 4 */
struct bvec_iter bi_iter; /* 32 24 */
/* XXX last struct has 4 bytes of padding */
[...]
/* size: 96, cachelines: 2, members: 19 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 32 bytes */
};
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-24 17:31:10 +00:00
|
|
|
atomic_t __bi_remaining;
|
2010-08-03 11:14:58 +00:00
|
|
|
|
2017-12-20 18:10:17 +00:00
|
|
|
struct bvec_iter bi_iter;
|
2013-11-24 02:34:15 +00:00
|
|
|
|
2021-10-12 11:12:24 +00:00
|
|
|
blk_qc_t bi_cookie;
|
2010-08-03 11:14:58 +00:00
|
|
|
bio_end_io_t *bi_end_io;
|
|
|
|
void *bi_private;
|
2012-03-05 21:15:27 +00:00
|
|
|
#ifdef CONFIG_BLK_CGROUP
|
|
|
|
/*
|
2018-12-05 17:10:35 +00:00
|
|
|
* Represents the association of the css and request_queue for the bio.
|
|
|
|
* If a bio goes direct to device, it will not have a blkg as it will
|
|
|
|
* not have a request_queue associated with it. The reference is put
|
|
|
|
* on release of the bio.
|
2012-03-05 21:15:27 +00:00
|
|
|
*/
|
2018-07-03 15:14:50 +00:00
|
|
|
struct blkcg_gq *bi_blkg;
|
2018-05-09 09:08:49 +00:00
|
|
|
struct bio_issue bi_issue;
|
2019-08-28 22:05:58 +00:00
|
|
|
#ifdef CONFIG_BLK_CGROUP_IOCOST
|
|
|
|
u64 bi_iocost_cost;
|
|
|
|
#endif
|
2012-03-05 21:15:27 +00:00
|
|
|
#endif
|
block: Inline encryption support for blk-mq
We must have some way of letting a storage device driver know what
encryption context it should use for en/decrypting a request. However,
it's the upper layers (like the filesystem/fscrypt) that know about and
manages encryption contexts. As such, when the upper layer submits a bio
to the block layer, and this bio eventually reaches a device driver with
support for inline encryption, the device driver will need to have been
told the encryption context for that bio.
We want to communicate the encryption context from the upper layer to the
storage device along with the bio, when the bio is submitted to the block
layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can
represent an encryption context (note that we can't use the bi_private
field in struct bio to do this because that field does not function to pass
information across layers in the storage stack). We also introduce various
functions to manipulate the bio_crypt_ctx and make the bio/request merging
logic aware of the bio_crypt_ctx.
We also make changes to blk-mq to make it handle bios with encryption
contexts. blk-mq can merge many bios into the same request. These bios need
to have contiguous data unit numbers (the necessary changes to blk-merge
are also made to ensure this) - as such, it suffices to keep the data unit
number of just the first bio, since that's all a storage driver needs to
infer the data unit number to use for each data block in each bio in a
request. blk-mq keeps track of the encryption context to be used for all
the bios in a request with the request's rq_crypt_ctx. When the first bio
is added to an empty request, blk-mq will program the encryption context
of that bio into the request_queue's keyslot manager, and store the
returned keyslot in the request's rq_crypt_ctx. All the functions to
operate on encryption contexts are in blk-crypto.c.
Upper layers only need to call bio_crypt_set_ctx with the encryption key,
algorithm and data_unit_num; they don't have to worry about getting a
keyslot for each encryption context, as blk-mq/blk-crypto handles that.
Blk-crypto also makes it possible for request-based layered devices like
dm-rq to make use of inline encryption hardware by cloning the
rq_crypt_ctx and programming a keyslot in the new request_queue when
necessary.
Note that any user of the block layer can submit bios with an
encryption context, such as filesystems, device-mapper targets, etc.
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-14 00:37:18 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_BLK_INLINE_ENCRYPTION
|
|
|
|
struct bio_crypt_ctx *bi_crypt_context;
|
|
|
|
#endif
|
|
|
|
|
2014-09-26 23:19:56 +00:00
|
|
|
union {
|
2010-08-03 11:14:58 +00:00
|
|
|
#if defined(CONFIG_BLK_DEV_INTEGRITY)
|
2014-09-26 23:19:56 +00:00
|
|
|
struct bio_integrity_payload *bi_integrity; /* data integrity */
|
2010-08-03 11:14:58 +00:00
|
|
|
#endif
|
2014-09-26 23:19:56 +00:00
|
|
|
};
|
2010-08-03 11:14:58 +00:00
|
|
|
|
2013-10-11 22:44:27 +00:00
|
|
|
unsigned short bi_vcnt; /* how many bio_vec's */
|
|
|
|
|
2012-09-06 22:34:58 +00:00
|
|
|
/*
|
|
|
|
* Everything starting with bi_max_vecs will be preserved by bio_reset()
|
|
|
|
*/
|
|
|
|
|
2013-10-11 22:44:27 +00:00
|
|
|
unsigned short bi_max_vecs; /* max bvl_vecs we can hold */
|
2012-09-06 22:34:58 +00:00
|
|
|
|
2015-04-17 22:23:59 +00:00
|
|
|
atomic_t __bi_cnt; /* pin count */
|
2012-09-06 22:34:58 +00:00
|
|
|
|
|
|
|
struct bio_vec *bi_io_vec; /* the actual vec list */
|
|
|
|
|
2012-09-06 22:34:55 +00:00
|
|
|
struct bio_set *bi_pool;
|
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
/*
|
|
|
|
* We can inline a number of vecs at the end of the bio, to avoid
|
|
|
|
* double allocations for a small number of bio_vecs. This member
|
|
|
|
* MUST obviously be kept at the very end of the bio.
|
|
|
|
*/
|
2020-03-23 21:45:36 +00:00
|
|
|
struct bio_vec bi_inline_vecs[];
|
2010-08-03 11:14:58 +00:00
|
|
|
};
|
|
|
|
|
2012-09-06 22:34:58 +00:00
|
|
|
#define BIO_RESET_BYTES offsetof(struct bio, bi_max_vecs)
|
2021-07-21 12:43:32 +00:00
|
|
|
#define BIO_MAX_SECTORS (UINT_MAX >> SECTOR_SHIFT)
|
2012-09-06 22:34:58 +00:00
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
/*
|
|
|
|
* bio flags
|
|
|
|
*/
|
2019-04-03 09:15:19 +00:00
|
|
|
enum {
|
|
|
|
BIO_NO_PAGE_REF, /* don't put release vec pages */
|
|
|
|
BIO_CLONED, /* doesn't own data */
|
|
|
|
BIO_BOUNCED, /* bio is a bounce bio */
|
|
|
|
BIO_QUIET, /* Make BIO Quiet */
|
|
|
|
BIO_CHAIN, /* chained bio, ->bi_remaining in effect */
|
|
|
|
BIO_REFFED, /* bio has elevated ->bi_cnt */
|
blk-throttle: fix that io throttle can only work for single bio
Test scripts:
cd /sys/fs/cgroup/blkio/
echo "8:0 1024" > blkio.throttle.write_bps_device
echo $$ > cgroup.procs
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &
Test result:
10240 bytes (10 kB, 10 KiB) copied, 10.0134 s, 1.0 kB/s
10240 bytes (10 kB, 10 KiB) copied, 10.0135 s, 1.0 kB/s
The problem is that the second bio is finished after 10s instead of 20s.
Root cause:
1) second bio will be flagged:
__blk_throtl_bio
while (true) {
...
if (sq->nr_queued[rw]) -> some bio is throttled already
break
};
bio_set_flag(bio, BIO_THROTTLED); -> flag the bio
2) flagged bio will be dispatched without waiting:
throtl_dispatch_tg
tg_may_dispatch
tg_with_in_bps_limit
if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED))
*wait = 0; -> wait time is zero
return true;
commit 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
support to count split bios for iops limit, thus it adds flagged bio
checking in tg_with_in_bps_limit() so that split bios will only count
once for bps limit, however, it introduce a new problem that io throttle
won't work if multiple bios are throttled.
In order to fix the problem, handle iops/bps limit in different ways:
1) for iops limit, there is no flag to record if the bio is throttled,
and iops is always applied.
2) for bps limit, original bio will be flagged with BIO_BPS_THROTTLED,
and io throttle will ignore bio with the flag.
Noted this patch also remove the code to set flag in __bio_clone(), it's
introduced in commit 111be8839817 ("block-throttle: avoid double
charge"), and author thinks split bio can be resubmited and throttled
again, which is wrong because split bio will continue to dispatch from
caller.
Fixes: 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
Cc: <stable@vger.kernel.org>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220829022240.3348319-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-29 02:22:37 +00:00
|
|
|
BIO_BPS_THROTTLED, /* This bio has already been subjected to
|
2016-10-20 13:12:12 +00:00
|
|
|
* throttling rules. Don't do it again. */
|
2019-04-03 09:15:19 +00:00
|
|
|
BIO_TRACE_COMPLETION, /* bio_endio() should trace the final completion
|
block: trace completion of all bios.
Currently only dm and md/raid5 bios trigger
trace_block_bio_complete(). Now that we have bio_chain() and
bio_inc_remaining(), it is not possible, in general, for a driver to
know when the bio is really complete. Only bio_endio() knows that.
So move the trace_block_bio_complete() call to bio_endio().
Now trace_block_bio_complete() pairs with trace_block_bio_queue().
Any bio for which a 'queue' event is traced, will subsequently
generate a 'complete' event.
There are a few cases where completion tracing is not wanted.
1/ If blk_update_request() has already generated a completion
trace event at the 'request' level, there is no point generating
one at the bio level too. In this case the bi_sector and bi_size
will have changed, so the bio level event would be wrong
2/ If the bio hasn't actually been queued yet, but is being aborted
early, then a trace event could be confusing. Some filesystems
call bio_endio() but do not want tracing.
3/ The bio_integrity code interposes itself by replacing bi_end_io,
then restoring it and calling bio_endio() again. This would produce
two identical trace events if left like that.
To handle these, we introduce a flag BIO_TRACE_COMPLETION and only
produce the trace event when this is set.
We address point 1 above by clearing the flag in blk_update_request().
We address point 2 above by only setting the flag when
generic_make_request() is called.
We address point 3 above by clearing the flag after generating a
completion event.
When bio_split() is used on a bio, particularly in blk_queue_split(),
there is an extra complication. A new bio is split off the front, and
may be handle directly without going through generic_make_request().
The old bio, which has been advanced, is passed to
generic_make_request(), so it will trigger a trace event a second
time.
Probably the best result when a split happens is to see a single
'queue' event for the whole bio, then multiple 'complete' events - one
for each component. To achieve this was can:
- copy the BIO_TRACE_COMPLETION flag to the new bio in bio_split()
- avoid generating a 'queue' event if BIO_TRACE_COMPLETION is already set.
This way, the split-off bio won't create a queue event, the original
won't either even if it re-submitted to generic_make_request(),
but both will produce completion events, each for their own range.
So if generic_make_request() is called (which generates a QUEUED
event), then bi_endio() will create a single COMPLETE event for each
range that the bio is split into, unless the driver has explicitly
requested it not to.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 15:40:52 +00:00
|
|
|
* of this bio. */
|
2020-04-28 11:27:55 +00:00
|
|
|
BIO_CGROUP_ACCT, /* has been accounted to a cgroup */
|
2022-03-14 07:15:02 +00:00
|
|
|
BIO_QOS_THROTTLED, /* bio went through rq_qos throttle path */
|
|
|
|
BIO_QOS_MERGED, /* but went through rq_qos merge path */
|
2021-01-24 10:02:36 +00:00
|
|
|
BIO_REMAPPED,
|
2021-05-25 21:24:53 +00:00
|
|
|
BIO_ZONE_WRITE_LOCKED, /* Owns a zoned device zone write lock */
|
2019-04-03 09:15:19 +00:00
|
|
|
BIO_FLAG_LAST
|
|
|
|
};
|
2018-06-02 20:04:07 +00:00
|
|
|
|
2017-11-09 18:49:59 +00:00
|
|
|
typedef __u32 __bitwise blk_mq_req_flags_t;
|
|
|
|
|
2022-07-14 18:06:27 +00:00
|
|
|
#define REQ_OP_BITS 8
|
2022-07-14 18:06:31 +00:00
|
|
|
#define REQ_OP_MASK (__force blk_opf_t)((1 << REQ_OP_BITS) - 1)
|
2022-07-14 18:06:27 +00:00
|
|
|
#define REQ_FLAG_BITS 24
|
|
|
|
|
|
|
|
/**
|
|
|
|
* enum req_op - Operations common to the bio and request structures.
|
2016-10-28 14:48:16 +00:00
|
|
|
* We use 8 bits for encoding the operation, and the remaining 24 for flags.
|
2016-10-20 13:12:15 +00:00
|
|
|
*
|
|
|
|
* The least significant bit of the operation number indicates the data
|
|
|
|
* transfer direction:
|
|
|
|
*
|
|
|
|
* - if the least significant bit is set transfers are TO the device
|
|
|
|
* - if the least significant bit is not set transfers are FROM the device
|
|
|
|
*
|
|
|
|
* If a operation does not transfer data the least significant bit has no
|
|
|
|
* meaning.
|
2010-08-03 11:14:58 +00:00
|
|
|
*/
|
2022-07-14 18:06:27 +00:00
|
|
|
enum req_op {
|
2016-10-20 13:12:15 +00:00
|
|
|
/* read sectors from the device */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_READ = (__force blk_opf_t)0,
|
2016-10-20 13:12:15 +00:00
|
|
|
/* write sectors to the device */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_WRITE = (__force blk_opf_t)1,
|
2016-10-20 13:12:15 +00:00
|
|
|
/* flush the volatile write cache */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_FLUSH = (__force blk_opf_t)2,
|
2016-10-20 13:12:15 +00:00
|
|
|
/* discard sectors */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_DISCARD = (__force blk_opf_t)3,
|
2016-10-20 13:12:15 +00:00
|
|
|
/* securely erase sectors */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_SECURE_ERASE = (__force blk_opf_t)5,
|
2016-11-30 20:28:59 +00:00
|
|
|
/* write the zero filled sector many times */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_WRITE_ZEROES = (__force blk_opf_t)9,
|
2019-10-27 14:05:45 +00:00
|
|
|
/* Open a zone */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_OPEN = (__force blk_opf_t)10,
|
2019-10-27 14:05:45 +00:00
|
|
|
/* Close a zone */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_CLOSE = (__force blk_opf_t)11,
|
2019-10-27 14:05:45 +00:00
|
|
|
/* Transition a zone to full */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_FINISH = (__force blk_opf_t)12,
|
2020-05-12 08:55:47 +00:00
|
|
|
/* write data at the current zone write pointer */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_APPEND = (__force blk_opf_t)13,
|
block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
Currently REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are defined as
even numbers 6 and 8, such zone reset bios are treated as READ bios by
bio_data_dir(), which is obviously misleading.
The macro bio_data_dir() is defined in include/linux/bio.h as,
55 #define bio_data_dir(bio) \
56 (op_is_write(bio_op(bio)) ? WRITE : READ)
And op_is_write() is defined in include/linux/blk_types.h as,
397 static inline bool op_is_write(unsigned int op)
398 {
399 return (op & 1);
400 }
The convention of op_is_write() is when there is data transfer then the
op code should be odd number, and treat as a write op. bio_data_dir()
treats all bio direction as READ if op_is_write() reports false, and
WRITE if op_is_write() reports true.
Because REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are even numbers,
although they don't transfer data but reporting them as READ bio by
bio_data_dir() is misleading and might be wrong. Because these two
commands will reset the writer pointers of the resetting zones, and all
content after the reset write pointer will be invalid and unaccessible,
obviously they are not READ bios in any means.
This patch changes REQ_OP_ZONE_RESET from 6 to 15, and changes
REQ_OP_ZONE_RESET_ALL from 8 to 17. Now bios with these two op code
can be treated as WRITE by bio_data_dir(). Although they don't transfer
data, now we keep them consistent with REQ_OP_DISCARD and
REQ_OP_WRITE_ZEROES with the ituition that they change on-media content
and should be WRITE request.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@fb.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-17 02:42:29 +00:00
|
|
|
/* reset a zone write pointer */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_RESET = (__force blk_opf_t)15,
|
block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
Currently REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are defined as
even numbers 6 and 8, such zone reset bios are treated as READ bios by
bio_data_dir(), which is obviously misleading.
The macro bio_data_dir() is defined in include/linux/bio.h as,
55 #define bio_data_dir(bio) \
56 (op_is_write(bio_op(bio)) ? WRITE : READ)
And op_is_write() is defined in include/linux/blk_types.h as,
397 static inline bool op_is_write(unsigned int op)
398 {
399 return (op & 1);
400 }
The convention of op_is_write() is when there is data transfer then the
op code should be odd number, and treat as a write op. bio_data_dir()
treats all bio direction as READ if op_is_write() reports false, and
WRITE if op_is_write() reports true.
Because REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are even numbers,
although they don't transfer data but reporting them as READ bio by
bio_data_dir() is misleading and might be wrong. Because these two
commands will reset the writer pointers of the resetting zones, and all
content after the reset write pointer will be invalid and unaccessible,
obviously they are not READ bios in any means.
This patch changes REQ_OP_ZONE_RESET from 6 to 15, and changes
REQ_OP_ZONE_RESET_ALL from 8 to 17. Now bios with these two op code
can be treated as WRITE by bio_data_dir(). Although they don't transfer
data, now we keep them consistent with REQ_OP_DISCARD and
REQ_OP_WRITE_ZEROES with the ituition that they change on-media content
and should be WRITE request.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@fb.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-17 02:42:29 +00:00
|
|
|
/* reset all the zone present on the device */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)17,
|
2016-10-28 14:48:16 +00:00
|
|
|
|
2017-01-31 15:57:31 +00:00
|
|
|
/* Driver private requests */
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_DRV_IN = (__force blk_opf_t)34,
|
|
|
|
REQ_OP_DRV_OUT = (__force blk_opf_t)35,
|
2017-01-31 15:57:31 +00:00
|
|
|
|
2022-07-14 18:06:31 +00:00
|
|
|
REQ_OP_LAST = (__force blk_opf_t)36,
|
2016-10-28 14:48:16 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
enum req_flag_bits {
|
|
|
|
__REQ_FAILFAST_DEV = /* no driver retries of device errors */
|
|
|
|
REQ_OP_BITS,
|
2010-08-03 11:14:58 +00:00
|
|
|
__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
|
|
|
|
__REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */
|
|
|
|
__REQ_SYNC, /* request is sync (sync write or read) */
|
|
|
|
__REQ_META, /* metadata io request */
|
2011-08-23 12:50:29 +00:00
|
|
|
__REQ_PRIO, /* boost priority in cfq */
|
2016-10-20 13:12:10 +00:00
|
|
|
__REQ_NOMERGE, /* don't touch this for merging */
|
2016-11-01 13:40:09 +00:00
|
|
|
__REQ_IDLE, /* anticipate more IO after this one */
|
2014-09-26 23:19:56 +00:00
|
|
|
__REQ_INTEGRITY, /* I/O includes block integrity payload */
|
2011-08-11 08:36:03 +00:00
|
|
|
__REQ_FUA, /* forced unit access */
|
2016-06-05 19:32:25 +00:00
|
|
|
__REQ_PREFLUSH, /* request for cache flush */
|
2016-10-20 13:12:11 +00:00
|
|
|
__REQ_RAHEAD, /* read ahead, can fail anytime */
|
2016-11-01 15:52:57 +00:00
|
|
|
__REQ_BACKGROUND, /* background IO */
|
2017-11-02 18:29:48 +00:00
|
|
|
__REQ_NOWAIT, /* Don't wait if request will block */
|
2021-10-12 11:12:24 +00:00
|
|
|
__REQ_POLLED, /* caller polls for completion using bio_poll */
|
2022-03-24 20:35:24 +00:00
|
|
|
__REQ_ALLOC_CACHE, /* allocate IO from cache if available */
|
2022-05-12 06:14:08 +00:00
|
|
|
__REQ_SWAP, /* swap I/O */
|
|
|
|
__REQ_DRV, /* for driver use */
|
2023-03-27 00:49:51 +00:00
|
|
|
__REQ_FS_PRIVATE, /* for file system (submitter) use */
|
2022-05-12 06:14:08 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Command specific flags, keep last:
|
|
|
|
*/
|
|
|
|
/* for REQ_OP_WRITE_ZEROES: */
|
|
|
|
__REQ_NOUNMAP, /* do not free blocks when zeroing */
|
2018-08-29 16:36:56 +00:00
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
__REQ_NR_BITS, /* stops here */
|
|
|
|
};
|
|
|
|
|
2022-07-14 18:06:31 +00:00
|
|
|
#define REQ_FAILFAST_DEV \
|
|
|
|
(__force blk_opf_t)(1ULL << __REQ_FAILFAST_DEV)
|
|
|
|
#define REQ_FAILFAST_TRANSPORT \
|
|
|
|
(__force blk_opf_t)(1ULL << __REQ_FAILFAST_TRANSPORT)
|
|
|
|
#define REQ_FAILFAST_DRIVER \
|
|
|
|
(__force blk_opf_t)(1ULL << __REQ_FAILFAST_DRIVER)
|
|
|
|
#define REQ_SYNC (__force blk_opf_t)(1ULL << __REQ_SYNC)
|
|
|
|
#define REQ_META (__force blk_opf_t)(1ULL << __REQ_META)
|
|
|
|
#define REQ_PRIO (__force blk_opf_t)(1ULL << __REQ_PRIO)
|
|
|
|
#define REQ_NOMERGE (__force blk_opf_t)(1ULL << __REQ_NOMERGE)
|
|
|
|
#define REQ_IDLE (__force blk_opf_t)(1ULL << __REQ_IDLE)
|
|
|
|
#define REQ_INTEGRITY (__force blk_opf_t)(1ULL << __REQ_INTEGRITY)
|
|
|
|
#define REQ_FUA (__force blk_opf_t)(1ULL << __REQ_FUA)
|
|
|
|
#define REQ_PREFLUSH (__force blk_opf_t)(1ULL << __REQ_PREFLUSH)
|
|
|
|
#define REQ_RAHEAD (__force blk_opf_t)(1ULL << __REQ_RAHEAD)
|
|
|
|
#define REQ_BACKGROUND (__force blk_opf_t)(1ULL << __REQ_BACKGROUND)
|
|
|
|
#define REQ_NOWAIT (__force blk_opf_t)(1ULL << __REQ_NOWAIT)
|
|
|
|
#define REQ_POLLED (__force blk_opf_t)(1ULL << __REQ_POLLED)
|
|
|
|
#define REQ_ALLOC_CACHE (__force blk_opf_t)(1ULL << __REQ_ALLOC_CACHE)
|
|
|
|
#define REQ_SWAP (__force blk_opf_t)(1ULL << __REQ_SWAP)
|
2023-03-27 00:49:51 +00:00
|
|
|
#define REQ_DRV (__force blk_opf_t)(1ULL << __REQ_DRV)
|
|
|
|
#define REQ_FS_PRIVATE (__force blk_opf_t)(1ULL << __REQ_FS_PRIVATE)
|
|
|
|
|
|
|
|
#define REQ_NOUNMAP (__force blk_opf_t)(1ULL << __REQ_NOUNMAP)
|
2017-04-05 17:21:09 +00:00
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
#define REQ_FAILFAST_MASK \
|
|
|
|
(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
|
|
|
|
|
2012-09-18 16:19:25 +00:00
|
|
|
#define REQ_NOMERGE_FLAGS \
|
2016-10-20 13:12:13 +00:00
|
|
|
(REQ_NOMERGE | REQ_PREFLUSH | REQ_FUA)
|
2012-09-18 16:19:25 +00:00
|
|
|
|
2018-07-18 11:47:38 +00:00
|
|
|
enum stat_group {
|
|
|
|
STAT_READ,
|
|
|
|
STAT_WRITE,
|
2018-07-18 11:47:40 +00:00
|
|
|
STAT_DISCARD,
|
2019-11-21 10:40:26 +00:00
|
|
|
STAT_FLUSH,
|
2018-07-18 11:47:38 +00:00
|
|
|
|
|
|
|
NR_STAT_GROUPS
|
|
|
|
};
|
|
|
|
|
2022-07-14 18:06:30 +00:00
|
|
|
static inline enum req_op bio_op(const struct bio *bio)
|
|
|
|
{
|
|
|
|
return bio->bi_opf & REQ_OP_MASK;
|
|
|
|
}
|
2010-08-03 11:14:58 +00:00
|
|
|
|
2022-07-14 18:06:31 +00:00
|
|
|
static inline bool op_is_write(blk_opf_t op)
|
2016-10-20 13:12:15 +00:00
|
|
|
{
|
2022-07-14 18:06:31 +00:00
|
|
|
return !!(op & (__force blk_opf_t)1);
|
2016-10-20 13:12:15 +00:00
|
|
|
}
|
|
|
|
|
2017-01-27 15:30:47 +00:00
|
|
|
/*
|
|
|
|
* Check if the bio or request is one that needs special treatment in the
|
|
|
|
* flush state machine.
|
|
|
|
*/
|
2022-07-14 18:06:31 +00:00
|
|
|
static inline bool op_is_flush(blk_opf_t op)
|
2017-01-27 15:30:47 +00:00
|
|
|
{
|
|
|
|
return op & (REQ_FUA | REQ_PREFLUSH);
|
|
|
|
}
|
|
|
|
|
2016-11-01 13:40:08 +00:00
|
|
|
/*
|
|
|
|
* Reads are always treated as synchronous, as are requests with the FUA or
|
|
|
|
* PREFLUSH flag. Other operations may be marked as synchronous using the
|
|
|
|
* REQ_SYNC flag.
|
|
|
|
*/
|
2022-07-14 18:06:31 +00:00
|
|
|
static inline bool op_is_sync(blk_opf_t op)
|
2016-10-28 14:48:16 +00:00
|
|
|
{
|
2016-11-01 13:40:08 +00:00
|
|
|
return (op & REQ_OP_MASK) == REQ_OP_READ ||
|
|
|
|
(op & (REQ_SYNC | REQ_FUA | REQ_PREFLUSH));
|
2016-10-28 14:48:16 +00:00
|
|
|
}
|
2016-08-05 14:11:04 +00:00
|
|
|
|
2022-07-14 18:06:31 +00:00
|
|
|
static inline bool op_is_discard(blk_opf_t op)
|
2018-07-18 11:47:40 +00:00
|
|
|
{
|
|
|
|
return (op & REQ_OP_MASK) == REQ_OP_DISCARD;
|
|
|
|
}
|
|
|
|
|
2019-10-27 14:05:45 +00:00
|
|
|
/*
|
|
|
|
* Check if a bio or request operation is a zone management operation, with
|
|
|
|
* the exception of REQ_OP_ZONE_RESET_ALL which is treated as a special case
|
|
|
|
* due to its different handling in the block layer and device response in
|
|
|
|
* case of command failure.
|
|
|
|
*/
|
2022-07-14 18:06:27 +00:00
|
|
|
static inline bool op_is_zone_mgmt(enum req_op op)
|
2019-10-27 14:05:45 +00:00
|
|
|
{
|
|
|
|
switch (op & REQ_OP_MASK) {
|
|
|
|
case REQ_OP_ZONE_RESET:
|
|
|
|
case REQ_OP_ZONE_OPEN:
|
|
|
|
case REQ_OP_ZONE_CLOSE:
|
|
|
|
case REQ_OP_ZONE_FINISH:
|
|
|
|
return true;
|
|
|
|
default:
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-07-14 18:06:28 +00:00
|
|
|
static inline int op_stat_group(enum req_op op)
|
2018-07-18 11:47:39 +00:00
|
|
|
{
|
2018-07-18 11:47:40 +00:00
|
|
|
if (op_is_discard(op))
|
|
|
|
return STAT_DISCARD;
|
2018-07-18 11:47:39 +00:00
|
|
|
return op_is_write(op);
|
|
|
|
}
|
|
|
|
|
2016-11-08 04:32:37 +00:00
|
|
|
struct blk_rq_stat {
|
2017-10-07 00:55:59 +00:00
|
|
|
u64 mean;
|
2016-11-08 04:32:37 +00:00
|
|
|
u64 min;
|
|
|
|
u64 max;
|
2017-10-07 00:55:59 +00:00
|
|
|
u32 nr_samples;
|
2016-11-08 04:32:37 +00:00
|
|
|
u64 batch;
|
|
|
|
};
|
|
|
|
|
2010-08-03 11:14:58 +00:00
|
|
|
#endif /* __LINUX_BLK_TYPES_H */
|