Commit graph

62 commits

Author SHA1 Message Date
Tejun Heo
a051661ca6 blkcg: implement per-blkg request allocation
Currently, request_queue has one request_list to allocate requests
from regardless of blkcg of the IO being issued.  When the unified
request pool is used up, cfq proportional IO limits become meaningless
- whoever grabs the next request being freed wins the race regardless
of the configured weights.

This can be easily demonstrated by creating a blkio cgroup w/ very low
weight, put a program which can issue a lot of random direct IOs there
and running a sequential IO from a different cgroup.  As soon as the
request pool is used up, the sequential IO bandwidth crashes.

This patch implements per-blkg request_list.  Each blkg has its own
request_list and any IO allocates its request from the matching blkg
making blkcgs completely isolated in terms of request allocation.

* Root blkcg uses the request_list embedded in each request_queue,
  which was renamed to @q->root_rl from @q->rq.  While making blkcg rl
  handling a bit harier, this enables avoiding most overhead for root
  blkcg.

* Queue fullness is properly per request_list but bdi isn't blkcg
  aware yet, so congestion state currently just follows the root
  blkcg.  As writeback isn't aware of blkcg yet, this works okay for
  async congestion but readahead may get the wrong signals.  It's
  better than blkcg completely collapsing with shared request_list but
  needs to be improved with future changes.

* After this change, each block cgroup gets a full request pool making
  resource consumption of each cgroup higher.  This makes allowing
  non-root users to create cgroups less desirable; however, note that
  allowing non-root users to directly manage cgroups is already
  severely broken regardless of this patch - each block cgroup
  consumes kernel memory and skews IO weight (IO weights are not
  hierarchical).

v2: queue-sysfs.txt updated and patch description udpated as suggested
    by Vivek.

v3: blk_get_rl() wasn't checking error return from
    blkg_lookup_create() and may cause oops on lookup failure.  Fix it
    by falling back to root_rl on blkg lookup failures.  This problem
    was spotted by Rakesh Iyer <rni@google.com>.

v4: Updated to accomodate 458f27a982 "block: Avoid missed wakeup in
    request waitqueue".  blk_drain_queue() now wakes up waiters on all
    blkg->rl on the target queue.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-26 18:42:49 -04:00
Wang Sheng-Hui
b31966816d Documentation: drop as block elevator reference in switching-sched.txt
Remove 'as' for as is no longer supported, and we can not use
'elevator=as' any more.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-04 12:01:48 -07:00
Paul Bolle
395cf9691d doc: fix broken references
There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-27 18:08:04 +02:00
Vivek Goyal
4931402a9d cfq-iosched: Add documentation about idling
There are always questions about why CFQ is idling on various conditions.
Recent ones is Christoph asking again why to idle on REQ_NOIDLE. His
assertion is that XFS is relying more and more on workqueues and is
concerned that CFQ idling on IO from every workqueue will impact
XFS badly.

So he suggested that I add some more documentation about CFQ idling
and that can provide more clarity on the topic and also gives an
opprotunity to poke a hole in theory and lead to improvements.

So here is my attempt at that. Any comments are welcome.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-05 09:42:20 +02:00
Dan Williams
5757a6d76c block: strict rq_affinity
Some systems benefit from completions always being steered to the strict
requester cpu rather than the looser "per-socket" steering that
blk_cpu_to_group() attempts by default. This is because the first
CPU in the group mask ends up being completely overloaded with work,
while the others (including the original submitter) has power left
to spare.

Allow the strict mode to be set by writing '2' to the sysfs control
file. This is identical to the scheme used for the nomerges file,
where '2' is a more aggressive setting than just being turned on.

echo 2 > /sys/block/<bdev>/queue/rq_affinity

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roland Dreier <roland@purestorage.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-23 20:44:25 +02:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Jens Axboe
7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Randy Dunlap
17a9e7bbae Documentation: remove anticipatory scheduler info
Remove anticipatory block I/O scheduler info from Documentation/
since the code has been deleted.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: "Robert P. J. Day" <rpjday@crashcourse.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-11 12:09:59 +01:00
Jens Axboe
fa251f8990 Merge branch 'v2.6.36-rc8' into for-2.6.37/barrier
Conflicts:
	block/blk-core.c
	drivers/block/loop.c
	mm/swapfile.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-19 09:13:04 +02:00
Christoph Hellwig
04ccc65cd1 block: update documentation for REQ_FLUSH / REQ_FUA
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Vivek Goyal
6d6ac1c1a3 cfq-iosched: Documentation help for new tunables
Some documentation to provide help with tunables.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:25:29 +02:00
Nick Piggin
6e57559022 nick piggin: change email address
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-05 08:45:20 -07:00
FUJITA Tomonori
c2282adbde Documentation: fix block/biodoc.txt dma mapping description
- It looks incorrect to use blk_rq_map_sg with pci_map_page here about
  DMA mappings. dma_map_sg?

- better to use dma_map_page instead of pci_map_page.
  http://marc.info/?l=linux-kernel&m=126596737604808&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-08 09:11:07 +01:00
Jens Axboe
f11cbd74c5 Merge branch 'master' into for-2.6.34 2010-02-22 13:48:51 +01:00
Alan D. Brunelle
488991e28e block: Added in stricter no merge semantics for block I/O
Updated 'nomerges' tunable to accept a value of '2' - indicating that _no_
merges at all are to be attempted (not even the simple one-hit cache).

The following table illustrates the additional benefit - 5 minute runs of
a random I/O load were applied to a dozen devices on a 16-way x86_64 system.

nomerges        Throughput      %System         Improvement (tput / %sys)
--------        ------------    -----------     -------------------------
0               12.45 MB/sec    0.669365609
1               12.50 MB/sec    0.641519199     0.40% / 2.71%
2               12.52 MB/sec    0.639849750     0.56% / 2.96%

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-29 09:04:08 +01:00
Kusanagi Kouichi
4207a152bc Documentation: Rename Documentation/DMA-mapping.txt
It seems that Documentation/DMA-mapping.txt was supposed to be renamed
to Documentation/PCI/PCI-DMA-mapping.txt.

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-02 10:09:44 -08:00
FUJITA Tomonori
1de6129f38 block: remove Documentation/block/as-iosched.txt
Commit 492af6350a removed
the AS IO scheduler, so remove its documentation too.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-18 12:37:48 +01:00
Andre Noll
61fd21670d Trivial typo fixes in Documentation/block/data-integrity.txt.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-01 10:56:25 +02:00
Matt LaPlante
19f5946001 trivial: Miscellaneous documentation typo fixes
Fix various typos in documentation txts.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-06-12 18:01:47 +02:00
vibi sreenivasan
dbdc9dd342 Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
File Documentation/PCI/PCI-DMA-mapping.txt does not exist.
 Documentation/DMA-mapping.txt contains DMA Mapping details

Signed-off-by: vibi sreenivasan <vibi_sreenivasan@cms.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02 14:52:32 +02:00
Jens Axboe
329007ce25 block: update biodoc.txt on plugging
We do per-device plugging, get rid of any references to tq_disk as that
has been dead since 2.6.5 or so.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15 08:28:11 +02:00
Avishay Traeger
07e86f405a block: Repeated lines in switching-sched.txt
These lines appear in this file twice - removed one occurrence.

Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-26 11:01:28 +01:00
Jens Axboe
cbb5901b90 block: add text file detailing queue/ sysfs files
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-02 13:02:31 +01:00
Nikanth Karthikesan
7598909e3e Mark mandatory elevator functions in the biodoc.txt
biodoc.txt mentions that elevator functions marked with * are mandatory, but
no function is marked with *. Mark the 3 functions which should be
implemented by any io scheduler.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-01-30 12:34:37 +01:00
Randy Dunlap
5872fb94f8 Documentation: move DMA-mapping.txt to Doc/PCI/
Move DMA-mapping.txt to Documentation/PCI/.

DMA-mapping.txt was supposed to be moved from Documentation/ to
Documentation/PCI/.  The 00-INDEX files in those two directories
were updated, along with a few other text files, but the file
itself somehow escaped being moved, so move it and update more
text files and source files with its new location.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
cc:	Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-29 18:19:29 -08:00
Nikanth Karthikesan
4236469099 Documentation: remove reference to ll_rw_blk.c and moved drivers/block/elevator.c
The drivers/block/ll_rw_block.c has been split and organized in the block/
directory, and also drivers/block/elevator.c has been moved to the block/
directory. Update Documentation/block/biodoc.txt accordingly

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:43 +01:00
Linus Torvalds
d1b5726358 Merge branch 'docs' of git://git.lwn.net/linux-2.6
* 'docs' of git://git.lwn.net/linux-2.6:
  Document panic_on_unrecovered_nmi sysctl
  Add a reference to paper to SubmittingPatches
  Add kerneldoc documentation for new printk format extensions
  Remove videobook.tmpl
  doc: Test-by?
  Add the development process document
  Documentation/block/data-integrity.txt: Fix section numbers
2008-10-16 12:18:16 -07:00
Alberto Bertogli
d86f4bc4bc Documentation/block/data-integrity.txt: Fix section numbers
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-10-16 11:51:30 -06:00
Aaron Carroll
6a421c1dc9 block: update documentation for deadline fifo_batch tunable
Update the description of fifo_batch to match the current implementation,
and include a description of how to tune it.

Signed-off-by: Aaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:03 +02:00
Martin K. Petersen
c1c72b5994 block: Data integrity infrastructure documentation
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:13 +02:00
Robert P. J. Day
c0d1f29534 DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
2008-04-21 22:44:50 +00:00
Rob Landley
8b6800fbce Add Documentation/block/00-INDEX
Add Documentation/block/00-INDEX

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 10:11:28 +02:00
Alan D. Brunelle
23c76983e2 Some IO scheduler cleanup in Documentation/block
as-iosched.txt:
  o  Changed IO scheduler selection text to a reference to the
     switching-sched.txt file.

  o  Fixed typo: 'for up time...' -> 'for up to...'

  o  Added short description of the est_time file.

deadline-iosched.txt:
  o  Changed IO scheduler selection text to a reference to the
     switching-sched.txt file.

  o  Removed references to non-existent seek-cost and stream_unit.

  o  Fixed typo: 'write_starved' -> 'writes_starved'

switching-sched.txt:
  o  Added in boot-time argument to set the default IO scheduler. (From
     as-iosched.txt)

  o  Added in sysfs mount instructions. (From deadline-iosched.txt)

Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 09:59:55 +02:00
Rob Landley
26bbb29a2a Update Jens Axboe's email in Documentation/*
Jens Axboe's old email address bounces.

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 09:59:55 +02:00
Dhaval Giani
3317fedba9 Corrections in Documentation/block/ioprio.txt
The newer glibc does not allow system calls to be made via _syscallN()
wrapper. They have to be made through syscall(). The ionice code used
the older interface. Correcting it to use syscall.

Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:57 +02:00
NeilBrown
5705f70217 Introduce rq_for_each_segment replacing rq_for_each_bio
Every usage of rq_for_each_bio wraps a usage of
bio_for_each_segment, so these can be combined into
rq_for_each_segment.

We define "struct req_iterator" to hold the 'bio' and 'index' that
are needed for the double iteration.

Signed-off-by: Neil Brown <neilb@suse.de>

Various compile fixes by me...

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:56 +02:00
Jens Axboe
165125e1e4 [BLOCK] Get rid of request_queue_t typedef
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-24 09:28:11 +02:00
Geert Uytterhoeven
c0613c1c94 Documentation/block/barrier.txt is not in sync with the actual code: - blk_queue_ordered() no longer has a gfp_mask parameter - blk_queue_ordered_locked() no longer exists - sd_prepare_flush() looks slightly different
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 13:43:34 +02:00
Kristen Carlson Accardi
86ce18d7b7 genhd: expose AN to user space
Allow user space to determine if a disk supports Asynchronous Notification of
media changes.  This is done by adding a new sysfs file "capability_flags",
which is documented in (insert file name).  This sysfs file will export all
disk capabilities flags to user space.  We also define a new flag to define
the media change notification capability.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23 20:14:11 -07:00
Matt LaPlante
a982ac06b0 misc doc and kconfig typos
Fix various typos in kernel docs and Kconfigs, 2.6.21-rc4.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-09 08:58:15 +02:00
Jens Axboe
126ec9a676 [PATCH] block: document io scheduler allow_merge_fn hook
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-12-20 11:06:15 +01:00
Filipe
49033c8184 [PATCH] io/storage: Documentation update to as-iosched.txt
Documentation update, adding references to CFQ scheduler and to another
document about selecting IO Schedulers.

Signed-off-by: Filipe Lautert <filipe@icewall.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Matt LaPlante
5d3f083d8f Fix typos in /Documentation : Misc
This patch fixes typos in various Documentation txts. The patch addresses some
misc words.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-30 05:21:10 +01:00
Matt LaPlante
4ae0edc21b Fix typos in /Documentation : 'U-Z'
This patch fixes typos in various Documentation txts. The patch addresses some
+words starting with the letters 'U-Z'.

Looks like I made it through the alphabet...just in time to start over again
+too!  Maybe I can fit more profound fixes into the next round...?  Time will
+tell. :)

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-30 04:58:40 +01:00
Paolo Ornati
670e9f34ee Documentation: remove duplicated words
Remove many duplicated words under Documentation/ and do other small
cleanups.

Examples:
        "and and" --> "and"
        "in in" --> "in"
        "the the" --> "the"
        "the the" --> "to the"
        ...

Signed-off-by: Paolo Ornati <ornati@fastwebnet.it>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:57:56 +02:00
Matt LaPlante
53cb47268e Fix typos in Documentation/: 'S'
This patch fixes typos in various Documentation txts. The patch addresses
some words starting with the letter 'S'.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Alan Cox <alan@redhat.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:55:17 +02:00
Matt LaPlante
d6bc8ac9e1 Fix typos in Documentation/: 'Q'-'R'
This patch fixes typos in various Documentation txts. The patch addresses
some words starting with the letters 'Q'-'R'.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:54:15 +02:00
Matt LaPlante
992caacf11 Fix typos in Documentation/: 'N'-'P'
This patch fixes typos in various Documentation txts. The patch addresses
some words starting with the letters 'N'-'P'.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:52:05 +02:00
Matt LaPlante
2fe0ae78c6 Fix typos in Documentation/: 'H'-'M'
This patch fixes typos in various Documentation txts. The patch addresses
some words starting with the letters 'H'-'M'.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:50:39 +02:00
Matt LaPlante
a2ffd27516 Fix typos in Documentation/: 'F'-'G'
This patch fixes typos in various Documentation txts. The patch addresses
some words starting with the letters 'F'-'G'.

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 22:49:15 +02:00