Commit graph

4103 commits

Author SHA1 Message Date
Fabian Frederick
a3b25d9b77 drivers/block/Kconfig: update RAM block device module name
RAM block device support module name changed to brd.ko some years ago
with an "rd" alias to match previous module implementation.  This patch
updates its Kconfig definition.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:36:53 -08:00
Linus Torvalds
84621c9b18 Features:
- FIFO event channels. Key advantages: support for over 100,000 events (2^17),
    16 different event priorities, improved fairness in event latency through
    the use of FIFOs.
  - Xen PVH support. "It’s a fully PV kernel mode, running with paravirtualized
    disk and network, paravirtualized interrupts and timers, no emulated devices
    of any kind (and thus no qemu), no BIOS or legacy boot — but instead of
    requiring PV MMU, it uses the HVM hardware extensions to virtualize the
    pagetables, as well as system calls and other privileged operations."
    (from "The Paravirtualization Spectrum, Part 2: From poles to a spectrum")
 Bug-fixes:
  - Fixes in balloon driver (refactor and make it work under ARM)
  - Allow xenfb to be used in HVM guests.
  - Allow xen_platform_pci=0 to work properly.
  - Refactors in event channels.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS4BmLAAoJEFjIrFwIi8fJ4SAH/iNGESowgMhfW64vRA8pBWq+
 NRJpUjYjjwmbxpwoNl6NPwn15cIXFyc3sMtvvrDD3taRDyko2RFuT+NTjpO05xPh
 d/cRpRXpXERHoiFgPf/WTp7ONBDhvPtHG0+BzJKwgqEIOUYXdbhD+gEjaVlFJScS
 CAY68OLmk7XYMSZBNzPfKNbSCyhVgZF7wpaimK9lxZBKsFRCDIq6jIyrAsC8epIL
 6V/V4l2S6lk/uUeGB6ULphYeINjI2kkpbSfCd1vyenLfWpVscc2o8uWEYFcZMAxy
 V4HpsoseuqrfdDqgPfud3VgogdISvbkCvDfW85rzfDP4MWxei2mVHFtJ/gSBV+g=
 =ToNG
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen updates from Konrad Rzeszutek Wilk:
 "Two major features that Xen community is excited about:

  The first is event channel scalability by David Vrabel - we switch
  over from an two-level per-cpu bitmap of events (IRQs) - to an FIFO
  queue with priorities.  This lets us be able to handle more events,
  have lower latency, and better scalability.  Good stuff.

  The other is PVH by Mukesh Rathor.  In short, PV is a mode where the
  kernel lets the hypervisor program page-tables, segments, etc.  With
  EPT/NPT capabilities in current processors, the overhead of doing this
  in an HVM (Hardware Virtual Machine) container is much lower than the
  hypervisor doing it for us.

  In short we let a PV guest run without doing page-table, segment,
  syscall, etc updates through the hypervisor - instead it is all done
  within the guest container.  It is a "hybrid" PV - hence the 'PVH'
  name - a PV guest within an HVM container.

  The major benefits are less code to deal with - for example we only
  use one function from the the pv_mmu_ops (which has 39 function
  calls); faster performance for syscall (no context switches into the
  hypervisor); less traps on various operations; etc.

  It is still being baked - the ABI is not yet set in stone.  But it is
  pretty awesome and we are excited about it.

  Lastly, there are some changes to ARM code - you should get a simple
  conflict which has been resolved in #linux-next.

  In short, this pull has awesome features.

  Features:
   - FIFO event channels.  Key advantages: support for over 100,000
     events (2^17), 16 different event priorities, improved fairness in
     event latency through the use of FIFOs.
   - Xen PVH support.  "It’s a fully PV kernel mode, running with
     paravirtualized disk and network, paravirtualized interrupts and
     timers, no emulated devices of any kind (and thus no qemu), no BIOS
     or legacy boot — but instead of requiring PV MMU, it uses the HVM
     hardware extensions to virtualize the pagetables, as well as system
     calls and other privileged operations." (from "The
     Paravirtualization Spectrum, Part 2: From poles to a spectrum")

  Bug-fixes:
   - Fixes in balloon driver (refactor and make it work under ARM)
   - Allow xenfb to be used in HVM guests.
   - Allow xen_platform_pci=0 to work properly.
   - Refactors in event channels"

* tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (52 commits)
  xen/pvh: Set X86_CR0_WP and others in CR0 (v2)
  MAINTAINERS: add git repository for Xen
  xen/pvh: Use 'depend' instead of 'select'.
  xen: delete new instances of __cpuinit usage
  xen/fb: allow xenfb initialization for hvm guests
  xen/evtchn_fifo: fix error return code in evtchn_fifo_setup()
  xen-platform: fix error return code in platform_pci_init()
  xen/pvh: remove duplicated include from enlighten.c
  xen/pvh: Fix compile issues with xen_pvh_domain()
  xen: Use dev_is_pci() to check whether it is pci device
  xen/grant-table: Force to use v1 of grants.
  xen/pvh: Support ParaVirtualized Hardware extensions (v3).
  xen/pvh: Piggyback on PVHVM XenBus.
  xen/pvh: Piggyback on PVHVM for grant driver (v4)
  xen/grant: Implement an grant frame array struct (v3).
  xen/grant-table: Refactor gnttab_init
  xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init.
  xen/pvh: Piggyback on PVHVM for event channels (v2)
  xen/pvh: Update E820 to work with PVH (v2)
  xen/pvh: Secondary VCPU bringup (non-bootup CPUs)
  ...
2014-01-22 22:00:18 -08:00
Linus Torvalds
bb1281f2aa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science stuff from trivial.git"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  neighbour.h: fix comment
  sched: Fix warning on make htmldocs caused by wait.h
  slab: struct kmem_cache is protected by slab_mutex
  doc: Fix typo in USB Gadget Documentation
  of/Kconfig: Spelling s/one/once/
  mkregtable: Fix sscanf handling
  lp5523, lp8501: comment improvements
  thermal: rcar: comment spelling
  treewide: fix comments and printk msgs
  IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart()
  Documentation: update /proc/uptime field description
  Documentation: Fix size parameter for snprintf
  arm: fix comment header and macro name
  asm-generic: uaccess: Spelling s/a ny/any/
  mtd: onenand: fix comment header
  doc: driver-model/platform.txt: fix a typo
  drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text
  doc: Fix typo (acces_process_vm -> access_process_vm)
  treewide: Fix typos in printk
  drivers/gpu/drm/qxl/Kconfig: reformat the help text
  ...
2014-01-22 21:21:55 -08:00
Geert Uytterhoeven
14424be4db mg_disk: Spelling s/finised/finished/
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:34:58 -08:00
Raghavendra K T
9967d8ac5c null_blk: Null pointer deference problem in alloc_page_buffers
If we load the null_blk module with bs=8k we get following oops:
[ 3819.812190] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 3819.812387] IP: [<ffffffff81170aa5>] create_empty_buffers+0x28/0xaf
[ 3819.812527] PGD 219244067 PUD 215a06067 PMD 0
[ 3819.812640] Oops: 0000 [#1] SMP
[ 3819.812772] Modules linked in: null_blk(+)

 Fix that by resetting block size to PAGE_SIZE if it is greater than PAGE_SIZE

Reported-by: Sumanth <sumantk2@linux.vnet.ibm.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:22:40 -08:00
Sam Bradshaw
26d580575b mtip32xx: Correctly handle security locked condition
If power is removed during a secure erase, the drive will end up in a
security locked condition.  This patch causes the driver to identify,
log, and flag the security lock state.  IOs are prevented from
submission to the drive until the locked state is addressed with a
secure erase.

Bumped version number to reflect this capability.

Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:20:44 -08:00
Sam Bradshaw
188b9f49d4 mtip32xx: Make SGL container per-command to eliminate high order dma allocation
The mtip32xx driver makes a high order dma memory allocation to store a
command index table, some dedicated buffers, and a command header & SGL
blob.  This allocation can fail with a surprise insert under low &
fragmented memory conditions.

This patch breaks these regions up into separate low order allocations
and increases the maximum number of segments a single command SGL can
have.  We wanted to allow at least 256 segments for 1 MB direct IO.
Since the command header occupies the first 0x80 bytes of the SGL blob,
that meant we needed two 4k pages to contain the header and SGL.  The
two pages allow up to 504 SGL segments.

Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:20:13 -08:00
Jens Axboe
e1803a706f Merge branch 'for-jens' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/linux-block into for-3.14/drivers 2014-01-21 20:18:54 -08:00
Olaf Hering
12a64d2f5e drivers/block/loop.c: fix comment typo in loop_config_discard
Discard requests are ignored if the encryption is enabled for the given
loop device.  Update comment to match the code, and similar comments
elsewhere in the file.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:16:56 -08:00
Andrew Morton
4336548af4 drivers/block/cciss.c:cciss_init_one(): use proper errnos
pci_driver.probe should return a meaningful errno, not -1.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:16:56 -08:00
Dan Carpenter
3f7d758b1e drivers/block/paride/pg.c: underflow bug in pg_write()
The test here can underflow so we pass bogus lengths to the hardware.
It's a static checker fix and I don't know the impact.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:16:56 -08:00
Jingoo Han
f2fc8af41c drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release or on
probe failure.  Thus, it is not needed to manually clear the device driver
data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:16:56 -08:00
Jingoo Han
28e6c50069 drivers/block/sx8.c: use module_pci_driver()
Use module_pci_driver() macro which makes the code smaller and simpler.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2014-01-21 20:16:56 -08:00
Linus Torvalds
8cf7a16ee9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
 - Zorro bus cleanups and UAPI revival
 - Bootinfo cleanups and UAPI revival
 - Kexec support
 - Memory size reductions and bug fixes for multi-platform kernels
 - Polled interrupt support for Atari EtherNAT, EtherNEC and NetUSBee
 - Machine-specific random_get_entropy()
 - Defconfig updates and cleanups

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (46 commits)
  m68k/mac: Make SCC reset work more reliably
  m68k/irq - Use polled IRQ flag for MFP timer cascaded interrupts
  m68k: Update defconfigs for v3.13-rc1
  m68k/defconfig: Enable EARLY_PRINTK
  m68k/mm: kmap spelling/grammar fixes
  m68k: Convert arch/m68k/kernel/traps.c to pr_*()
  m68k: Convert arch/m68k/mm/fault.c to pr_*()
  m68k/mm: Check for mm != NULL in do_page_fault() debug code
  m68k/defconfig: Disable /sbin/hotplug fork-bomb by default
  m68k/atari: Hide RTC_PORT() macro from rtc-cmos
  m68k/amiga,atari: Fix specifying multiple debug= parameters
  m68k/defconfig: Use ext4 for ext2/ext3 file systems
  m68k: Add support to export bootinfo in procfs
  m68k: Add kexec support
  m68k/mac: Mark Mac IIsi ADB driver BROKEN
  m68k/amiga: Provide mach_random_get_entropy()
  m68k: Add infrastructure for machine-specific random_get_entropy()
  m68k/atari: Call paging_init() before nf_init()
  m68k: Remove superfluous inclusions of <asm/bootinfo.h>
  m68k/UAPI: Use proper types (endianness/size) in <asm/bootinfo*.h>
  ...
2014-01-20 09:24:31 -08:00
Jiri Kosina
7b7b68bba5 floppy: bail out in open() if drive is not responding to block0 read
In case reading of block 0 during open() fails, it is not the right thing
to let open() succeed.

Fix this by introducing FD_OPEN_SHOULD_FAIL_BIT flag, and setting it in
case the bio callback encounters an error while trying to read block 0.

As a bonus, this works around certain broken userspace (blkid), which is
not able to properly handle read()s returning IO errors. Hence be nice to
those, and bail out during open() already; if block 0 is not readable,
read()s are not going to provide any meaningful data anyway.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-01-17 11:12:06 +01:00
Ming Lei
518d00b749 block: null_blk: fix queue leak inside removing device
When queue_mode is NULL_Q_MQ and null_blk is being removed,
blk_cleanup_queue() isn't called to cleanup queue, so the queue
allocated won't be freed.

This patch calls blk_cleanup_queue() for MQ to drain all pending
requests first and release the reference counter of queue kobject, then
blk_mq_free_queue() will be called in queue kobject's release handler
when queue kobject's reference counter drops to zero.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-12 16:22:42 +07:00
Jens Axboe
54a387cb9e Merge branch 'for-3.14/core' into for-3.14/drivers
We need the updated code to make bcache easier to merge.
2014-01-08 09:32:45 -07:00
Konrad Rzeszutek Wilk
51c71a3bba xen/pvhvm: If xen_platform_pci=0 is set don't blow up (v4).
The user has the option of disabling the platform driver:
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)

which is used to unplug the emulated drivers (IDE, Realtek 8169, etc)
and allow the PV drivers to take over. If the user wishes
to disable that they can set:

  xen_platform_pci=0
  (in the guest config file)

or
  xen_emul_unplug=never
  (on the Linux command line)

except it does not work properly. The PV drivers still try to
load and since the Xen platform driver is not run - and it
has not initialized the grant tables, most of the PV drivers
stumble upon:

input: Xen Virtual Keyboard as /devices/virtual/input/input5
input: Xen Virtual Pointer as /devices/virtual/input/input6M
------------[ cut here ]------------
kernel BUG at /home/konrad/ssd/konrad/linux/drivers/xen/grant-table.c:1206!
invalid opcode: 0000 [#1] SMP
Modules linked in: xen_kbdfront(+) xenfs xen_privcmd
CPU: 6 PID: 1389 Comm: modprobe Not tainted 3.13.0-rc1upstream-00021-ga6c892b-dirty #1
Hardware name: Xen HVM domU, BIOS 4.4-unstable 11/26/2013
RIP: 0010:[<ffffffff813ddc40>]  [<ffffffff813ddc40>] get_free_entries+0x2e0/0x300
Call Trace:
 [<ffffffff8150d9a3>] ? evdev_connect+0x1e3/0x240
 [<ffffffff813ddd0e>] gnttab_grant_foreign_access+0x2e/0x70
 [<ffffffffa0010081>] xenkbd_connect_backend+0x41/0x290 [xen_kbdfront]
 [<ffffffffa0010a12>] xenkbd_probe+0x2f2/0x324 [xen_kbdfront]
 [<ffffffff813e5757>] xenbus_dev_probe+0x77/0x130
 [<ffffffff813e7217>] xenbus_frontend_dev_probe+0x47/0x50
 [<ffffffff8145e9a9>] driver_probe_device+0x89/0x230
 [<ffffffff8145ebeb>] __driver_attach+0x9b/0xa0
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145cf1c>] bus_for_each_dev+0x8c/0xb0
 [<ffffffff8145e7d9>] driver_attach+0x19/0x20
 [<ffffffff8145e260>] bus_add_driver+0x1a0/0x220
 [<ffffffff8145f1ff>] driver_register+0x5f/0xf0
 [<ffffffff813e55c5>] xenbus_register_driver_common+0x15/0x20
 [<ffffffff813e76b3>] xenbus_register_frontend+0x23/0x40
 [<ffffffffa0015000>] ? 0xffffffffa0014fff
 [<ffffffffa001502b>] xenkbd_init+0x2b/0x1000 [xen_kbdfront]
 [<ffffffff81002049>] do_one_initcall+0x49/0x170

.. snip..

which is hardly nice. This patch fixes this by having each
PV driver check for:
 - if running in PV, then it is fine to execute (as that is their
   native environment).
 - if running in HVM, check if user wanted 'xen_emul_unplug=never',
   in which case bail out and don't load any PV drivers.
 - if running in HVM, and if PCI device 5853:0001 (xen_platform_pci)
   does not exist, then bail out and not load PV drivers.
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=ide-disks',
   then bail out for all PV devices _except_ the block one.
   Ditto for the network one ('nics').
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=unnecessary'
   then load block PV driver, and also setup the legacy IDE paths.
   In (v3) make it actually load PV drivers.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it
Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Reported-and-Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Add extra logic to handle the myrid ways 'xen_emul_unplug'
can be used per Ian and Stefano suggestion]
[v3: Make the unnecessary case work properly]
[v4: s/disks/ide-disks/ spotted by Fabio]
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> [for PCI parts]
CC: stable@vger.kernel.org
2014-01-03 14:54:18 -05:00
Julia Lawall
8586ea96b4 pktcdvd: fix error return code
Set the return variable to an error code as done elsewhere in the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-01-03 10:05:34 +01:00
Ilya Dryomov
e37180c0f2 rbd: tear down watch request if rbd_dev_device_setup() fails
Tear down watch request if rbd_dev_device_setup() fails.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:32:07 +02:00
Ilya Dryomov
fca2706539 rbd: introduce rbd_dev_header_unwatch_sync() and switch to it
Rename rbd_dev_header_watch_sync() to __rbd_dev_header_watch_sync() and
introduce two helpers: rbd_dev_header_{,un}watch_sync() to make it more
clear what is going on.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:32:06 +02:00
Ilya Dryomov
7e513d4366 rbd: enable extended devt in single-major mode
If single-major device number allocation scheme is turned on, instead
of reserving 256 minors per device, which imposes a limit of 4096
images mapped at once, reserve 16 minors per device and enable extended
devt feature.  This results in a theoretical limit of 65536 images
mapped at once, and still allows to have more than 15 partititions:
partitions starting with 16th are mapped under major 259 (Block
Extended Major):

$ rbd showmapped
id pool image snap device
0  rbd  b5    -    /dev/rbd0    # no partitions
1  rbd  b2    -    /dev/rbd1    # 40 partitions
2  rbd  b3    -    /dev/rbd2    #  2 partitions

$ cat /proc/partitions
 251        0       1024 rbd0
 251       16       1024 rbd1
 251       17          0 rbd1p1
 251       18          0 rbd1p2
 ...
 251       30          0 rbd1p14
 251       31          0 rbd1p15
 259        0          0 rbd1p16
 259        1          0 rbd1p17
 ...
 259       23          0 rbd1p39
 259       24          0 rbd1p40
 251       32       1024 rbd2
 251       33          0 rbd2p1
 251       34          0 rbd2p2

(major 251 was assigned dynamically at module load time)

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:32:04 +02:00
Ilya Dryomov
9b60e70b3b rbd: add support for single-major device number allocation scheme
Currently each rbd device is allocated its own major number, which
leads to a hard limit of 230-250 images mapped at once.  This commit
adds support for a new single-major device number allocation scheme,
which is hidden behind a new single_major boolean module parameter and
is disabled by default for backwards compatibility reasons.  (Old
userspace cannot correctly unmap images mapped under single-major
scheme and would essentially just unmap a random image, if that.)

$ rbd showmapped
id pool image snap device
0  rbd  b100  -    /dev/rbd0
1  rbd  b101  -    /dev/rbd1
2  rbd  b102  -    /dev/rbd2
3  rbd  b103  -    /dev/rbd3

Old scheme (modprobe rbd):

$ ls -l /dev/rbd*
brw-rw---- 1 root disk 253, 0 Dec 10 12:24 /dev/rbd0
brw-rw---- 1 root disk 252, 0 Dec 10 12:28 /dev/rbd1
brw-rw---- 1 root disk 252, 1 Dec 10 12:28 /dev/rbd1p1
brw-rw---- 1 root disk 252, 2 Dec 10 12:28 /dev/rbd1p2
brw-rw---- 1 root disk 252, 3 Dec 10 12:28 /dev/rbd1p3
brw-rw---- 1 root disk 251, 0 Dec 10 12:28 /dev/rbd2
brw-rw---- 1 root disk 251, 1 Dec 10 12:28 /dev/rbd2p1
brw-rw---- 1 root disk 250, 0 Dec 10 12:24 /dev/rbd3

New scheme (modprobe rbd single_major=Y):

$ ls -l /dev/rbd*
brw-rw---- 1 root disk 253,   0 Dec 10 12:30 /dev/rbd0
brw-rw---- 1 root disk 253, 256 Dec 10 12:30 /dev/rbd1
brw-rw---- 1 root disk 253, 257 Dec 10 12:30 /dev/rbd1p1
brw-rw---- 1 root disk 253, 258 Dec 10 12:30 /dev/rbd1p2
brw-rw---- 1 root disk 253, 259 Dec 10 12:30 /dev/rbd1p3
brw-rw---- 1 root disk 253, 512 Dec 10 12:30 /dev/rbd2
brw-rw---- 1 root disk 253, 513 Dec 10 12:30 /dev/rbd2p1
brw-rw---- 1 root disk 253, 768 Dec 10 12:30 /dev/rbd3

(major 253 was assigned dynamically at module load time)

The new limit is 4096 images mapped at once, and it comes from the fact
that, as before, 256 minor numbers are reserved for each mapping.
(A follow-up commit changes the number of minors reserved and the way
we deal with partitions over that number.)

If single_major is set to true, two new sysfs interfaces show up:
/sys/bus/rbd/{add,remove}_single_major.  These are to be used instead
of /sys/bus/rbd/{add,remove}, which are disabled for backwards
compatibility reasons outlined above.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:59 +02:00
Ilya Dryomov
92c76dc036 rbd: wire up is_visible() sysfs callback for rbd bus
In preparation for single-major device number allocation scheme, wire
up attribute_group::is_visible() callback for rbd bus.  This allows us
to make the new single-major attributes conditional.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:58 +02:00
Ilya Dryomov
dd82fff1e8 rbd: add 'minor' sysfs rbd device attribute
Introduce /sys/bus/rbd/devices/<id>/minor sysfs attribute for exporting
rbd whole disk minor numbers.  This is a step towards single-major
device number allocation scheme, but also a good thing on its own.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:57 +02:00
Ilya Dryomov
f8a22fc238 rbd: switch to ida for rbd id assignments
Currently rbd ids are allocated using an atomic variable that keeps
track of the highest id currently in use and each new id is simply one
more than the value of that variable.  That's nice and cheap, but it
does mean that rbd ids are allowed to grow boundlessly, and, more
importantly, it's completely unpredictable.  So, in preparation for
single-major device number allocation scheme, which is going to
establish and rely on a constant mapping between rbd ids and device
numbers, switch to ida for rbd id assignments.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:56 +02:00
Ilya Dryomov
e1b4d96dea rbd: refactor rbd_init() a bit
Refactor rbd_init() a bit to make it more clear what's going on.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:55 +02:00
Ilya Dryomov
90da258b88 rbd: tweak "loaded" message and module description
Tweak "loaded" message, so that it looks like

[   30.184235] rbd: loaded

instead of

[   38.056564] rbd: loaded rbd (rados block device)

Also move (and slightly tweak) MODULE_DESCRIPTION so that all authors
are next to each other in modinfo output.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:54 +02:00
Ilya Dryomov
70eebd2006 rbd: rbd_device::dev_id is an int, format it as such
rbd_device::dev_id is an int, format it as such.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-12-31 20:31:53 +02:00
Jens Axboe
b28bc9b38c Linux 3.13-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJSwLfoAAoJEHm+PkMAQRiGi6QH/1U1B7lmHChDTw3jj1lfm9gA
 189Si4QJlnxFWCKHvKEL+pcaVuACU+aMGI8+KyMYK4/JfuWVjjj5fr/SvyHH2/8m
 LdSK8aHMhJ46uBS4WJ/l6v46qQa5e2vn8RKSBAyKm/h4vpt+hd6zJdoFrFai4th7
 k/TAwOAEHI5uzexUChwLlUBRTvbq4U8QUvDu+DeifC8cT63CGaaJ4qVzjOZrx1an
 eP6UXZrKDASZs7RU950i7xnFVDQu4PsjlZi25udsbeiKcZJgPqGgXz5ULf8ZH8RQ
 YCi1JOnTJRGGjyIOyLj7pyB01h7XiSM2+eMQ0S7g54F2s7gCJ58c2UwQX45vRWU=
 =/4/R
 -----END PGP SIGNATURE-----

Merge tag 'v3.13-rc6' into for-3.14/core

Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.

Fixup merge issues related to the immutable biovec changes.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

Conflicts:
	block/blk-flush.c
	fs/btrfs/check-integrity.c
	fs/btrfs/extent_io.c
	fs/btrfs/scrub.c
	fs/logfs/dev_bdev.c
2013-12-31 09:51:02 -07:00
Linus Torvalds
c5fdd531b5 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 - fix for a memory leak on certain unplug events
 - a collection of bcache fixes from Kent and Nicolas
 - a few null_blk fixes and updates form Matias
 - a marking of static of functions in the stec pci-e driver

* 'for-linus' of git://git.kernel.dk/linux-block:
  null_blk: support submit_queues on use_per_node_hctx
  null_blk: set use_per_node_hctx param to false
  null_blk: corrections to documentation
  null_blk: warning on ignored submit_queues param
  null_blk: refactor init and init errors code paths
  null_blk: documentation
  null_blk: mem garbage on NUMA systems during init
  drivers: block: Mark the functions as static in skd_main.c
  bcache: New writeback PD controller
  bcache: bugfix for race between moving_gc and bucket_invalidate
  bcache: fix for gc and writeback race
  bcache: bugfix - moving_gc now moves only correct buckets
  bcache: fix for gc crashing when no sectors are used
  bcache: Fix heap_peek() macro
  bcache: Fix for can_attach_cache()
  bcache: Fix dirty_data accounting
  bcache: Use uninterruptible sleep in writeback
  bcache: kthread don't set writeback task to INTERUPTIBLE
  block: fix memory leaks on unplugging block device
  bcache: fix sparse non static symbol warning
2013-12-24 10:06:03 -08:00
Matias Bjørling
fc1bc35443 null_blk: support submit_queues on use_per_node_hctx
In the case of both the submit_queues param and use_per_node_hctx param
are used. We limit the number af submit_queues to the number of online
nodes.

If the submit_queues is a multiple of nr_online_nodes, its trivial. Simply map
them to the nodes. For example: 8 submit queues are mapped as node0[0,1],
node1[2,3], ...
If uneven, we are left with an uneven number of submit_queues that must be
mapped. These are mapped toward the first node and onward. E.g. 5
submit queues mapped onto 4 nodes are mapped as node0[0,1], node1[2], ...

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21 09:30:34 -07:00
Matias Bjørling
200052440d null_blk: set use_per_node_hctx param to false
The defaults for the module is to instantiate itself with blk-mq and a
submit queue for each CPU node in the system.

To save resources, initialize instead with a single submit queue.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21 09:30:33 -07:00
Matias Bjorling
d15ee6b1a4 null_blk: warning on ignored submit_queues param
Let the user know when the number of submission queues are being
ignored.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19 08:09:43 -07:00
Matias Bjorling
2d263a7856 null_blk: refactor init and init errors code paths
Simplify the initialization logic of the three block-layers.

- The queue initialization is split into two parts. This allows reuse of
  code when initializing the sq-, bio- and mq-based layers.
- Set submit_queues default value to 0 and always set it at init time.
- Simplify the init error code paths.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19 08:09:42 -07:00
Matias Bjorling
0c56010c83 null_blk: mem garbage on NUMA systems during init
For NUMA systems, initializing the blk-mq layer and using per node hctx.
We initialize submit queues to 1, while blk-mq nr_hw_queues is
initialized to the number of NUMA nodes.

This makes the null_init_hctx function overwrite memory outside of what
it allocated.  In my case it lead to writing garbage into struct
request_queue's mq_map.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-19 08:09:38 -07:00
Rashika Kheria
a26ba7fadd drivers: block: Mark the functions as static in skd_main.c
Mark functions skd_skmsg_state_to_str() and skd_skreq_state_to_str() as
static in skd_main.c because they are not used outside this file.

This eliminates the following warnings in skd_main.c:
drivers/block/skd_main.c:5272:13: warning: no previous prototype for ‘skd_skmsg_state_to_str’ [-Wmissing-prototypes]
drivers/block/skd_main.c:5284:13: warning: no previous prototype for ‘skd_skreq_state_to_str’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19 08:06:49 -07:00
Jiri Kosina
e23c34bb41 Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply fixes on top of newer things
in tree (efi-stub).

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-12-19 15:08:32 +01:00
Keith Busch
9a6b94584d NVMe: Device resume error handling
Adds controller error handling on resume power management. If the device
fails to initialize, the device is queued for a reset. If the reset fails,
a thread is spawned to remove the pci device.

If the device resumes as "busy", the device is responding to admin
commands but will not create IO queues. In this case, we need to remove
the gendisks and free the IO queues since they can't be used and may be
holding bios in their lists.

From testing, the dma pools require a pci device so this had to change
the pci driver 'remove' to release the dma resources in line with that
call instead of after all references to the device are released.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-12-16 15:54:39 -05:00
Matthew Wilcox
68608c268b NVMe: Cache dev->pci_dev in a local pointer
Helps with line-length issues

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-12-16 15:54:35 -05:00
Matthew Wilcox
0a8d44cb33 NVMe: Fix lockdep warnings
During the initialisation path, the queue lock is taken without interrupt
protection.  It's perfectly safe to do so, because the interrupt handler
can't run at this point, but it confuses lockdep.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-12-16 15:54:34 -05:00
Keith Busch
320a382746 NVMe: compat SG_IO ioctl
For 32-bit versions of sg3-utils running on a 64-bit system. This is
mostly a copy from the relevent portions of fs/compat_ioctl.c, with
slight modifications for going through block_device_operations.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@linux.intel.com>
[fixed up CONFIG_COMPAT=n build problems]
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-12-16 15:49:40 -05:00
Matias Bjorling
57053d8c5c null_blk: mem garbage on NUMA systems during init
For NUMA systems, initializing the blk-mq layer and using per node hctx.
We initialize submit queues to 1, while blk-mq nr_hw_queues is
initialized to the number of NUMA nodes.

This makes the null_init_hctx function overwrite memory outside of what
it allocated.  In my case it lead to writing garbage into struct
request_queue's mq_map.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-15 12:17:16 -08:00
Linus Torvalds
5ee540613d Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "A small collection of fixes for the current series. It contains:

   - A fix for a use-after-free of a request in blk-mq.  From Ming Lei

   - A fix for a blk-mq bug that could attempt to dereference a NULL rq
     if allocation failed

   - Two xen-blkfront small fixes

   - Cleanup of submit_bio_wait() type uses in the kernel, unifying
     that.  From Kent

   - A fix for 32-bit blkg_rwstat reading.  I apologize for this one
     looking mangled in the shortlog, it's entirely my fault for missing
     an empty line between the description and body of the text"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: fix use-after-free of request
  blk-mq: fix dereference of rq->mq_ctx if allocation fails
  block: xen-blkfront: Fix possible NULL ptr dereference
  xen-blkfront: Silence pfn maybe-uninitialized warning
  block: submit_bio_wait() conversions
  Update of blkg_stat and blkg_rwstat may happen in bh context
2013-12-05 15:33:27 -08:00
Alan
6bbdc3984e vio: remove dangly makefile bits
The drivers are long gone but some config escaped the prune

Signed-off-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=57221
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-12-03 17:35:23 +01:00
Jens Axboe
e345d767f6 Merge branch 'stable/for-jens-3.13-take-two' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip into for-linus 2013-11-27 19:12:58 -07:00
Felipe Pena
2f089cb89d block: xen-blkfront: Fix possible NULL ptr dereference
In the blkif_release function the bdget_disk() call might returns
a NULL ptr which might be dereferenced on bdev->bd_openers checking

Signed-off-by: Felipe Pena <felipensp@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Added WARN per Roger's suggestion]
2013-11-26 11:24:01 -05:00
Tim Gardner
427bfe07e6 xen-blkfront: Silence pfn maybe-uninitialized warning
pfn cannot actually be used unless (!info->feature_persistent), nor is
pfn accessed in get_grant() unless (!info->feature_persistent), but silence
this warning anyway. gcc-4.8

drivers/block/xen-blkfront.c: In function 'do_blkif_request':
drivers/block/xen-blkfront.c:508:20: warning: 'pfn' may be used uninitialized in this function [-Wmaybe-uninitialized]
     gnt_list_entry = get_grant(&gref_head, pfn, info);
                    ^
drivers/block/xen-blkfront.c:492:19: note: 'pfn' was declared here
     unsigned long pfn;

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2013-11-26 11:23:53 -05:00
Geert Uytterhoeven
b3ce717205 block/z2ram: Remove duplicate external declarations
Remove the external declarations for m68k_realnum_memory and m68k_memory,
which are already provided by <asm/setup.h>, to avoid conflicts later.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2013-11-26 11:09:10 +01:00
Geert Uytterhoeven
6112ea0862 zorro: ZTWO_VADDR() should return "void __iomem *"
ZTWO_VADDR() converts from physical to virtual I/O addresses, so it should
return "void __iomem *" instead of "unsigned long".

This allows to drop several casts, but requires adding a few casts to
accomodate legacy driver frameworks that store "unsigned long" I/O
addresses.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2013-11-26 11:09:07 +01:00
Kent Overstreet
20d0189b10 block: Introduce new bio_split()
The new bio_split() can split arbitrary bios - it's not restricted to
single page bios, like the old bio_split() (previously renamed to
bio_pair_split()). It also has different semantics - it doesn't allocate
a struct bio_pair, leaving it up to the caller to handle completions.

Then convert the existing bio_pair_split() users to the new bio_split()
- and also nvme, which was open coding bio splitting.

(We have to take that BUG_ON() out of bio_integrity_trim() because this
bio_split() needs to use it, and there's no reason it has to be used on
bios marked as cloned; BIO_CLONED doesn't seem to have clearly
documented semantics anyways.)

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Neil Brown <neilb@suse.de>
2013-11-23 22:33:57 -08:00
Kent Overstreet
ee67891bf1 block: Rename bio_split() -> bio_pair_split()
This is prep work for introducing a more general bio_split().

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: NeilBrown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Sage Weil <sage@inktank.com>
2013-11-23 22:33:56 -08:00
Kent Overstreet
5341a6278b rbd: Refactor bio cloning
Now that we've got drivers converted to the new immutable bvec
primitives, bio splitting becomes much easier - this is how the new
bio_split() will work. (Someone more familiar with the ceph code could
probably use bio_clone_fast() instead of bio_clone() here).

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
2013-11-23 22:33:54 -08:00
Kent Overstreet
feb261e2ee aoe: Convert to immutable biovecs
Now that we've got a mechanism for immutable biovecs -
bi_iter.bi_bvec_done - we need to convert drivers to use primitives that
respect it instead of using the bvec array directly.

The aoe code no longer has to manually iterate over partial bvecs, so
some struct members go away - other struct members are effectively
renamed:

buf->resid	-> buf->iter.bi_size
buf->sector	-> buf->iter.bi_sector

f->bcnt		-> f->iter.bi_size
f->lba		-> f->iter.bi_sector

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
2013-11-23 22:33:52 -08:00
Kent Overstreet
003b5c5719 block: Convert drivers to immutable biovecs
Now that we've got a mechanism for immutable biovecs -
bi_iter.bi_bvec_done - we need to convert drivers to use primitives that
respect it instead of using the bvec array directly.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: NeilBrown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
2013-11-23 22:33:51 -08:00
Kent Overstreet
458b76ed2f block: Kill bio_segments()/bi_vcnt usage
When we start sharing biovecs, keeping bi_vcnt accurate for splits is
going to be error prone - and unnecessary, if we refactor some code.

So bio_segments() has to go - but most of the existing users just needed
to know if the bio had multiple segments, which is easier - add a
bio_multiple_segments() for them.

(Two of the current uses of bio_segments() are going to go away in a
couple patches, but the current implementation of bio_segments() is
unsafe as soon as we start doing driver conversions for immutable
biovecs - so implement a dumb version for bisectability, it'll go away
in a couple patches)

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-23 22:33:51 -08:00
Kent Overstreet
4550dd6c6b block: Immutable bio vecs
This adds a mechanism by which we can advance a bio by an arbitrary
number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done
indicates the number of bytes completed in the current bvec.

Various driver code still needs to be updated to not refer to the bvec
directly before we can use this for interesting things, like efficient
bio splitting.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
2013-11-23 22:33:49 -08:00
Kent Overstreet
7988613b0e block: Convert bio_for_each_segment() to bvec_iter
More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio->bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: support@lsi.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-m68k@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: DL-MPTFusionLinux@lsi.com
Cc: linux-scsi@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: linux-fsdevel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: linux-mm@kvack.org
Acked-by: Geoff Levand <geoff@infradead.org>
2013-11-23 22:33:49 -08:00
Kent Overstreet
a4ad39b1d1 block: Convert bio_iovec() to bvec_iter
For immutable biovecs, we'll be introducing a new bio_iovec() that uses
our new bvec iterator to construct a biovec, taking into account
bvec_iter->bi_bvec_done - this patch updates existing users for the new
usage.

Some of the existing users really do need a pointer into the bvec array
- those uses are all going to be removed, but we'll need the
functionality from immutable to remove them - so for now rename the
existing bio_iovec() -> __bio_iovec(), and it'll be removed in a couple
patches.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-23 22:33:49 -08:00
Kent Overstreet
4f024f3797 block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
2013-11-23 22:33:47 -08:00
Yuanhan Liu
044c8d4b15 kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanly
Remove CONFIG_USE_GENERIC_SMP_HELPERS left by commit 0a06ff068f
("kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS").

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-21 16:42:27 -08:00
Shaohua Li
f02b9ac35a virtio-blk: virtqueue_kick() must be ordered with other virtqueue operations
It isn't safe to call it without holding the vblk->vq_lock.

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Shaohua Li <shli@fusionio.com>

Fixed another condition of virtqueue_kick() not holding the lock.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-19 19:00:45 -07:00
Michael Opdenacker
481e5bad84 NVMe: remove deprecated IRQF_DISABLED
This patch proposes to remove the use of the IRQF_DISABLED flag

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-11-18 17:13:54 -05:00
Haiyan Hu
b80d5ccca3 NVMe: Avoid shift operation when writing cq head doorbell
Changes the type of dev->db_stride to unsigned and changes the value
stored there to be 1 << the current value. Then there is less
calculation to be done at completion time.

Signed-off-by: Haiyan Hu <huhaiyan@huawei.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2013-11-18 17:10:51 -05:00
Linus Torvalds
f412f2c60b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull second round of block driver updates from Jens Axboe:
 "As mentioned in the original pull request, the bcache bits were pulled
  because of their dependency on the immutable bio vecs.  Kent re-did
  this part and resubmitted it, so here's the 2nd round of (mostly)
  driver updates for 3.13.  It contains:

 - The bcache work from Kent.

 - Conversion of virtio-blk to blk-mq.  This removes the bio and request
   path, and substitutes with the blk-mq path instead.  The end result
   almost 200 deleted lines.  Patch is acked by Asias and Christoph, who
   both did a bunch of testing.

 - A removal of bootmem.h include from Grygorii Strashko, part of a
   larger series of his killing the dependency on that header file.

 - Removal of __cpuinit from blk-mq from Paul Gortmaker"

* 'for-linus' of git://git.kernel.dk/linux-block: (56 commits)
  virtio_blk: blk-mq support
  blk-mq: remove newly added instances of __cpuinit
  bcache: defensively handle format strings
  bcache: Bypass torture test
  bcache: Delete some slower inline asm
  bcache: Use ida for bcache block dev minor
  bcache: Fix sysfs splat on shutdown with flash only devs
  bcache: Better full stripe scanning
  bcache: Have btree_split() insert into parent directly
  bcache: Move spinlock into struct time_stats
  bcache: Kill sequential_merge option
  bcache: Kill bch_next_recurse_key()
  bcache: Avoid deadlocking in garbage collection
  bcache: Incremental gc
  bcache: Add make_btree_freeing_key()
  bcache: Add btree_node_write_sync()
  bcache: PRECEDING_KEY()
  bcache: bch_(btree|extent)_ptr_invalid()
  bcache: Don't bother with bucket refcount for btree node allocations
  bcache: Debug code improvements
  ...
2013-11-15 16:33:41 -08:00
Linus Torvalds
b746f9c794 Nothing really exciting: some groundwork for changing virtio endian, and
some robustness fixes for broken virtio devices, plus minor tweaks.
 
 [vs last pull request: added the virtio-scsi broken vq escape patch, which
 I somehow lost.]
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSgDsJAAoJENkgDmzRrbjxEE4P/jXqZHS/HdlxW9k0BjKKlEIF
 PdtCoP3UhWTdskXvy2pD8m6nYn214MEJYUIa4HFlIEZsdxhuexzQHY19Ynkjagyv
 57sRsUUm5fYQLIL7IUh2DUD1VU38hUFinno/y333szzvCj9qITDA/QABsiWxK8NO
 dq+Lmeixgrhc5yN9iryW+gZV+hekJIZ4LsU5ejSaJucKblzXUH8qIbmSthG7RTYJ
 tr4J7xTTXbhxY4CoC5Dpx2hvsFkvzaAIvI4Nr1mDjfq5cR8BaYvnC89U1IbhdAey
 p1AbZE58JLrY+Z8K8LBRGV2KjO8qSZ6R47hbZ9nAnodJYB7sZLyj6jUe1q+/htuC
 Dh9Xm9O4eW2xNaFk20dYeIF4UU5/HzdsbvG/IlH8x4sm8/K706ocYyAOHlzYUg2T
 k7gltrgDzDokMgb2R44gwnr4oaJ2q8Gne6JXswlPEv2eRs6vNnA5Xhc0rEHGkU6C
 gYn1vNFN6yx0vf2syG/Ce5pZtMxGpefKQkHzzWdq8FKr1B9s54dDuf2hls7J8A9t
 OQT1gE33yURSelf4Kh4k9zWXaWk/Ohv9l2R1cqpALnJ4/+q0fP5t7HdK500S7aax
 DxLeFeqvsBw7nlWgsGxQmt+fjITQFHhcDiwst0ehnt6RbDEW7XPIguz0K/gyhxYG
 +UNbl/5Gr64jnUX3YCzm
 =vY2L
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "Nothing really exciting: some groundwork for changing virtio endian,
  and some robustness fixes for broken virtio devices, plus minor
  tweaks"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_scsi: verify if queue is broken after virtqueue_get_buf()
  x86, asmlinkage, lguest: Pass in globals into assembler statement
  virtio: mmio: fix signature checking for BE guests
  virtio_ring: adapt to notify() returning bool
  virtio_net: verify if queue is broken after virtqueue_get_buf()
  virtio_console: verify if queue is broken after virtqueue_get_buf()
  virtio_blk: verify if queue is broken after virtqueue_get_buf()
  virtio_ring: add new function virtqueue_is_broken()
  virtio_test: verify if virtqueue_kick() succeeded
  virtio_net: verify if virtqueue_kick() succeeded
  virtio_ring: let virtqueue_{kick()/notify()} return a bool
  virtio_ring: change host notification API
  virtio_config: remove virtio_config_val
  virtio: use size-based config accessors.
  virtio_config: introduce size-based accessors.
  virtio_ring: plug kmemleak false positive.
  virtio: pm: use CONFIG_PM_SLEEP instead of CONFIG_PM
2013-11-15 13:28:47 +09:00
Wolfram Sang
16735d022f tree-wide: use reinit_completion instead of INIT_COMPLETION
Use this new function to make code more comprehensible, since we are
reinitialzing the completion, not initializing.

[akpm@linux-foundation.org: linux-next resyncs]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13)
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:21 +09:00
Jens Axboe
1cf7e9c68f virtio_blk: blk-mq support
Switch virtio-blk from the dual support for old-style requests and bios
to use the block-multiqueue.

Acked-by: Asias He <asias@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2013-11-14 08:40:44 -07:00
Jens Axboe
1355b37f11 Merge branch 'for-3.13/post-mq-drivers' into for-linus 2013-11-14 08:29:01 -07:00
Linus Torvalds
5eea9be8b2 Merge branch 'for-3.13/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "This is the block driver pull request for 3.13.  As with the core pull
  request just sent out, this was rebased on top of the core branch
  again after the immutable series was pulled.  This also means that
  bcache gets to sit the initial pull over.  I will send a second driver
  pull request in the merge window to get those fixes in, once they have
  been rebased and tested on top of the non-immutable stack.

  This pull request contains:

   - Add support for the sTec Kronos pci-e flash card from sTec.  Also
     has various cleanups for this driver, from myself, Bart, Mike
     Snizter, and Wei Yongjun.

   - Add surprise removal support for the micron mtip32xx driver from
     Micron.

   - Floppy documentation fix from Ben Harris.

   - debugfs bug fix for pktcdvd from Dan Carpenter.

   - Fix for the mtip32xx driver stack usage in the debugfs path,
     dynamically allocating those buffers instead.  From David Milburn.

   - Disable cpqarray in Kconfig.  The plan is to remove it on request
     of HP, but lets disable it for a few revisions just to see if
     anyone yells.

   - drbd fixes from Lars Ellenberg and Philipp Reisner.

   - Elevator switch fix for the s390 block driver from Heiko Carstens.

   - loop crash fix on IO to unassigned device from Mikulas Patocka.

   - A series of bug fixes for the IBM rsxx pci-e flash driver from
     Philip J Kelleher.

   - cciss probe fix from Stephen Cameron.

   - Xen block front/back fixes from Roger Pau Monne and Vegard Nossum"

* 'for-3.13/drivers' of git://git.kernel.dk/linux-block: (41 commits)
  floppy: Correct documentation of driver options when used as a module.
  pktcdvd: debugfs functions return NULL on error
  xen-blkfront: restore the non-persistent data path
  skd: fix formatting in skd_s1120.h
  skd: reorder construct/destruct code
  skd: cleanup skd_do_inq_page_da()
  skd: remove SKD_OMIT_FROM_SRC_DIST ifdefs
  skd: remove redundant skdev->pdev assignment from skd_pci_probe()
  skd: use <asm/unaligned.h>
  skd: remove SCSI subsystem specific includes
  skd: register block device only if some devices are present
  skd: fix error messages in skd_init()
  skd: fix error paths in skd_init()
  skd: fix unregister_blkdev() placement
  skd: more removal of bio-based code
  skd: cleanup the skd_*() function block wrapping
  skd: rip out bio path
  skd: fix error return code in skd_pci_probe()
  s390/dasd: hold request queue sysfs lock when calling elevator_init()
  cciss: return 0 from driver probe function on success, not 1
  ...
2013-11-14 12:13:05 +09:00
Linus Torvalds
0910c0bdf7 Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block
Pull block IO core updates from Jens Axboe:
 "This is the pull request for the core changes in the block layer for
  3.13.  It contains:

   - The new blk-mq request interface.

     This is a new and more scalable queueing model that marries the
     best part of the request based interface we currently have (which
     is fully featured, but scales poorly) and the bio based "interface"
     which the new drivers for high IOPS devices end up using because
     it's much faster than the request based one.

     The bio interface has no block layer support, since it taps into
     the stack much earlier.  This means that drivers end up having to
     implement a lot of functionality on their own, like tagging,
     timeout handling, requeue, etc.  The blk-mq interface provides all
     these.  Some drivers even provide a switch to select bio or rq and
     has code to handle both, since things like merging only works in
     the rq model and hence is faster for some workloads.  This is a
     huge mess.  Conversion of these drivers nets us a substantial code
     reduction.  Initial results on converting SCSI to this model even
     shows an 8x improvement on single queue devices.  So while the
     model was intended to work on the newer multiqueue devices, it has
     substantial improvements for "classic" hardware as well.  This code
     has gone through extensive testing and development, it's now ready
     to go.  A pull request is coming to convert virtio-blk to this
     model will be will be coming as well, with more drivers scheduled
     for 3.14 conversion.

   - Two blktrace fixes from Jan and Chen Gang.

   - A plug merge fix from Alireza Haghdoost.

   - Conversion of __get_cpu_var() from Christoph Lameter.

   - Fix for sector_div() with 64-bit divider from Geert Uytterhoeven.

   - A fix for a race between request completion and the timeout
     handling from Jeff Moyer.  This is what caused the merge conflict
     with blk-mq/core, in case you are looking at that.

   - A dm stacking fix from Mike Snitzer.

   - A code consolidation fix and duplicated code removal from Kent
     Overstreet.

   - A handful of block bug fixes from Mikulas Patocka, fixing a loop
     crash and memory corruption on blk cg.

   - Elevator switch bug fix from Tomoki Sekiyama.

  A heads-up that I had to rebase this branch.  Initially the immutable
  bio_vecs had been queued up for inclusion, but a week later, it became
  clear that it wasn't fully cooked yet.  So the decision was made to
  pull this out and postpone it until 3.14.  It was a straight forward
  rebase, just pruning out the immutable series and the later fixes of
  problems with it.  The rest of the patches applied directly and no
  further changes were made"

* 'for-3.13/core' of git://git.kernel.dk/linux-block: (31 commits)
  block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
  block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
  block: Do not call sector_div() with a 64-bit divisor
  kernel: trace: blktrace: remove redundent memcpy() in compat_blk_trace_setup()
  block: Consolidate duplicated bio_trim() implementations
  block: Use rw_copy_check_uvector()
  block: Enable sysfs nomerge control for I/O requests in the plug list
  block: properly stack underlying max_segment_size to DM device
  elevator: acquire q->sysfs_lock in elevator_change()
  elevator: Fix a race in elevator switching and md device initialization
  block: Replace __get_cpu_var uses
  bdi: test bdi_init failure
  block: fix a probe argument to blk_register_region
  loop: fix crash if blk_alloc_queue fails
  blk-core: Fix memory corruption if blkcg_init_queue fails
  block: fix race between request completion and timeout handling
  blktrace: Send BLK_TN_PROCESS events to all running traces
  blk-mq: don't disallow request merges for req->special being set
  blk-mq: mq plug list breakage
  blk-mq: fix for flush deadlock
  ...
2013-11-14 12:08:14 +09:00
Linus Torvalds
8ceafbfa91 Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm
Pull DMA mask updates from Russell King:
 "This series cleans up the handling of DMA masks in a lot of drivers,
  fixing some bugs as we go.

  Some of the more serious errors include:
   - drivers which only set their coherent DMA mask if the attempt to
     set the streaming mask fails.
   - drivers which test for a NULL dma mask pointer, and then set the
     dma mask pointer to a location in their module .data section -
     which will cause problems if the module is reloaded.

  To counter these, I have introduced two helper functions:
   - dma_set_mask_and_coherent() takes care of setting both the
     streaming and coherent masks at the same time, with the correct
     error handling as specified by the API.
   - dma_coerce_mask_and_coherent() which resolves the problem of
     drivers forcefully setting DMA masks.  This is more a marker for
     future work to further clean these locations up - the code which
     creates the devices really should be initialising these, but to fix
     that in one go along with this change could potentially be very
     disruptive.

  The last thing this series does is prise away some of Linux's addition
  to "DMA addresses are physical addresses and RAM always starts at
  zero".  We have ARM LPAE systems where all system memory is above 4GB
  physical, hence having DMA masks interpreted by (eg) the block layers
  as describing physical addresses in the range 0..DMAMASK fails on
  these platforms.  Santosh Shilimkar addresses this in this series; the
  patches were copied to the appropriate people multiple times but were
  ignored.

  Fixing this also gets rid of some ARM weirdness in the setup of the
  max*pfn variables, and brings ARM into line with every other Linux
  architecture as far as those go"

* 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm: (52 commits)
  ARM: 7805/1: mm: change max*pfn to include the physical offset of memory
  ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7795/1: mm: dma-mapping: Add dma_max_pfn(dev) helper function
  ARM: 7794/1: block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()
  ARM: DMA-API: better handing of DMA masks for coherent allocations
  ARM: 7857/1: dma: imx-sdma: setup dma mask
  DMA-API: firmware/google/gsmi.c: avoid direct access to DMA masks
  DMA-API: dcdbas: update DMA mask handing
  DMA-API: dma: edma.c: no need to explicitly initialize DMA masks
  DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks
  DMA-API: crypto: remove last references to 'static struct device *dev'
  DMA-API: crypto: fix ixp4xx crypto platform device support
  DMA-API: others: use dma_set_coherent_mask()
  DMA-API: staging: use dma_set_coherent_mask()
  DMA-API: usb: use new dma_coerce_mask_and_coherent()
  DMA-API: usb: use dma_set_coherent_mask()
  DMA-API: parport: parport_pc.c: use dma_coerce_mask_and_coherent()
  DMA-API: net: octeon: use dma_coerce_mask_and_coherent()
  DMA-API: net: nxp/lpc_eth: use dma_coerce_mask_and_coherent()
  ...
2013-11-14 07:55:21 +09:00
Dan Carpenter
49c2856af7 pktcdvd: debugfs functions return NULL on error
My static checker complains correctly that this is potential NULL
dereference because debugfs functions return NULL on error.  They return
an ERR_PTR if they are configured out.

We don't need to check for ERR_PTR because if debugfs is stubbed out the
dummy functions won't complain about that.  We don't need to check the
values before calling debugfs_remove() because that accepts ERR_PTRs and
NULL pointers.

We don't need to set pkt->dfs_f_info to NULL in pkt_debugfs_dev_new()
because it was initialized with kzalloc() so I have removed that.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-11-08 09:10:31 -07:00
Roger Pau Monne
bfe11d6de1 xen-blkfront: restore the non-persistent data path
When persistent grants were added they were always used, even if the
backend doesn't have this feature (there's no harm in always using the
same set of pages). This restores the old data path when the backend
doesn't have persistent grants, removing the burden of doing a memcpy
when it is not actually needed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Felipe Franciosi <felipe.franciosi@citrix.com>
Cc: Felipe Franciosi <felipe.franciosi@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Fix up whitespace issues]
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
f1a3c61913 skd: fix formatting in skd_s1120.h
Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
542d7b0015 skd: reorder construct/destruct code
Reorder placement of skd_construct(), skd_cons_sg_list(), skd_destruct()
and skd_free_sg_list() functions. Then remove no longer needed function
prototypes.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
fec23f6311 skd: cleanup skd_do_inq_page_da()
skdev->pdev and skdev->pdev->bus are always different than NULL in
skd_do_inq_page_da() so simplify the code accordingly.

Also cache skdev->pdev value in pdev variable while at it.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
7f74e5e944 skd: remove SKD_OMIT_FROM_SRC_DIST ifdefs
SKD_OMIT_FROM_SRC_DIST is never defined.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
ebedd16dab skd: remove redundant skdev->pdev assignment from skd_pci_probe()
skdev->pdev is set to pdev twice in skd_pci_probe(), first time
through skd_construct() call and the second time directly in
the function. Remove the second assignment as it is not needed.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
4ca90b5332 skd: use <asm/unaligned.h>
Use <asm/unaligned.h> instead of <asm-generic/unaligned.h>.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
43504631d0 skd: remove SCSI subsystem specific includes
This is not a SCSI host driver so remove SCSI subsystem specific
includes.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
b8df6647c2 skd: register block device only if some devices are present
Register block device in skd_pci_probe() instead of in skd_init() so it
is registered only if some devices are present (currently it is always
registered when the driver is loaded). Please note that this change
depends on the fact that register_blkdev(0, ...) never returns 0.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
fbed149ab3 skd: fix error messages in skd_init()
* change priority level from KERN_INFO to KERN_ERR
* add "skd: " prefix
* do minor CodingStyle fixes

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
2b91c55222 skd: fix error paths in skd_init()
Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:30 -07:00
Bartlomiej Zolnierkiewicz
a073ae953e skd: fix unregister_blkdev() placement
register_blkdev() is called before pci_register_driver() in skd_init()
so unregister_blkdev() should be called after pci_unregister_driver()
in skd_exit(). Fix it.

Cc: Akhil Bhansali <abhansali@stec-inc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Mike Snitzer
38d4a1bb99 skd: more removal of bio-based code
Remove skd_flush_cmd structure and skd_flush_slab.
Remove skd_end_request wrapper around skd_end_request_blk.
Remove skd_requeue_request, use blk_requeue_request directly.
Cleanup some comments (remove "bio" info) and whitespace.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Jens Axboe
6a5ec65b9a skd: cleanup the skd_*() function block wrapping
Just call the block functions directly, don't wrap them
in skd helpers. With only one queueing model enabled, there's
no point in doing that.

Also kill the ->start_time and ->bio from the skd_request_context,
we don't use those anymore.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Jens Axboe
fcd37eb3c1 skd: rip out bio path
The skd driver has a selectable rq or bio based queueing model.
For 3.14, we want to turn this into a single blk-mq interface
instead. With the immutable biovecs being merged in 3.13, the
bio model would need patches to even work. So rip it out, with
a conversion pending for blk-mq in the next release.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Wei Yongjun
1762b57fcb skd: fix error return code in skd_pci_probe()
Fix to return -ENOMEM in the skd construct error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Stephen M. Cameron
b88fac630b cciss: return 0 from driver probe function on success, not 1
A return value of 1 is interpreted as an error

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
rchinthekindi
2e44b42718 skd: Replaced custom debug PRINTKs with pr_debug
Replaced DPRINTK() and VPRINTK() with pr_debug().

Signed-off-by: Ramprasad C <ramprasad.chinthekindi@hgst.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Akhil Bhansali
f721bb0dbd skd: Fix checkpatch ERRORS and removed unused functions
This patch fixes checkpatch.pl errors for assignment in if condition.
It also removes unused readq / readl function calls.

As Andrew had disabled the compilation of drivers for 32 bit,
I have modified format specifiers in few VPRINTKs to avoid warnings
during 64 bit compilation.

Signed-off-by: Akhil Bhansali <abhansali@stec-inc.com>
Reviewed-by: Ramprasad Chinthekindi <rchinthekindi@stec-inc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Philip J Kelleher
8c49a77ca4 rsxx: Fix possible kernel panic with invalid config.
This patch fixes a possible Kernel Panic on driver load if
the configuration on the card is messed up or not yet set.
The driver could possible give a 32 bit unsigned all Fs to
the kernel as the device's block size.

Now we only write the block size to the kernel if the
configuration from the card is valid.

Also, driver version is being updated.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Philip J Kelleher
e35f38bf73 rsxx: Disallow discards from being unmapped.
This patch fixes a bug in which discards were always
calling pci_unmap_page. Discards should never call the
pci_unmap_page function call because they are never mapped.

This caused a race condition on PowerPC systems when issuing
discards, writes, and reads all at the same time. The
pci_map_page function would eventually map logical address
0 for a read or write. Discards are always assigned a DMA
address of 0 because they are never mapped. So if
pci_map_page mapped address 0 for a DMA and a discard was
"unmapped" then the address would be freed and would cause
an EEH event to occur when Hardware accesses the address.

This was injected/uncovered in commit:
b347f9cf0bc8d42ee95ba1d3837fd93045ab336b

The pci_dma_mapping_error function declares -1 a DMA_ERROR
not 0 like initially thought So before we would never unmap
discards because they were considered NULL.

This patch should fall on top of commit id:
fc1967bb08a6184ed44ef990e1dd4389901b809c

Also, the driver version is being up dated.

Signed-off-by: Philip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Lars Ellenberg
35f47ef1a1 drbd: avoid to shrink max_bio_size due to peer re-configuration
For a long time, the receiving side has spread "too large" incoming
requests over multiple bios.  No need to shrink our max_bio_size
(max_hw_sectors) if the peer is reconfigured to use a different storage.

The problem manifests itself if we are not the top of the device stack
(DRBD is used a LVM PV).

A hardware reconfiguration on the peer may cause the supported
max_bio_size to shrink, and the connection handshake would now
unnecessarily shrink the max_bio_size on the active node.

There is no way to notify upper layers that they have to "re-stack"
their limits. So they won't notice at all, and may keep submitting bios
that are suddenly considered "too large for device".

We already check for compatibility and ignore changes on the peer,
the code only was masked out unless we have a fully established connection.
We just need to allow it a bit earlier during the handshake.

Also consider max_hw_sectors in our merge bvec function, just in case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Lars Ellenberg
d2da5b0cb5 drbd: fix decoding of bitmap vli rle for device sizes > 64 TB
Symptoms: disconnect after bitmap exchange due to
bitmap overflow (e:49731075554) while decoding bm RLE packet

In the decoding step of the variable length integer run length encoding
there was potentially an uncatched bitshift by wordsize (variable >> 64).

The result of which is "undefined" :(
(only "sometimes" the result is the desired 0)

Fix: don't do any bit shift magic for shift == 64, just assign.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Philipp Reisner
57737adc96 drbd: Fix adding of new minors with freshly created meta data
Online adding of new minors with freshly created meta data
to an resource with an established connection failed, with a
wrong state transition on one side on one side of the new minor.

Freshly created meta-data has a la_size (last agreed size) of 0.
When we online add such devices, the code wrongly got into
the code path for resyncing new storage that was added while
the disk was detached.

Fixed that by making the GREW from ZERO a special case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Philipp Reisner
b874d231e1 drbd: Fix an connection drop issue after enabling allow-two-primaries
Since drbd-8.4.0 it is possible to change the allow-two-primaries
network option while the connection is established.

The sequence code used to partially order packets from the
data socket with packets from the meta-data socket, still assued
that the allow-two-primaries option is constant while the
connection is established.

I.e.
On a node that has the RESOLVE_CONFLICTS bits set, after enabling
allow-two-primaries, when receiving the next data packet it timed out
while waiting for the necessary packets on the data socket to arrive
(wait_for_and_update_peer_seq() function).

Fixed that by always tracking the sequence number, but only waiting
for it if allow-two-primaries is set.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Lars Ellenberg
69babf05cb drbd: fix NULL pointer deref in module init error path
If we want to iterate over the (as of yet still empty) list in the
cleanup path, we need to initialize the list before the first goto fail.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Jens Axboe
7badfb1c34 block: disable cpqarray in Kconfig
Mike writes:

"cpqarray hasn't been used in over 12 years. It's doubtful that anyone
 still uses the board. It's time the driver was removed from the mainline
 kernel.  The only updates these days are minor and mostly done by people
 outside of HP."

If nobody yells, we'll remove it from the kernel tree completely
for 3.15.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00