Commit Graph

198 Commits

Author SHA1 Message Date
Hannes Reinecke c309b35171 scsi: Remove CONFIG_SCSI_MULTI_LUN
Obsolete; either use 'max_lun' if the host supports only a
limited number of LUNs or BLIST_NOLUN if the target has
problems addressing more than one LUN.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17 22:07:35 +02:00
Linus Torvalds b7e70ca9c7 Merge branch 'async-scsi-resume' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci
Pull async SCSI resume support from Dan Williams:
 "Allow disks and other devices to resume in parallel.

  This provides a tangible speed up for a non-esoteric use case (laptop
  resume):

    https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach"

* 'async-scsi-resume' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci:
  scsi: async sd resume
2014-04-11 17:23:52 -07:00
Dan Williams 3c31b52f96 scsi: async sd resume
async_schedule() sd resume work to allow disks and other devices to
resume in parallel.

This moves the entirety of scsi_device resume to an async context to
ensure that scsi_device_resume() remains ordered with respect to the
completion of the start/stop command.  For the duration of the resume,
new command submissions (that do not originate from the scsi-core) will
be deferred (BLKPREP_DEFER).

It adds a new ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain) as a container
of these operations.  Like scsi_sd_probe_domain it is flushed at
sd_remove() time to ensure async ops do not continue past the
end-of-life of the sdev.  The implementation explicitly refrains from
reusing scsi_sd_probe_domain directly for this purpose as it is flushed
at the end of dpm_resume(), potentially defeating some of the benefit.
Given sdevs are quiesced it is permissible for these resume operations
to bleed past the async_synchronize_full() calls made by the driver
core.

We defer the resolution of which pm callback to call until
scsi_dev_type_{suspend|resume} time and guarantee that the callback
parameter is never NULL.  With this in place the type of resume
operation is encoded in the async function identifier.

There is a concern that async resume could trigger PSU overload.  In the
enterprise, storage enclosures enforce staggered spin-up regardless of
what the kernel does making async scanning safe by default.  Outside of
that context a user can disable asynchronous scanning via a kernel
command line or CONFIG_SCSI_SCAN_ASYNC.  Honor that setting when
deciding whether to do resume asynchronously.

Inspired by Todd's analysis and initial proposal [2]:
https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach

Cc: Len Brown <len.brown@intel.com>
Cc: Phillip Susi <psusi@ubuntu.com>
[alan: bug fix and clean up suggestion]
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
[djbw: kick all resume work to the async queue]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2014-04-10 15:30:35 -07:00
Hannes Reinecke b3ae8780b4 [SCSI] Add EVPD page 0x83 and 0x80 to sysfs
EVPD page 0x83 is used to uniquely identify the device.
So instead of having each and every program issue a separate
SG_IO call to retrieve this information it does make far more
sense to display it in sysfs.

Some older devices (most notably tapes) will only report reliable
information in page 0x80 (Unit Serial Number). So export this
in the sysfs attribute 'vpd_pg80'.

[jejb: checkpatch fix]
[hare: attach after transport configure]
[fengguang.wu@intel.com: spotted problems with the original now fixed]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-27 08:25:33 -07:00
James Bottomley f2495e228f [SCSI] dual scan thread bug fix
In the highly unusual case where two threads are running concurrently through
the scanning code scanning the same target, we run into the situation where
one may allocate the target while the other is still using it.  In this case,
because the reap checks for STARGET_CREATED and kills the target without
reference counting, the second thread will do the wrong thing on reap.

Fix this by reference counting even creates and doing the STARGET_CREATED
check in the final put.

Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable@vger.kernel.org # delay backport for 2 months for field testing
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:18:59 -07:00
James Bottomley e63ed0d7a9 [SCSI] fix our current target reap infrastructure
This patch eliminates the reap_ref and replaces it with a proper kref.
On last put of this kref, the target is removed from visibility in
sysfs.  The final call to scsi_target_reap() for the device is done from
__scsi_remove_device() and only if the device was made visible.  This
ensures that the target disappears as soon as the last device is gone
rather than waiting until final release of the device (which is often
too long).

Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable@vger.kernel.org # delay backport by 2 months for field testing
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:18:59 -07:00
Martin K. Petersen 56f2a8016e [SCSI] Workaround for disks that report bad optimal transfer length
Not all disks fill out the VPD pages correctly. Add a blacklist flag
that allows us ignore the SBC-3 VPD pages for a given device. The
BLIST_SKIP_VPD_PAGES flag triggers our existing skip_vpd_pages
scsi_device parameter to bypass VPD scanning.

Also blacklist the offending Seagate drive model.

Reported-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-24 13:00:10 -07:00
Martin K. Petersen 0816c9251a [SCSI] Allow error handling timeout to be specified
Introduce eh_timeout which can be used for error handling purposes. This
was previously hardcoded to 10 seconds in the SCSI error handling
code. However, for some fast-fail scenarios it is necessary to be able
to tune this as it can take several iterations (bus device, target, bus,
controller) before we give up.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-04 11:16:24 -07:00
James Bottomley fe709ed827 Merge SCSI misc branch into isci-for-3.6 tag 2012-10-02 08:55:12 +01:00
Martin K. Petersen d974e4265d [SCSI] Disable DIF on Hitachi Ultrastar 15K300
Hitachi Ultrastar 15K300 is quirky. Disable T10 PI (DIF).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-24 12:11:00 +04:00
James Bottomley 14216561e1 [SCSI] Fix 'Device not ready' issue on mpt2sas
This is a particularly nasty SCSI ATA Translation Layer (SATL) problem.

SAT-2 says (section 8.12.2)

        if the device is in the stopped state as the result of
        processing a START STOP UNIT command (see 9.11), then the SATL
        shall terminate the TEST UNIT READY command with CHECK CONDITION
        status with the sense key set to NOT READY and the additional
        sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
        REQUIRED;

mpt2sas internal SATL seems to implement this.  The result is very confusing
standby behaviour (using hdparm -y).  If you suspend a drive and then send
another command, usually it wakes up.  However, if the next command is a TEST
UNIT READY, the SATL sees that the drive is suspended and proceeds to follow
the SATL rules for this, returning NOT READY to all subsequent commands.  This
means that the ordering of TEST UNIT READY is crucial: if you send TUR and
then a command, you get a NOT READY to both back.  If you send a command and
then a TUR, you get GOOD status because the preceeding command woke the drive.

This bit us badly because

commit 85ef06d1d2
Author: Tejun Heo <tj@kernel.org>
Date:   Fri Jul 1 16:17:47 2011 +0200

    block: flush MEDIA_CHANGE from drivers on close(2)

Changed our ordering on TEST UNIT READY commands meaning that SATA drives
connected to an mpt2sas now suspend and refuse to wake (because the mpt2sas
SATL sees the suspend *before* the drives get awoken by the next ATA command)
resulting in lots of failed commands.

The standard is completely nuts forcing this inconsistent behaviour, but we
have to work around it.

The fix for this is twofold:

   1. Set the allow_restart flag so we wake the drive when we see it has been
      suspended

   2. Return all TEST UNIT READY status directly to the mid layer without any
      further error handling which prevents us causing error handling which
      may offline the device just because of a media check TUR.

Reported-by: Matthias Prager <linux@matthiasprager.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-08-22 09:42:54 +04:00
Dan Williams e96eb23d82 [SCSI] Revert "[SCSI] fix async probe regression"
This reverts commit 43a8d39d01.

Commit 43a8d39d fixed the fact that wait_for_device_probe() was unable
to flush sd probe work.  Now that sd probe work is once again flushable
via wait_for_device_probe() this workaround is no longer needed.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 09:25:56 +01:00
Dan Williams 492d542273 [SCSI] cleanup usages of scsi_complete_async_scans
Now that scsi registers its async scan work with the async subsystem,
wait_for_device_probe() is sufficient for ensuring all scanning is
complete.

[jejb: fix merge problems with eea03c20ae Make wait_for_device_probe() also do scsi_complete_async_scans()]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 09:25:22 +01:00
Dan Williams 6cdd55205d [SCSI] queue async scan work to an async_schedule domain
This is preparation to enable async_synchronize_full() to be used as a
replacement for scsi_complete_async_scans(), i.e. to stop leaking scsi
internal details where they are not needed.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 09:09:27 +01:00
Dan Williams 3b661a92e8 [SCSI] fix hot unplug vs async scan race
The following crash results from cases where the end_device has been
removed before scsi_sysfs_add_sdev has had a chance to run.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6
 ...
 Call Trace:
  [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3
  [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff8125e641>] kobject_add_varg+0x41/0x50
  [<ffffffff8125e70b>] kobject_add+0x64/0x66
  [<ffffffff8131122b>] device_add+0x12d/0x63a
  [<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56
  [<ffffffff8107de15>] ? module_refcount+0x89/0xa0
  [<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a
  [<ffffffff8132dcbb>] do_scan_async+0x9c/0x145

...teach scsi_sysfs_add_devices() to check for deleted devices() before
trying to add them, and teach scsi_remove_target() how to remove targets
that have not been added via device_add().

Cc: <stable@vger.kernel.org>
Reported-by: Dariusz Majchrzak <dariusz.majchrzak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:45 +01:00
Dan Williams 43a8d39d01 [SCSI] fix async probe regression
Commit a7a20d1 "[SCSI] sd: limit the scope of the async probe domain"
moved sd probe work out of reach of wait_for_device_probe().  Allow it
to be synced via scsi_complete_async_scans().

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-30 13:37:07 +04:00
Greg Kroah-Hartman f7a0d426f3 Merge 3.3-rc7 into usb-next
This resolves the conflict with drivers/usb/host/ehci-fsl.h that
happened with changes in Linus's and this branch at the same time.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-12 09:13:31 -07:00
Huajun Li 267a6ad4ae [SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost'
In do_scan_async(), calling scsi_autopm_put_host(shost) may reference
freed shost, and cause Posison overwitten warning.
Yes, this case can happen, for example, an USB is disconnected just
when do_scan_async() thread starts to run, then scsi_host_put() called
in scsi_finish_async_scan() will lead to shost be freed(because the
refcount of shost->shost_gendev decreases to 1 after USB disconnects),
at this point, if references shost again, system will show following
warning msg.

To make scsi_autopm_put_host(shost) always reference a valid shost,
put it just before scsi_host_put() in function
scsi_finish_async_scan().

[  299.281565] =============================================================================
[  299.281634] BUG kmalloc-4096 (Tainted: G          I ): Poison overwritten
[  299.281682] -----------------------------------------------------------------------------
[  299.281684]
[  299.281752] INFO: 0xffff880056c305d0-0xffff880056c305d0. First byte
0x6a instead of 0x6b
[  299.281816] INFO: Allocated in scsi_host_alloc+0x4a/0x490 age=1688
cpu=1 pid=2004
[  299.281870] 	__slab_alloc+0x617/0x6c1
[  299.281901] 	__kmalloc+0x28c/0x2e0
[  299.281931] 	scsi_host_alloc+0x4a/0x490
[  299.281966] 	usb_stor_probe1+0x5b/0xc40 [usb_storage]
[  299.282010] 	storage_probe+0xa4/0xe0 [usb_storage]
[  299.282062] 	usb_probe_interface+0x172/0x330 [usbcore]
[  299.282105] 	driver_probe_device+0x257/0x3b0
[  299.282138] 	__driver_attach+0x103/0x110
[  299.282171] 	bus_for_each_dev+0x8e/0xe0
[  299.282201] 	driver_attach+0x26/0x30
[  299.282230] 	bus_add_driver+0x1c4/0x430
[  299.282260] 	driver_register+0xb6/0x230
[  299.282298] 	usb_register_driver+0xe5/0x270 [usbcore]
[  299.282337] 	0xffffffffa04ab03d
[  299.282364] 	do_one_initcall+0x47/0x230
[  299.282396] 	sys_init_module+0xa0f/0x1fe0
[  299.282429] INFO: Freed in scsi_host_dev_release+0x18a/0x1d0 age=85
cpu=0 pid=2008
[  299.282482] 	__slab_free+0x3c/0x2a1
[  299.282510] 	kfree+0x296/0x310
[  299.282536] 	scsi_host_dev_release+0x18a/0x1d0
[  299.282574] 	device_release+0x74/0x100
[  299.282606] 	kobject_release+0xc7/0x2a0
[  299.282637] 	kobject_put+0x54/0xa0
[  299.282668] 	put_device+0x27/0x40
[  299.282694] 	scsi_host_put+0x1d/0x30
[  299.282723] 	do_scan_async+0x1fc/0x2b0
[  299.282753] 	kthread+0xdf/0xf0
[  299.282782] 	kernel_thread_helper+0x4/0x10
[  299.282817] INFO: Slab 0xffffea00015b0c00 objects=7 used=7 fp=0x
      (null) flags=0x100000000004080
[  299.282882] INFO: Object 0xffff880056c30000 @offset=0 fp=0x          (null)
[  299.282884]
...

Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>
Cc: stable@kernel.org
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-18 08:52:48 -06:00
Alan Stern 09b6b51b0b SCSI & usb-storage: add flags for VPD pages and REPORT LUNS
This patch (as1507) adds a skip_vpd_pages flag to struct scsi_device
and a no_report_luns flag to struct scsi_target.  The first is used to
control whether sd will look at VPD pages for information on block
provisioning, limits, and characteristics.  The second prevents
scsi_report_lun_scan() from issuing a REPORT LUNS command.

The patch also modifies usb-storage to set the new flag bits for all
USB devices and targets, and to stop adjusting the scsi_level value.

Historically we have seen that USB mass-storage devices often don't
support VPD pages or REPORT LUNS properly.  Until now we have avoided
these things by setting the scsi_level to SCSI_2 for all USB devices.
But this has the side effect of storing the LUN bits into the second
byte of each CDB, and now we have a report of a device which doesn't
like that.  The best solution is to stop abusing scsi_level and
instead have separate flags for VPD pages and REPORT LUNS.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Perry Wagle <wagle@mac.com>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-08 17:36:41 -08:00
Tejun Heo 09ac46c429 block: misc updates to blk_get_queue()
* blk_get_queue() is peculiar in that it returns 0 on success and 1 on
  failure instead of 0 / -errno or boolean.  Update it such that it
  returns %true on success and %false on failure.

* Make sure the caller checks for the return value.

* Separate out __blk_get_queue() which doesn't check whether @q is
  dead and put it in blk.h.  This will be used later.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-12-14 00:33:38 +01:00
James Bottomley 4e6c82b361 [SCSI] fix WARNING: at drivers/scsi/scsi_lib.c:1704
On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote:
> Hi all,
>
> Starting some time last week I am getting the following during boot on
> our PPC970 blade:
>
> calling  .ipr_init+0x0/0x68 @ 1
> ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
> ipr 0000:01:01.0: Found IOA with IRQ: 26
> ipr 0000:01:01.0: Starting IOA initialization sequence.
> ipr 0000:01:01.0: Adapter firmware version: 06160039
> ipr 0000:01:01.0: IOA initialized.
> scsi0 : IBM 572E Storage Adapter
> ------------[ cut here ]------------
> WARNING: at drivers/scsi/scsi_lib.c:1704
> Modules linked in:
> NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70
> REGS: c0000000783c2f60 TRAP: 0700   Not tainted  (3.1.0-autokern1)
> MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24002024  XER: 20000002
> TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0
> GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0
> GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000
> GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000
> GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800
> GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000
> GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001
> GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938
> GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0
> NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90
> LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
> Call Trace:
> [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable)
> [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
> [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40
> [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650
> [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100
> [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0
> [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390
> [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50
> [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150
> [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210
> [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110
> [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0
> [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40
> [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340
> [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0
> [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140
> [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68
> [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0
> [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc
> [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70
> Instruction dump:
> ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010
> f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000
> ---[ end trace 759bed76a85e8dec ]---
> scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
> ------------[ cut here ]------------
>
> I get lots more of these.  The obvious commit to point the finger at
> is 3308511c93 ("[SCSI] Make scsi_free_queue() kill pending SCSI
> commands") but the root cause may be something different.

Caused by

commit f7c9c6bb14
Author: Anton Blanchard <anton@samba.org>
Date:   Thu Nov 3 08:56:22 2011 +1100

    [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev

Doesn't completely do the teardown.  The true fix is to do a proper
teardown instead of hand rolling it

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: stable@kernel.org	#2.6.38+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-09 12:05:23 -06:00
Anton Blanchard f7c9c6bb14 [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev
When looking at memory consumption issues I noticed quite a
lot of memory in the kmalloc-2048 bucket:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
  6561   6471  98%    2.30K    243       27     15552K kmalloc-2048

Over 15MB. slub debug shows that cfq is responsible for almost
all of it:

# sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls
6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13

In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc
fails we don't free it with scsi_free_queue.

The patch below fixes the issue:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
   135     72  53%    2.30K      5       27       320K kmalloc-2048

# cat /sys/kernel/slab/kmalloc-2048/alloc_calls
3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>		#2.6.38+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-03 11:19:50 +04:00
James Bottomley e73e079bf1 [SCSI] Fix oops caused by queue refcounting failure
In certain circumstances, we can get an oops from a torn down device.
Most notably this is from CD roms trying to call scsi_ioctl.  The root
cause of the problem is the fact that after scsi_remove_device() has
been called, the queue is fully torn down.  This is actually wrong
since the queue can be used until the sdev release function is called.
Therefore, we add an extra reference to the queue which is released in
sdev->release, so the queue always exists.

Reported-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-06-02 18:34:43 +09:00
Jens Axboe 9937a5e2f3 scsi: remove performance regression due to async queue run
Commit c21e6beb removed our queue request_fn re-enter
protection, and defaulted to always running the queues from
kblockd to be safe. This was a known potential slow down,
but should be safe.

Unfortunately this is causing big performance regressions for
some, so we need to improve this logic. Looking into the details
of the re-enter, the real issue is on requeue of requests.

Requeue of requests upon seeing a BUSY condition from the device
ends up re-running the queue, causing traces like this:

scsi_request_fn()
        scsi_dispatch_cmd()
                scsi_queue_insert()
                        __scsi_queue_insert()
                                scsi_run_queue()
					scsi_request_fn()
						...

potentially causing the issue we want to avoid. So special
case the requeue re-run of the queue, but improve it to offload
the entire run of local queue and starved queue from a single
workqueue callback. This is a lot better than potentially
kicking off a workqueue run for each device seen.

This also fixes the issue of the local device going into recursion,
since the above mentioned commit never moved that queue run out
of line.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-17 11:04:44 +02:00
Kay Sievers 39aba963d9 driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices
This patch removes the old CONFIG_SYSFS_DEPRECATED_V2 config option,
but it keeps the logic around to handle block devices in the old manner
as some people like to run new kernel versions on old (pre 2007/2008)
distros.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:16:43 -07:00
Alan Stern bc4f24014d [SCSI] implement runtime Power Management
This patch (as1398b) adds runtime PM support to the SCSI layer.  Only
the machanism is provided; use of it is up to the various high-level
drivers, and the patch doesn't change any of them.  Except for sg --
the patch expicitly prevents a device from being runtime-suspended
while its sg device file is open.

The implementation is simplistic.  In general, hosts and targets are
automatically suspended when all their children are asleep, but for
them the runtime-suspend code doesn't actually do anything.  (A host's
runtime PM status is propagated up the device tree, though, so a
runtime-PM-aware lower-level driver could power down the host adapter
hardware at the appropriate times.)  There are comments indicating
where a transport class might be notified or some other hooks added.

LUNs are runtime-suspended by calling the drivers' existing suspend
handlers (and likewise for runtime-resume).  Somewhat arbitrarily, the
implementation delays for 100 ms before suspending an eligible LUN.
This is because there typically are occasions during bootup when the
same device file is opened and closed several times in quick
succession.

The way this all works is that the SCSI core increments a device's
PM-usage count when it is registered.  If a high-level driver does
nothing then the device will not be eligible for runtime-suspend
because of the elevated usage count.  If a high-level driver wants to
use runtime PM then it can call scsi_autopm_put_device() in its probe
routine to decrement the usage count and scsi_autopm_get_device() in
its remove routine to restore the original count.

Hosts, targets, and LUNs are not suspended while they are being probed
or removed, or while the error handler is running.  In fact, a fairly
large part of the patch consists of code to make sure that things
aren't suspended at such times.

[jejb: fix up compile issues in PM config variations]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:07:50 -05:00
Linus Torvalds e2e2400bd4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
  [SCSI] fix race in scsi_target_reap
  [SCSI] aacraid: Eliminate use after free
  [SCSI] arcmsr: Support HW reset for EH and polling scheme for scsi device
  [SCSI] bfa: fix system crash when reading sysfs fc_host statistics
  [SCSI] iscsi_tcp: remove sk_sleep check
  [SCSI] ipr: improve interrupt service routine performance
  [SCSI] ipr: set the data list length in the request control block
  [SCSI] ipr: fix a register read to use the correct address for 64 bit adapters
  [SCSI] ipr: include the resource path in the IOA status area structure
  [SCSI] ipr: implement fixes for 64 bit adapter support
  [SCSI] be2iscsi: correct return value in mgmt_invalidate_icds()
2010-05-27 10:28:11 -07:00
Alan Stern f9e8894ae5 [SCSI] fix race in scsi_target_reap
This patch (as1357) fixes a race in SCSI target allocation and
release.  Putting a target in the STARGET_DEL state isn't protected by
the host lock, so an old target structure could be reused by a new
device even though it's about to be deleted.  The cure is to change
the state while still holding the host lock.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-25 11:00:56 -05:00
Randy Dunlap 9f6aa5750d scsi_scan.c: fix/convert functions to use kernel-doc
scsi_scan.c: fix incorrectly formatted kernel-doc notation
& convert documentation of 2 functions into kernel-doc.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-24 07:31:20 -07:00
Alan Stern 12fb8c1574 [SCSI] don't kfree an initialized struct device
This patch (as1359) fixes a bug in scsi_alloc_target().  After a
device structure has been initialized (and especially after its name
has been set), it must not be freed directly.  One has to call
put_device() instead.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-04-11 09:24:15 -05:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Linus Torvalds b1bf936840 Merge branch 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
  block: don't access jiffies when initialising io_context
  cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
  block: fix for "Consolidate phys_segment and hw_segment limits"
  cfq-iosched: quantum check tweak
  blktrace: perform cleanup after setup error
  blkdev: fix merge_bvec_fn return value checks
  cfq-iosched: requests "in flight" vs "in driver" clarification
  cciss: Fix problem with scatter gather elements in the scsi half of the driver
  cciss: eliminate unnecessary pointer use in cciss scsi code
  cciss: do not use void pointer for scsi hba data
  cciss: factor out scatter gather chain block mapping code
  cciss: fix scatter gather chain block dma direction kludge
  cciss: simplify scatter gather code
  cciss: factor out scatter gather chain block allocation and freeing
  cciss: detect bad alignment of scsi commands at build time
  cciss: clarify command list padding calculation
  cfq-iosched: rethink seeky detection for SSDs
  cfq-iosched: rework seeky detection
  block: remove padding from io_context on 64bit builds
  block: Consolidate phys_segment and hw_segment limits
  ...
2010-03-01 09:00:29 -08:00
Martin K. Petersen 086fa5ff08 block: Rename blk_queue_max_sectors to blk_queue_max_hw_sectors
The block layer calling convention is blk_queue_<limit name>.
blk_queue_max_sectors predates this practice, leading to some confusion.
Rename the function to appropriately reflect that its intended use is to
set max_hw_sectors.

Also introduce a temporary wrapper for backwards compability.  This can
be removed after the merge window is closed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-26 13:58:08 +01:00
Alan Stern d5469119f0 [SCSI] fix refcounting bug in scsi_get_host_dev
This patch (as1334) fixes a bug in scsi_get_host_dev().  It
incorrectly calls get_device() on the new device's target.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-02-18 10:52:39 -06:00
Alan Stern 75f8ee8e01 [SCSI] fix memory leak in scsi_report_lun_scan
This patch (as1333) fixes a bug in scsi_report_lun_scan().  If a
newly-allocated device can't be used, it should be deleted.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-02-18 10:52:10 -06:00
Linus Torvalds 382f51fe2f Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (222 commits)
  [SCSI] zfcp: Remove flag ZFCP_STATUS_FSFREQ_TMFUNCNOTSUPP
  [SCSI] zfcp: Activate fc4s attributes for zfcp in FC transport class
  [SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED
  [SCSI] zfcp: Update FSF error reporting
  [SCSI] zfcp: Improve ELS ADISC handling
  [SCSI] zfcp: Simplify handling of ct and els requests
  [SCSI] zfcp: Remove ZFCP_DID_MASK
  [SCSI] zfcp: Move WKA port to zfcp FC code
  [SCSI] zfcp: Use common code definitions for FC CT structs
  [SCSI] zfcp: Use common code definitions for FC ELS structs
  [SCSI] zfcp: Update FCP protocol related code
  [SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport
  [SCSI] zfcp: Assign scheduled work to driver queue
  [SCSI] zfcp: Remove STATUS_COMMON_REMOVE flag as it is not required anymore
  [SCSI] zfcp: Implement module unloading
  [SCSI] zfcp: Merge trace code for fsf requests in one function
  [SCSI] zfcp: Access ports and units with container_of in sysfs code
  [SCSI] zfcp: Remove suspend callback
  [SCSI] zfcp: Remove global config_mutex
  [SCSI] zfcp: Replace local reference counting with common kref
  ...
2009-12-09 19:42:25 -08:00
Vasu Dev 4a84067dbf [SCSI] add queue_depth ramp up code
Current FC HBA queue_depth ramp up code depends on last queue
full time. The sdev already  has last_queue_full_time field to
track last queue full time but stored value is truncated by
last four bits.

So this patch updates last_queue_full_time without truncating
last 4 bits to store full value and then updates its only
current usages in scsi_track_queue_full to ignore last four bits
to keep current usages same while also use this field
in added ramp up code.

Adds scsi_handle_queue_ramp_up to ramp up queue_depth on
successful completion of IO. The scsi_handle_queue_ramp_up will
do ramp up on all luns of a target, just same as ramp down done
on all luns on a target.

The ramp up is skipped in case the change_queue_depth is not
supported by LLD or already reached to added max_queue_depth.

Updates added max_queue_depth on every new update to default
queue_depth value.

The ramp up is also skipped if lapsed time since either last
queue ramp up or down is less than LLD specified
queue_ramp_up_period.

Adds queue_ramp_up_period to sysfs but only if change_queue_depth
is supported since ramp up and queue_ramp_up_period is needed only
in case change_queue_depth is supported first.

Initializes queue_ramp_up_period to 120HZ jiffies as initial
default value, it is same as used in existing lpfc and qla2xxx.

-v2
 Combined all ramp code into this single patch.

-v3
 Moves max_queue_depth initialization after slave_configure is
called from after slave_alloc calling done. Also adjusted
max_queue_depth check to skip ramp up if current queue_depth
is >= max_queue_depth.

-v4
 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f
to store or show its value.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04 12:00:44 -06:00
James Bottomley 860dc73608 [SCSI] fix async scan add/remove race resulting in an oops
Async scanning introduced a very wide window where the SCSI device is
up and running but has not yet been added to sysfs.  We delay the
adding until all scans have completed to retain the same ordering as
sync scanning.

This delay in visibility causes an oops if a device is removed before
we make it visible because the SCSI removal routines have an inbuilt
assumption that if a device is in SDEV_RUNNING state, it must be
visible (which is not necessarily true in the async scanning case).

Fix this by introducing an additional is_visible flag which we can use
to condition the tear down so we do the right thing for running but
not yet made visible.

Reported-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-11-26 09:43:39 -06:00
James Bottomley 37e6ba0072 [SCSI] fix memory leak in initialization
The root cause of the problem is the fact that dev_set_name() now
allocates storage instead of using the original array within the kobj.
That means that the SCSI assumption that if you haven't made the
containing object or any sub objects visible, you can just destroy it
(and its component devices) lock stock and barrel becomes false.

Fix this by doing the get of sdev_dev at parent time and thus do an
extra put of it in scsi_destroy_sdev() (and all other destruction
without add paths).

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-13 11:33:45 -05:00
Alan Stern 14faf12f7d [SCSI] Increase default timeout for INQUIRY
This patch (as1224) changes the default timeout for INQUIRY commands
from 3 seconds to 20 seconds, which is the value used by Windows for
USB Mass-Storage devices.  Some of these devices, like the Corsair
Flash Voyager (see Bugzilla #12188) really do need a long timeout.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-20 17:21:13 -05:00
Edward Goggin c53a284f8b [SCSI] initialize max_target_blocked in scsi_alloc_target
This patch initializes the max_target_blocked field of a scsi target
structure so that a queuecommand return value of
SCSI_MLQUEUE_TARGET_BUSY will actually result in having the
scsi_queue_insert blocking the device queue before requeuing the
command and running the queue.  Otherwise, can and does cause livelock
on single CPU configurations if/when open-iSCSI software initiator's
command PDU window fills.

Signed-off-by: Ed Goggin <egoggin@vmware.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-14 17:17:46 -04:00
Arjan van de Ven d4d5291c8c driver synchronization: make scsi_wait_scan more advanced
There is currently only one way for userspace to say "wait for my storage
device to get ready for the modules I just loaded": to load the
scsi_wait_scan module. Expectations of userspace are that once this
module is loaded, all the (storage) devices for which the drivers
were loaded before the module load are present.

Now, there are some issues with the implementation, and the async
stuff got caught in the middle of this: The existing code only
waits for the scsy async probing to finish, but it did not take
into account at all that probing might not have begun yet.
(Russell ran into this problem on his computer and the fix works for him)

This patch fixes this more thoroughly than the previous "fix", which
had some bad side effects (namely, for kernel code that wanted to wait for
the scsi scan it would also do an async sync, which would deadlock if you did
it from async context already.. there's a report about that on lkml):
The patch makes the module first wait for all device driver probes, and then it
will wait for the scsi parallel scan to finish.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21 19:40:00 -07:00
Boaz Harrosh 82443a58d3 [SCSI] add OSD_TYPE
- Define the OSD_TYPE scsi device and let it show up in scans

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-12 12:58:05 -05:00
James Smart c2f9e49f9b [SCSI] scsi_scan: add missing interim SDEV_DEL state if slave_alloc fails
We were running i/o and performing a bunch of hba resets in a loop.
This forces a lot of target removes and then rescans. Since the
resets are occuring during scan it's causing the scan i/o to timeout,
invoking error recovery, etc.  We end up getting some nasty crashing
in scsi_scan.c due to references to old sdevs that are failing
but had some lingering references that kept them around.

Fix by setting device state to SDEV_DEL if the LLD's slave_alloc
fails.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10 11:15:17 -05:00
Linus Torvalds cd764695b6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits)
  [SCSI] qla2xxx: Update version number to 8.03.00-k1.
  [SCSI] qla2xxx: Add ISP81XX support.
  [SCSI] qla2xxx: Use proper request/response queues with MQ instantiations.
  [SCSI] qla2xxx: Correct MQ-chain information retrieval during a firmware dump.
  [SCSI] qla2xxx: Collapse EFT/FCE copy procedures during a firmware dump.
  [SCSI] qla2xxx: Don't pollute kernel logs with ZIO/RIO status messages.
  [SCSI] qla2xxx: Don't fallback to interrupt-polling during re-initialization with MSI-X enabled.
  [SCSI] qla2xxx: Remove support for reading/writing HW-event-log.
  [SCSI] cxgb3i: add missing include
  [SCSI] scsi_lib: fix DID_RESET status problems
  [SCSI] fc transport: restore missing dev_loss_tmo callback to LLDD
  [SCSI] aha152x_cs: Fix regression that keeps driver from using shared interrupts
  [SCSI] sd: Correctly handle 6-byte commands with DIX
  [SCSI] sd: DIF: Fix tagging on platforms with signed char
  [SCSI] sd: DIF: Show app tag on error
  [SCSI] Fix error handling for DIF/DIX
  [SCSI] scsi_lib: don't decrement busy counters when inserting commands
  [SCSI] libsas: fix test for negative unsigned and typos
  [SCSI] a2091, gvp11: kill warn_unused_result warnings
  [SCSI] fusion: Move a dereference below a NULL test
  ...

Fixed up trivial conflict due to moving the async part of sd_probe
around in the async probes vs using dev_set_name() in naming.
2009-01-08 16:27:31 -08:00
Arjan van de Ven 4ace92fc11 fastboot: make scsi probes asynchronous
This patch makes part of the scsi probe (which is mostly device spin up and the
partition scan) asynchronous. Only the part that runs after getting the device
number allocated is asynchronous, ensuring that device numbering remains stable.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-01-07 08:46:13 -08:00
Kay Sievers 71610f55fa [SCSI] struct device - replace bus_id with dev_name(), dev_set_name()
[jejb: limit ioctl to returning 20 characters to avoid overrun
       on long device names and add a few more conversions]
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-01-02 10:22:16 -06:00
FUJITA Tomonori 5cd3bbfad0 [SCSI] retry with missing data for INQUIRY
This patch changes scsi_probe_lun() to retry INQUIRY if the device has
not actually sent back any INQUIRY data,

This enables the Thecus N2050 storage device to work better. The
firmware on that device starts up strangely; it sends no data in
response to the initial INQUIRY, and it sends the INQUIRY information
in response to the followup REQUEST SENSE. But after that it works
better, so retrying the INQUIRY is enough to get it going.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:24 -06:00
FUJITA Tomonori f4f4e47e4a [SCSI] add residual argument to scsi_execute and scsi_execute_req
scsi_execute() and scsi_execute_req() discard the residual length
information. Some callers need it. This adds residual argument
(optional) to scsi_execute and scsi_execute_req.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:24 -06:00
Mike Christie f0c0a376d0 [SCSI] Add helper code so transport classes/driver can control queueing (v3)
SCSI-ml manages the queueing limits for the device and host, but
does not do so at the target level. However something something similar
can come in userful when a driver is transitioning a transport object to
the the blocked state, becuase at that time we do not want to queue
io and we do not want the queuecommand to be called again.

The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
a transport level queueing issue like the hw cannot allocate some
resource at the iscsi session/connection level, or the target has temporarily
closed or shrunk the queueing window, or if we are transitioning
to the blocked state.

bnx2i, when they rework their firmware according to netdev
developers requests, will also need to be able to limit queueing at this
level. bnx2i will hook into libiscsi, but will allocate a scsi host per
netdevice/hba, so unlike pure software iscsi/iser which is allocating
a host per session, it cannot set the scsi_host->can_queue and return
SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.

The iscsi class/driver can also set a scsi_target->can_queue value which
reflects the max commands the driver/class can support. For iscsi this
reflects the number of commands we can support for each session due to
session/connection hw limits, driver limits, and to also reflect the
session/targets's queueing window.

Changes:
v1 - initial patch.
v2 - Fix scsi_run_queue handling of multiple blocked targets.
Previously we would break from the main loop if a device was added back on
the starved list. We now run over the list and check if any target is
blocked.
v3 - Rediff for scsi-misc.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13 09:28:46 -04:00
James Bottomley 6f4267e3bd [SCSI] Update the SCSI state model to allow blocking in the created state
Brian King <brking@linux.vnet.ibm.com> reported that fibre channel
devices can oops during scanning if their ports block (because the
device goes from CREATED -> BLOCK -> RUNNING rather than CREATED ->
BLOCK -> CREATED).

Fix this by adding a new state: CREATED_BLOCK which can only transition
back to CREATED and disallow the CREATED -> BLOCK transition.  Now both
the created and blocked states that the mid-layer recognises can include
CREATED_BLOCK.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 11:46:13 -05:00
James Bottomley 0f1d87a2ac [SCSI] add inline functions for recognising created and blocked states
The created and blocked states are very shortly going to correspond to
mixed sdev_state states.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 11:46:13 -05:00
James Bottomley 01b291bd66 [SCSI] fix check of PQ and PDT bits for WLUNs
For IBM z series certain LUNs can no longer be accessed.

This is because kernel version 2.6.19 a check was introduced not to
create a generic SCSI device for devices that return PQ=1 and
PDT=0x1f. For WLUNs (see SAM-3, p. 41ff) generic SCSI devices should
be created unconditionally without looking at the PQ bit, so add a
check for WLUNs in with this test.

Acked-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-08-29 09:19:11 -05:00
Harvey Harrison cadbd4a5e3 [SCSI] replace __FUNCTION__ with __func__
[jejb: fixed up a ton of missed conversions.

 All of you are on notice this has happened, driver trees will now
 need to be rebased]

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: SCSI List <linux-scsi@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-27 10:31:49 -04:00
Julia Lawall 773e82f6cd [SCSI] scsi_scan.c: Release mutex in error handling code
The mutex is released on a successful return, so it would seem that it
should be released on an error return as well.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression l;
@@

mutex_lock(l);
... when != mutex_unlock(l)
    when any
    when strict
(
if (...) { ... when != mutex_unlock(l)
+   mutex_unlock(l);
    return ...;
}
|
mutex_unlock(l);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:56 -04:00
Adrian Bunk 453cd0f3ff [SCSI] make struct scsi_{host,target}_type static
Make the needlessly global struct scsi_{host,target}_type static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:36 -05:00
Hirofumi Nakagawa 801678c5a3 Remove duplicated unlikely() in IS_ERR()
Some drivers have duplicated unlikely() macros.  IS_ERR() already has
unlikely() in itself.

This patch cleans up such pointless code.

Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:25 -07:00
James Bottomley 643eb2d932 [SCSI] rework scsi_target allocation
The current target allocation code registeres each possible target
with sysfs; it will be deleted again if no useable LUN on this target
was found. This results in a string of 'target add/target remove' uevents.

Based on a patch by Hannes Reinecke <hare@suse.de> this patch reworks
the target allocation code so that only uevents for existing targets
are sent. The sysfs registration is split off from the existing
scsi_target_alloc() into a in a new scsi_add_target() function, which
should be called whenever an existing target is found. Only then a
uevent is sent, so we'll be generating events for existing targets
only.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-22 15:16:31 -05:00
Hannes Reinecke b0ed43360f [SCSI] add scsi_host and scsi_target to scsi_bus
This patch implements scsi_host and scsi_target device types
and adds both to the scsi_bus.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-22 15:16:29 -05:00
Randy Dunlap e59e4a0972 docbook: fix scsi source file
Fix docbook problem in SCSI source files.
These cause the generated docbook to be incorrect.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03 10:47:13 -08:00
James Bottomley d52b3815a5 [SCSI] add missing transport configure points for target and host
While trying to convert the SPI transport class to attribute groups, I
discovered that we don't actually have any transport configure points
for either the target or the host.  This patch adds these missing
transport class triggers.  The host one is simply done after the add,
the target one tries to be more clever and add it after devices may have
been placed on the target (so the device configure will have set up the
target parameters).

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23 11:29:17 -06:00
Tony Battersby 25d7c363f2 [SCSI] move single_lun flag from scsi_device to scsi_target
Some SCSI tape medium changers that need the BLIST_SINGLELUN flag have
the medium changer at one LUN and the tape drive at a different LUN.
The inquiry string of the tape drive may be different from that of the
medium changer.  In order for single_lun to be effective, every
scsi_device under a given scsi_target must have it set.  This means that
there needs to be a blacklist entry for BOTH the medium changer AND the
tape drive, which is impractical because some medium changers may be
paired with a variety of different tape drive models.  It makes more
sense to put the single_lun flag in scsi_target instead of scsi_device,
which causes every device at a given target ID to inherit the single_lun
flag from one LUN.  This makes it possible to blacklist just the medium
changer and not the tape drive.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:44 -06:00
Rob Landley eb44820c28 [SCSI] Add Documentation and integrate into docbook build
Add Documentation/DocBook/scsi_midlayer.tmpl, add to Makefile, and update
lots of kerneldoc comments in drivers/scsi/*.

Updated with comments from Stefan Richter, Stephen M. Cameron,
 James Bottomley and Randy Dunlap.

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:40 -06:00
Jeff Garzik a341cd0f6a SCSI: add asynchronous event notification API
Originally based on a patch by Kristen Carlson Accardi @ Intel.
Copious input from James Bottomley.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-11-03 22:23:02 -04:00
Matthew Wilcox a57b1fccdf [SCSI] scsi_scan: Cope with kthread_run failing
If kthread_run failed, we would fail to scan the host, and leak the
allocated async_scan_data.  Since using a separate thread is just an
optimisation, do the scan synchronously if we fail to spawn a thread.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-10-12 14:51:06 -04:00
Masatake YAMATO 10f4b89a0f [SCSI] Fix signness of parameters in scsi module
In scsi module I've found some inconsistency between variable type
used in module_param_named and type passed to module_param_named as an
argument. Especially the inconsistency of `max_scsi_luns' parameter is
a bit serious because the description text says "last scsi LUN (should
be between 1 and 2^32-1)".

Signed-off-by: Masatake YAMATO <jet@gyve.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-10-12 14:49:11 -04:00
Matthew Wilcox 6b7f123f37 [SCSI] Fix async scanning double-add problems
Stress-testing and some thought has revealed some places where
asynchronous scanning needs some more attention to locking.

 - Since async_scan is a bit, we need to hold the host_lock while
   modifying it to prevent races against other CPUs modifying the word
   that bit is in.  This is probably a theoretical race for the moment,
   but other patches may change that.
 - The async_scan bit means not only that this host is being scanned
   asynchronously, but that all the devices attached to this host are not
   yet added to sysfs.  So we must ensure that this bit is always in sync.
   I've chosen to do this with the scan_mutex since it's already acquired
   in most of the right places.
 - If the host changes state to deleted while we're in the middle of
   a scan, we'll end up with some devices on the host's list which must
   be deleted.  Add a check to scsi_sysfs_add_devices() to ensure the
   host is still running.
 - To avoid the async_scan bit being protected by three locks, the
   async_scan_lock now only protects the scanning_list.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-10-12 14:38:24 -04:00
Linus Torvalds bc06cffdec Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits)
  [SCSI] ibmvscsi: convert to use the data buffer accessors
  [SCSI] dc395x: convert to use the data buffer accessors
  [SCSI] ncr53c8xx: convert to use the data buffer accessors
  [SCSI] sym53c8xx: convert to use the data buffer accessors
  [SCSI] ppa: coding police and printk levels
  [SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc
  [SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c
  [SCSI] remove the dead CYBERSTORMIII_SCSI option
  [SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
  [SCSI] Clean up scsi_add_lun a bit
  [SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
  [SCSI] sni_53c710: Cleanup
  [SCSI] qla4xxx: Fix underrun/overrun conditions
  [SCSI] megaraid_mbox: use mutex instead of semaphore
  [SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
  [SCSI] qla2xxx: update version to 8.02.00-k1.
  [SCSI] qla2xxx: add support for NPIV
  [SCSI] stex: use resid for xfer len information
  [SCSI] Add Brownie 1200U3P to blacklist
  [SCSI] scsi.c: convert to use the data buffer accessors
  ...
2007-07-15 16:51:54 -07:00
Matthew Wilcox 6d877688ef [SCSI] Clean up scsi_add_lun a bit
This patch tidies up scsi_add_lun a bit.  I rewrote the kerneldoc to match
the actual parameters, moved the check for RBC and MMC REPORT_LUN devices
away from the switch(), changed the setup of sdev->type to account for
BLIST_ISROM, moved the check for BLIST_NO_ULD_ATTACH further down in
the function, removed a bogus comment and fixed some whitespace issues.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:13:13 -05:00
Christof Schmitt 462b7859a0 [SCSI] zfcp: Report FCP LUN to SCSI midlayer
When reporting SCSI devices to the SCSI midlayer, use the FCP LUN as
LUN reported to the SCSI layer. With this approach, zfcp does not have
to create unique LUNS, and this code can be removed.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-06-19 19:51:02 -07:00
Hugh Dickins f2f027c6e9 [SCSI] fix CONFIG_SCSI_WAIT_SCAN=m
CONFIG_MODULES=y
CONFIG_SCSI=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m

2.6.21-rc5-mm2 VFS panics unable to find my root on /dev/sda2, but boots
okay if I change drivers/scsi/Kconfig to "default y" instead of "default m"
for SCSI_WAIT_SCAN.

Make sure there's a late_initcall to scsi_complete_async_scans when it's
built in, so a monolithic SCSI_SCAN_ASYNC kernel can rely on the scans
being completed before trying to mount root, even if they're slow.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-05-30 09:16:38 -05:00
James Bottomley 0272bf7271 [SCSI] fix scsi_wait_scan build problem
The #ifdef MODULE around the export of scsi_complete_async_scans()
which is the API the scsi_wait_scan module uses is incorrect and
causes the symbol to be undefined in certain circumstances leading to
a build failure.  Remove the defines.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-03-21 08:15:41 -06:00
Linus Torvalds 5fc77247f7 Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
  [SCSI] SCSI core: better initialization for sdev->scsi_level
  [SCSI] scsi_proc.c: display sdev->scsi_level correctly
  [SCSI] megaraid_sas: update version and author info
  [SCSI] megaraid_sas: return sync cache call with success
  [SCSI] megaraid_sas: replace pci_alloc_consitent with dma_alloc_coherent in ioctl path
  [SCSI] megaraid_sas: add bios_param in scsi_host_template
  [SCSI] megaraid_sas: do not process cmds if hw_crit_error is set
  [SCSI] scsi_transport.h should include scsi_device.h
  [SCSI] aic79xx: remove extra newline from info message
  [SCSI] scsi_scan.c: handle bad inquiry responses
  [SCSI] aic94xx: tie driver to the major number of the sequencer firmware
  [SCSI] lpfc: add PCI error recovery support
  [SCSI] megaraid: pci_module_init to pci_register_driver
  [SCSI] tgt: fix the user/kernel ring buffer interface
  [SCSI] sgiwd93: interfacing to wd33c93
  [SCSI] wd33c93: Fast SCSI with WD33C93B
2007-02-19 13:32:28 -08:00
Robert P. J. Day 405ae7d381 Replace remaining references to "driverfs" with "sysfs".
Globally, s/driverfs/sysfs/g.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 19:13:42 +01:00
Alan Stern 7c9d6f16f5 [SCSI] SCSI core: better initialization for sdev->scsi_level
This patch will affect the CDB in INQUIRY commands sent to LUNs above 0 
when LUN-0 reports a scsi_level of 0; the LUN bits will no longer be set 
in the second byte of the CDB.  This is as it should be.  Nevertheless, 
it's possible that some wacky device might be adversely affected.  I doubt 
anyone will complain...

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-02-16 11:22:55 -06:00
Alan Stern e423ee31db [SCSI] scsi_scan.c: handle bad inquiry responses
A particular USB device has been reporting short inquiry lengths.  The
SCSI code cannot operate properly unless we get an inquiry length of
36 or above (because of the way we parse vendor and product), so
assume at least 36 bytes are valid even if the device reports fewer.
This is wrong, but it's no worse than what we're doing now (using the
garbage beyond the last reported valid byte).

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-02-16 10:13:01 -06:00
James Bottomley 81b7bbd193 Merge branch 'linus'
Conflicts:

	drivers/scsi/ipr.c

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-02-10 13:45:43 -06:00
James Bottomley 30716e07ef Merge branch 'linus' 2007-01-31 11:24:00 -06:00
Matthew Wilcox 938e2ac0b7 [SCSI] Fix scsi_add_device() for async scanning
I had thought that all drivers which didn't call scsi_scan_host()
called scsi_scan_target().  Some, such as sbp2, mptsas and libata-scsi,
call scsi_add_device() or __scsi_add_device().  We just need to wait
for the currently executing async scans to complete first.  This is the
same code that's in scsi_scan_target(), except that we have to return
an error instead of void when we're declining to scan at all.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-27 09:02:36 -06:00
Kurt Garloff 3424a65d71 [SCSI] scsi_scan message cosmetic error
Hi,

Minor typo ...
In my first iteration of patches (that got merged), the
BLIST_ATTACH_PQ3 actually had the value 0x800000, but that
got changed later to avoid conflicts. This piece must have
been overlooked.
You could obviously do something like %x and then add the
bitflags, but that looks overkill for something that does
not tend to change.

Please merge.
(Patch applied against latest 2.6.20rc version that I tested.)

From: Kurt Garloff <kurt@garloff.de>
Subject: [SCSI SCAN] Fix logging message for PQ3 devices

The blacklist flags BLIST_ATTACH_PQ3 has value 0x1000000,
not 0x800000.

Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 13:55:56 -06:00
James Bottomley ddaf6fc854 [SCSI] scsi_scan: fix report lun problems with CDROM or RBC devices
Apparently no ATAPI CD/DVD actually supports REPORT LUNS (in spite of
claiming scsi-3 compliance, where it's mandatory) and worse, some
crash or flake out on being sent the command.  This may actually be
due to a conflict between SPC and MMC with MMC not listing REPORT LUNS
as mandatory.  The same standards conflict exists for RBC as well.

Fix all of this by reversing the blacklists for CDROM and RBC devices
(i.e. now they have to have the BLIST_REPORTLUNS2 flag set even if the
inquiry data returns scsi-3 compliance).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-06 08:56:58 -06:00
Matthew Wilcox 8bcc24127b [SCSI] Add missing completion to scsi_complete_async_scans()
If either scsi_complete_async_scans() is called a second time
before the first call has finished, or a host scan is started while
scsi_complete_async_scans() is still sleeping, it would fail to wake up
the other task, which would sleep forever.

I've changed the kernel-doc to make it clear that
scsi_complete_async_scans() only guarantees that scans which started
before it was called are guaranteed to have finished when it returns.
I considered making it wait until all scans are completed, but it can't
guarantee that no more scans will start before it returns anyway, and it
runs the risk of confusing other callers of scsi_complete_async_scans()
for hosts actually scanning.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-03 16:57:31 -06:00
David Howells 4796b71fbb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/pcmcia/ds.c

Fix up merge failures with Linus's head and fix new compile failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-06 15:01:18 +00:00
Matthew Wilcox 1aa8fab2ac [SCSI] Make scsi_scan_host work for drivers which find their own targets
If a driver can find its own targets, it can now fill in scan_finished and
(optionally) scan_start in the scsi_host_template.  Then, when it calls
scsi_scan_host(), it will be called back (from a thread if asynchronous
discovery is enabled), first to start the scan, and then at intervals to
check if the scan is completed.

Also make scsi_prep_async_scan and scsi_finish_async_scan static.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-22 16:42:42 -06:00
Matthew Wilcox 93b45af5c6 [SCSI] fix missing check for no scanning
Drivers that called scsi_scan_target() instead of scsi_scan_host() were
still adding devices; this needs to be under the control of userspace,
not the driver.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-22 16:41:52 -06:00
Matthew Wilcox 21db1882f7 [SCSI] Add Kconfig option for asynchronous SCSI scanning
Without this patch, the user has to add a kernel command line parameter
to get asynchronous SCSI scanning.  Now they can select the default at
compile time and still override it at boot time if they need to.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-22 16:41:09 -06:00
James Bottomley 0bd2af4683 Merge ../scsi-rc-fixes-2.6 2006-11-22 12:06:44 -06:00
David Howells 65f27f3844 WorkStruct: Pass the work_struct pointer instead of context data
Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct.  This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function.  This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated..  This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems.  But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default.  Special
initiators exist for the non-auto-release case (ending in _NAR).


Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:55:48 +00:00
Alan Stern 09123d230a [PATCH] SCSI core: always store >= 36 bytes of INQUIRY data
This patch (as810c) copies a minimum of 36 bytes of INQUIRY data, even if
the device claims that not all of them are valid.  Often badly behaved
devices put plausible data in the Vendor, Product, and Revision strings but
set the Additional Length byte to a small value.  Using potentially valid
data is certainly better than allocating a short buffer and then reading
beyond the end of it, which is what we do now.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-13 07:40:43 -08:00
Matthew Wilcox 3e082a910d [SCSI] Add ability to scan scsi busses asynchronously
Since it often takes around 20-30 seconds to scan a scsi bus, it's
highly advantageous to do this in parallel with other things.  The bulk
of this patch is ensuring that devices don't change numbering, and that
all devices are discovered prior to trying to start init.  For those
who build SCSI as modules, there's a new scsi_wait_scan module that will
ensure all bus scans are finished.

This patch only handles drivers which call scsi_scan_host.  Fibre Channel,
SAS, SATA, USB and Firewire all need additional work.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-11 13:44:25 -05:00
James Bottomley 884d25cc4f [SCSI] Fix refcount breakage with 'echo "1" > scan' when target already present
Spotted by: Dan Aloni <da-xx@monatomic.org>

The problem is there's inconsistent locking semantic usage of
scsi_alloc_target().  Two callers assume the target comes back with
reference unincremented and the third assumes its incremented.  Fix by
always making the reference incremented on return.  Also fix path in
target alloc that could consistently increment the parent lock.
Finally document scsi_alloc_target() so its callers know what the
expectations are.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-07 10:08:43 -05:00
Alan Stern e5b3cd4296 [SCSI] SCSI: sanitize INQUIRY strings
Sanitize the Vendor, Product, and Revision strings contained in an
INQUIRY result by setting all non-graphic or non-ASCII characters to ' '.
Since the standard disallows such characters, this will affect
only non-compliant devices.

To help maintain backward compatibility, NUL characters are treated
specially.  They are taken as string terminators; they and all the
following characters are set to ' '.  If some valid characters get
erased as a result... well, we weren't seeing them before so we haven't
lost anything.

The primary purpose of this change is to allow blacklist entries to
match devices with illegal Vendor or Product strings.

In addition, the patch updates a couple of function prototypes, giving
inq_result its correct type (unsigned char *).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-02 13:36:59 -05:00
dave wysochanski 84961f28e9 [SCSI] Don't add scsi_device for devices that return PQ=1, PDT=0x1f
Some targets may return slight variations of PQ and PDT to indicate
no LUN mapped.  USB UFI setting PDT=0x1f but having reserved bits for
PQ is one example, and NetApp targets returning PQ=1 and PDT=0x1f is
another.  Both instances seem like reasonable responses according to
SPC-3 and UFI specs.

The current scsi_probe_and_add_lun() code adds a scsi_device
for targets that return PQ=1 and PDT=0x1f.  This causes LUNs of type
"UNKNOWN" to show up in /proc/scsi/scsi when no LUNs are mapped.
In addition, subsequent rescans fail to recognize LUNs that may be
added on the target, unless preceded by a write to the delete attribute
of the "UNKNOWN" LUN.

This patch addresses this problem by skipping over the scsi_add_lun()
when PQ=1,PDT=0x1f is encountered, and just returns
SCSI_SCAN_TARGET_PRESENT.

Signed-off-by: Dave Wysochanski <davidw@netapp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-08-19 13:37:40 -07:00
James Bottomley 19ac0db3e2 [SCSI] fix up short inquiry printing
A recent drivers base commit:

3e95637a48

Caused the bus to be added to dev_printk, so now our SCSI inquiry short
messages print like this:

scsiscsi 2:0:0:0: Direct access     IBM-ESXS ST973401SS       B519 PQ: 0 ANSI: 5

Just remove the "scsi" from the sdev_printk to compensate.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-08-06 18:19:19 -05:00
Matthew Wilcox 4ff36718ed [SCSI] Improve inquiry printing
- Replace scsi_device_types array API with scsi_device_type function API.
   Gets rid of a lot of common code, as well as being easier to use.
 - Add the new device types in SPC4 r05a, and rename some of the older ones.
 - Reformat the printing of inquiry data; now fits on one line and
   includes PQ.

I think I've addressed all the feedback from the previous versions.  My
current test box prints:

scsi 2:0:1:0: Direct access     HP 18.2G ATLAS10K3_18_SCA HP05 PQ: 0 ANSI: 2

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-08-06 15:59:26 -05:00
James Bottomley c4e00fac42 Merge ../scsi-misc-2.6
Conflicts:

	drivers/scsi/nsp32.c
	drivers/scsi/pcmcia/nsp_cs.c

Removal of randomness flag conflicts with SA_ -> IRQF_ global
replacement.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-07-03 09:41:12 -05:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Brian King 309bd27121 [SCSI] scsi: Device scanning oops for offlined devices (resend)
If a device gets offlined as a result of the Inquiry sent
during scanning, the following oops can occur. After the
disk gets put into the SDEV_OFFLINE state, the error handler
sends back the failed inquiry, which wakes the thread doing
the scan. This starts a race between the scanning thread
freeing the scsi device and the error handler calling
scsi_run_host_queues to restart the host. Since the disk
is in the SDEV_OFFLINE state, scsi_device_get will still
work, which results in __scsi_iterate_devices getting
a reference to the scsi disk when it shouldn't.

The following execution thread causes the oops:

CPU 0 (scan)				CPU 1 (eh)

---------------------------------------------------------
scsi_probe_and_add_lun
                        ....
                                        scsi_eh_offline_sdevs
                                        scsi_eh_flush_done_q
scsi_destroy_sdev
scsi_device_dev_release
                                        scsi_restart_operations
                                         scsi_run_host_queues
                                          __scsi_iterate_devices
                                           get_device
scsi_device_dev_release_usercontext
                                          scsi_run_queue
                                            <---OOPS--->

The patch fixes this by changing the state of the sdev to SDEV_DEL
before doing the final put_device, which should prevent the race
from occurring.

Original oops follows:

Badness in kref_get at lib/kref.c:32
Call Trace:
[C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
[C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
[C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
 Exception: 700 at .kref_get+0x10/0x28
    LR = .kobject_get+0x20/0x3c
[C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
[C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
[C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
[C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
[C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
[C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
[C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
data at address 0x000001b8
Faulting instruction address: 0xd0000000000698e4
sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi2 : sym-2.2.2
cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
    pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
    lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
    sp: c00000002f447cb0
   msr: 9000000000009032
   dar: 1b8
 dsisr: 40000000
  current = 0xc0000000045fecd0
  paca    = 0xc00000000048ee80
    pid   = 1123, comm = scsi_eh_1
enter ? for help
[c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
[c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
[c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-28 12:39:56 -04:00
Christoph Hellwig beb4048750 [SCSI] remove scsi_request infrastructure
With Achim patch the last user (gdth) is switched away from scsi_request
so we an kill it now.  Also disables some code in i2o_scsi that was
broken since the sg driver stopped using scsi_requests.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-10 16:24:40 -05:00
Amit Arora 091686d3b5 [SCSI] Return -EINVAL when "id == max_id" in scsi_scan_host_selected()
The scsi_scan_host_selected() should return -EINVAL when the id is equal
to the max_id. Currently it uses ">" when comparing with max_id, and
hence leaves the border case when "id==max_id".
The channel and lun have values valid from 0 up to,
and including, max_channel or max_lun. But, the valid values for id
range from 0 to max_id-1. This patch fixes the problem.

Signed-off-by: Amit Arora <aarora@in.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-05-28 13:01:23 -04:00