linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-21 08:19:28 +00:00

Author	SHA1	Message	Date
Christoph Hellwig	ee52423604	scsi: zero per-cmd driver data before each I/O Without this drivers that don't clear the state themselves can see off effects. For example Hyper-V VMs using the storvsc driver will often hang during boot due to uncleared Test Unit Ready failures. Fixes: `e9c787e6` ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Dexuan Cui <decui@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-21 12:51:54 -07:00
Linus Torvalds	772c8f6f3b	for-4.11/linus-merge-signed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJYqeb8AAoJEPfTWPspceCmB3UP/3UtcPrzEm8w2cxB9MaWhZN3 J+jiwlO4vaqhm2HVzQtoJqfaqRlud/iDx5cIXE2S7FnIM54ZKs3CANbKu8X+b1zm eJije3zMI8A8qyftigbz6a/Y2kWE4ZqFEc9WU5CWawfTl3ImCVUi8+F5X0wOLU/h r50zAQOEyURH4G5usNl9q0olF6FonJ82AcYm1iJ0QP2wYWZRJauC0rRn8IT93tyK bZPHnGKdkd7km8yi3zr2GNWOfuZZuA0HWAaF4qfrHPZQ883gITFAUIlFb1f+2TNl DkQzRrBB2wPWPnlbfb9KejMkvL94hflzsLb5rHt835DyVXFRyjxsgyAI8A+LPGSz vqZ3rsbWj6H4F9z2CkZ+T+AP/ZSWDNjwc0RXPm9HYdR5CDeTxIUVvnFQ44YNsmTv Xd5BKrUJ2oKegAxQG6zcuFx23p8JzhT70l+mNrMdtyeKnDD9FRdDvhKG9AHeTipn o/DnGivhS3UMQoQ7D68KOO+kuhLDeo7my5XGsnjzMO/iHqg++7IP2HyYYs/Ba4qZ cYaCtSDQW71Zt0vsqa6dvPuXBveu4h8Qh8R7uAGjSGS9IAFFb4Cab2tiUdISE6PE YnMWzY+G6pT8imlLVOL5/QFuo2Q4pUsaL0AHpXMCN9TZnQtbqXa8eqwnKnQ0m2KN 7ut0IYYEPaYUX5xFn1K6 =z7AL -----END PGP SIGNATURE----- Merge tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: - blk-mq scheduling framework from me and Omar, with a port of the deadline scheduler for this framework. A port of BFQ from Paolo is in the works, and should be ready for 4.12. - Various fixups and improvements to the above scheduling framework from Omar, Paolo, Bart, me, others. - Cleanup of the exported sysfs blk-mq data into debugfs, from Omar. This allows us to export more information that helps debug hangs or performance issues, without cluttering or abusing the sysfs API. - Fixes for the sbitmap code, the scalable bitmap code that was migrated from blk-mq, from Omar. - Removal of the BLOCK_PC support in struct request, and refactoring of carrying SCSI payloads in the block layer. This cleans up the code nicely, and enables us to kill the SCSI specific parts of struct request, shrinking it down nicely. From Christoph mainly, with help from Hannes. - Support for ranged discard requests and discard merging, also from Christoph. - Support for OPAL in the block layer, and for NVMe as well. Mainly from Scott Bauer, with fixes/updates from various others folks. - Error code fixup for gdrom from Christophe. - cciss pci irq allocation cleanup from Christoph. - Making the cdrom device operations read only, from Kees Cook. - Fixes for duplicate bdi registrations and bdi/queue life time problems from Jan and Dan. - Set of fixes and updates for lightnvm, from Matias and Javier. - A few fixes for nbd from Josef, using idr to name devices and a workqueue deadlock fix on receive. Also marks Josef as the current maintainer of nbd. - Fix from Josef, overwriting queue settings when the number of hardware queues is updated for a blk-mq device. - NVMe fix from Keith, ensuring that we don't repeatedly mark and IO aborted, if we didn't end up aborting it. - SG gap merging fix from Ming Lei for block. - Loop fix also from Ming, fixing a race and crash between setting loop status and IO. - Two block race fixes from Tahsin, fixing request list iteration and fixing a race between device registration and udev device add notifiations. - Double free fix from cgroup writeback, from Tejun. - Another double free fix in blkcg, from Hou Tao. - Partition overflow fix for EFI from Alden Tondettar. * tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block: (156 commits) nvme: Check for Security send/recv support before issuing commands. block/sed-opal: allocate struct opal_dev dynamically block/sed-opal: tone down not supported warnings block: don't defer flushes on blk-mq + scheduling blk-mq-sched: ask scheduler for work, if we failed dispatching leftovers blk-mq: don't special case flush inserts for blk-mq-sched blk-mq-sched: don't add flushes to the head of requeue queue blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or not block: do not allow updates through sysfs until registration completes lightnvm: set default lun range when no luns are specified lightnvm: fix off-by-one error on target initialization Maintainers: Modify SED list from nvme to block Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN uapi: sed-opal fix IOW for activate lsp to use correct struct cdrom: Make device operations read-only elevator: fix loading wrong elevator type for blk-mq devices cciss: switch to pci_irq_alloc_vectors block/loop: fix race between I/O and set_status blk-mq-sched: don't hold queue_lock when calling exit_icq block: set make_request_fn manually in blk_mq_update_nr_hw_queues ...	2017-02-21 10:57:33 -08:00
Johannes Thumshirn	fd3fc0b4d7	scsi: don't BUG_ON() empty DMA transfers Don't crash the machine just because of an empty transfer. Use WARN_ON() combined with returning an error. Found by Dmitry Vyukov and syzkaller. [ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON() might still be a cause of excessive log spamming. NOTE! If this warning ever triggers, we may end up leaking resources, since this doesn't bother to try to clean the command up. So this WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is much worse. People really need to stop using BUG_ON() for "this shouldn't ever happen". It makes pretty much any bug worse. - Linus ] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-19 09:49:15 -08:00
Christoph Hellwig	aebf526b53	block: fold cmd_type into the REQ_OP_ space Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-31 14:00:44 -07:00
Christoph Hellwig	57292b58dd	block: introduce blk_rq_is_passthrough This can be used to check for fs vs non-fs requests and basically removes all knowledge of BLOCK_PC specific from the block layer, as well as preparing for removing the cmd_type field in struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-31 14:00:34 -07:00
Christoph Hellwig	82ed4db499	block: split scsi_request out of struct request And require all drivers that want to support BLOCK_PC to allocate it as the first thing of their private data. To support this the legacy IDE and BSG code is switched to set cmd_size on their queues to let the block layer allocate the additional space. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-27 15:08:35 -07:00
Christoph Hellwig	e9c787e65c	scsi: allocate scsi_cmnd structures as part of struct request Rely on the new block layer functionality to allocate additional driver specific data behind struct request instead of implementing it in SCSI itѕelf. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-27 15:08:35 -07:00
Christoph Hellwig	d48777a633	scsi: remove __scsi_alloc_queue Instead do an internal export of __scsi_init_queue for the transport classes that export BSG nodes. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-27 15:08:35 -07:00
Christoph Hellwig	0a6ac4ee7c	scsi: respect unchecked_isa_dma for blk-mq Currently blk-mq always allocates the sense buffer using normal GFP_KERNEL allocation. Refactor the cmd pool code to split the cmd and sense allocation and share the code to allocate the sense buffers as well as the sense buffer slab caches between the legacy and blk-mq path. Note that this switches to lazy allocation of the sense slab caches - the slab caches (not the actual allocations) won't be destroy until the scsi module is unloaded instead of keeping track of hosts using them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-27 15:08:35 -07:00
Linus Torvalds	34241af77b	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - the virtio_blk stack DMA corruption fix from Christoph, fixing and issue with VMAP stacks. - O_DIRECT blkbits calculation fix from Chandan. - discard regression fix from Christoph. - queue init error handling fixes for nbd and virtio_blk, from Omar and Jeff. - two small nvme fixes, from Christoph and Guilherme. - rename of blk_queue_zone_size and bdev_zone_size to _sectors instead, to more closely follow what we do in other places in the block layer. This interface is new for this series, so let's get the naming right before releasing a kernel with this feature. From Damien. * 'for-linus' of git://git.kernel.dk/linux-block: block: don't try to discard from __blkdev_issue_zeroout sd: remove __data_len hack for WRITE SAME nvme: use blk_rq_payload_bytes scsi: use blk_rq_payload_bytes block: add blk_rq_payload_bytes block: Rename blk_queue_zone_size and bdev_zone_size nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too nvme-rdma: fix nvme_rdma_queue_is_ready virtio_blk: fix panic in initialization error path nbd: blk_mq_init_queue returns an error code on failure, not NULL virtio_blk: avoid DMA to stack for the sense buffer do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned	2017-01-14 17:07:04 -08:00
Christoph Hellwig	fd102b125e	scsi: use blk_rq_payload_bytes Without that we'll pass a wrong payload size in cmd->sdb, which can lead to hangs with drivers that need the total transfer size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Chris Valean <v-chvale@microsoft.com> Reported-by: Dexuan Cui <decui@microsoft.com> Fixes: `f9d03f96` ("block: improve handling of the magic discard payload") Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-01-13 15:17:04 -07:00
Bart Van Assche	7dbbf0fa1b	scsi: scsi-mq: Wait for .queue_rq() if necessary Ensure that if scsi-mq is enabled that scsi_internal_device_block() waits until ongoing shost->hostt->queuecommand() calls have finished. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-12-20 17:01:28 -05:00
Linus Torvalds	a829a8445f	SCSI misc on 20161213 This update includes the usual round of major driver updates (ncr5380, lpfc, hisi_sas, megaraid_sas, ufs, ibmvscsis, mpt3sas). There's also an assortment of minor fixes, mostly in error legs or other not very user visible stuff. The major change is the pci_alloc_irq_vectors replacement for the old pci_msix_.. calls; this effectively makes IRQ mapping generic for the drivers and allows blk_mq to use the information. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJYUJOvAAoJEAVr7HOZEZN42B0P/1lj1W2N7y0LOAbR2MasyQvT fMD/SSip/v+R/zJiTv+5M/IDQT6ez62JnQGWyX3HZTob9VEfoqagbUuHH6y+fmib tHyqiYc9FC/NZSRX/0ib+rpnSVhC/YRSVV7RrAqilbpOKAzeU25FlN/vbz+Nv/XL ReVBl+2nGjJtHyWqUN45Zuf74c0zXOWPPUD0hRaNclK5CfZv5wDYupzHzTNSQTkj nWvwPYT0OgSMNe7mR+IDFyOe3UQ/OYyeJB0yBNqO63IiaUabT8/hgrWR9qJAvWh8 LiH+iSQ69+sDUnvWvFjuls/GzcFuuTljwJbm+FyTsmNHONPVY8JRCLNq7CNDJ6Vx HwpNuJdTSJpne4lAVBGPwgjs+GhlMvUP/xYVLWAXdaBxU9XGePrwqQDcFu1Rbx3P yfMiVaY1+e45OEjLRCbDAwFnMPevC3kyymIvSsTySJxhTbYrOsyrrWt5kwWsvE3r SKANsub+xUnpCkyg57nXRQStJSCiSfGIDsydKmMX+pf1SR4k6gCUQZlcchUX0uOa dcY6re0c7EJIQQiT7qeGP5TRBblxARocCA/Igx6b5U5HmuQ48tDFlMCps7/TE84V JBsBnmkXcEi/ALShL/Tui+3YKA1DfOtEnwHtXx/9Ecx/nxP2Sjr9LJwCKiONv8NY RgLpGfccrix34lQumOk5 =sPXh -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (ncr5380, lpfc, hisi_sas, megaraid_sas, ufs, ibmvscsis, mpt3sas). There's also an assortment of minor fixes, mostly in error legs or other not very user visible stuff. The major change is the pci_alloc_irq_vectors replacement for the old pci_msix_.. calls; this effectively makes IRQ mapping generic for the drivers and allows blk_mq to use the information" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (256 commits) scsi: qla4xxx: switch to pci_alloc_irq_vectors scsi: hisi_sas: support deferred probe for v2 hw scsi: megaraid_sas: switch to pci_alloc_irq_vectors scsi: scsi_devinfo: remove synchronous ALUA for NETAPP devices scsi: be2iscsi: set errno on error path scsi: be2iscsi: set errno on error path scsi: hpsa: fallback to use legacy REPORT PHYS command scsi: scsi_dh_alua: Fix RCU annotations scsi: hpsa: use %phN for short hex dumps scsi: hisi_sas: fix free'ing in probe and remove scsi: isci: switch to pci_alloc_irq_vectors scsi: ipr: Fix runaway IRQs when falling back from MSI to LSI scsi: dpt_i2o: double free on error path scsi: cxlflash: Migrate scsi command pointer to AFU command scsi: cxlflash: Migrate IOARRIN specific routines to function pointers scsi: cxlflash: Cleanup queuecommand() scsi: cxlflash: Cleanup send_tmf() scsi: cxlflash: Remove AFU command lock scsi: cxlflash: Wait for active AFU commands to timeout upon tear down scsi: cxlflash: Remove private command pool ...	2016-12-14 10:49:33 -08:00
Christoph Hellwig	f9d03f96b9	block: improve handling of the magic discard payload Instead of allocating a single unused biovec for discard requests, send them down without any payload. Instead we allow the driver to add a "special" payload using a biovec embedded into struct request (unioned over other fields never used while in the driver), and overloading the number of segments for this case. This has a couple of advantages: - we don't have to allocate the bio_vec - the amount of special casing for discard requests in the block layer is significantly reduced - using this same scheme for other request types is trivial, which will be important for implementing the new WRITE_ZEROES op on devices where it actually requires a payload (e.g. SCSI) - we can get rid of playing games with the request length, as we'll never touch it and completions will work just fine - it will allow us to support ranged discard operations in the future by merging non-contiguous discard bios into a single request - last but not least it removes a lot of code This patch is the common base for my WIP series for ranges discards and to remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES, so it would be good to get it in quickly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-12-09 08:30:51 -07:00
Bart Van Assche	669f044170	scsi: srp_transport: Move queuecommand() wait code to SCSI core Additionally, rename srp_wait_for_queuecommand() into scsi_wait_for_queuecommand() and add a comment about the queuecommand() call from scsi_send_eh_cmnd(). Note: this patch changes scsi_internal_device_block from a function that did not sleep into a function that may sleep. This is fine for all callers of this function: * scsi_internal_device_block() is called from the mpt3sas device while that driver holds the ioc->dm_cmds.mutex. This means that the mpt3sas driver calls this function from thread context. * scsi_target_block() is called by __iscsi_block_session() from kernel thread context and with IRQs enabled. * The SRP transport code also calls scsi_target_block() from kernel thread context while sleeping is allowed. * The snic driver also calls scsi_target_block() from a context from which sleeping is allowed. The scsi_target_block() call namely occurs immediately after a scsi_flush_work() call. [mkp: s/shost/sdev/] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-29 11:21:27 -05:00
Omar Sandoval	2868f13c30	scsi_lib: untangle 0 and BLK_MQ_RQ_QUEUE_OK Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having specific values. No functional change. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-15 12:50:31 -07:00
Christoph Hellwig	2d9c5c20c9	scsi: allow LLDDs to expose the queue mapping to blk-mq Just hand through the blk-mq map_queues method in the host template. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-11-08 17:30:00 -05:00
Bart Van Assche	2b053aca76	blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request() Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls are followed by kicking the requeue list. Hence add an argument to these two functions that allows to kick the requeue list. This was proposed by Christoph Hellwig. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-02 12:50:19 -06:00
Bart Van Assche	52d7f1b5c2	blk-mq: Avoid that requeueing starts stopped queues Since blk_mq_requeue_work() starts stopped queues and since execution of this function can be scheduled after a queue has been stopped it is not possible to stop queues without using an additional state variable to track whether or not the queue has been stopped. Hence modify blk_mq_requeue_work() such that it does not start stopped queues. My conclusion after a review of the blk_mq_stop_hw_queues() and blk_mq_{delay_,}kick_requeue_list() callers is as follows: * In the dm driver starting and stopping queues should only happen if __dm_suspend() or __dm_resume() is called and not if the requeue list is processed. * In the SCSI core queue stopping and starting should only be performed by the scsi_internal_device_block() and scsi_internal_device_unblock() functions but not by any other function. Although the blk_mq_stop_hw_queue() call in scsi_queue_rq() may help to reduce CPU load if a LLD queue is full, figuring out whether or not a queue should be restarted when requeueing a command would require to introduce additional locking in scsi_mq_requeue_cmd() to avoid a race with scsi_internal_device_block(). Avoid this complexity by removing the blk_mq_stop_hw_queue() call from scsi_queue_rq(). * In the NVMe core only the functions that call blk_mq_start_stopped_hw_queues() explicitly should start stopped queues. * A blk_mq_start_stopped_hwqueues() call must be added in the xen-blkfront driver in its blkif_recover() function. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-02 12:50:19 -06:00
Christoph Hellwig	e806402130	block: split out request-only flags into a new namespace A lot of the REQ_* flags are only used on struct requests, and only of use to the block layer and a few drivers that dig into struct request internals. This patch adds a new req_flags_t rq_flags field to struct request for them, and thus dramatically shrinks the number of common requests. It also removes the unfortunate situation where we have to fit the fields from the same enum into 32 bits for struct bio and 64 bits for struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-10-28 08:45:17 -06:00
Christoph Hellwig	7d7e0f90b7	blk-mq: remove ->map_queue All drivers use the default, so provide an inline version of it. If we ever need other queue mapping we can add an optional method back, although supporting will also require major changes to the queue setup code. This provides better code generation, and better debugability as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-09-15 08:42:03 -06:00
James Bottomley	a621bac304	scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands When SCSI was written, all commands coming from the filesystem (REQ_TYPE_FS commands) had data. This meant that our signal for needing to complete the command was the number of bytes completed being equal to the number of bytes in the request. Unfortunately, with the advent of flush barriers, we can now get zero length REQ_TYPE_FS commands, which confuse this logic because they satisfy the condition every time. This means they never get retried even for retryable conditions, like UNIT ATTENTION because we complete them early assuming they're done. Fix this by special casing the early completion condition to recognise zero length commands with errors and let them drop through to the retry code. Cc: stable@vger.kernel.org Reported-by: Sebastian Parschauer <s.parschauer@gmx.de> Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Tested-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-05-22 14:52:45 -04:00
Hannes Reinecke	d230823a1c	scsi_lib: Decode T10 vendor IDs Some arrays / HBAs will only present T10 vendor IDs, so we should be decoding them, too. [mkp: Fixed T10 spelling] Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Hannes Reinecke <hare@suse.com> Tested-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-05-10 21:34:16 -04:00
Ming Lin	9b1d6c8950	lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c Now it's ready to move the mempool based SG chained allocator code from SCSI driver to lib/sg_pool.c, which will be compiled only based on a Kconfig symbol CONFIG_SG_POOL. SCSI selects CONFIG_SG_POOL. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:14 -04:00
Ming Lin	65e8617fba	scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS Rename SCSI_MAX_SG_SEGMENTS to SG_CHUNK_SIZE, which means the amount we fit into a single scatterlist chunk. Rename SCSI_MAX_SG_CHAIN_SEGMENTS to SG_MAX_SEGMENTS. Will move these 2 generic definitions to scatterlist.h later. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Bart Van Assche <bart.vanassche@sandisk.com> (for ib_srp changes) Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:14 -04:00
Ming Lin	001d63be61	scsi: rename SG related struct and functions Rename SCSI specific struct and functions to more genenic names. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sgi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:13 -04:00
Ming Lin	22cc3d4c6f	scsi: replace "mq" with "first_chunk" in SG functions Parameter "bool mq" is block driver specific. Change it to "first_chunk" to make it more generic. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:13 -04:00
Ming Lin	91dbc08d64	scsi: replace "scsi_data_buffer" with "sg_table" in SG functions Replace parameter "struct scsi_data_buffer" with "struct sg_table" in SG alloc/free functions to make them generic. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:12 -04:00
James Bottomley	a7dee8f45f	Merge branch 'fixes' into misc	2016-03-15 15:24:44 -07:00
jiangyiwen	e1cd391117	SCSI: Free resources when we return BLKPREP_INVALID When called scsi_prep_fn return BLKPREP_INVALID, we should use the same code with BLKPREP_KILL in scsi_prep_return. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-02-26 17:25:32 -05:00
Hannes Reinecke	d3d328919f	scsi_dh: add 'rescan' callback If a device needs to be rescanned the device_handler might need to be rechecked, too. So add a 'rescan' callback to the device handler and call it upon scsi_rescan_device(). The rescan callback will be invoked from the Unit Attention handling of ASC/ASCQ 3F 03 (INQUIRY DATA HAS CHANGED). Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-02-23 21:27:02 -05:00
Hannes Reinecke	a8aa397858	scsi: Add scsi_vpd_tpg_id() Implement scsi_vpd_tpg_id() to extract the target port group id and the relative port id from SCSI VPD page 0x83. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-02 16:43:11 -05:00
Hannes Reinecke	9983bed390	scsi: Add scsi_vpd_lun_id() Add a function scsi_vpd_lun_id() to return a unique device identifcation based on the designation descriptors of VPD page 0x83. As devices might implement several descriptors the order of preference is: - NAA IEE Registered Extended - EUI-64 based 16-byte - EUI-64 based 12-byte - NAA IEEE Registered - NAA IEEE Extended A SCSI name string descriptor is preferred to all of them if the identification is longer than 16 bytes. The returned unique device identification will be formatted as a SCSI Name string to avoid clashes between different designator types. [mkp: Fixed up kernel doc comment from Johannes] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-02 16:40:19 -05:00
Mel Gorman	71baba4b92	mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM __GFP_WAIT was used to signal that the caller was in atomic context and could not sleep. Now it is possible to distinguish between true atomic context and callers that are not willing to sleep. The latter should clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing __GFP_WAIT behaves differently, there is a risk that people will clear the wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly indicate what it does -- setting it allows all reclaim activity, clearing them prevents it. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
Christoph Hellwig	f4829a9b7a	blk-mq: fix racy updates of rq->errors blk_mq_complete_request may be a no-op if the request has already been completed by others means (e.g. a timeout or cancellation), but currently drivers have to set rq->errors before calling blk_mq_complete_request, which might leave us with the wrong error value. Add an error parameter to blk_mq_complete_request so that we can defer setting rq->errors until we known we won the race to complete the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-10-01 10:10:55 +02:00
Christoph Hellwig	ee14c674e8	scsi_dh: kill struct scsi_dh_data Add a ->handler and a ->handler_data field to struct scsi_device and kill this indirection. Also move struct scsi_device_handler to scsi_dh.h so that changes to it don't require rebuilding every SCSI LLDD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-08-28 13:14:57 -07:00
Hannes Reinecke	14c3e677df	scsi: Add ALUA state change UA handling Log the ALUA state change unit attention correctly with the message log and emit an event to allow user-space tools to react to it. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-08-26 07:19:57 -07:00
Hannes Reinecke	0ae80ba91f	scsi: retry MODE SENSE on unit attention The 'sd' driver is calling scsi_mode_sense() to figure out internal details. But scsi_mode_sense() never checks for any pending unit attentions, so we're getting annoying error messages like: MODE SENSE: unimplemented page/subpage: 0x00/0x00 and a possible wrong decision for device cache handling. Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-07-30 12:37:20 -07:00
Tony Battersby	0c958ecc69	scsi: fix memory leak with scsi-mq Fix a memory leak with scsi-mq triggered by commands with large data transfer length. __sg_alloc_table() sets both table->nents and table->orig_nents to the same value. When the scatterlist is DMA-mapped, table->nents is overwritten with the (possibly smaller) size of the DMA-mapped scatterlist, while table->orig_nents retains the original size of the allocated scatterlist. scsi_free_sgtable() should therefore check orig_nents instead of nents, and all code that initializes sdb->table without calling __sg_alloc_table() should set both nents and orig_nents. Fixes: `d285203cf6` ("scsi: add support for a blk-mq based I/O path.") Cc: <stable@vger.kernel.org> # 3.17+ Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-07-30 10:40:33 -07:00
Bart Van Assche	bba0bdd7ad	Defer processing of REQ_PREEMPT requests for blocked devices SCSI transport drivers and SCSI LLDs block a SCSI device if the transport layer is not operational. This means that in this state no requests should be processed, even if the REQ_PREEMPT flag has been set. This patch avoids that a rescan shortly after a cable pull sporadically triggers the following kernel oops: BUG: unable to handle kernel paging request at ffffc9001a6bc084 IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib] Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100) Call Trace: [<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp] [<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp] [<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod] [<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod] [<ffffffff81223b37>] __blk_run_queue+0x27/0x30 [<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110 [<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0 [<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod] [<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod] [<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod] [<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod] [<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod] [<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod] [<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod] [<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod] [<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160 [<ffffffff811589de>] vfs_write+0xce/0x140 [<ffffffff81158b53>] sys_write+0x53/0xa0 [<ffffffff81464592>] system_call_fastpath+0x16/0x1b [<00007f611c9d9300>] 0x7f611c9d92ff Reported-by: Max Gurtuvoy <maxg@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-04-08 09:41:41 -07:00
Linus Torvalds	3e12cefbe1	Merge branch 'for-3.20/core' of git://git.kernel.dk/linux-block Pull core block IO changes from Jens Axboe: "This contains: - A series from Christoph that cleans up and refactors various parts of the REQ_BLOCK_PC handling. Contributions in that series from Dongsu Park and Kent Overstreet as well. - CFQ: - A bug fix for cfq for realtime IO scheduling from Jeff Moyer. - A stable patch fixing a potential crash in CFQ in OOM situations. From Konstantin Khlebnikov. - blk-mq: - Add support for tag allocation policies, from Shaohua. This is a prep patch enabling libata (and other SCSI parts) to use the blk-mq tagging, instead of rolling their own. - Various little tweaks from Keith and Mike, in preparation for DM blk-mq support. - Minor little fixes or tweaks from me. - A double free error fix from Tony Battersby. - The partition 4k issue fixes from Matthew and Boaz. - Add support for zero+unprovision for blkdev_issue_zeroout() from Martin" * 'for-3.20/core' of git://git.kernel.dk/linux-block: (27 commits) block: remove unused function blk_bio_map_sg block: handle the null_mapped flag correctly in blk_rq_map_user_iov blk-mq: fix double-free in error path block: prevent request-to-request merging with gaps if not allowed blk-mq: make blk_mq_run_queues() static dm: fix multipath regression due to initializing wrong request cfq-iosched: handle failure of cfq group allocation block: Quiesce zeroout wrapper block: rewrite and split __bio_copy_iov() block: merge __bio_map_user_iov into bio_map_user_iov block: merge __bio_map_kern into bio_map_kern block: pass iov_iter to the BLOCK_PC mapping functions block: add a helper to free bio bounce buffer pages block: use blk_rq_map_user_iov to implement blk_rq_map_user block: simplify bio_map_kern block: mark blk-mq devices as stackable block: keep established cmd_flags when cloning into a blk-mq request block: add blk-mq support to blk_insert_cloned_request() block: require blk_rq_prep_clone() be given an initialized clone request blk-mq: add tag allocation policy ...	2015-02-12 14:13:23 -08:00
Shaohua Li	24391c0dc5	blk-mq: add tag allocation policy This is the blk-mq part to support tag allocation policy. The default allocation policy isn't changed (though it's not a strict FIFO). The new policy is round-robin for libata. But it's a try-best implementation. If multiple tasks are competing, the tags returned will be mixed (which is unavoidable even with !mq, as requests from different tasks can be mixed in queue) Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-01-23 14:18:00 -07:00
Ewan D. Milne	91724c2061	scsi: Avoid crashing if device uses DIX but adapter does not support it This can happen if a multipathed device uses DIX and another path is added via an adapter that does not support it. Multipath should not allow this path to be added, but we should not depend upon that to avoid crashing. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2015-01-20 09:04:21 +01:00
Christoph Hellwig	70a0f2c189	scsi: ->queue_rq can't sleep The blk-mq ->queue_rq method is always called from process context, but might have preemption disabled. This means we still always have to use GFP_ATOMIC for memory allocations, and thus need to revert part of commit `3c356bde1` ("scsi: stop passing a gfp_mask argument down the command setup path"). Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>	2015-01-09 15:43:01 +01:00
Tony Battersby	120bb3e1e3	scsi: fix random memory corruption with scsi-mq + T10 PI This fixes random memory corruption triggered when all three of the following are true: * scsi-mq enabled * T10 Protection Information (DIF) enabled * SCSI host with sg_tablesize > SCSI_MAX_SG_SEGMENTS (128) The symptoms of this bug are unpredictable memory corruption, BUG()s, oopses, lockups, etc., any of which may appear to be completely unrelated to the root cause. Cc: <stable@vger.kernel.org> # 3.17.x, 3.18.x Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-12-15 10:31:33 +01:00
Linus Torvalds	caf292ae5b	Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-block Pull block driver core update from Jens Axboe: "This is the pull request for the core block IO changes for 3.19. Not a huge round this time, mostly lots of little good fixes: - Fix a bug in sysfs blktrace interface causing a NULL pointer dereference, when enabled/disabled through that API. From Arianna Avanzini. - Various updates/fixes/improvements for blk-mq: - A set of updates from Bart, mostly fixing buts in the tag handling. - Cleanup/code consolidation from Christoph. - Extend queue_rq API to be able to handle batching issues of IO requests. NVMe will utilize this shortly. From me. - A few tag and request handling updates from me. - Cleanup of the preempt handling for running queues from Paolo. - Prevent running of unmapped hardware queues from Ming Lei. - Move the kdump memory limiting check to be in the correct location, from Shaohua. - Initialize all software queues at init time from Takashi. This prevents a kobject warning when CPUs are brought online that weren't online when a queue was registered. - Single writeback fix for I_DIRTY clearing from Tejun. Queued with the core IO changes, since it's just a single fix. - Version X of the __bio_add_page() segment addition retry from Maurizio. Hope the Xth time is the charm. - Documentation fixup for IO scheduler merging from Jan. - Introduce (and use) generic IO stat accounting helpers for non-rq drivers, from Gu Zheng. - Kill off artificial limiting of max sectors in a request from Christoph" * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits) bio: modify __bio_add_page() to accept pages that don't start a new segment blk-mq: Fix uninitialized kobject at CPU hotplugging blktrace: don't let the sysfs interface remove trace from running list blk-mq: Use all available hardware queues blk-mq: Micro-optimize bt_get() blk-mq: Fix a race between bt_clear_tag() and bt_get() blk-mq: Avoid that __bt_get_word() wraps multiple times blk-mq: Fix a use-after-free blk-mq: prevent unmapped hw queue from being scheduled blk-mq: re-check for available tags after running the hardware queue blk-mq: fix hang in bt_get() blk-mq: move the kdump check to blk_mq_alloc_tag_set blk-mq: cleanup tag free handling blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map blk: introduce generic io stat accounting help function blk-mq: handle the single queue case in blk_mq_hctx_next_cpu genhd: check for int overflow in disk_expand_part_tbl() blk-mq: add blk_mq_free_hctx_request() blk-mq: export blk_mq_free_request() blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable ...	2014-12-13 14:14:23 -08:00
Christoph Hellwig	82042a2cdb	scsi: move scsi_dispatch_cmd to scsi_lib.c scsi_lib.c is where the rest of the I/O submission path lives, so move scsi_dispatch_cmd there and mark it static. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 19:57:03 +01:00
Christoph Hellwig	3c356bde19	scsi: stop passing a gfp_mask argument down the command setup path There is no reason for ULDs to pass in a flag on how to allocate the S/G lists. While we don't need GFP_ATOMIC for the blk-mq case because we don't hold locks, that decision can be made way down the chain without having to pass a pointless gfp_mask argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 19:56:55 +01:00
Christoph Hellwig	bb3ec62a17	scsi: remove scsi_next_command There's only one caller left, so inline it and reduce the blk-mq vs !blk-mq diff a litte bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 19:56:47 +01:00
Christoph Hellwig	125c99bc8b	scsi: add new scsi-command flag for tagged commands Currently scsi piggy backs on the block layer to define the concept of a tagged command. But we want to be able to have block-level host-wide tags assigned even for untagged commands like the initial INQUIRY, so add a new SCSI-level flag for commands that are tagged at the scsi level, so that even commands without that set can have tags assigned to them. Note that this alredy is the case for the blk-mq code path, and this just lets the old path catch up with it. We also set this flag based upon sdev->simple_tags instead of the block queue flag, so that it is entirely independent of the block layer tagging, and thus always correct even if a driver doesn't use block level tagging yet. Also remove the old blk_rq_tagged; it was only used by SCSI drivers, and removing it forces them to look for the proper replacement. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-12 11:19:40 +01:00
Bart Van Assche	efec4b90f1	scsi: add support for multiple hardware queues Allow a SCSI LLD to declare how many hardware queues it supports by setting Scsi_Host.nr_hw_queues before calling scsi_add_host(). Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:16:09 +01:00
Hannes Reinecke	f1569ff1d5	scsi: ratelimit I/O error messages There can be quite a lot of I/O error messages, even on smaller machines. So we need to ratelimit them to not overwhelm logging. Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Robert Elliott <elliott@hp.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:16:08 +01:00
Hannes Reinecke	c11c004b1c	scsi: simplify scsi_log_(send\|completion) Simplify scsi_log_(send\|completion) by externalizing scsi_mlreturn_string() and always print the command address. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:16:05 +01:00
Hannes Reinecke	4753cbc0a1	scsi: use 'bool' as return value for scsi_normalize_sense() Convert scsi_normalize_sense() and friends to return 'bool' instead of an integer. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Reviewed-by: Yoshihiro Yunomae <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:16:01 +01:00
Hannes Reinecke	d811b848eb	scsi: use sdev as argument for sense code printing We should be using the standard dev_printk() variants for sense code printing. [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen] [hch: folded bracing fix from Dan Carpenter] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:15:58 +01:00
Mark Rustad	037e6d8654	scsi: resolve some missing-field-initializers warnings Resolve some missing-field-initializers warnings by using designated initialization. [hch: W=2 with modern gcc warns about this. Pretty pointless to me, but I'd prefer to keep us warning free] Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:15:53 +01:00
Jens Axboe	74c450521d	blk-mq: add a 'list' parameter to ->queue_rq() Since we have the notion of a 'last' request in a chain, we can use this to have the hardware optimize the issuing of requests. Add a list_head parameter to queue_rq that the driver can use to temporarily store hw commands for issue when 'last' is true. If we are doing a chain of requests, pass in a NULL list for the first request to force issue of that immediately, then batch the remainder for deferred issue until the last request has been sent. Instead of adding yet another argument to the hot ->queue_rq path, encapsulate the passed arguments in a blk_mq_queue_data structure. This is passed as a constant, and has been tested as faster than passing 4 (or even 3) args through ->queue_rq. Update drivers for the new ->queue_rq() prototype. There are no functional changes in this patch for drivers - if they don't use the passed in list, then they will just queue requests individually like before. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-10-29 11:14:52 -06:00
Christoph Hellwig	b1dd2aac4c	scsi: set REQ_QUEUE for the blk-mq case To generate the right SPI tag messages we need to properly set QUEUE_FLAG_QUEUED in the request_queue and mirror it to the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Cc: stable@vger.kernel.org	2014-10-28 09:53:43 +01:00
Linus Torvalds	d3dc366bba	Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block Pull core block layer changes from Jens Axboe: "This is the core block IO pull request for 3.18. Apart from the new and improved flush machinery for blk-mq, this is all mostly bug fixes and cleanups. - blk-mq timeout updates and fixes from Christoph. - Removal of REQ_END, also from Christoph. We pass it through the ->queue_rq() hook for blk-mq instead, freeing up one of the request bits. The space was overly tight on 32-bit, so Martin also killed REQ_KERNEL since it's no longer used. - blk integrity updates and fixes from Martin and Gu Zheng. - Update to the flush machinery for blk-mq from Ming Lei. Now we have a per hardware context flush request, which both cleans up the code should scale better for flush intensive workloads on blk-mq. - Improve the error printing, from Rob Elliott. - Backing device improvements and cleanups from Tejun. - Fixup of a misplaced rq_complete() tracepoint from Hannes. - Make blk_get_request() return error pointers, fixing up issues where we NULL deref when a device goes bad or missing. From Joe Lawrence. - Prep work for drastically reducing the memory consumption of dm devices from Junichi Nomura. This allows creating clone bio sets without preallocating a lot of memory. - Fix a blk-mq hang on certain combinations of queue depths and hardware queues from me. - Limit memory consumption for blk-mq devices for crash dump scenarios and drivers that use crazy high depths (certain SCSI shared tag setups). We now just use a single queue and limited depth for that" * 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits) block: Remove REQ_KERNEL blk-mq: allocate cpumask on the home node bio-integrity: remove the needless fail handle of bip_slab creating block: include func name in __get_request prints block: make blk_update_request print prefix match ratelimited prefix blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio block: fix alignment_offset math that assumes io_min is a power-of-2 blk-mq: Make bt_clear_tag() easier to read blk-mq: fix potential hang if rolling wakeup depth is too high block: add bioset_create_nobvec() block: use bio_clone_fast() in blk_rq_prep_clone() block: misplaced rq_complete tracepoint sd: Honor block layer integrity handling flags block: Replace strnicmp with strncasecmp block: Add T10 Protection Information functions block: Don't merge requests if integrity flags differ block: Integrity checksum flag block: Relocate bio integrity flags block: Add a disk flag to block integrity profile block: Add prefix to block integrity profile flags ...	2014-10-18 11:53:51 -07:00
Linus Torvalds	9a50aaefc1	SCSI for-linus on 20141007 This patch set consists of the usual driver updates (megaraid_sas, arcmsr, be2iscsi, lpfc, mpt2sas, mpt3sas, qla2xxx, ufs) plus several assorted fixes and miscellaneous updates (including the pci_msix_enable_range() changes that have been pending for a while). Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJUNHvtAAoJEDeqqVYsXL0MzaoIAJ/R2JW/Xm50rD6iUj6RfjEc 6OOi3sOJe6yivPFLLTmIZyLcgHuZKPVXsjcjBXENsrjJeyu2aTq+vs2bOJN9BRYU gHGyAEhVPKsvecYhEj/78ClRIMzwkr7KQMQConbClDa0sVr62M/dPQVHNvjaTDeS rtmPGZbNpv9rCl0itNBLMrnOBT/MduuWtS2VNCAkV5yFU8kvEax5pizB+W4ztjoe BnVnF8OJC70wAM4vpiUcgwCR5AGmYv5SQKn3AHNBayrJic0MLUSIhrnCptc9TSir zWJAyoW2iQY1LKmihjwjDlXP40jbfOaBacEycqTUKNkfMRKbBl3qQa4IUrR1XsQ= =I1Yg -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This patch set consists of the usual driver updates (megaraid_sas, arcmsr, be2iscsi, lpfc, mpt2sas, mpt3sas, qla2xxx, ufs) plus several assorted fixes and miscellaneous updates (including the pci_msix_enable_range() changes that have been pending for a while)" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (202 commits) scsi: add a CONFIG_SCSI_MQ_DEFAULT option ufs: definitions for phy interface ufs: tune bkops while power managment events ufs: Add support for clock scaling using devfreq framework ufs: Add freq-table-hz property for UFS device ufs: Add support for clock gating ufs: refactor configuring power mode ufs: add UFS power management support ufs: introduce well known logical unit in ufs ufs: manually add well known logical units ufs: Active Power Mode - configuring bActiveICCLevel ufs: improve init sequence ufs: refactor query descriptor API support ufs: add voting support for host controller power ufs: Add clock initialization support ufs: Add regulator enable support ufs: Allow vendor specific initialization scsi: don't add scsi_device if its already visible scsi: fix the type for well known LUs scsi: fix comment in struct Scsi_Host definition ...	2014-10-07 21:29:18 -04:00
Christoph Hellwig	fe052529e4	scsi: move blk_mq_start_request call earlier Some ATA drivers need the dma drain size workaround, and thus need to call blk_mq_start_request before the S/G mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-22 12:00:08 -06:00
Christoph Hellwig	0152fb6b57	blk-mq: pass a reserved argument to the timeout handler Allow blk-mq to pass an argument to the timeout handler to indicate if we're timing out a reserved or regular command. For many drivers those need to be handled different. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-22 12:00:07 -06:00
Christoph Hellwig	c8a446ad69	blk-mq: rename blk_mq_end_io to blk_mq_end_request Now that we've changed the driver API on the submission side use the opportunity to fix up the name on the completion side to fit into the general scheme. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-22 12:00:07 -06:00
Christoph Hellwig	e2490073cd	blk-mq: call blk_mq_start_request from ->queue_rq When we call blk_mq_start_request from the core blk-mq code before calling into ->queue_rq there is a racy window where the timeout handler can hit before we've fully set up the driver specific part of the command. Move the call to blk_mq_start_request into the driver so the driver can start the request only once it is fully set up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-22 12:00:07 -06:00
Christoph Hellwig	bf57229745	blk-mq: remove REQ_END Pass an explicit parameter for the last request in a batch to ->queue_rq instead of using a request flag. Besides being a cleaner and non-stateful interface this is also required for the next patch, which fixes the blk-mq I/O submission code to not start a time too early. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-22 12:00:07 -06:00
Daniel Gryniewicz	f81426a84b	[SCSI] fix for bidi use after free When ending a bi-directionional SCSI request, blk_finish_request() cleans up and frees the request, but scsi_release_bidi_buffers() tries to indirect through the request to find it's data buffers. This causes a panic due to a null pointer dereference. Move the call to scsi_release_bidi_buffers() before the call to blk_finish_request(). Signed-off-by: Daniel Gryniewicz <dang@linuxbox.com> Reviewed-by: Webb Scales <webbnh@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-09-19 13:23:33 +01:00
Kashyap.Desai@avagotech.com	64bdcbc449	scsi: add use_cmd_list flag Add a use_cmd_list flag in struct Scsi_Host to request keeping track of all outstanding commands per device. Default behaviour is not to keep track of cmd_list per sdev, as this may introduce lock contention. (overhead is more on multi-node NUMA.), and only enable it on the two drivers that need it. Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-09-15 16:01:58 -07:00
Jens Axboe	b207892b06	Merge branch 'for-linus' into for-3.18/core A bit of churn on the for-linus side that would be nice to have in the core bits for 3.18, so pull it in to catch us up and make forward progress easier. Signed-off-by: Jens Axboe <axboe@fb.com> Conflicts: block/scsi_ioctl.c	2014-09-11 09:31:18 -06:00
Linus Torvalds	522a15db95	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "A smaller collection of fixes that have come up since the initial merge window pull request. This contains: - error handling cleanup and support for larger than 16 byte cdbs in sg_io() from Christoph. The latter just matches what bsg and friends support, sg_io() got left out in the merge. - an option for brd to expose partitions in /proc/partitions. They are hidden by default for compat reasons. From Dmitry Monakhov. - a few blk-mq fixes from me - killing a dead/unused flag, fix for merging happening even if turned off, and correction of a few comments. - removal of unnecessary ->owner setting in systemace. From Michal Simek. - two related fixes for a problem with nesting freezing of queues in blk-mq. One from Ming Lei removing an unecessary freeze operation, and another from Tejun fixing the nesting regression introduced in the merge window. - fix for a BUG_ON() at bio_endio time when protection info is attached and the IO has an error. From Sagi Grimberg. - two scsi_ioctl bug fixes for regressions with scsi-mq from Tony Battersby. - a cfq weight update fix and subsequent comment update from Toshiaki Makita" * 'for-linus' of git://git.kernel.dk/linux-block: cfq-iosched: Add comments on update timing of weight cfq-iosched: Fix wrong children_weight calculation block: fix error handling in sg_io fix regression in SCSI_IOCTL_SEND_COMMAND scsi-mq: fix requests that use a separate CDB buffer block: support > 16 byte CDBs for SG_IO block: cleanup error handling in sg_io brd: add ram disk visibility option block: systemace: Remove .owner field for driver blk-mq: blk_mq_freeze_queue() should allow nesting blk-mq: correct a few wrong/bad comments block: Fix BUG_ON when pi errors occur blk-mq: don't allow merges if turned off for the queue blk-mq: get rid of unused BLK_MQ_F_SHOULD_SORT flag blk-mq: fix WARNING "percpu_ref_kill() called more than once!"	2014-08-29 11:21:49 -07:00
Joe Lawrence	a492f07545	block,scsi: fixup blk_get_request dead queue scenarios The blk_get_request function may fail in low-memory conditions or during device removal (even if __GFP_WAIT is set). To distinguish between these errors, modify the blk_get_request call stack to return the appropriate ERR_PTR. Verify that all callers check the return status and consider IS_ERR instead of a simple NULL pointer check. For consistency, make a similar change to the blk_mq_alloc_request leg of blk_get_request. It may fail if the queue is dead, or the caller was unwilling to wait. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd] Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd] Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-08-28 10:03:46 -06:00
Tony Battersby	6f4a16266f	scsi-mq: fix requests that use a separate CDB buffer This patch fixes code such as the following with scsi-mq enabled: rq = blk_get_request(...); blk_rq_set_block_pc(rq); rq->cmd = my_cmd_buffer; /* separate CDB buffer */ blk_execute_rq_nowait(...); Code like this appears in e.g. sg_start_req() in drivers/scsi/sg.c (for large CDBs only). Without this patch, scsi_mq_prep_fn() will set rq->cmd back to rq->__cmd, causing the wrong CDB to be sent to the device. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-08-22 15:04:31 -05:00
Guenter Roeck	480cadc2b7	scsi: Fix qemu boot hang problem The latest kernel fails to boot qemu arm images when using scsi for disk access. Boot gets stuck after the following messages. brd: module loaded sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103) sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 93 sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 Bisect points to commit `71e75c97f9` ("scsi: convert device_busy to atomic_t"). Code inspection shows the following suspicious change in scsi_request_fn. out_delay: - if (sdev->device_busy == 0 && !scsi_device_blocked(sdev)) + if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } 'sdev->device_busy == 0' was replaced with 'atomic_read(&sdev->device_busy)', meaning the logic was reversed. Changing this expression to '!atomic_read(&sdev->device_busy)' fixes the problem. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Jens Axboe <axboe@fb.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Reviewed-by: Webb Scales <webbnh@hp.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-19 12:42:26 -05:00
Christoph Hellwig	d285203cf6	scsi: add support for a blk-mq based I/O path. This patch adds support for an alternate I/O path in the scsi midlayer which uses the blk-mq infrastructure instead of the legacy request code. Use of blk-mq is fully transparent to drivers, although for now a host template field is provided to opt out of blk-mq usage in case any unforseen incompatibilities arise. In general replacing the legacy request code with blk-mq is a simple and mostly mechanical transformation. The biggest exception is the new code that deals with the fact the I/O submissions in blk-mq must happen from process context, which slightly complicates the I/O completion handler. The second biggest differences is that blk-mq is build around the concept of preallocated requests that also include driver specific data, which in SCSI context means the scsi_cmnd structure. This completely avoids dynamic memory allocations for the fast path through I/O submission. Due the preallocated requests the MQ code path exclusively uses the host-wide shared tag allocator instead of a per-LUN one. This only affects drivers actually using the block layer provided tag allocator instead of their own. Unlike the old path blk-mq always provides a tag, although drivers don't have to use it. For now the blk-mq path is disable by defauly and must be enabled using the "use_blk_mq" module parameter. Once the remaining work in the block layer to make blk-mq more suitable for slow devices is complete I hope to make it the default and eventually even remove the old code path. Based on the earlier scsi-mq prototype by Nicholas Bellinger. Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and various sugestions and code contributions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:16:28 -04:00
Christoph Hellwig	c53c6d6a68	scatterlist: allow chaining to preallocated chunks Blk-mq drivers usually preallocate their S/G list as part of the request, but if we want to support the very large S/G lists currently supported by the SCSI code that would tie up a lot of memory in the preallocated request pool. Add support to the scatterlist code so that it can initialize a S/G list that uses a preallocated first chunks and dynamically allocated additional chunks. That way the scsi-mq code can preallocate a first page worth of S/G entries as part of the request, and dynamically extend the S/G list when needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:16:21 -04:00
Christoph Hellwig	f6d47e74fc	scsi: unwind blk_end_request_all and blk_end_request_err calls Replace the calls to the various blk_end_request variants with opencode equivalents. Blk-mq is using a model that gives the driver control between the bio updates and the actual completion, and making the old code follow that same model allows us to keep the code more similar for both paths. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:16:13 -04:00
Christoph Hellwig	2ccbb00808	scsi: only maintain target_blocked if the driver has a target queue limit This saves us an atomic operation for each I/O submission and completion for the usual case where the driver doesn't set a per-target can_queue value. Only a few iscsi hardware offload drivers set the per-target can_queue value at the moment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:16:05 -04:00
Christoph Hellwig	cd9070c9c5	scsi: fix the {host,target,device}_blocked counter mess Seems like these counters are missing any sort of synchronization for updates, as a over 10 year old comment from me noted. Fix this by using atomic counters, and while we're at it also make sure they are in the same cacheline as the _busy counters and not needlessly stored to in every I/O completion. With the new model the _busy counters can temporarily go negative, so all the readers are updated to check for > 0 values. Longer term every successful I/O completion will reset the counters to zero, so the temporarily negative values will not cause any harm. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:15:48 -04:00
Christoph Hellwig	71e75c97f9	scsi: convert device_busy to atomic_t Avoid taking the queue_lock to check the per-device queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Unlike the host and target busy counters this doesn't allow us to avoid the queue_lock in the request_fn due to the way the interface works, but it'll allow us to prepare for using the blk-mq code, which doesn't use the queue_lock at all, and it at least avoids a queue_lock round trip in scsi_device_unbusy, which is still important given how busy the queue_lock is. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:43:45 -04:00
Christoph Hellwig	7466501608	scsi: convert host_busy to atomic_t Avoid taking the host-wide host_lock to check the per-host queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:43:43 -04:00
Christoph Hellwig	7ae65c0f96	scsi: convert target_busy to an atomic_t Avoid taking the host-wide host_lock to check the per-target queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:39:00 -04:00
Christoph Hellwig	cf68d334dd	scsi: push host_lock down into scsi_{host,target}_queue_ready Prepare for not taking a host-wide lock in the dispatch path by pushing the lock down into the places that actually need it. Note that this patch is just a preparation step, as it will actually increase lock roundtrips and thus decrease performance on its own. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:38:53 -04:00
Christoph Hellwig	3b5382c459	scsi: set ->scsi_done before calling scsi_dispatch_cmd The blk-mq code path will set this to a different function, so make the code simpler by setting it up in a legacy-request specific place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:38:48 -04:00
Christoph Hellwig	d0d3bbf96e	scsi: centralize command re-queueing in scsi_dispatch_fn Make sure we only have the logic for requeing commands in one place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:38:41 -04:00
Christoph Hellwig	de3e8bf331	scsi: split __scsi_queue_insert Factor out a helper to set the _blocked values, which we'll reuse for the blk-mq code path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:38:35 -04:00
Christoph Hellwig	6af7a4ffa2	scsi: add scsi_setup_cmnd helper Factor out command setup code that will be shared with the blk-mq code path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com>	2014-07-25 07:38:08 -04:00
Christoph Hellwig	4f1e576575	scsi: mark scsi_setup_blk_pc_cmnd static Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:16:29 +02:00
Christoph Hellwig	5158a899d8	scsi: set sc_data_direction in common code The data direction fiel in the SCSI command is derived only from the block request structure. Move setting it up into common code instead of duplicating it in the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:11:41 +02:00
Christoph Hellwig	3868cf8ea7	scsi: restructure command initialization for TYPE_FS requests We should call the device handler prep_fn for all TYPE_FS requests, not just simple read/write calls that are handled by the disk driver. Restructure the common I/O code to call the prep_fn handler and zero out the CDB, and just leave the call to scsi_init_io to the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:11:27 +02:00
Christoph Hellwig	635d98b1d0	scsi: move the nr_phys_segments assert into scsi_init_io scsi_init_io should only be called for requests that transfer data, so move the assert that a request has segments from the callers into scsi_init_io. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:10:53 +02:00
Maurizio Lombardi	e6c11dbb8d	scsi_lib: remove the description string in scsi_io_completion() During IO with fabric faults, one generally sees several "Unhandled error code" messages in the syslog as shown below: sd 4:0:6:2: [sdbw] Unhandled error code sd 4:0:6:2: [sdbw] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 4:0:6:2: [sdbw] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 end_request: I/O error, dev sdbw, sector 0 This comes from scsi_io_completion (in scsi_lib.c) while handling error codes other than DID_RESET or not deferred sense keys i.e. this is actually handled by the SCSI mid layer. But what gets displayed here is "Unhandled error code" which is quite misleading as it indicates something that is not addressed by the mid layer. The description string is based on the sense key and sometimes on the additional sense code; since the ACTION_FAIL case always prints the sense key and the additional sense code, this patch removes the description string completely because it does not add useful information. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:46 +02:00
Christoph Hellwig	f1bea55d5a	scsi: remove various exports that were only used by scsi_tgt Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:07:45 +02:00
Hannes Reinecke	91921e016a	scsi: use dev_printk variants where possible Using dev_printk variants prefixes the logging message with the originating device, which makes debugging easier. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:42 +02:00
James Bottomley	89fb4cd1f7	scsi: handle flush errors properly Flush commands don't transfer data and thus need to be special cased in the I/O completion handler so that we can propagate errors to the block layer and filesystem. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Reported-by: Steven Haber <steven@qumulo.com> Tested-by: Steven Haber <steven@qumulo.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 21:56:34 +02:00
Linus Torvalds	23d4ed53b7	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "Final small batch of fixes to be included before -rc1. Some general cleanups in here as well, but some of the blk-mq fixes we need for the NVMe conversion and/or scsi-mq. The pull request contains: - Support for not merging across a specified "chunk size", if set by the driver. Some NVMe devices perform poorly for IO that crosses such a chunk, so we need to support it generically as part of request merging avoid having to do complicated split logic. From me. - Bump max tag depth to 10Ki tags. Some scsi devices have a huge shared tag space. Before we failed with EINVAL if a too large tag depth was specified, now we truncate it and pass back the actual value. From me. - Various blk-mq rq init fixes from me and others. - A fix for enter on a dying queue for blk-mq from Keith. This is needed to prevent oopsing on hot device removal. - Fixup for blk-mq timer addition from Ming Lei. - Small round of performance fixes for mtip32xx from Sam Bradshaw. - Minor stack leak fix from Rickard Strandqvist. - Two __init annotations from Fabian Frederick" * 'for-linus' of git://git.kernel.dk/linux-block: block: add __init to blkcg_policy_register block: add __init to elv_register block: ensure that bio_add_page() always accepts a page for an empty bio blk-mq: add timer in blk_mq_start_request blk-mq: always initialize request->start_time block: blk-exec.c: Cleaning up local variable address returnd mtip32xx: minor performance enhancements blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init() blk-mq: don't allow queue entering for a dying queue blk-mq: bump max tag depth to 10K tags block: add blk_rq_set_block_pc() block: add notion of a chunk size for request merging	2014-06-11 08:41:17 -07:00
Linus Torvalds	1c54fc1efe	SCSI for-linus on 20140609 This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJTlcwiAAoJEDeqqVYsXL0M5qsIALzVPLd4yxA16zCiaPQUeIV5 mfYmwISFlN+qW3AcUeSH4D13YgegCjEBfqaDMWvIkgouxLy/7jpxtChutq3MCzUE cDT1B9+ZrzoqBISRNHEh/gx5F1MOF2VPuqG2pe0J90wyRCNzJscB6PbtWMAo86CA 2eu7wq3K9FXxCC1qY0PzwBLXHqUcgk5GWiK9CM/k4W0NiTVeNmwPeh5i91IQnBHx E2l7NAXgNLyCf5tyeswvZ4pW0T3hlaswNmBB4qC8oJm4U6UqMN+tk4ML63Pz7uPe 4mlHG0uI8Vbdi13iv1EDUZ9Vo8iqVrzP2UAhakgP9poKSGqE4d/MD0EKNGQB2Es= =cBY8 -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (130 commits) include/scsi/osd_protocol.h: remove unnecessary __constant mvsas: Recognise device/subsystem 9485/9485 as 88SE9485 Revert "be2iscsi: Fix processing cqe for cxn whose endpoint is freed" mptfusion: fix msgContext in mptctl_hp_hostinfo acornscsi: remove linked command support scsi/NCR5380: dprintk macro fusion: Remove use of DEF_SCSI_QCMD fusion: Add free msg frames to the head, not tail of list mpt2sas: Add free smids to the head, not tail of list mpt2sas: Remove use of DEF_SCSI_QCMD mpt2sas: Remove uses of serial_number mpt3sas: Remove use of DEF_SCSI_QCMD mpt3sas: Remove uses of serial_number qla2xxx: Use kmemdup instead of kmalloc + memcpy qla4xxx: Use kmemdup instead of kmalloc + memcpy qla2xxx: fix incorrect debug printk be2iscsi: Bump the driver version be2iscsi: Fix processing cqe for cxn whose endpoint is freed be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed be2iscsi: Fix memory corruption in MBX path ...	2014-06-09 18:54:06 -07:00
Jens Axboe	f27b087b81	block: add blk_rq_set_block_pc() With the optimizations around not clearing the full request at alloc time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC up to the user allocating the request. Add a blk_rq_set_block_pc() that sets the command type to REQ_TYPE_BLOCK_PC, and properly initializes the members associated with this type of request. Update callers to use this function instead of manipulating rq->cmd_type directly. Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed attempt. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-06-06 07:57:37 -06:00
Linus Torvalds	681a289548	Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next Pull block core updates from Jens Axboe: "It's a big(ish) round this time, lots of development effort has gone into blk-mq in the last 3 months. Generally we're heading to where 3.16 will be a feature complete and performant blk-mq. scsi-mq is progressing nicely and will hopefully be in 3.17. A nvme port is in progress, and the Micron pci-e flash driver, mtip32xx, is converted and will be sent in with the driver pull request for 3.16. This pull request contains: - Lots of prep and support patches for scsi-mq have been integrated. All from Christoph. - API and code cleanups for blk-mq from Christoph. - Lots of good corner case and error handling cleanup fixes for blk-mq from Ming Lei. - A flew of blk-mq updates from me: * Provide strict mappings so that the driver can rely on the CPU to queue mapping. This enables optimizations in the driver. * Provided a bitmap tagging instead of percpu_ida, which never really worked well for blk-mq. percpu_ida relies on the fact that we have a lot more tags available than we really need, it fails miserably for cases where we exhaust (or are close to exhausting) the tag space. * Provide sane support for shared tag maps, as utilized by scsi-mq * Various fixes for IO timeouts. * API cleanups, and lots of perf tweaks and optimizations. - Remove 'buffer' from struct request. This is ancient code, from when requests were always virtually mapped. Kill it, to reclaim some space in struct request. From me. - Remove 'magic' from blk_plug. Since we store these on the stack and since we've never caught any actual bugs with this, lets just get rid of it. From me. - Only call part_in_flight() once for IO completion, as includes two atomic reads. Hopefully we'll get a better implementation soon, as the part IO stats are now one of the more expensive parts of doing IO on blk-mq. From me. - File migration of block code from {mm,fs}/ to block/. This includes bio.c, bio-integrity.c, bounce.c, and ioprio.c. From me, from a discussion on lkml. That should describe the meat of the pull request. Also has various little fixes and cleanups from Dave Jones, Shaohua Li, Duan Jiong, Fengguang Wu, Fabian Frederick, Randy Dunlap, Robert Elliott, and Sam Bradshaw" * 'for-3.16/core' of git://git.kernel.dk/linux-block: (100 commits) blk-mq: push IPI or local end_io decision to __blk_mq_complete_request() blk-mq: remember to start timeout handler for direct queue block: ensure that the timer is always added blk-mq: blk_mq_unregister_hctx() can be static blk-mq: make the sysfs mq/ layout reflect current mappings blk-mq: blk_mq_tag_to_rq should handle flush request block: remove dead code in scsi_ioctl:blk_verify_command blk-mq: request initialization optimizations block: add queue flag for disabling SG merging block: remove 'magic' from struct blk_plug blk-mq: remove alloc_hctx and free_hctx methods blk-mq: add file comments and update copyright notices blk-mq: remove blk_mq_alloc_request_pinned blk-mq: do not use blk_mq_alloc_request_pinned in blk_mq_map_request blk-mq: remove blk_mq_wait_for_tags blk-mq: initialize request in __blk_mq_alloc_request blk-mq: merge blk_mq_alloc_reserved_request into blk_mq_alloc_request blk-mq: add helper to insert requests from irq context blk-mq: remove stale comment for blk_mq_complete_request() blk-mq: allow non-softirq completions ...	2014-06-02 09:29:34 -07:00
Christoph Hellwig	a1b73fc194	scsi: reintroduce scsi_driver.init_command Instead of letting the ULD play games with the prep_fn move back to the model of a central prep_fn with a callback to the ULD. This already cleans up and shortens the code by itself, and will be required to properly support blk-mq in the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-05-19 12:35:09 +02:00
Christoph Hellwig	bc85dc500f	scsi: remove scsi_end_request By folding scsi_end_request into its only caller we can significantly clean up the completion logic. We can use simple goto labels now to only have a single place to finish or requeue command there instead of the previous convoluted logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-05-19 12:33:41 +02:00
Christoph Hellwig	c682adf3e1	scsi: explicitly release bidi buffers Instead of trying to guess when we have a BIDI buffer in scsi_release_buffers add a function to explicitly free the BIDI ressoures in the one place that handles them. This avoids needing a special __scsi_release_buffers for the case where we already have freed the request as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>	2014-05-19 12:33:41 +02:00
Alan Stern	644373a421	[SCSI] Fix command result state propagation We're seeing a case where the contents of scmd->result isn't being reset after a SCSI command encounters an error, is resubmitted, times out and then gets handled. The error handler acts on the stale result of the previous error instead of the timeout. Fix this by properly zeroing the scmd->status before the command is resubmitted. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-04-21 14:27:26 -07:00
Christoph Hellwig	68c03d9193	[SCSI] don't reference freed command in scsi_prep_return Patch commit `0479633686` Author: Christoph Hellwig <hch@infradead.org> Date: Thu Feb 20 14:20:55 2014 -0800 [SCSI] do not manipulate device reference counts in scsi_get/put_command Introduced a use after free:I in the kill case of scsi_prep_return we have to release our device reference, but we do this trying to reference the just freed command. Use the local sdev pointer instead. Fixes: `0479633686` Reported-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-04-21 07:57:22 -07:00
Christoph Hellwig	5e012aad85	[SCSI] don't reference freed command in scsi_init_sgtable Patch commit `0479633686` Author: Christoph Hellwig <hch@infradead.org> Date: Thu Feb 20 14:20:55 2014 -0800 [SCSI] do not manipulate device reference counts in scsi_get/put_command Introduced a use after free: when scsi_init_io fails we have to release our device reference, but we do this trying to reference the just freed command. Add a local scsi_device pointer to fix this. Fixes: `0479633686` Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-04-21 07:57:21 -07:00
Jens Axboe	b4f42e2831	block: remove struct request buffer member This was used in the olden days, back when onions were proper yellow. Basically it mapped to the current buffer to be transferred. With highmem being added more than a decade ago, most drivers map pages out of a bio, and rq->buffer isn't pointing at anything valid. Convert old style drivers to just use bio_data(). For the discard payload use case, just reference the page in the bio. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-15 14:03:02 -06:00
Jens Axboe	f89e0dd9d1	Linux 3.15-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTSv89AAoJEHm+PkMAQRiGd7AIAKL45dFRhX96W53uzGlpXtv7 Ecs9CMY5mtsB/rqtV/NSQaUELlNb4Ilb4lITh7NaLWAZxDJ12GwVsIbFoaBj7Ypx tfBNNxffGGnTTn2C1PpPQmLytLBvqXIVHMaPpDvnHYJl6g9BihshLzyMrsrizqnA DPJ0xMdgGp6BLC4qm0ZH1mS2q+TyXLN2ZCjJ1lp6NNYwBnwOe/ABjnUa0Ztu7aii 837N8h6nEuKHTr6DgrYHEgYVc3DPHPHaly/UJ8v4U30myzRv83YkD5DsBSUjSLj8 CzxJEnECa1XljLJK7SjRHy5pSP9tcUlmjx3Fk7IpQixmT+A15cov6fQ967jNlDw= =Hxnc -----END PGP SIGNATURE----- Merge tag 'v3.15-rc1' into for-3.16/core We don't like this, but things have diverged with the blk-mq fixes in 3.15-rc1. So merge it in.	2014-04-15 14:02:24 -06:00
Martin K. Petersen	2bfad21ecc	scsi: Make sure cmd_flags are 64-bit cmd_flags in struct request is now 64 bits wide but the scsi_execute functions truncated arguments passed to int leading to errors. Make sure the flags parameters are u64. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Jens Axboe <axboe@fb.com> CC: Jan Kara <jack@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-09 20:26:20 -06:00
Jens Axboe	59c3d45e48	block: remove 'q' parameter from kblockd_schedule_*_work() The queue parameter is never used, just get rid of it. Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-09 10:17:00 -06:00
Christoph Hellwig	134997a041	[SCSI] remove a useless get/put_device pair in scsi_requeue_command Avoid a spurious device get/put pair by cleaning up scsi_requeue_command and folding scsi_unprep_request into it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:25 -07:00
Bart Van Assche	27e9e0f12a	[SCSI] remove a useless get/put_device pair in scsi_next_command Eliminate a get_device() / put_device() pair from scsi_next_command(). Both are atomic operations hence removing these slightly improves performance. [hch: slight changes due to different context] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:25 -07:00
Bart Van Assche	613be1f626	[SCSI] remove a useless get/put_device pair in scsi_request_fn SCSI devices may only be removed by calling scsi_remove_device(). That function must invoke blk_cleanup_queue() before the final put of sdev->sdev_gendev. Since blk_cleanup_queue() waits for the block queue to drain and then tears it down, scsi_request_fn cannot be active anymore after blk_cleanup_queue() has returned and hence the get_device()/put_device() pair in scsi_request_fn is unnecessary. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:25 -07:00
Christoph Hellwig	0479633686	[SCSI] do not manipulate device reference counts in scsi_get/put_command Many callers won't need this and we can optimize them away. In addition the handling in the __-prefixed variants was inconsistant to start with. Based on an earlier patch from Bart Van Assche. [jejb: fix kerneldoc probelm picked up by Fengguang Wu] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:24 -07:00
Christoph Hellwig	21a05df547	[SCSI] avoid taking host_lock in scsi_run_queue unless nessecary If we don't have starved devices we don't need to take the host lock to iterate over them. Also split the function up to be more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:24 -07:00
Eiichi Tsukata	ee60b2c52e	[SCSI] Add timeout to avoid infinite command retry Currently, scsi error handling in scsi_io_completion() tries to unconditionally requeue scsi command when device keeps some error state. For example, UNIT_ATTENTION causes infinite retry with action == ACTION_RETRY. This is because retryable errors are thought to be temporary and the scsi device will soon recover from those errors. Normally, such retry policy is appropriate because the device will soon recover from temporary error state. But there is no guarantee that device is able to recover from error state immediately. Some hardware error can prevent device from recovering. This patch adds timeout in scsi_io_completion() to avoid infinite command retry in scsi_io_completion(). Once scsi command retry time is longer than this timeout, the command is treated as failure. Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:19:19 -07:00
Russell King	e83b366487	Fix uses of dma_max_pfn() when converting to a limiting address We must use a 64-bit for this, otherwise overflowed bits get lost, and that can result in a lower than intended value set. Fixes: `8e0cb8a1f6` ("ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations") Fixes: `7d35496dd9` ("ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations") Tested-Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-02-17 23:08:41 +00:00
Santosh Shilimkar	7d35496dd9	ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations DMA bounce limit is the maximum direct DMA'able memory beyond which bounce buffers has to be used to perform dma operations. SCSI driver relies on dma_mask but its calculation is based on max_*pfn which don't have uniform meaning across architectures. So make use of dma_max_pfn() which is expected to return the DMAable maximum pfn value across architectures. Cc: linux-scsi@vger.kernel.org Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-31 14:49:26 +00:00
Linus Torvalds	357397a141	Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata changes from Tejun Heo: "Two interesting changes. - libata acpi handling has been restructured so that the association between ata devices and ACPI handles are less convoluted. This change shouldn't change visible behavior. - Queued TRIM support, which enables sending TRIM to the device without draining in-flight RW commands, is added. Currently only enabled for ahci (and likely to stay that way for the foreseeable future). Other changes are driver-specific updates / fixes" * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: bugfix: Remove __le32 in ata_tf_to_fis() libata: acpi: Remove ata_dev_acpi_handle stub in libata.h libata: Add support for queued DSM TRIM libata: Add support for SEND/RECEIVE FPDMA QUEUED libata: Add H2D FIS "auxiliary" port flag libata: Populate host-to-device FIS "auxiliary" field ata: acpi: rework the ata acpi bind support sata, highbank: send extra clock cycles in SGPIO patterns sata, highbank: set tx_atten override bits devicetree: create a separate binding description for sata_highbank drivers/ata/sata_rcar.c: simplify use of devm_ioremap_resource sata highbank: enable 64-bit DMA mask when using LPAE ata: pata_samsung_cf: add missing __iomem annotation ata: pata_arasan: Staticize local symbols sata_mv: Remove unneeded CONFIG_HAVE_CLK ifdefs ata: use dev_get_platdata() sata_mv: Remove unneeded forward declaration libata: acpi: remove dead code for ata_acpi_(un)bind libata: move 'struct ata_taskfile' and friends from ata.h to libata.h	2013-09-03 18:19:53 -07:00
Ewan D. Milne	279afdfe78	[SCSI] Generate uevents on certain unit attention codes Generate a uevent when the following Unit Attention ASC/ASCQ codes are received: 2A/01 MODE PARAMETERS CHANGED 2A/09 CAPACITY DATA HAS CHANGED 38/07 THIN PROVISIONING SOFT THRESHOLD REACHED 3F/03 INQUIRY DATA HAS CHANGED 3F/0E REPORTED LUNS DATA HAS CHANGED Log kernel messages when the following Unit Attention ASC/ASCQ codes are received that are not as specific as those above: 2A/xx PARAMETERS CHANGED 3F/xx TARGET OPERATING CONDITIONS HAVE CHANGED Added logic to set expecting_lun_change for other LUNs on the target after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate uevents are not generated, and clear expecting_lun_change when a REPORT LUNS command completes, in accordance with the SPC-3 specification regarding reporting of the 3F 0E ASC/ASCQ UA. [jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and unused variable fix, both reported by Fengguang Wu] Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-26 18:52:27 +04:00
Hannes Reinecke	7e782af576	[SCSI] Return ENODATA on medium error When a medium error is detected the SCSI stack should return ENODATA to the upper layers. [jejb: fix whitespace error] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-23 12:54:53 -04:00
Hannes Reinecke	a9d6ceb838	[SCSI] return ENOSPC on thin provisioning failure When the thin provisioning hard threshold is reached we should return ENOSPC to inform upper layers about this fact. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-23 12:43:54 -04:00
Hannes Reinecke	0f7f6234d3	[SCSI] Document enhanced error codes Document the various error codes returned on I/O failure. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-23 12:31:59 -04:00
Aaron Lu	f1bc1e4c44	ata: acpi: rework the ata acpi bind support Binding ACPI handle to SCSI device has several drawbacks, namely: 1 During ATA device initialization time, ACPI handle will be needed while SCSI devices are not created yet. So each time ACPI handle is needed, instead of retrieving the handle by ACPI_HANDLE macro, a namespace scan is performed to find the handle for the corresponding ATA device. This is inefficient, and also expose a restriction on calling path not holding any lock. 2 The binding to SCSI device tree makes code complex, while at the same time doesn't bring us any benefit. All ACPI handlings are still done in ATA module, not in SCSI. Rework the ATA ACPI binding code to bind ACPI handle to ATA transport devices(ATA port and ATA device). The binding needs to be done only once, since the ATA transport devices do not go away with hotplug. And due to this, the flush_work call in hotplug handler for ATA bay is no longer needed. Tested on an Intel test platform for binding and runtime power off for ODD(ZPODD) and hard disk; on an ASUS S400C for binding and normal boot and S3, where its SATA port node has _SDD and _GTF control methods when configured as an AHCI controller and its PATA device node has _GTF control method when configured as an IDE controller. SATA PMP binding and ATA hotplug is not tested. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Tested-by: Dirk Griesbach <spamthis@freenet.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-08-23 12:09:23 -04:00
Bart Van Assche	0516c08d10	[SCSI] enable destruction of blocked devices which fail LUN scanning If something goes wrong during LUN scanning, e.g. a transport layer failure occurs, then __scsi_remove_device() can get invoked by the LUN scanning code for a SCSI device in state SDEV_CREATED_BLOCK and before the SCSI device has been added to sysfs (is_visible == 0). Make sure that even in this case the transition into state SDEV_DEL occurs. This avoids that __scsi_remove_device() can get invoked a second time by scsi_forget_host() if this last function is invoked from another thread than the thread that performs LUN scanning. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-07-09 12:14:09 +01:00
James Bottomley	e2eb7244bc	[SCSI] Fix race between starved list and device removal scsi_run_queue() examines all SCSI devices that are present on the starved list. Since scsi_run_queue() unlocks the SCSI host lock a SCSI device can get removed after it has been removed from the starved list and before its queue is run. Protect against that race condition by holding a reference on the queue while running it. Reported-by: Chanho Min <chanho.min@lge.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-07-09 12:14:08 +01:00
Lin Ming	9b21493c45	[SCSI] sd: use REQ_PM in sd's runtime suspend operation With the introduction of REQ_PM, modify sd's runtime suspend operation functions to use that flag so that the operations to put the device into runtime suspended state(i.e. sync cache and stop device) will not affect its runtime PM status. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-06 12:48:17 -07:00
Rafael J. Wysocki	53540098b2	ACPI / glue: Add .match() callback to struct acpi_bus_type USB uses the .find_bridge() callback from struct acpi_bus_type incorrectly, because as a result of the way it is used by USB every device in the system that doesn't have a bus type or parent is passed to usb_acpi_find_device() for inspection. What USB actually needs, though, is to call usb_acpi_find_device() for USB ports that don't have a bus type defined, but have usb_port_device_type as their device type, as well as for USB devices. To fix that replace the struct bus_type pointer in struct acpi_bus_type used for matching devices to specific subsystems with a .match() callback to be used for this purpose and update the users of struct acpi_bus_type, including USB, accordingly. Define the .match() callback routine for USB, usb_acpi_bus_match(), in such a way that it will cover both USB devices and USB ports and remove the now redundant .find_bridge() callback pointer from usb_acpi_bus. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jeff Garzik <jgarzik@pobox.com>	2013-03-04 14:23:40 +01:00
Aaron Lu	6f4c827e68	[libata] scsi: no poll when ODD is powered off When the ODD is powered off, any action the user did to the ODD that would generate a media event will trigger an ACPI interrupt, so the poll for media event is no longer necessary. And the poll will also cause a runtime status change, which will stop the ODD from staying in powered off state, so the poll should better be stopped. But since we don't have access to the gendisk structure in LLDs, here comes the disk_events_disable_depth for scsi device. This field is a hint set by LLDs to convey information to upper layer drivers. A value of 0 means media poll is necessary for the device, while values above 0 means media poll is not needed and should better be skipped. So we can increase its value when we are to power off the ODD in ATA layer and decrease its value when the ODD is powered on, effectively silence the media events poll. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2013-01-25 15:36:43 -05:00
Linus Torvalds	60da5bf47d	Merge branch 'for-3.8/core' of git://git.kernel.dk/linux-block Pull block layer core updates from Jens Axboe: "Here are the core block IO bits for 3.8. The branch contains: - The final version of the surprise device removal fixups from Bart. - Don't hide EFI partitions under advanced partition types. It's fairly wide spread these days. This is especially dangerous for systems that have both msdos and efi partition tables, where you want to keep them in sync. - Cleanup of using -1 instead of the proper NUMA_NO_NODE - Export control of bdi flusher thread CPU mask and default to using the home node (if known) from Jeff. - Export unplug tracepoint for MD. - Core improvements from Shaohua. Reinstate the recursive merge, as the original bug has been fixed. Add plugging for discard and also fix a problem handling non pow-of-2 discard limits. There's a trivial merge in block/blk-exec.c due to a fix that went into 3.7-rc at a later point than -rc4 where this is based." * 'for-3.8/core' of git://git.kernel.dk/linux-block: block: export block_unplug tracepoint block: add plug for blkdev_issue_discard block: discard granularity might not be power of 2 deadline: Allow 0ms deadline latency, increase the read speed partitions: enable EFI/GPT support by default bsg: Remove unused function bsg_goose_queue() block: Make blk_cleanup_queue() wait until request_fn finished block: Avoid scheduling delayed work on a dead queue block: Avoid that request_fn is invoked on a dead queue block: Let blk_drain_queue() caller obtain the queue lock block: Rename queue dead flag bdi: add a user-tunable cpu_list for the bdi flusher threads block: use NUMA_NO_NODE instead of -1 block: recursive merge requests block CFQ: avoid moving request to different queue	2012-12-17 08:27:23 -08:00
Bart Van Assche	3f3299d5c0	block: Rename queue dead flag QUEUE_FLAG_DEAD is used to indicate that queuing new requests must stop. After this flag has been set queue draining starts. However, during the queue draining phase it is still safe to invoke the queue's request_fn, so QUEUE_FLAG_DYING is a better name for this flag. This patch has been generated by running the following command over the kernel source tree: git grep -lEw 'blk_queue_dead\|QUEUE_FLAG_DEAD' \| xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g' \ -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g'; \ sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \ include/linux/blkdev.h; \ sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \ -e 's/Dead queue/A dying queue/' block/blk-core.c Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <JBottomley@Parallels.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Jens Axboe <axboe@kernel.dk> Cc: Chanho Min <chanho.min@lge.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-12-06 14:30:58 +01:00
Martin K. Petersen	5db44863b6	[SCSI] sd: Implement support for WRITE SAME Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk driver. - We set the default maximum to 0xFFFF because there are several devices out there that only support two-byte block counts even with WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK LIMITS VPD. - max_write_same_blocks can be overriden per-device basis in sysfs. - The UNMAP discovery heuristics remain unchanged but the discard limits are tweaked to match the "real" WRITE SAME commands. - In the error handling logic we now distinguish between WRITE SAME with and without UNMAP set. The discovery process heuristics are: - If the device reports a SCSI level of SPC-3 or greater we'll issue READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is supported. If that's the case we will use it. - If the device supports the block limits VPD and reports a MAXIMUM WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16). - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond 0xFFFFFFFF or the block count exceeds 0xFFFF. - no_write_same is set for ATA, FireWire and USB. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-13 22:45:42 -08:00
James Bottomley	fe709ed827	Merge SCSI misc branch into isci-for-3.6 tag	2012-10-02 08:55:12 +01:00
Vikas Chaudhary	0e58076b37	[SCSI] scsi_lib: Set the device state from transport-offline to running FC and iSCSI class set SCSI devices to transport-offline state after fast_io_fail/replacement_timeout has fired, but after relogin, function scsi_internal_device_unblock() is not setting scsi device state to running. Due to this the devices even after being relogged in remain offline. Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-14 17:59:22 +01:00
Mike Snitzer	27c419739b	[SCSI] scsi_lib: fix scsi_io_completion's SG_IO error propagation The following v3.4-rc1 commit unmasked an existing bug in scsi_io_completion's SG_IO error handling: `47ac56d` [SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR Given that certain ILLEGAL_REQUEST are now properly categorized as TARGET_ERROR the host_byte is being set (before host_byte wasn't ever set for these ILLEGAL_REQUEST). In scsi_io_completion, initialize req->errors with cmd->result _after_ the SG_IO block that calls __scsi_error_from_host_byte (which may modify the host_byte). Before this fix: cdb to send: 12 01 01 00 00 00 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=20000, flags=0, status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b, 00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0x10, driver_status=0x8, resid=0, duration=0, info=0x1}) = 0 SCSI Status: Check Condition Sense Information: sense buffer empty After: cdb to send: 12 01 01 00 00 00 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=20000, flags=0, status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b, 00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0, driver_status=0x8, resid=0, duration=0, info=0x1}) = 0 SCSI Status: Check Condition Sense Information: Fixed format, current; Sense key: Illegal Request Additional sense: Invalid field in cdb Raw sense data (in hex): 70 00 05 00 00 00 00 0b 00 00 00 00 24 00 00 00 00 00 00 Reported-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Babu Moger <babu.moger@netapp.com> Cc: stable@vger.kernel.org # 3.4 Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-08-22 09:42:31 +04:00
Jeff Garzik	8407884dd9	Merge branch 'master' [vanilla Linus master] into libata-dev.git/upstream Two bits were appended to the end of the bitfield list in struct scsi_device. Resolve that conflict by including both bits. Conflicts: include/scsi/scsi_device.h	2012-07-25 15:58:48 -04:00
Bart Van Assche	b485462aca	[SCSI] Stop accepting SCSI requests before removing a device Avoid that the code for requeueing SCSI requests triggers a crash by making sure that that code isn't scheduled anymore after a device has been removed. Also, source code inspection of __scsi_remove_device() revealed a race condition in this function: no new SCSI requests must be accepted for a SCSI device after device removal started. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:41 +01:00
Bart Van Assche	84feb1664e	[SCSI] Change return type of scsi_queue_insert() into void The return value of scsi_queue_insert() is ignored by all its callers, hence change the return type of this function into void. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:41 +01:00
Bart Van Assche	940f5d47e2	[SCSI] Avoid dangling pointer in scsi_requeue_command() When we call scsi_unprep_request() the command associated with the request gets destroyed and therefore drops its reference on the device. If this was the only reference, the device may get released and we end up with a NULL pointer deref when we call blk_requeue_request. Reported-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> [jejb: enhance commend and add commit log for stable] Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:40 +01:00
Bart Van Assche	67bd941300	[SCSI] Fix device removal NULL pointer dereference Use blk_queue_dead() to test whether the queue is dead instead of !sdev. Since scsi_prep_fn() may be invoked concurrently with __scsi_remove_device(), keep the queuedata (sdev) pointer in __scsi_remove_device(). This patch fixes a kernel oops that can be triggered by USB device removal. See also http://www.spinics.net/lists/linux-scsi/msg56254.html. Other changes included in this patch: - Swap the blk_cleanup_queue() and kfree() calls in scsi_host_dev_release() to make that code easier to grasp. - Remove the queue dead check from scsi_run_queue() since the queue state can change anyway at any point in that function where the queue lock is not held. - Remove the queue dead check from the start of scsi_request_fn() since it is redundant with the scsi_device_online() check. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:40 +01:00
Mike Christie	d075498c98	[SCSI] remove old comment from block/unblock functions We do not hold the host lock when calling these functions, so remove comment. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:23 +01:00
Mike Christie	5d9fb5cc1b	[SCSI] core, classes, mpt2sas: have scsi_internal_device_unblock take new state This has scsi_internal_device_unblock/scsi_target_unblock take the new state to set the devices as an argument instead of always setting to running. The patch also converts users of these functions. This allows the FC and iSCSI class to transition devices from blocked to transport-offline, so that when fast_io_fail/replacement_timeout has fired we do not set the devices back to running. Instead, we set them to SDEV_TRANSPORT_OFFLINE. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:22 +01:00
Mike Christie	1b8d262061	[SCSI] add new SDEV_TRANSPORT_OFFLINE state This patch adds a new state SDEV_TRANSPORT_OFFLINE. It will be used by transport classes to offline devices for cases like when the fast_io_fail/recovery_tmo fires. In those cases we want all IO to fail, and we have not yet escalated to dev_loss_tmo behavior where we are removing the devices. Currently to handle this state, transport classes are setting the scsi_device's state to running, setting their internal session/port structs state to something that indicates failed, and then failing IO from some transport check in the queuecommand. The reason for the new value is so that users can distinguish between a device failure that is a result of a transport problem vs the wide range of errors that devices get offlined for when a scsi command times out and we offline the devices there. It also fixes the confusion as to why the transport class is failing IO, but has set the device state from blocked to running. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:21 +01:00
Holger Macht	de50ada55b	[SCSI] add wrapper to access and set scsi_bus_type in struct acpi_bus_type For being able to bind ata devices against acpi devices, scsi_bus_type needs to be set as bus in struct acpi_bus_type. So add wrapper to scsi_lib to accomplish that. Signed-off-by: Holger Macht <holger@homac.de> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2012-06-29 11:38:09 -04:00
Jun'ichi Nomura	b7e94a1686	[SCSI] Fix dm-multipath starvation when scsi host is busy block congestion control doesn't have any concept of fairness across multiple queues. This means that if SCSI reports the host as busy in the queue congestion control it can result in an unfair starvation situation in dm-mp if there are multiple multipath devices on the same host. For example: http://www.redhat.com/archives/dm-devel/2012-May/msg00123.html The fix for this is to report only the sdev busy state (and ignore the host busy state) in the block congestion control call back. The host is still congested, but the SCSI subsystem will sort out the congestion in a fair way because it knows the relation between the queues and the host. [jejb: fixed up trailing whitespace] Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-05-23 09:34:17 +01:00
James Bottomley	e346933365	isci update for 3.5 1/ Rework remote-node-context (RNC) handling for proper management of the silicon state machine in error handling and hot-plug conditions. Further details below, suffice to say if the RNC is mismanaged the silicon state machines may lock up. 2/ Refactor the initialization code to be reused for suspend/resume support 3/ Miscellaneous bug fixes to address discovery issues and hardware compatibility. RNC rework details from Jeff Skirvin: In the controller, devices as they appear on a SAS domain (or direct-attached SATA devices) are represented by memory structures known as "Remote Node Contexts" (RNCs). These structures are transferred from main memory to the controller using a set of register commands; these commands include setting up the context ("posting"), removing the context ("invalidating"), and commands to control the scheduling of commands and connections to that remote device ("suspensions" and "resumptions"). There is a similar path to control RNC scheduling from the protocol engine, which interprets the results of command and data transmission and reception. In general, the controller chooses among non-suspended RNCs to find one that has work requiring scheduling the transmission of command and data frames to a target. Likewise, when a target tries to return data back to the initiator, the state of the RNC is used by the controller to determine how to treat the incoming request. As an example, if the RNC is in the state "TX/RX Suspended", incoming SSP connection requests from the target will be rejected by the controller hardware. When an RNC is "TX Suspended", it will not be selected by the controller hardware to start outgoing command or data operations (with certain priority-based exceptions). As mentioned above, there are two sources for management of the RNC states: commands from driver software, and the result of transmission and reception conditions of commands and data signaled by the controller hardware. As an example of the latter, if an outgoing SSP command ends with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will transition to the "TX Suspended" state, and this is signaled by the controller hardware in the status to the completion of the pending command as well as signaled in a controller hardware event. Examples of the former are included in the patch changelogs. Driver software is required to suspend the RNC in a "TX/RX Suspended" condition before any outstanding commands can be terminated. Failure to guarantee this can lead to a complete hardware hang condition. Earlier versions of the driver software did not guarantee that an RNC was correctly managed before I/O termination, and so operated in an unsafe way. Further, the driver performed unnecessary contortions to preserve the remote device command state and so was more complicated than it needed to be. A simplifying driver assumption is that once an I/O has entered the error handler path without having completed in the target, the requirement on the driver is that all use of the sas_task must end. Beyond that, recovery of operation is dependent on libsas and other components to reset, rediscover and reconfigure the device before normal operation can restart. In the driver, this simplifying assumption meant that the RNC management could be reduced to entry into the suspended state, terminating the targeted I/O request, and resuming the RNC as needed for device-specific management such as an SSP Abort Task or LUN Reset Management request. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPtYFXAAoJEB7SkWpmfYgCJkcP/1VvsuuitNy/YM9P1tb/RvQ7 ytJzjGtWiAABHVWwjgB+Ng7hUTaP2r6l8KeNfwxpwXyNdBAUNysYEUBHAfPsKzKz espTmw3wVCnREajgKXZwFp9aTj8DcYFB6vKcC/ddACt3uRNjjA9En36+6797r8Vg YdebyjFX2FxwoUj0icTUiV/OXgb8w723imnCl8bfOhhFRi4eFZ4EJ23AdMkUya1i uYePAPJPSQJuU/87gNIx4JcR0qHJ1ziGPEY+XC47CzEeXbBTSPgWOwanQ6KPoXRJ XVxamfcKAjRdtwQ4m1vYBSE32RTdrjhujVbkGiPi6QaEbCLjhLSCIYyuS3XMckV+ TCZ16o5kd/I6ZtZOeP4zZRGnNBkOPzY44qiJeKffjWDhTrFacx4XWJB/ftWPgEA5 N2zFH3RM4sY0FUJ3I/Qe5CERNdCXMtcj+UAf3nHpAIVcv46Lp+qoSkdEx05uuuiN +D/dSlubktuvuzmB5WisL3qrjNEkkLTAGQpZs1j0ojLEBm0XAgV5EzqmHiZ0GOPD OQNFxeei9SlqgtKIIP0bymRispPrG2HVCOvExYMxzKR6fjxofZLAs/aWOsdhxgMq TlAyZJ6OmGI+KX68HHzoMpT9iquvmP64WGkfHzCx296BfSKiruLh/Jzt5gGwv+Z1 5tlpnUr9dUTxx7qkQXvj =HYvO -----END PGP SIGNATURE----- Merge tag 'isci-for-3.5' into misc isci update for 3.5 1/ Rework remote-node-context (RNC) handling for proper management of the silicon state machine in error handling and hot-plug conditions. Further details below, suffice to say if the RNC is mismanaged the silicon state machines may lock up. 2/ Refactor the initialization code to be reused for suspend/resume support 3/ Miscellaneous bug fixes to address discovery issues and hardware compatibility. RNC rework details from Jeff Skirvin: In the controller, devices as they appear on a SAS domain (or direct-attached SATA devices) are represented by memory structures known as "Remote Node Contexts" (RNCs). These structures are transferred from main memory to the controller using a set of register commands; these commands include setting up the context ("posting"), removing the context ("invalidating"), and commands to control the scheduling of commands and connections to that remote device ("suspensions" and "resumptions"). There is a similar path to control RNC scheduling from the protocol engine, which interprets the results of command and data transmission and reception. In general, the controller chooses among non-suspended RNCs to find one that has work requiring scheduling the transmission of command and data frames to a target. Likewise, when a target tries to return data back to the initiator, the state of the RNC is used by the controller to determine how to treat the incoming request. As an example, if the RNC is in the state "TX/RX Suspended", incoming SSP connection requests from the target will be rejected by the controller hardware. When an RNC is "TX Suspended", it will not be selected by the controller hardware to start outgoing command or data operations (with certain priority-based exceptions). As mentioned above, there are two sources for management of the RNC states: commands from driver software, and the result of transmission and reception conditions of commands and data signaled by the controller hardware. As an example of the latter, if an outgoing SSP command ends with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will transition to the "TX Suspended" state, and this is signaled by the controller hardware in the status to the completion of the pending command as well as signaled in a controller hardware event. Examples of the former are included in the patch changelogs. Driver software is required to suspend the RNC in a "TX/RX Suspended" condition before any outstanding commands can be terminated. Failure to guarantee this can lead to a complete hardware hang condition. Earlier versions of the driver software did not guarantee that an RNC was correctly managed before I/O termination, and so operated in an unsafe way. Further, the driver performed unnecessary contortions to preserve the remote device command state and so was more complicated than it needed to be. A simplifying driver assumption is that once an I/O has entered the error handler path without having completed in the target, the requirement on the driver is that all use of the sas_task must end. Beyond that, recovery of operation is dependent on libsas and other components to reset, rediscover and reconfigure the device before normal operation can restart. In the driver, this simplifying assumption meant that the RNC management could be reduced to entry into the suspended state, terminating the targeted I/O request, and resuming the RNC as needed for device-specific management such as an SSP Abort Task or LUN Reset Management request.	2012-05-21 12:17:30 +01:00
Dan Williams	a7a20d1039	[SCSI] sd: limit the scope of the async probe domain sd injects and synchronizes probe work on the global kernel-wide domain. This runs into conflict with PM that wants to perform resume actions in async context: [ 494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds. [ 494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.360809] kworker/u:3 D 0000000000000000 0 554 2 0x00000000 [ 494.420739] ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8 [ 494.484392] ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160 [ 494.548038] ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398 [ 494.611685] Call Trace: [ 494.632649] [<ffffffff8149dd25>] schedule+0x5a/0x5c [ 494.674687] [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112 [ 494.734177] [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50 [ 494.787134] [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48 [ 494.837900] [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17 [ 494.891567] [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70 <-- here we wait for async contexts to complete [ 494.943783] [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a [ 495.002547] [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod] [ 495.051861] [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf [ 495.104807] [<ffffffff812fe9bd>] device_release_driver+0x25/0x32 <-- here we take device_lock() [ 853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds. [ 853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 853.635119] kworker/u:4 D ffff88013097b5d0 0 549 2 0x00000000 [ 853.695129] ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8 [ 853.758990] ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000 [ 853.822796] 0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000 [ 853.886633] Call Trace: [ 853.907631] [<ffffffff8149dd25>] schedule+0x5a/0x5c [ 853.949670] [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351 [ 854.001225] [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4 [ 854.049082] [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4 [ 854.097011] [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36 <-- here we wait for device_lock() [ 854.145591] [<ffffffff81304bd7>] device_resume+0x58/0x1c4 [ 854.192066] [<ffffffff81304d61>] async_resume+0x1e/0x45 [ 854.237019] [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173 <-- ...while running in async context Provide a 'scsi_sd_probe_domain' so that async probe actions actions can be flushed without regard for the state of PM, and allow for the resume path to handle devices that have transitioned from SDEV_QUIESCE to SDEV_DEL prior to resume. Acked-by: Alan Stern <stern@rowland.harvard.edu> [alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume] Signed-off-by: Dan Williams <dan.j.williams@intel.com> [jejb: remove unneeded config guards in include file] Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-05-17 09:10:46 +01:00
Lin Ming	6f381fa344	[SCSI] scsi_lib: use correct DMA device in __scsi_alloc_queue Currently, __scsi_alloc_queue uses SCSI host's parent device as DMA device to set segment boundary. But the parent device may not refer to the DMA device. For example, for ATA disk, SCSI host's parent device now refers to ATA port. Since commit d139b9b([SCSI] scsi_lib_dma: fix bug with dma maps on nested scsi objects), a new field Scsi_Host->dma_dev was introduced to refer to the real DMA device. Use ->dma_dev in __scsi_alloc_queue to correctly set segment boundary. Bug report: http://marc.info/?l=linux-ide&m=133177818318187&w=2 Reported-and-tested-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-04-22 18:56:18 +01:00
Linus Torvalds	424a6f6ef9	SCSI updates on 20120319 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPZxSnAAoJEDeqqVYsXL0M0Y4IAMX0vrTVZbg6psA5/gMcWGRP CkFXEQ8n0PL2SCaj6BoDqamJFe5Nc7dnqxM0fGawB4S9vr3rHhiOlwO+NbV9zFYC 2skBTpeL3sjgtN/jTBdfeeAa7xTYpu/XGyei0NS1A5c2AyMVXV0uYV2s4VNZxe44 tVIn1OEzM2giZ9EB1OZslDMrg5XXm3MBIUECP0LbWUhBm/35caSFKzMXRwhh7WiK +AVmc2AZYtdEwuknDyiH7KlsaoB3vGL9pPrAUJzIgEhy2pOo2A7W72HfA4Fj+y6a uF9HBS5zciMp1+sGWry62AjNbWgin9BRlozBEO/lJhIfMGDV1nXEIJsOkOgkdoE= =1TxB -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 SCSI updates from James Bottomley: "The update includes the usual assortment of driver updates (lpfc, qla2xxx, qla4xxx, bfa, bnx2fc, bnx2i, isci, fcoe, hpsa) plus a huge amount of infrastructure work in the SAS library and transport class as well as an iSCSI update. There's also a new SCSI based virtio driver." * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (177 commits) [SCSI] qla4xxx: Update driver version to 5.02.00-k15 [SCSI] qla4xxx: trivial cleanup [SCSI] qla4xxx: Fix sparse warning [SCSI] qla4xxx: Add support for multiple session per host. [SCSI] qla4xxx: Export CHAP index as sysfs attribute [SCSI] scsi_transport: Export CHAP index as sysfs attribute [SCSI] qla4xxx: Add support to display CHAP list and delete CHAP entry [SCSI] iscsi_transport: Add support to display CHAP list and delete CHAP entry [SCSI] pm8001: fix endian issue with code optimization. [SCSI] pm8001: Fix possible racing condition. [SCSI] pm8001: Fix bogus interrupt state flag issue. [SCSI] ipr: update PCI ID definitions for new adapters [SCSI] qla2xxx: handle default case in qla2x00_request_firmware() [SCSI] isci: improvements in driver unloading routine [SCSI] isci: improve phy event warnings [SCSI] isci: debug, provide state-enum-to-string conversions [SCSI] scsi_transport_sas: 'enable' phys on reset [SCSI] libsas: don't recover end devices attached to disabled phys [SCSI] libsas: fixup target_port_protocols for expanders that don't report sata [SCSI] libsas: set attached device type and target protocols for local phys ...	2012-03-22 12:55:29 -07:00
Cong Wang	77dfce076c	scsi: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:19 +08:00
Martin K. Petersen	66a651aa7a	[SCSI] Ensure discard failure gets treated as a target problem The error reported up the stack for a discard failure did not clearly indicate that the command was processed and subsequently failed by the target device. Return -EREMOTEIO so multipathing does not classify this condition as a path failure. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-02-19 09:38:01 -06:00
Moger, Babu	2082ebc45a	[SCSI] fix the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE) This patch fixes the host byte settings DID_TARGET_FAILURE and DID_NEXUS_FAILURE. The function __scsi_error_from_host_byte, tries to reset the host byte to DID_OK. But that does not happen because of the OR operation. Here is the flow. scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition, result will be set as DID_NEXUS_FAILURE (=0x11). Then in __scsi_error_from_host_byte, when we do OR with DID_OK. Purpose is to reset it back to DID_OK. But that does not happen. This patch fixes this issue. Signed-off-by: Babu Moger <babu.moger@netapp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-02-19 08:08:59 -06:00
Shaohua Li	466c08c71a	[SCSI] don't change sdev starvation list order without request dispatched The sdev is deleted from starved list and then try to dispatch from this device. It's quite possible the sdev can't eventually dispatch a request, then the sdev will be in starved list tail. This isn't fair. There are two cases here: 1. unplug path. scsi_request_fn() calls to scsi_target_queue_ready(), then the dev is removed from starved list, but quite possible host queue isn't ready, the dev is moved to starved list without dispatching any request. 2. scsi_run_queue path. It deletes the dev from starved list first (both global and local starved lists), then handles the dev. Then we could have the same process like case 1. This patch fixes the first case. Case 2 isn't fixed, because there is a rare case scsi_run_queue finds host isn't busy but scsi_request_fn finds host is busy (other CPU is faster to get host queue depth). Not deleting the dev from starved list in scsi_run_queue will keep scsi_run_queue looping (though this is very rare case, because host will become busy). Fortunately fixing case 1 already gives big improvement for starvation in my test. In a 12 disk JBOD setup, running file creation under EXT4, this gives 12% more throughput. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-01-16 11:54:04 +04:00
Hannes Reinecke	745718132c	[SCSI] Silencing 'killing requests for dead queue' When we tear down a device we try to flush all outstanding commands in scsi_free_queue(). However the check in scsi_request_fn() is imperfect as it only signals that we _might start_ aborting commands, not that we've actually aborted some. So move the printk inside the scsi_kill_request function, this will also give us a hint about which commands are aborted. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-11-09 12:07:07 -06:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Paul Gortmaker	09703660ed	scsi: Add export.h for EXPORT_SYMBOL/THIS_MODULE as required For the basic SCSI infrastructure files that are exporting symbols but not modules themselves, add in the basic export.h header file to allow the exports. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:23 -04:00
Bart Van Assche	3308511c93	[SCSI] Make scsi_free_queue() kill pending SCSI commands Make sure that SCSI device removal via scsi_remove_host() does finish all pending SCSI commands. Currently that's not the case and hence removal of a SCSI host during I/O can cause a deadlock. See also "blkdev_issue_discard() hangs forever if underlying storage device is removed" (http://bugzilla.kernel.org/show_bug.cgi?id=40472). See also http://lkml.org/lkml/2011/8/27/6. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: <stable@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-10-30 13:20:28 +04:00
James Smart	573e591353	[SCSI] scsi_lib: pause between error retries During cable pull tests on our 16G FC adapter, we are seeing errors, typically reads to close targets, which fail due to CRC or framing errors caused by the cable being pull (return status DID_ERROR). The adapter detects the error on one of the first frames received, marks the FC exchange as dead (further frames go to bit bucket) and signals the host of the error. This action is so quick, and coupled with fast host CPUs, creates a scenario in which the midlayer sees the failure and retries the io almost immediately. We've seen link traces with the retry on the link while the original i/o is still being processed by the target. We're also seeing the time window for the "link to pull-apart" and the physical interface to report disconnected to be in the few millisecond range. Which means, we're encountering scenarios where the full retry count is exhausted (all with error) by the midlayer before the link disconnect state is detected. We looked at 8G FC behavior and occasionally see the same behavior, but as the link was slower, it rarely could exhaust all retries before the link reported disconnect. What is needed is a slight delay between io retries due to DID_ERROR to cover this error. It is inappropriate to put this delay in the driver, as the error is indistinguishable from other link-related errors, nor does the driver track whether the io is a retry or not. This is also easier than tracking between-io-error bursts that are seen in this scenario. The patch below updates the retry path so that it inserts a delay as if the target was busy. The busy delay is on the order of 6ms. This delay is sufficient to ensure the link down condition is reported before the retry count is exhausted (at most 1 retry is seen). Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-07-27 14:06:01 +04:00
James Bottomley	bfe159a512	[SCSI] fix crash in scsi_dispatch_cmd() USB surprise removal of sr is triggering an oops in scsi_dispatch_command(). What seems to be happening is that USB is hanging on to a queue reference until the last close of the upper device, so the crash is caused by surprise remove of a mounted CD followed by attempted unmount. The problem is that USB doesn't issue its final commands as part of the SCSI teardown path, but on last close when the block queue is long gone. The long term fix is probably to make sr do the teardown in the same way as sd (so remove all the lower bits on ejection, but keep the upper disk alive until last close of user space). However, the current oops can be simply fixed by not allowing any commands to be sent to a dead queue. Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-07-21 14:21:18 -07:00
Linus Torvalds	a2b9c1f620	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: don't delay blk_run_queue_async scsi: remove performance regression due to async queue run blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup block: rescan partitions on invalidated devices on -ENOMEDIA too cdrom: always check_disk_change() on open block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers	2011-05-18 06:49:02 -07:00
Jens Axboe	9937a5e2f3	scsi: remove performance regression due to async queue run Commit `c21e6beb` removed our queue request_fn re-enter protection, and defaulted to always running the queues from kblockd to be safe. This was a known potential slow down, but should be safe. Unfortunately this is causing big performance regressions for some, so we need to improve this logic. Looking into the details of the re-enter, the real issue is on requeue of requests. Requeue of requests upon seeing a BUSY condition from the device ends up re-running the queue, causing traces like this: scsi_request_fn() scsi_dispatch_cmd() scsi_queue_insert() __scsi_queue_insert() scsi_run_queue() scsi_request_fn() ... potentially causing the issue we want to avoid. So special case the requeue re-run of the queue, but improve it to offload the entire run of local queue and starved queue from a single workqueue callback. This is a lot better than potentially kicking off a workqueue run for each device seen. This also fixes the issue of the local device going into recursion, since the above mentioned commit never moved that queue run out of line. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-17 11:04:44 +02:00
James Bottomley	c055f5b261	[SCSI] fix oops in scsi_run_queue() The recent commit closing the race window in device teardown: commit `86cbfb5607` Author: James Bottomley <James.Bottomley@suse.de> Date: Fri Apr 22 10:39:59 2011 -0500 [SCSI] put stricter guards on queue dead checks is causing a potential NULL deref in scsi_run_queue() because the q->queuedata may already be NULL by the time this function is called. Since we shouldn't be running a queue that is being torn down, simply add a NULL check in scsi_run_queue() to forestall this. Tested-by: Jim Schutt <jaschut@sandia.gov> Cc: stable@kernel.org Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-05-03 15:30:00 -05:00
Jens Axboe	c21e6beba8	block: get rid of QUEUE_FLAG_REENTER We are currently using this flag to check whether it's safe to call into ->request_fn(). If it is set, we punt to kblockd. But we get a lot of false positives and excessive punts to kblockd, which hurts performance. The only real abuser of this infrastructure is SCSI. So export the async queue run and convert SCSI over to use that. There's room for improvement in that SCSI need not always use the async call, but this fixes our performance issue and they can fix that up in due time. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-04-19 13:32:46 +02:00
Christoph Hellwig	24ecfbe27f	block: add blk_run_queue_async Instead of overloading __blk_run_queue to force an offload to kblockd add a new blk_run_queue_async helper to do it explicitly. I've kept the blk_queue_stopped check for now, but I suspect it's not needed as the check we do when the workqueue items runs should be enough. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-04-18 11:41:33 +02:00
Linus Torvalds	6c51038900	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}	2011-03-24 10:16:26 -07:00
Linus Torvalds	c55d267de2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (170 commits) [SCSI] scsi_dh_rdac: Add MD36xxf into device list [SCSI] scsi_debug: add consecutive medium errors [SCSI] libsas: fix ata list corruption issue [SCSI] hpsa: export resettable host attribute [SCSI] hpsa: move device attributes to avoid forward declarations [SCSI] scsi_debug: Logical Block Provisioning (SBC3r26) [SCSI] sd: Logical Block Provisioning update [SCSI] Include protection operation in SCSI command trace [SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try) [SCSI] target: Fix volume size misreporting for volumes > 2TB [SCSI] bnx2fc: Broadcom FCoE offload driver [SCSI] fcoe: fix broken fcoe interface reset [SCSI] fcoe: precedence bug in fcoe_filter_frames() [SCSI] libfcoe: Remove stale fcoe-netdev entries [SCSI] libfcoe: Move FCOE_MTU definition from fcoe.h to libfcoe.h [SCSI] libfc: introduce __fc_fill_fc_hdr that accepts fc_hdr as an argument [SCSI] fcoe, libfc: initialize EM anchors list and then update npiv EMs [SCSI] Revert "[SCSI] libfc: fix exchange being deleted when the abort itself is timed out" [SCSI] libfc: Fixing a memory leak when destroying an interface [SCSI] megaraid_sas: Version and Changelog update ... Fix up trivial conflicts due to whitespace differences in drivers/scsi/libsas/{sas_ata.c,sas_scsi_host.c}	2011-03-17 17:54:40 -07:00
Martin K. Petersen	c98a0eb0e9	[SCSI] sd: Logical Block Provisioning update SBC3r26 contains many changes to the Logical Block Provisioning interfaces (formerly known as Thin Provisioning ditto). This patch implements support for both the old and new schemes using the same heuristic as before (whether the LBP VPD page is present). The new code also allows the provisioning mode (i.e. choice of command) to be overridden on a per-device basis via sysfs. Two additional modes are supported in this version: - WRITE SAME(10) with the UNMAP bit set - WRITE SAME(10) without the UNMAP bit set. This allows us to support devices that predate the TP/LBP enhancements in SBC3 and which work by way zero-detection Switching between modes has been consolidated in a helper function that also updates the block layer topology according to the limitations of the chosen command. I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10) if WRITE SAME(16) fails, etc. but found several devices that got cranky. So for now we'll disable discard if one of the commands fail. The user still has the option of selecting a different mode in sysfs. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-03-14 18:37:34 -05:00
Martin K. Petersen	72f7d322fd	[SCSI] Include protection operation in SCSI command trace When debugging DIF/DIX it is very helpful to be able to see which DIX operation is associated with the scsi_cmnd. Include the protection op in the SCSI command trace. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-03-14 18:36:02 -05:00
Jens Axboe	4c63f5646e	Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-10 08:58:35 +01:00
Jens Axboe	a488e74976	scsi: convert to blk_delay_queue() It was always abuse to reuse the plugging infrastructure for this, convert it to the (new) real API for delaying queueing a bit. A default delay of 3 msec is defined, to match the previous behaviour. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-10 08:45:54 +01:00
Tejun Heo	1654e7411a	block: add @force_kblockd to __blk_run_queue() __blk_run_queue() automatically either calls q->request_fn() directly or schedules kblockd depending on whether the function is recursed. blk-flush implementation needs to be able to explicitly choose kblockd. Add @force_kblockd. All the current users are converted to specify %false for the parameter and this patch doesn't introduce any behavior change. stable: This is prerequisite for fixing ide oops caused by the new blk-flush implementation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-02 08:48:05 -05:00
Hannes Reinecke	63583cca74	[SCSI] Add detailed SCSI I/O errors Instead of just passing 'EIO' for any I/O error we should be notifying the upper layers with more details about the cause of this error. Update the possible I/O errors to: - ENOLINK: Link failure between host and target - EIO: Retryable I/O error - EREMOTEIO: Non-retryable I/O error - EBADE: I/O error restricted to the I_T_L nexus 'Retryable' in this context means that an I/O error _might_ be restricted to the I_T_L nexus (vulgo: path), so retrying on another nexus / path might succeed. 'Non-retryable' in general refers to a target failure, so this error will always be generated regardless of the I_T_L nexus it was send on. I/O errors restricted to the I_T_L nexus might be retried on another nexus / path, but they should _not_ be queued if no paths are available. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-02-12 10:33:08 -06:00
Linus Torvalds	275220f0fc	Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits) block: ensure that completion error gets properly traced blktrace: add missing probe argument to block_bio_complete block cfq: don't use atomic_t for cfq_group block cfq: don't use atomic_t for cfq_queue block: trace event block fix unassigned field block: add internal hd part table references block: fix accounting bug on cross partition merges kref: add kref_test_and_get bio-integrity: mark kintegrityd_wq highpri and CPU intensive block: make kblockd_workqueue smarter Revert "sd: implement sd_check_events()" block: Clean up exit_io_context() source code. Fix compile warnings due to missing removal of a 'ret' variable fs/block: type signature of major_to_index(int) to major_to_index(unsigned) block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p) cfq-iosched: don't check cfqg in choose_service_tree() fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors cdrom: export cdrom_check_events() sd: implement sd_check_events() sr: implement sr_check_events() ...	2011-01-13 10:45:01 -08:00
Linus Torvalds	da40d036fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (147 commits) [SCSI] arcmsr: fix write to device check [SCSI] lpfc: lower stack use in lpfc_fc_frame_check [SCSI] eliminate an unnecessary local variable from scsi_remove_target() [SCSI] libiscsi: use bh locking instead of irq with session lock [SCSI] libiscsi: do not take host lock in queuecommand [SCSI] be2iscsi: fix null ptr when accessing task hdr [SCSI] be2iscsi: fix gfp use in alloc_pdu [SCSI] libiscsi: add more informative failure message during iscsi scsi eh [SCSI] gdth: Add missing call to gdth_ioctl_free [SCSI] bfa: remove unused defintions and misc cleanups [SCSI] bfa: remove inactive functions [SCSI] bfa: replace bfa_assert with WARN_ON [SCSI] qla2xxx: Use sg_next to fetch next sg element while walking sg list. [SCSI] qla2xxx: Fix to avoid recursive lock failure during BSG timeout. [SCSI] qla2xxx: Remove code to not reset ISP82xx on failure. [SCSI] qla2xxx: Display mailbox register 4 during 8012 AEN for ISP82XX parts. [SCSI] qla2xxx: Don't perform a BIG_HAMMER if Get-ID (0x20) mailbox command fails on CNAs. [SCSI] qla2xxx: Remove redundant module parameter permission bits [SCSI] qla2xxx: Add sysfs node for displaying board temperature. [SCSI] qla2xxx: Code cleanup to remove unwanted comments and code. ...	2011-01-07 12:47:02 -08:00
Hillf Danton	fd01a6632d	[SCSI] fix the return value of scsi_target_queue_read() It seems that zero should be returned if scsi_target_is_busy(starget) is true, no matter if sdev is on the starved list. Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-12-21 12:37:28 -06:00
Linus Torvalds	7f8635cc9e	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: cciss: fix cciss_revalidate panic block: max hardware sectors limit wrapper block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead blk-throttle: Correct the placement of smp_rmb() blk-throttle: Trim/adjust slice_end once a bio has been dispatched block: check for proper length of iov entries earlier in blk_rq_map_user_iov() drbd: fix for spin_lock_irqsave in endio callback drbd: don't recvmsg with zero length	2010-12-20 09:19:46 -08:00
Martin K. Petersen	e692cb668f	block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead When stacking devices, a request_queue is not always available. This forced us to have a no_cluster flag in the queue_limits that could be used as a carrier until the request_queue had been set up for a metadevice. There were several problems with that approach. First of all it was up to the stacking device to remember to set queue flag after stacking had completed. Also, the queue flag and the queue limits had to be kept in sync at all times. We got that wrong, which could lead to us issuing commands that went beyond the max scatterlist limit set by the driver. The proper fix is to avoid having two flags for tracking the same thing. We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the block layer merging functions. The queue_limit 'no_cluster' is turned into 'cluster' to avoid double negatives and to ease stacking. Clustering defaults to being enabled as before. The queue flag logic is removed from the stacking function, and explicitly setting the cluster flag is no longer necessary in DM and MD. Reported-by: Ed Lin <ed.lin@promise.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-17 08:35:53 +01:00
Tejun Heo	9f8a2c23c6	scsi: replace sr_test_unit_ready() with scsi_test_unit_ready() The usage of TUR has been confusing involving several different commits updating different parts over time. Currently, the only differences between scsi_test_unit_ready() and sr_test_unit_ready() are, * scsi_test_unit_ready() also sets sdev->changed on NOT_READY. * scsi_test_unit_ready() returns 0 if TUR ended with UNIT_ATTENTION or NOT_READY. Due to the above two differences, sr is using its own sr_test_unit_ready(), but sd - the sole user of the above extra handling - doesn't even need them. Where scsi_test_unit_ready() is used in sd_media_changed(), the code is looking for device ready w/ media present state which is true iff TUR succeeds w/o sense data or UA, and when the device is not ready for whatever reason sd_media_changed() explicitly marks media as missing so there's no reason to set sdev->changed automatically from scsi_test_unit_ready() on NOT_READY. Drop both special handlings from scsi_test_unit_ready(), which makes it equivalant to sr_test_unit_ready(), and replace sr_test_unit_ready() with scsi_test_unit_ready(). Also, drop the unnecessary explicit NOT_READY check from sd_media_changed(). Checking return value is enough for testing device readiness. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-16 17:53:39 +01:00
James Bottomley	459dbf72e4	[SCSI] Eliminate error handler overload of the SCSI serial number The error handler is using the test cmd->serial_number == 0 in the abort routines to signal that the command to be aborted has already completed normally. This design was to close a race window in the original error handler where a command could go through the normal completion routines after it timed out but before error handling was started. Mike Anderson pointed out that when we converted our timeout and softirq completions, we picked up atomicity here because the block layer now mediates this with the REQ_ATOM_COMPLETE flag and guarantees that either the command times out or our done routine is called, but ensures we can't get both occurring. That makes the serial number zero check redundant and it can be removed. Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-12-09 09:41:16 -06:00
Mike Christie	986fe6c7f5	[SCSI] Fix regressions in scsi_internal_device_block Deleting a SCSI device on a blocked fc_remote_port (before fast_io_fail_tmo fires) results in a hanging thread: STACK: 0 schedule+1108 [0x5cac48] 1 schedule_timeout+528 [0x5cb7fc] 2 wait_for_common+266 [0x5ca6be] 3 blk_execute_rq+160 [0x354054] 4 scsi_execute+324 [0x3b7ef4] 5 scsi_execute_req+162 [0x3b80ca] 6 sd_sync_cache+138 [0x3cf662] 7 sd_shutdown+138 [0x3cf91a] 8 sd_remove+112 [0x3cfe4c] 9 __device_release_driver+124 [0x3a08b8] 10 device_release_driver+60 [0x3a0a5c] 11 bus_remove_device+266 [0x39fa76] 12 device_del+340 [0x39d818] 13 __scsi_remove_device+204 [0x3bcc48] 14 scsi_remove_device+66 [0x3bcc8e] 15 sysfs_schedule_callback_work+50 [0x260d66] 16 worker_thread+622 [0x162326] 17 kthread+160 [0x1680b0] 18 kernel_thread_starter+6 [0x10aaea] During the delete, the SCSI device is in moved to SDEV_CANCEL. When the FC transport class later calls scsi_target_unblock, this has no effect, since scsi_internal_device_unblock ignores SCSI devics in this state. It looks like all these are regressions caused by: `5c10e63c94` [SCSI] limit state transitions in scsi_internal_device_unblock Fix by rejecting offline and cancel in the state transition. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> [jejb: Original patch by Christof Schmitt, modified by Mike Christie] Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-10-25 09:48:32 -05:00
Linus Torvalds	e9dd2b6837	Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits) cfq-iosched: Fix a gcc 4.5 warning and put some comments block: Turn bvec_k{un,}map_irq() into static inline functions block: fix accounting bug on cross partition merges block: Make the integrity mapped property a bio flag block: Fix double free in blk_integrity_unregister block: Ensure physical block size is unsigned int blkio-throttle: Fix possible multiplication overflow in iops calculations blkio-throttle: limit max iops value to UINT_MAX blkio-throttle: There is no need to convert jiffies to milli seconds blkio-throttle: Fix link failure failure on i386 blkio: Recalculate the throttled bio dispatch time upon throttle limit change blkio: Add root group to td->tg_list blkio: deletion of a cgroup was causes oops blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n block: set the bounce_pfn to the actual DMA limit rather than to max memory block: revert bad fix for memory hotplug causing bounces Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK block: set the bounce_pfn to the actual DMA limit rather than to max memory block: Prevent hang_check firing during long I/O cfq: improve fsync performance for small files ... Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h	2010-10-22 17:00:32 -07:00
Martin K. Petersen	13f05c8d8e	block/scsi: Provide a limit on the number of integrity segments Some controllers have a hardware limit on the number of protection information scatter-gather list segments they can handle. Introduce a max_integrity_segments limit in the block layer and provide a new scsi_host_template setting that allows HBA drivers to provide a value suitable for the hardware. Add support for honoring the integrity segment limit when merging both bios and requests. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2010-09-10 20:50:10 +02:00
James Bottomley	3a5c19c23d	[SCSI] fix use-after-free in scsi_init_io() we're using a pointer through a freed command to reset the request, which has shown up as an oops with slab poisoning: Reported-by: Tejun Heo <tj@kernel.org> Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-09-09 09:58:18 -05:00
Bartlomiej Zolnierkiewicz	d6e9fb46cd	scsi: remove superfluous NULL pointer check from scsi_kill_request() Dan's list included: drivers/scsi/scsi_lib.c +1365 scsi_kill_request(9) warning: variable derefenced in initializer 'cmd' drivers/scsi/scsi_lib.c +1365 scsi_kill_request(9) warning: variable derefenced before check 'cmd' We dereference cmd (and possible OOPS if cmd == NULL) before starting the request so just remove the superfluous debugging code altogether. [ bart: the potential NULL pointer dereference was finally fixed in (much later than mine) commit `03b1470` but my patch is still valid ] Reported-by: Dan Carpenter <error27@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eugene Teo <eteo@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:00 -07:00
FUJITA Tomonori	610a63498f	scsi: fix discard page leak We leak a page allocated for discard on some error conditions (e.g. scsi_prep_state_check returns BLKPREP_DEFER in scsi_setup_blk_pc_cmnd). We unprep on requests that weren't prepped in the error path of scsi_init_io. It makes the error path to clean up scsi commands messy. Let's strictly apply the rule that we can't unprep on a request that wasn't prepped. Calling just scsi_put_command() in the error path of scsi_init_io() is enough. We don't set REQ_DONTPREP yet. scsi_setup_discard_cmnd can safely free a page on the error case with the above rule. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:24:28 +02:00
James Bottomley	28018c242a	block: implement an unprep function corresponding directly to prep Reviewed-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:23:47 +02:00
Christoph Hellwig	33659ebbae	block: remove wrappers for request type/flags Remove all the trivial wrappers for the cmd_type and cmd_flags fields in struct requests. This allows much easier grepping for different request types instead of unwinding through macros. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:17:56 +02:00
Linus Torvalds	b1bf936840	Merge branch 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits) block: don't access jiffies when initialising io_context cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds block: fix for "Consolidate phys_segment and hw_segment limits" cfq-iosched: quantum check tweak blktrace: perform cleanup after setup error blkdev: fix merge_bvec_fn return value checks cfq-iosched: requests "in flight" vs "in driver" clarification cciss: Fix problem with scatter gather elements in the scsi half of the driver cciss: eliminate unnecessary pointer use in cciss scsi code cciss: do not use void pointer for scsi hba data cciss: factor out scatter gather chain block mapping code cciss: fix scatter gather chain block dma direction kludge cciss: simplify scatter gather code cciss: factor out scatter gather chain block allocation and freeing cciss: detect bad alignment of scsi commands at build time cciss: clarify command list padding calculation cfq-iosched: rethink seeky detection for SSDs cfq-iosched: rework seeky detection block: remove padding from io_context on 64bit builds block: Consolidate phys_segment and hw_segment limits ...	2010-03-01 09:00:29 -08:00
Martin K. Petersen	8a78362c4e	block: Consolidate phys_segment and hw_segment limits Except for SCSI no device drivers distinguish between physical and hardware segment limits. Consolidate the two into a single segment limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-02-26 13:58:08 +01:00
Martin K. Petersen	086fa5ff08	block: Rename blk_queue_max_sectors to blk_queue_max_hw_sectors The block layer calling convention is blk_queue_<limit name>. blk_queue_max_sectors predates this practice, leading to some confusion. Rename the function to appropriately reflect that its intended use is to set max_hw_sectors. Also introduce a temporary wrapper for backwards compability. This can be removed after the merge window is closed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-02-26 13:58:08 +01:00
Douglas Gilbert	e7efe5932b	[SCSI] skip sense logging for some ATA PASS-THROUGH cdbs Further to the lsml thread titled: "does scsi_io_completion need to dump sense data for ata pass through (ck_cond = 1) ?" This is a patch to skip logging when the sense data is associated with a SENSE_KEY of "RECOVERED_ERROR" and the additional sense code is "ATA PASS-THROUGH INFORMATION AVAILABLE". This only occurs with the SAT ATA PASS-THROUGH commands when CK_COND=1 (in the cdb). It indicates that the sense data contains ATA registers. Smartmontools uses such commands on ATA disks connected via SAT. Periodic checks such as those done by smartd cause nuisance entries into logs that are: - neither errors nor warnings - pointless unless the cdb that caused them are also logged Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-01-18 10:48:16 -06:00
Boaz Harrosh	63c43b0ec1	[SCSI] scsi_lib: Fix bug in completion of bidi commands Because of the terrible structuring of scsi-bidi-commands it breaks some of the life time rules of a scsi-command. It is now not allowed to free up the block-request before cleanup and partial deallocation of the scsi-command. (Which is not so for none bidi commands) The right fix to this problem would be to make bidi command a first citizen by allocating a scsi_sdb pointer at scsi command just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated as part of the get/put_command (Again like the prot_sdb) and the current decoupling of scsi_cmnd and blk-request should be kept. For now make sure scsi_release_buffers() is called before the call to blk_end_request_all() which might cause the suicide of the block requests. At best the leak of bidi buffers, at worse a crash, as there is a race between the existence of the bidi_request and the free of the associated bidi_sdb. The reason this was never hit before is because only OSD has the potential of doing asynchronous bidi commands. (So does bsg but it is never used) And OSD clients just happen to do all their bidi commands synchronously, up until recently. CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-01-17 12:16:18 -06:00
Martin K. Petersen	d8705f11d8	[SCSI] Correctly handle thin provisioning write error A thin provisioned device may temporarily be out of sufficient allocation units to fulfill a write request. In that case it will return a space allocation in progress error. Wait a bit and retry the write. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-12-10 08:54:15 -06:00
Jiri Slaby	03b147083a	[SCSI] scsi_lib: fix potential NULL dereference Stanse found a potential NULL dereference in scsi_kill_request. Instead of triggering BUG() in 'if (unlikely(cmd == NULL))' branch, the kernel will Oops earlier on cmd dereference. Move the dereferences after the if. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-12-04 12:00:51 -06:00
Mike Christie	ad63082626	[SCSI] fix propogation of integrity errors When the Integrity check is done in scsi_io_completion it will set error to -EILSEQ. However, at this point error is no longer used, and blk_end_request_err has -EIO hardcoded. It looks like there was just porting mistake with this patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3e695f89c5debb735e4ff051e9e58d8fb4e95110 and we meant to send error upwards, so this patch changes the hard coded EIO to the error variable. I have only boot tested this patch. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-10-29 13:03:27 -04:00
Linus Torvalds	355bbd8cb8	Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits) block: use blkdev_issue_discard in blk_ioctl_discard Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads block: don't assume device has a request list backing in nr_requests store block: Optimal I/O limit wrapper cfq: choose a new next_req when a request is dispatched Seperate read and write statistics of in_flight requests aoe: end barrier bios with EOPNOTSUPP block: trace bio queueing trial only when it occurs block: enable rq CPU completion affinity by default cfq: fix the log message after dispatched a request block: use printk_once cciss: memory leak in cciss_init_one() splice: update mtime and atime on files block: make blk_iopoll_prep_sched() follow normal 0/1 return convention cfq-iosched: get rid of must_alloc flag block: use interrupts disabled version of raise_softirq_irqoff() block: fix comment in blk-iopoll.c block: adjust default budget for blk-iopoll block: fix long lines in block/blk-iopoll.c block: add blk-iopoll, a NAPI like approach for block devices ...	2009-09-14 17:55:15 -07:00
Tejun Heo	da6c5c720c	scsi,block: update SCSI to handle mixed merge failures Update scsi_io_completion() such that it only fails requests till the next error boundary and retry the leftover. This enables block layer to merge requests with different failfast settings and still behave correctly on errors. Allow merge of requests of different failfast settings. As SCSI is currently the only subsystem which follows failfast status, there's no need to worry about other block drivers for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Niel Lambrechts <niel.lambrechts@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Martin K. Petersen	002b1eb2c0	[SCSI] Print failed commands When a request fails we print the sense data but not the actual command that failed. Add a printout of the operation + CDB for failed commands. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-08-22 17:51:51 -05:00
Hannes Reinecke	b391277a56	sd, sr: fix Driver 'sd' needs updating message If a SCSI ULD driver sets blk_queue_prep_rq(), it should clean it up itself on remove(), and not from the bus callbacks. This removes the need to hook into bus->remove(), which should not be used at the same time as driver->remove(). [jejb: fix sdkp initialisation problem due to mismerge] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-21 12:01:27 -05:00
James Bottomley	82681a318f	[SCSI] Merge branch 'linus' Conflicts: drivers/message/fusion/mptsas.c fixed up conflict between req->data_len accessors and mptsas driver updates. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-12 10:02:03 -05:00
Takahiro Yasui	5c10e63c94	[SCSI] limit state transitions in scsi_internal_device_unblock scsi timeout on two or more devices may cause extremely long execution time for user applications because SDEV_OFFLINE state is changed to SDEV_RUNNING state during scsi error recovery procedures triggered by a bus reset or a host reset of scsi LLD, and scsi timeout can happens on the same devices many times. This happens because scsi_internal_device_unblock() changes device's state to SDEV_RUNNING even if a device in other states than SDEV_BLOCK, while the following two transitions are required in this function. SDEV_BLOCK -> SDEV_RUNNING SDEV_CREATED_BLOCK -> SDEV_CREATED Otherwise, it returns -EINVAL. Signed-off-by: Takahiro Yasui <tyasui@redhat.com> [matthew@wil.cx: supplied rewritten base for patch] Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-05-23 15:44:05 -05:00
Jens Axboe	e4b636366c	Merge branch 'master' into for-2.6.31 Conflicts: drivers/block/hd.c drivers/block/mg_disk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 20:25:34 +02:00
Boaz Harrosh	ac36552a52	scsi_lib: remove unused variable The last request completion cleanup in scsi_lib left an unused this_count variable in scsi_io_completion(). (It was used before in a code segment that now uses blk_end_request_all()) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-19 19:54:09 +02:00
Tejun Heo	e458824f9d	scsi: fix resid_len mis-conversion in scsi_end_request() Commit `c3a4d78c58` introduced rq->data_len and converted residual count users to it. While converting, it mistakenly converted scsi_end_request() to finish requests with residual count when it wants to do is fully complete the request. Fix it by using blk_end_request_all() instead. This bug was spotted by Boaz Harrosh. Signed-off-by: Tejun Heo <tj@kernel.org> Spotted-by: Boaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-12 08:49:32 +02:00
FUJITA Tomonori	e6bb7a96c2	scsi: simplify the bidi completion Let's use blk_end_request_all() instead of blk_end_bidi_request(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 11:06:47 +02:00
Tejun Heo	9934c8c045	block: implement and enforce request peek/start/fetch Till now block layer allowed two separate modes of request execution. A request is always acquired from the request queue via elv_next_request(). After that, drivers are free to either dequeue it or process it without dequeueing. Dequeue allows elv_next_request() to return the next request so that multiple requests can be in flight. Executing requests without dequeueing has its merits mostly in allowing drivers for simpler devices which can't do sg to deal with segments only without considering request boundary. However, the benefit this brings is dubious and declining while the cost of the API ambiguity is increasing. Segment based drivers are usually for very old or limited devices and as converting to dequeueing model isn't difficult, it doesn't justify the API overhead it puts on block layer and its more modern users. Previous patches converted all block low level drivers to dequeueing model. This patch completes the API transition by... * renaming elv_next_request() to blk_peek_request() * renaming blkdev_dequeue_request() to blk_start_request() * adding blk_fetch_request() which is combination of peek and start * disallowing completion of queued (not started) requests * applying new API to all LLDs Renamings are for consistency and to break out of tree code so that it's apparent that out of tree drivers need updating. [ Impact: block request issue API cleanup, no functional change ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Miller <mike.miller@hp.com> Cc: unsik Kim <donari75@gmail.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Laurent Vivier <Laurent@lvivier.info> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Stefan Weinhuber <wein@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:52:18 +02:00
Tejun Heo	1011c1b9f2	block: blk_rq_[cur_]_{sectors\|bytes}() usage cleanup With the previous changes, the followings are now guaranteed for all requests in any valid state. * blk_rq_sectors() == blk_rq_bytes() >> 9 * blk_rq_cur_sectors() == blk_rq_cur_bytes() >> 9 Clean up accessor usages. Notable changes are * nbd,i2o_block: end_all used instead of explicit byte count * scsi_lib: unnecessary conditional on request type removed [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:55 +02:00
Tejun Heo	b079041030	block: cleanup rq->data_len usages With recent unification of fields, it's now guaranteed that rq->data_len always equals blk_rq_bytes(). Convert all non-IDE direct users to accessors. IDE will be converted in a separate patch. Boaz: spotted incorrect data_len/resid_len conversion in osd. [ Impact: convert direct rq->data_len usages to blk_rq_bytes() ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:55 +02:00
Tejun Heo	83096ebf12	block: convert to pos and nr_sectors accessors With recent cleanups, there is no place where low level driver directly manipulates request fields. This means that the 'hard' request fields always equal the !hard fields. Convert all rq->sectors, nr_sectors and current_nr_sectors references to accessors. While at it, drop superflous blk_rq_pos() < 0 test in swim.c. [ Impact: use pos and nr_sectors accessors ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Mike Miller <mike.miller@hp.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Dario Ballabio <ballabio_dario@emc.com> Cc: David S. Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: unsik Kim <donari75@gmail.com> Cc: Laurent Vivier <Laurent@lvivier.info> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	5b93629b45	block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones Implement accessors - blk_rq_pos(), blk_rq_sectors() and blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors and rq->hard_cur_sectors respectively and convert direct references of the said fields to the accessors. This is in preparation of request data length handling cleanup. Geert : suggested adding const to struct request * parameter to accessors Sergei : spotted error in patch description [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:53 +02:00
Tejun Heo	c3a4d78c58	block: add rq->resid_len rq->data_len served two purposes - the length of data buffer on issue and the residual count on completion. This duality creates some headaches. First of all, block layer and low level drivers can't really determine what rq->data_len contains while a request is executing. It could be the total request length or it coulde be anything else one of the lower layers is using to keep track of residual count. This complicates things because blk_rq_bytes() and thus [__]blk_end_request_all() relies on rq->data_len for PC commands. Drivers which want to report residual count should first cache the total request length, update rq->data_len and then complete the request with the cached data length. Secondly, it makes requests default to reporting full residual count, ie. reporting that no data transfer occurred. The residual count is an exception not the norm; however, the driver should clear rq->data_len to zero to signify the normal cases while leaving it alone means no data transfer occurred at all. This reverse default behavior complicates code unnecessarily and renders block PC on some drivers (ide-tape/floppy) unuseable. This patch adds rq->resid_len which is used only for residual count. While at it, remove now unnecessasry blk_rq_bytes() caching in ide_pc_intr() as rq->data_len is not changed anymore. Boaz : spotted missing conversion in osd Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape [ Impact: cleanup residual count handling, report 0 resid by default ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:53 +02:00
Tejun Heo	731ec497e5	block: kill rq->data Now that all block request data transfer is done via bio, rq->data isn't used. Kill it. While at it, make the roles of rq->special and buffer clear. [ Impact: drop now unncessary field from struct request ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com>	2009-04-28 07:37:36 +02:00
Tejun Heo	40cbbb781d	block: implement and use [__]blk_end_request_all() There are many [__]blk_end_request() call sites which call it with full request length and expect full completion. Many of them ensure that the request actually completes by doing BUG_ON() the return value, which is awkward and error-prone. This patch adds [__]blk_end_request_all() which takes @rq and @error and fully completes the request. BUG_ON() is added to to ensure that this actually happens. Most conversions are simple but there are a few noteworthy ones. * cdrom/viocd: viocd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/block/dasd: dasd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/char/tape_block: tapeblock_end_request() replaced with direct calls to blk_end_request_all(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Miller <mike.miller@hp.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-04-28 07:37:35 +02:00
Mike Christie	b4efdd586b	[SCSI] fix q->lock not held warning when target is busy We cannot call blk_plug_device from scsi_target_queue_ready because the q lock is not held. And we do not need to call it from there because when we return 0, the scsi_request_fn not_ready handling will plug the queue for us if needed. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-04-27 09:48:10 -05:00
James Bottomley	a9bddd7463	[SCSI] fix recovered error handling We have a problem with recovered error handling in that any command which goes down as BLOCK_PC but which returns a sense code of RECOVERED ERROR gets completed with -EIO. For actual SG_IO commands, this doesn't matter at all, since the error return code gets dropped in favour of req->errors which contain the SCSI completion code. However, if this command is part of the block system, then it will pay attention to the returned error code. In particularly if a SYNCHRONIZE CACHE from a barrier command completes with RECOVERED ERROR, the resulting -EIO on the barrier causes block to error the request and return it to the filesystem. Fix this by converting the -EIO for recovered error to zero, plus remove the printing of this from sd and sr so the message isn't double printed. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-04-03 09:22:55 -05:00
FUJITA Tomonori	f078727b25	[SCSI] remove scsi_req_map_sg No one uses scsi_execute_async with data transfer now. We can remove scsi_req_map_sg. Only scsi_eh_lock_door uses scsi_execute_async. scsi_eh_lock_door doesn't handle sense and the callback. So we can remove scsi_io_context too. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-03-12 12:58:10 -05:00
James Bottomley	126c098296	[SCSI] fix ABORTED_COMMAND looping forever problem Instead of terminating after five retries, commands terminated by ABORTED_COMMAND sense are retrying forever. The problem was introduced by: commit `b60af5b0ad` Author: Alan Stern <stern@rowland.harvard.edu> Date: Mon Nov 3 15:56:47 2008 -0500 [SCSI] simplify scsi_io_completion() Which introduced an error whereby ABORTED_COMMAND now gets erroneously retried in scsi_io_completion. Fix this by returning the behaviour back to the default no retry. Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-02-21 20:29:38 -06:00
James Bottomley	79ed242972	[SCSI] scsi_lib: fix DID_RESET status problems Andrew Vaszquez said: > There's a problem that is causing commands returned by the LLD with > a DID_RESET status to be reissued with cleared cmd->sdb data which > in our tests are manifesting in firmware detected overruns. Here's > a snippet of a READ_10 scsi_cmnd upon completion by the storage The problem is caused by: commit `b60af5b0ad` Author: Alan Stern <stern@rowland.harvard.edu> Date: Mon Nov 3 15:56:47 2008 -0500 [SCSI] simplify scsi_io_completion() Because scsi_release_buffers() is called before commands that go through the ACTION_RETRY and ACTION_DELAYED_RETRY legs are requeued. However, they're not re-prepared, so nothing ever reallocates the buffer resources to them. Fix this by releasing the buffers only if we're not going to go down these legs (but scsi_release_buffers() on all legs including two in scsi_end_request(); this latter needs a special version __scsi_release_buffers() because the final one can be called after the request has been freed, so the bidi test in scsi_release_buffers(), which touches the request has to be skipped). Reported-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-07 15:15:44 -06:00
Martin K. Petersen	3e695f89c5	[SCSI] Fix error handling for DIF/DIX patch commit `b60af5b0ad` Author: Alan Stern <stern@rowland.harvard.edu> Date: Mon Nov 3 15:56:47 2008 -0500 [SCSI] simplify scsi_io_completion() broke DIX error handling. Also, we are now using EILSEQ to indicate integrity errors to the upper layers (as opposed to regular EIO failures). This allows filesystems to inspect buffers and decide whether to retry the I/O. Update scsi_io_completion() accordingly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-05 09:02:28 -06:00
James Bottomley	4f5299ac4e	[SCSI] scsi_lib: don't decrement busy counters when inserting commands A bug was introduced by commit `b60af5b0ad` Author: Alan Stern <stern@rowland.harvard.edu> Date: Mon Nov 3 15:56:47 2008 -0500 [SCSI] simplify scsi_io_completion() because the simplification uses scsi_queue_insert(). The problem with this function is that it expects to be called from the completion path while the command is still outstanding, so it decrements the device and host busy counts to do the requeue. The problem is that scsi_io_completion() is a path executed well after these counts have already been decremented, leading to a double decrement if the command goes down any error path leading to ACTION_DELAYED_RETRY. The fix is to allow a private function __scsi_queue_insert() with a flag to say whether the busy counters should be decremented. This is made static to scsi_lib.c to discourage other use. Reported-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-05 08:54:11 -06:00
Alan Stern	3dbf6a5404	[SCSI] Fix uninitialized variable error in scsi_io_completion This patch (as1191) adds a missing "default" case in scsi_io_completion(), thereby fixing an "uninitialized variable" error. It also adds a missing newline to a log entry. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-02 10:57:41 -06:00
FUJITA Tomonori	f4f4e47e4a	[SCSI] add residual argument to scsi_execute and scsi_execute_req scsi_execute() and scsi_execute_req() discard the residual length information. Some callers need it. This adds residual argument (optional) to scsi_execute and scsi_execute_req. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-12-29 11:24:24 -06:00
Alan Stern	b60af5b0ad	[SCSI] simplify scsi_io_completion() This patch (as1142b) consolidates a lot of repetitious code in scsi_io_completion(). It also fixes a few comments. Most importantly, however, it clearly distinguishes among the three sorts of retries that can be done when a command fails to complete: Unprepare the request and resubmit it, so that a new command will be created for it. Requeue the request directly so that it will be retried immediately using the same command. Requeue the request so that it will be retried following a short delay. Complete the remainder of the request with an I/O error. [jejb: Updates 1. For several error conditions, we would now print the sense twice in slightly different ways, so unify the location of sense printing. 2. I added more descriptions to actual failure conditions for better debugging 3. according to spec, ABORTED_COMMAND is supposed to be retried (except on DIF failure). Our old behaviour of erroring it looks to be a bug. 4. I'd prefer not to default initialise the action variable because that ensures that every leg of the error handler has an associated action and the compiler will warn if someone later accidentally misses one or removes one. ] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-12-29 11:24:18 -06:00
James Bottomley	02bd3499a3	[SCSI] scsi_lib: only call scsi_unprep_request() under queue lock It's called under that lock everywhere else and it does alter the request state, so it should be. This one occurance in scsi_requeue_command() could open a window where req->special is set to NULL while the requests is going through either timeout or completion processing leading to NULL pointer derefs of the sort complained of in bugzillas 12020 and 12195. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-12-13 14:31:03 -06:00
Mike Christie	2a3a59e5c9	[SCSI] Fix hang in starved list processing Close possible infinite loop with interrupts off when devices are added back to the starved list. Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=11898 Reported-by: <alex.shi@intel.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-11-16 08:13:58 -06:00
Kiyoshi Ueda	6c5121b78b	[SCSI] export busy state via q->lld_busy_fn() This patch implements q->lld_busy_fn() for scsi mid layer to export its busy state for request stacking drivers. For efficiency, no lock is taken to check the busy state of shost/starget/sdev, since the returned value is not guaranteed and may be changed after request stacking drivers call the function, regardless of taking lock or not. When scsi can't dispatch I/Os anymore and needs to kill I/Os (e.g. !sdev), scsi needs to return 'not busy'. Otherwise, request stacking drivers may hold requests forever. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-23 11:42:16 -05:00
Kiyoshi Ueda	9d11251709	[SCSI] refactor sdev/starget/shost busy checking This patch refactors the busy checking codes of scsi_device, Scsi_Host and scsi_target. There should be no functional change. This is a preparation for another patch which exports scsi's busy state to the block layer for request stacking drivers. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-23 11:42:16 -05:00
James Bottomley	32c356d76d	[SCSI] fix removable device inability to detect disk changes On Tue, 12 Aug 2008 15:08:14 +0200 Giuliano Pochini <pochini@shiny.it> wrote: > Fujitsu magneto-optical drive, Adaptec 29160 and > Linux Jay 2.6.26 #7 SMP Sun Aug 10 18:34:22 CEST 2008 ppc 7455, altivec supported PowerMac3,6 GNU/Linux > > When I insert a disk and I mount it, scsi_test_unit_ready() is called and > the do-while loop gets sshdr->sense_key == UNIT_ATTENTION in the first > cycle and 0 in the second one. So the if below misses the UNIT_ATTENTION > and sdev->changed = 1 is not executed. At this point bad things can > happen... I'm not sure how to fix this. Any clue ? The problem is essentially caused by us eating UNIT_ATTENTION conditions in scsi_test_unit_ready(). Fix by updating the ->changed flag when this happens if the media is removable. [pochini@shiny.it: updates to tidy up patch] Signed-off-by: Giuliano Pochini <pochini@shiny.it> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-23 11:41:16 -05:00
Mike Christie	4a27446f3e	[SCSI] modify scsi to handle new fail fast flags. This checks the errors the scsi-ml determined were retryable and returns if we should fast fail it based on the request fail fast flags. Without the patch, drivers like lpfc, qla2xxx and fcoe would return DID_ERROR for what it determines is a temporary communication problem. There is no loss of connectivity at that time and the driver thinks that it would be fast to retry at the driver level. SCSI-ml will however sees fast fail on the request and DID_ERROR and will fast fail the io. This will then cause dm-multipath to fail the path and possibley switch target controllers when we should be retrying at the scsi layer. We also were fast failing device errors to dm multiapth when unless the scsi_dh modules think otherwis we want to retry at the scsi layer because multipath can only retry the IO like scsi should have done. multipath is a little dumber though because it does not what the error was for and assumes that it should fail the paths. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:52 -04:00
Mike Christie	f0c0a376d0	[SCSI] Add helper code so transport classes/driver can control queueing (v3) SCSI-ml manages the queueing limits for the device and host, but does not do so at the target level. However something something similar can come in userful when a driver is transitioning a transport object to the the blocked state, becuase at that time we do not want to queue io and we do not want the queuecommand to be called again. The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers. You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit a transport level queueing issue like the hw cannot allocate some resource at the iscsi session/connection level, or the target has temporarily closed or shrunk the queueing window, or if we are transitioning to the blocked state. bnx2i, when they rework their firmware according to netdev developers requests, will also need to be able to limit queueing at this level. bnx2i will hook into libiscsi, but will allocate a scsi host per netdevice/hba, so unlike pure software iscsi/iser which is allocating a host per session, it cannot set the scsi_host->can_queue and return SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport. The iscsi class/driver can also set a scsi_target->can_queue value which reflects the max commands the driver/class can support. For iscsi this reflects the number of commands we can support for each session due to session/connection hw limits, driver limits, and to also reflect the session/targets's queueing window. Changes: v1 - initial patch. v2 - Fix scsi_run_queue handling of multiple blocked targets. Previously we would break from the main loop if a device was added back on the starved list. We now run over the list and check if any target is blocked. v3 - Rediff for scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:46 -04:00
Linus Torvalds	ef5bef357c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits) [SCSI] zfcp: fix double dbf id usage [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev [SCSI] zfcp: fix erp list usage without using locks [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport [SCSI] zfcp: fix deadlock caused by shared work queue tasks [SCSI] zfcp: put threshold data in hba trace [SCSI] zfcp: Simplify zfcp data structures [SCSI] zfcp: Simplify get_adapter_by_busid [SCSI] zfcp: remove all typedefs and replace them with standards [SCSI] zfcp: attach and release SAN nameserver port on demand [SCSI] zfcp: remove unused references, declarations and flags [SCSI] zfcp: Update message with input from review [SCSI] zfcp: add queue_full sysfs attribute [SCSI] scsi_dh: suppress comparison warning [SCSI] scsi_dh: add Dell product information into rdac device handler [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option [SCSI] qla2xxx: fix printk format warnings [SCSI] qla2xxx: Update version number to 8.02.01-k8. [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing. [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling. ...	2008-10-10 10:53:26 -07:00
Jens Axboe	242f9dcb8b	block: unify request timeout handling Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:13 +02:00
James Bottomley	6f4267e3bd	[SCSI] Update the SCSI state model to allow blocking in the created state Brian King <brking@linux.vnet.ibm.com> reported that fibre channel devices can oops during scanning if their ports block (because the device goes from CREATED -> BLOCK -> RUNNING rather than CREATED -> BLOCK -> CREATED). Fix this by adding a new state: CREATED_BLOCK which can only transition back to CREATED and disallow the CREATED -> BLOCK transition. Now both the created and blocked states that the mid-layer recognises can include CREATED_BLOCK. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-03 11:46:13 -05:00
James Bottomley	44ea91c597	[SCSI] Fix hang with split requests Sometimes, particularly for USB devices with the last sector bug, requests get completed in chunks. There's a bug in this in that if one of the chunks gets an error, we complete that chunk with an error but never move on to the remaining ones, leading to the request hanging (because it's not fully completed). Fix this by completing all remaining chunks if an error is encountered. Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-09-23 12:29:01 -07:00
Harvey Harrison	cadbd4a5e3	[SCSI] replace __FUNCTION__ with __func__ [jejb: fixed up a ton of missed conversions. All of you are on notice this has happened, driver trees will now need to be rebased] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: SCSI List <linux-scsi@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-27 10:31:49 -04:00
Mike Christie	6bd522f6a2	[SCSI] scsi_lib: use blk_rq_tagged in scsi_request_fn I goofed and did not see the macro for checking if a request is tagged. This patch has us use blk_rq_tagged instead of digging into the req->tag. Patch was made over scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:59 -04:00
Martin K. Petersen	511e44f42e	[SCSI] Do not retry a request whose data integrity check failed If initiator or target reject the I/O due to DIF errors there is no point in retrying. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:55 -04:00
Martin K. Petersen	7027ad72a6	[SCSI] Support devices with protection information Implement support for DMA of protection information for devices that are data integrity capable. - Add support for mapping an extra scatter-gather list containing the protection information. - Allocate protection scsi_data_buffer if host is DIX (integrity DMA) capable. - Accessor function for checking whether a device has protection enabled. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:55 -04:00
Mike Christie	ecefe8a975	[SCSI] fix shared tag map tag allocation When drivers use a shared tag map we can end up with more requests than tags, because the tag map is shost->can_queue tags and there can be sdevs * sdev->queue_depth requests. In scsi_request_fn if tag allocation fails we just drop down to just dequeueing the tag without a tag. The problem is that drivers using the shared tag map rely on a valid tag always being set, because it will use the tag number to lookup commands later. This patch has us check if we got a valid tag when the host lock is held right before we check if the host queue is ready. We do the check here because to allocate the tag we need the q lock, but if the tag is bad we want to add the device/q onto the starved list which requires the host lock. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:50 -04:00
Linus Torvalds	89a93f2f48	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits) [SCSI] scsi_dh: fix kconfig related build errors [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h [SCSI] make struct scsi_{host,target}_type static [SCSI] fix locking in host use of blk_plug_device() [SCSI] zfcp: Cleanup external header file [SCSI] zfcp: Cleanup code in zfcp_erp.c [SCSI] zfcp: zfcp_fsf cleanup. [SCSI] zfcp: consolidate sysfs things into one file. [SCSI] zfcp: Cleanup of code in zfcp_aux.c [SCSI] zfcp: Cleanup of code in zfcp_scsi.c [SCSI] zfcp: Move status accessors from zfcp to SCSI include file. [SCSI] zfcp: Small QDIO cleanups [SCSI] zfcp: Adapter reopen for large number of unsolicited status [SCSI] zfcp: Fix error checking for ELS ADISC requests [SCSI] zfcp: wait until adapter is finished with ERP during auto-port [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver [SCSI] sg: Add target reset support [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC [SCSI] sd: Move scsi_disk() accessor function to sd.h ...	2008-07-15 18:58:04 -07:00
James Bottomley	2476b4d042	[SCSI] fix locking in host use of blk_plug_device() scsi_lib.c:scsi_host_queue_ready() plugs the device with incorrect locking. It should actually have the queue lock held, but it's holding the host lock. Fix this by eliminating the call. The host ready has no need to plug the queue because if it returns 0 in scsi_request_function control transfers to not_ready which acquires the queue lock and plugs the device if its at zero depth. Reported-by: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:36 -05:00
Martin K. Petersen	6362abd3e0	[SCSI] Rename scsi_bidi_sdb_cache The data integrity changes need to dynamically allocate scsi_data_buffers too. Rename scsi_bidi_sdb_cache for clarity. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:24 -05:00
Alan Stern	bdb2b8cab4	[SCSI] erase invalid data returned by device This patch (as1108) fixes a problem that can occur with certain USB mass-storage devices: They return invalid data together with a residue indicating that the data should be ignored. Rather than leave the invalid data in a transfer buffer, where it can get misinterpreted, the patch clears the invalid portion of the buffer. This solves a problem (wrong write-protect setting detected) reported by Maciej Rutecki and Peter Teoh. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Peter Teoh <htmldeveloper@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-06 11:33:08 -05:00
Chandra Seetharaman	a6a8d9f87e	[SCSI] scsi_dh: add infrastructure for SCSI Device Handlers Some of the storage devices (that can be accessed through multiple paths), do need some special handling for 1. Activating the passive path of the storage access. 2. Decode and handle the special sense codes returned by the devices. 3. Handle the I/Os being sent to the passive path, especially during the device probe time. when accessed through multiple paths. As of today this special device handling is done at the dm-multipath layer using dm-handlers. That works well for (1); for (2) to be handled at dm layer, scsi sense information need to be exported from SCSI to dm-layer, which is not very attractive; (3) cannot be done at all at the dm layer. Device handler has been moved to SCSI mainly to handle (2) and (3) properly. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-06-05 09:23:40 -05:00
Linus Torvalds	d626e3bf72	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: [SCSI] aic94xx: fix section mismatch [SCSI] u14-34f: Fix 32bit only problem [SCSI] dpt_i2o: sysfs code [SCSI] dpt_i2o: 64 bit support [SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent [SCSI] dpt_i2o: use standard __init / __exit code [SCSI] megaraid_sas: fix suspend/resume sections [SCSI] aacraid: Add Power Management support [SCSI] aacraid: Fix jbod operations scan issues [SCSI] aacraid: Fix warning about macro side-effects [SCSI] add support for variable length extended commands [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer [SCSI] bsg: add large command support [SCSI] aacraid: Fix down_interruptible() to check the return value correctly [SCSI] megaraid_sas; Update the Version and Changelog [SCSI] ibmvscsi: Handle non SCSI error status [SCSI] bug fix for free list handling [SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions [SCSI] megaraid_mbox: fix Dell CERC firmware problem	2008-05-02 13:52:35 -07:00
Boaz Harrosh	db4742dd8f	[SCSI] add support for variable length extended commands Add support for variable-length, extended, and vendor specific CDBs to scsi-ml. It is now possible for initiators and ULD's to issue these types of commands. LLDs need not change much. All they need is to raise the .max_cmd_len to the longest command they support (see iscsi patch). - clean-up some code paths that did not expect commands to be larger than 16, and change cmd_len members' type to short as char is not enough. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-05-02 11:33:25 -05:00
Boaz Harrosh	64a87b244b	[SCSI] Let scsi_cmnd->cmnd use request->cmd buffer - struct scsi_cmnd had a 16 bytes command buffer of its own. This is an unnecessary duplication and copy of request's cmd. It is probably left overs from the time that scsi_cmnd could function without a request attached. So clean that up. - Once above is done, few places, apart from scsi-ml, needed adjustments due to changing the data type of scsi_cmnd->cmnd. - Lots of drivers still use MAX_COMMAND_SIZE. So I have left that #define but equate it to BLK_MAX_CDB. The way I see it and is reflected in the patch below is. MAX_COMMAND_SIZE - means: The longest fixed-length () SCSI CDB as per the SCSI standard and is not related to the implementation. BLK_MAX_CDB. - The allocated space at the request level - I have audit all ISA drivers and made sure none use ->cmnd in a DMA Operation. Same audit was done by Andi Kleen. ()fixed-length here means commands that their size can be determined by their opcode and the CDB does not carry a length specifier, (unlike the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly true and the SCSI standard also defines extended commands and vendor specific commands that can be bigger than 16 bytes. The kernel will support these using the same infrastructure used for VARLEN CDB's. So in effect MAX_COMMAND_SIZE means the maximum size command scsi-ml supports without specifying a cmd_len by ULD's Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-05-02 10:18:22 -05:00
Nick Piggin	75ad23bc0f	block: make queue flags non-atomic We can save some atomic ops in the IO path, if we clearly define the rules of how to modify the queue flags. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 14:48:33 +02:00
James Bottomley	fa8e36c39b	[SCSI] fix barrier failure issue Currently, if the barrier command fails, the error return isn't seen by the block layer and it proceeds on regardless. The problem is that SCSI always returns no error for REQ_TYPE_BLOCK_PC ... it expects the submitter to pick the errors out of req->errors, which the block barrier functions don't do. Since it appears that the way SG_IO and scsi_execute_request() work they discard the block error return and always use req->errors, the best fix for this is to have the SCSI layer return an error to block if one actually occurred (this also allows us to filter out spurious errors, like deferred sense). This patch is a bug fix that will need backporting to stable, but it's also quite a big change and in need of testing, so we'll incubate in the main kernel tree and backport at the -rc2 or so stage if no problems turn up. Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-07 12:19:10 -05:00
Adrian Bunk	8c5e03d3cf	[SCSI] make scsi_end_bidi_request() static This patch makes the needlessly global scsi_end_bidi_request() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-07 12:19:08 -05:00
Kay Sievers	4d1566ed21	[SCSI] fix media change events for polled devices Commit: `a341cd0f` (SCSI: add asynchronous event notification API) breaks: `285e9670` (sr,sd: send media state change modification events) by introducing an event filter, which is removed here, to make events, we are depending on, happen again. Fix this by removing the event filter. It's pretty much broken at the moment, since a user can't set it (the attribute being read only). A proper fix will be to make the event discriminator distinguish between AN and Polled media change events. Cc: David Zeuthen <david@fubar.dk> Cc: kristen accardi <kaccardi@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-03-19 11:51:28 -05:00
Tejun Heo	6b00769fe1	block: add request->raw_data_len With padding and draining moved into it, block layer now may extend requests as directed by queue parameters, so now a request has two sizes - the original request size and the extended size which matches the size of area pointed to by bios and later by sgs. The latter size is what lower layers are primarily interested in when allocating, filling up DMA tables and setting up the controller. Both padding and draining extend the data area to accomodate controller characteristics. As any controller which speaks SCSI can handle underflows, feeding larger data area is safe. So, this patch makes the primary data length field, request->data_len, indicate the size of full data area and add a separate length field, request->raw_data_len, for the unmodified request size. The latter is used to report to higher layer (userland) and where the original request size should be fed to the controller or device. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-19 11:36:35 +01:00
Tony Battersby	4d2de3a50c	[SCSI] fix BUG when sum(scatterlist) > bufflen When sending a SCSI command to a tape drive via the SCSI Generic (sg) driver, if the command has a data transfer length more than scatter_elem_sz (32 KB default) and not a multiple of 512, then I either hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else the command never completes (depending on the LLDD). When constructing scatterlists, the sg driver rounds up the scatterlist element sizes to be a multiple of 512. This can result in sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg() incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to bufflen. When the command completes, req_bio_endio() detects that bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the command to be resubmitted, resulting in BUG_ON or the command never completing. This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather than to sum(scatterlist lengths), which fixes the problem. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-02-07 18:02:44 -06:00

... 3 4 5 6 7 ...

598 commits