Commit graph

722062 commits

Author SHA1 Message Date
Manish Rangankar
967823d6c3 scsi: qedi: Fix kernel crash during port toggle
BUG: unable to handle kernel NULL pointer dereference at 0000000000000100

[  985.596918] IP: _raw_spin_lock_bh+0x17/0x30
[  985.601581] PGD 0 P4D 0
[  985.604405] Oops: 0002 [#1] SMP
:
[  985.704533] CPU: 16 PID: 1156 Comm: qedi_thread/16 Not tainted 4.16.0-rc2 #1
[  985.712397] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.4.3 01/17/2017
[  985.720747] RIP: 0010:_raw_spin_lock_bh+0x17/0x30
[  985.725996] RSP: 0018:ffffa4b1c43d3e10 EFLAGS: 00010246
[  985.731823] RAX: 0000000000000000 RBX: ffff94a31bd03000 RCX: 0000000000000000
[  985.739783] RDX: 0000000000000001 RSI: ffff94a32fa16938 RDI: 0000000000000100
[  985.747744] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000a33
[  985.755703] R10: 0000000000000000 R11: ffffa4b1c43d3af0 R12: 0000000000000000
[  985.763662] R13: ffff94a301f40818 R14: 0000000000000000 R15: 000000000000000c
[  985.771622] FS:  0000000000000000(0000) GS:ffff94a32fa00000(0000) knlGS:0000000000000000
[  985.780649] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  985.787057] CR2: 0000000000000100 CR3: 000000067a009006 CR4: 00000000001606e0
[  985.795017] Call Trace:
[  985.797747]  qedi_fp_process_cqes+0x258/0x980 [qedi]
[  985.803294]  qedi_percpu_io_thread+0x10f/0x1b0 [qedi]
[  985.808931]  kthread+0xf5/0x130
[  985.812434]  ? qedi_free_uio+0xd0/0xd0 [qedi]
[  985.817298]  ? kthread_bind+0x10/0x10
[  985.821372]  ? do_syscall_64+0x6e/0x1a0

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:54 -05:00
Darren Trapp
2b5b96473e scsi: qla2xxx: Fix FC-NVMe LUN discovery
commit a4239945b8 ("scsi: qla2xxx: Add switch command to simplify
fabric discovery") introduced regression when it did not consider
FC-NVMe code path which broke NVMe LUN discovery.

Fixes: a4239945b8 ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:53 -05:00
Hannes Reinecke
e39a97353e scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()
When converting __scsi_error_from_host_byte() to BLK_STS error codes the
case DID_OK was forgotten, resulting in it always returning an error.

Fixes: 2a842acab1 ("block: introduce new block status code type")
Cc: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:52 -05:00
Bart Van Assche
3be8828fc5 scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops
Avoid that the recently introduced call_rcu() call in the SCSI core
triggers a double call_rcu() call.

Reported-by: Natanael Copa <ncopa@alpinelinux.org>
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=198861
Fixes: 3bd6f43f5c ("scsi: core: Ensure that the SCSI error handler gets woken up")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Natanael Copa <ncopa@alpinelinux.org>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Alexandre Oliva <oliva@gnu.org>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:52 -05:00
Hannes Reinecke
fa83e65885 scsi: qla2xxx: ensure async flags are reset correctly
The fcport flags FCF_ASYNC_ACTIVE and FCF_ASYNC_SENT are used to
throttle the state machine, so we need to ensure to always set and unset
them correctly. Not doing so will lead to the state machine getting
confused and no login attempt into remote ports.

Cc: Quinn Tran <quinn.tran@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Fixes: 3dbec59bdf ("scsi: qla2xxx: Prevent multiple active discovery commands per session")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:51 -05:00
Hannes Reinecke
07ea4b6026 scsi: qla2xxx: do not check login_state if no loop id is assigned
When no loop id is assigned in qla24xx_fcport_handle_login() the login
state needs to be ignored; it will get set later on in
qla_chk_n2n_b4_login().

Cc: Quinn Tran <quinn.tran@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Fixes: 040036bb0b ("scsi: qla2xxx: Delay loop id allocation at login")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:51 -05:00
Hannes Reinecke
1c6cacf4ea scsi: qla2xxx: Fixup locking for session deletion
Commit d8630bb95f ('Serialize session deletion by using work_lock')
tries to fixup a deadlock when deleting sessions, but fails to take into
account the locking rules. This patch resolves the situation by
introducing a separate lock for processing the GNLIST response, and
ensures that sess_lock is released before calling
qlt_schedule_sess_delete().

Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Quinn Tran <quinn.tran@cavium.com>
Fixes: d8630bb95f ("scsi: qla2xxx: Serialize session deletion by using work_lock")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:50 -05:00
himanshu.madhani@cavium.com
1514839b36 scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
This patch fixes NULL pointer crash due to active timer running for abort
IOCB.

From crash dump analysis it was discoverd that get_next_timer_interrupt()
encountered a corrupted entry on the timer list.

 #9 [ffff95e1f6f0fd40] page_fault at ffffffff914fe8f8
    [exception RIP: get_next_timer_interrupt+440]
    RIP: ffffffff90ea3088  RSP: ffff95e1f6f0fdf0  RFLAGS: 00010013
    RAX: ffff95e1f6451028  RBX: 000218e2389e5f40  RCX: 00000001232ad600
    RDX: 0000000000000001  RSI: ffff95e1f6f0fdf0  RDI: 0000000001232ad6
    RBP: ffff95e1f6f0fe40   R8: ffff95e1f6451188   R9: 0000000000000001
    R10: 0000000000000016  R11: 0000000000000016  R12: 00000001232ad5f6
    R13: ffff95e1f6450000  R14: ffff95e1f6f0fdf8  R15: ffff95e1f6f0fe10
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Looking at the assembly of get_next_timer_interrupt(), address came
from %r8 (ffff95e1f6451188) which is pointing to list_head with single
entry at ffff95e5ff621178.

 0xffffffff90ea307a <get_next_timer_interrupt+426>:      mov    (%r8),%rdx
 0xffffffff90ea307d <get_next_timer_interrupt+429>:      cmp    %r8,%rdx
 0xffffffff90ea3080 <get_next_timer_interrupt+432>:      je     0xffffffff90ea30a7 <get_next_timer_interrupt+471>
 0xffffffff90ea3082 <get_next_timer_interrupt+434>:      nopw   0x0(%rax,%rax,1)
 0xffffffff90ea3088 <get_next_timer_interrupt+440>:      testb  $0x1,0x18(%rdx)

 crash> rd ffff95e1f6451188 10
 ffff95e1f6451188:  ffff95e5ff621178 ffff95e5ff621178   x.b.....x.b.....
 ffff95e1f6451198:  ffff95e1f6451198 ffff95e1f6451198   ..E.......E.....
 ffff95e1f64511a8:  ffff95e1f64511a8 ffff95e1f64511a8   ..E.......E.....
 ffff95e1f64511b8:  ffff95e77cf509a0 ffff95e77cf509a0   ...|.......|....
 ffff95e1f64511c8:  ffff95e1f64511c8 ffff95e1f64511c8   ..E.......E.....

 crash> rd ffff95e5ff621178 10
 ffff95e5ff621178:  0000000000000001 ffff95e15936aa00   ..........6Y....
 ffff95e5ff621188:  0000000000000000 00000000ffffffff   ................
 ffff95e5ff621198:  00000000000000a0 0000000000000010   ................
 ffff95e5ff6211a8:  ffff95e5ff621198 000000000000000c   ..b.............
 ffff95e5ff6211b8:  00000f5800000000 ffff95e751f8d720   ....X... ..Q....

 ffff95e5ff621178 belongs to freed mempool object at ffff95e5ff621080.

 CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
 ffff95dc7fd74d00 mnt_cache                384      19785     24948    594    16k
   SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
   ffffdc5dabfd8800  ffff95e5ff620000     1     42         29    13
   FREE / [ALLOCATED]
    ffff95e5ff621080  (cpu 6 cache)

Examining the contents of that memory reveals a pointer to a constant string
in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().

 crash> rd ffffffffc059277c 20
 ffffffffc059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
 ffffffffc059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
 ffffffffc059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
 ffffffffc05927ac:  636976656420676e 786c252074612065   ng device at %lx
 ffffffffc05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
 ffffffffc05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
 ffffffffc05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
 ffffffffc05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
 ffffffffc05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
 ffffffffc059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma

 crash> struct -ox srb_iocb
 struct srb_iocb {
           union {
               struct {...} logio;
               struct {...} els_logo;
               struct {...} tmf;
               struct {...} fxiocb;
               struct {...} abt;
               struct ct_arg ctarg;
               struct {...} mbx;
               struct {...} nack;
    [0x0 ] } u;
    [0xb8] struct timer_list timer;
    [0x108] void (*timeout)(void *);
 }
 SIZE: 0x110

 crash> ! bc
 ibase=16
 obase=10
 B8+40
 F8

The object is a srb_t, and at offset 0xf8 within that structure
(i.e. ffff95e5ff621080 + f8 -> ffff95e5ff621178) is a struct timer_list.

Cc: <stable@vger.kernel.org> #4.4+
Fixes: 4440e46d5d ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous handling.")
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:33 -05:00
Sreekanth Reddy
c666d3be99 scsi: mpt3sas: wait for and flush running commands on shutdown/unload
This patch finishes all outstanding SCSI IO commands (but not other commands,
e.g., task management) in the shutdown and unload paths.

It first waits for the commands to complete (this is done after setting
'ioc->remove_host = 1 ', which prevents new commands to be queued) then it
flushes commands that might still be running.

This avoids triggering error handling (e.g., abort command) for all commands
possibly completed by the adapter after interrupts disabled.

[mauricfo: introduced something in commit message.]

Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-21 22:59:39 -05:00
Mauricio Faria de Oliveira
9ff549ffb4 scsi: mpt3sas: fix oops in error handlers after shutdown/unload
This patch adds checks for 'ioc->remove_host' in the SCSI error handlers, so
not to access pointers/resources potentially freed in the PCI shutdown/module
unload path.  The error handlers may be invoked after shutdown/unload,
depending on other components.

This problem was observed with kexec on a system with a mpt3sas based adapter
and an infiniband adapter which takes long enough to shutdown:

The mpt3sas driver finished shutting down / disabled interrupt handling, thus
some commands have not finished and timed out.

Since the system was still running (waiting for the infiniband adapter to
shutdown), the scsi error handler for task abort of mpt3sas was invoked, and
hit an oops -- either in scsih_abort() because 'ioc->scsi_lookup' was NULL
without commit dbec4c9040 ("scsi: mpt3sas: lockless command submission"), or
later up in scsih_host_reset() (with or without that commit), because it
eventually called mpt3sas_base_get_iocstate().

After the above commit, the oops in scsih_abort() does not occur anymore
(_scsih_scsi_lookup_find_by_scmd() is no longer called), but that commit is
too big and out of the scope of linux-stable, where this patch might help, so
still go for the changes.

Also, this might help to prevent similar errors in the future, in case code
changes and possibly tries to access freed stuff.

Note the fix in scsih_host_reset() is still important anyway.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-21 22:45:45 -05:00
Michael Kelley (EOSG)
9cfad4a5f4 scsi: storvsc: Spread interrupts when picking a channel for I/O requests
Update the algorithm in storvsc_do_io to look for a channel
starting with the current CPU + 1 and wrap around (within the
current NUMA node). This spreads VMbus interrupts more evenly
across CPUs. Previous code always started with first CPU in
the current NUMA node, skewing the interrupt load to that CPU.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-21 22:30:40 -05:00
Shivasharan S
9ff97fa8db scsi: megaraid_sas: Do not use 32-bit atomic request descriptor for Ventura controllers
Problem Statement: Sending I/O through 32 bit descriptors to Ventura series of
controller results in IO timeout on certain conditions.

This error only occurs on systems with high I/O activity on Ventura series
controllers.

Changes in this patch will prevent driver from using 32 bit descriptor and use
64 bit Descriptors.

Cc: <stable@vger.kernel.org>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-15 18:23:37 -05:00
Manish Rangankar
1bc5ad3a6a scsi: qla4xxx: skip error recovery in case of register disconnect.
A system crashes when continuously removing/re-adding the storage
controller.

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:41 -05:00
Meelis Roos
00c20cdc79 scsi: aacraid: fix shutdown crash when init fails
When aacraid init fails with "AAC0: adapter self-test failed.", shutdown
leads to UBSAN warning and then oops:

[154316.118423] ================================================================================
[154316.118508] UBSAN: Undefined behaviour in drivers/scsi/scsi_lib.c:2328:27
[154316.118566] member access within null pointer of type 'struct Scsi_Host'
[154316.118631] CPU: 2 PID: 14530 Comm: reboot Tainted: G        W        4.15.0-dirty #89
[154316.118701] Hardware name: Hewlett Packard HP NetServer/HP System Board, BIOS 4.06.46 PW 06/25/2003
[154316.118774] Call Trace:
[154316.118848]  dump_stack+0x48/0x65
[154316.118916]  ubsan_epilogue+0xe/0x40
[154316.118976]  __ubsan_handle_type_mismatch+0xfb/0x180
[154316.119043]  scsi_block_requests+0x20/0x30
[154316.119135]  aac_shutdown+0x18/0x40 [aacraid]
[154316.119196]  pci_device_shutdown+0x33/0x50
[154316.119269]  device_shutdown+0x18a/0x390
[...]
[154316.123435] BUG: unable to handle kernel NULL pointer dereference at 000000f4
[154316.123515] IP: scsi_block_requests+0xa/0x30

This is because aac_shutdown() does

        struct Scsi_Host *shost = pci_get_drvdata(dev);
        scsi_block_requests(shost);

and that assumes shost has been assigned with pci_set_drvdata().

However, pci_set_drvdata(pdev, shost) is done in aac_probe_one() far
after bailing out with error from calling the init function
((*aac_drivers[index].init)(aac)), and when the init function fails, no
error is returned from aac_probe_one() so PCI layer assumes there is
driver attached, and tries to shut it down later.

Fix it by returning error from aac_probe_one() when card-specific init
function fails.

This fixes reboot on my HP NetRAID-4M with dead battery.

Signed-off-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:40 -05:00
Nilesh Javali
2c08fe64e4 scsi: qedi: Cleanup local str variable
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Chris Leech <cleech@redhat.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:40 -05:00
Andrew Vasquez
1683ce57f5 scsi: qedi: Fix truncation of CHAP name and secret
The data in NVRAM is not guaranteed to be NUL terminated.  Since
snprintf expects byte-stream to accommodate null byte, the CHAP secret
is truncated.  Use sprintf instead of snprintf to fix the truncation of
CHAP name and secret.

Signed-off-by: Andrew Vasquez <andrew.vasquez@cavium.com>
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Chris Leech <cleech@redhat.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:39 -05:00
Himanshu Madhani
f376722502 scsi: qla2xxx: Fix incorrect handle for abort IOCB
This patch fixes incorrect handle used for abort IOCB.

Fixes: b027a5ace4 ("scsi: qla2xxx: Fix queue ID for async abort with Multiqueue")
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:39 -05:00
Quinn Tran
eaf75d1815 scsi: qla2xxx: Fix double free bug after firmware timeout
This patch is based on Max's original patch.

When the qla2xxx firmware is unavailable, eventually
qla2x00_sp_timeout() is reached, which calls the timeout function and
frees the srb_t instance.

The timeout function always resolves to qla2x00_async_iocb_timeout(),
which invokes another callback function called "done".  All of these
qla2x00_*_sp_done() callbacks also free the srb_t instance; after
returning to qla2x00_sp_timeout(), it is freed again.

The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
and add it to those code paths in qla2x00_async_iocb_timeout() which
do not already free the object.

This is how it looks like with KASAN:

BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
Read of size 8 at addr ffff88278147a590 by task swapper/2/0

Allocated by task 1502:
save_stack+0x33/0xa0
kasan_kmalloc+0xa0/0xd0
kmem_cache_alloc+0xb8/0x1c0
mempool_alloc+0xd6/0x260
qla24xx_async_gnl+0x3c5/0x1100

Freed by task 0:
save_stack+0x33/0xa0
kasan_slab_free+0x72/0xc0
kmem_cache_free+0x75/0x200
qla24xx_async_gnl_sp_done+0x556/0x9e0
qla2x00_async_iocb_timeout+0x1c7/0x420
qla2x00_sp_timeout+0x16d/0x250
call_timer_fn+0x36/0x200

The buggy address belongs to the object at ffff88278147a440
which belongs to the cache qla2xxx_srbs of size 344
The buggy address is located 336 bytes inside of
344-byte region [ffff88278147a440, ffff88278147a598)

Reported-by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Max Kellermann <mk@cm4all.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:38 -05:00
Michael Kelley (EOSG)
cabe92a55e scsi: storvsc: Increase cmd_per_lun for higher speed devices
Increase cmd_per_lun to allow more I/Os in progress per device,
particularly for NVMe's.  The Hyper-V host side can handle the higher
count with no issues.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-13 21:35:37 -05:00
Bart Van Assche
50dbd09c56 scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els()
Ensure that upon return the tgt->ha->tgt.sess_lock spin lock is unlocked
no matter which code path is taken through this function.  This was
detected by sparse.

Fixes: 82abdcaf3e ("scsi: qla2xxx: Allow target mode to accept PRLI in dual mode")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Quinn Tran <quinn.tran@cavium.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-06 18:11:58 -05:00
Bart Van Assche
f5572475e9 scsi: scsi_dh: Document alua_rtpg_queue() arguments
Since commit 3a025e1d1c ("Add optional check for bad kernel-doc
comments") building with W=1 causes warnings to appear for issues in
kernel-doc headers. This patch avoids that the following warnings are
reported when building with W=1:

drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'pg'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'sdev'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'qdata'
drivers/scsi/device_handler/scsi_dh_alua.c:867: warning: No description found for parameter 'force'

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 22:26:26 -05:00
Corentin Labbe
2e8233ab17 scsi: Remove Makefile entry for oktagon files
Remove line using non-existent files which were removed in
commit 642978beb4 ("[SCSI] remove m68k NCR53C9x based drivers")

[mkp: tweaked patch description]

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 22:26:25 -05:00
Corentin Labbe
7c0dde2b3d scsi: aic7xxx: remove aiclib.c
aiclib.c is unused (and contains no code) since commit 1ff927306e
("[SCSI] aic7xxx: remove aiclib.c")

13 years later, finish the cleaning by removing it from tree.

[mkp: tweaked patch description]

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 22:26:14 -05:00
Bart Van Assche
c02189e12c scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
A left shift must shift less than the bit width of the left argument.
Avoid triggering undefined behavior if ha->mbx_count == 32.

This patch avoids that UBSAN reports the following complaint:

UBSAN: Undefined behaviour in drivers/scsi/qla2xxx/qla_isr.c:275:14
shift exponent 32 is too large for 32-bit type 'int'
Call Trace:
 dump_stack+0x4e/0x6c
 ubsan_epilogue+0xd/0x3b
 __ubsan_handle_shift_out_of_bounds+0x112/0x14c
 qla2x00_mbx_completion+0x1c5/0x25d [qla2xxx]
 qla2300_intr_handler+0x1ea/0x3bb [qla2xxx]
 qla2x00_mailbox_command+0x77b/0x139a [qla2xxx]
 qla2x00_mbx_reg_test+0x83/0x114 [qla2xxx]
 qla2x00_chip_diag+0x354/0x45f [qla2xxx]
 qla2x00_initialize_adapter+0x2c2/0xa4e [qla2xxx]
 qla2x00_probe_one+0x1681/0x392e [qla2xxx]
 pci_device_probe+0x10b/0x1f1
 driver_probe_device+0x21f/0x3a4
 __driver_attach+0xa9/0xe1
 bus_for_each_dev+0x6e/0xb5
 driver_attach+0x22/0x3c
 bus_add_driver+0x1d1/0x2ae
 driver_register+0x78/0x130
 __pci_register_driver+0x75/0xa8
 qla2x00_module_init+0x21b/0x267 [qla2xxx]
 do_one_initcall+0x5a/0x1e2
 do_init_module+0x9d/0x285
 load_module+0x20db/0x38e3
 SYSC_finit_module+0xa8/0xbc
 SyS_finit_module+0x9/0xb
 do_syscall_64+0x77/0x271
 entry_SYSCALL64_slow_path+0x25/0x25

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:33:10 -05:00
Dan Carpenter
a7043e9529 scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
My static checker complains about an out of bounds read:

    drivers/message/fusion/mptctl.c:2786 mptctl_hp_targetinfo()
    error: buffer overflow 'hd->sel_timeout' 255 <= u32max.

It's true that we probably should have a bounds check here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:32:06 -05:00
Dan Carpenter
e6f791d953 scsi: sym53c8xx_2: iterator underflow in sym_getsync()
We wanted to exit the loop with "div" set to zero, but instead, if we
don't hit the break then "div" is -1 when we finish the loop.  It leads
to an array underflow a few lines later.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:29:16 -05:00
Chad Dupuis
ecf7ff4994 scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
When a request times out we set the io_req flag BNX2FC_FLAG_IO_COMPL so
that if a subsequent completion comes in on that task ID we will ignore
it.  The issue is that in the check for this flag there is a missing
return so we will continue to process a request which may have already
been returned to the ownership of the SCSI layer.  This can cause
unpredictable results.

Solution is to add in the missing return.

[mkp: typo plus title shortening]

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:27:02 -05:00
Colin Ian King
52797a1d4b scsi: csiostor: remove redundant assignment to pointer 'ln'
The pointer ln is assigned a value that is never read, it is re-assigned
a new value in the list_for_each loop hence the initialization is
redundant and can be removed.

Cleans up clang warning:
drivers/scsi/csiostor/csio_lnode.c:117:21: warning: Value stored to 'ln'
during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:24:07 -05:00
Sujit Reddy Thumma
84af7e8b89 scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
WRITE_SAME command is not supported by UFS. Enable a quirk for the upper
level drivers to not send WRITE SAME command.

[mkp: botched patch, applied by hand]

Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:19:53 -05:00
Tyrel Datwyler
c398136527 scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info
The fcp_rsp_info structure as defined in the FC spec has an initial 3
bytes reserved field. The ibmvfc driver mistakenly defined this field as
4 bytes resulting in the rsp_code field being defined in what should be
the start of the second reserved field and thus always being reported as
zero by the driver.

Ideally, we should wire ibmvfc up with libfc for the sake of code
deduplication, and ease of maintaining standardized structures in a
single place. However, for now simply fixup the definition in ibmvfc for
backporting to distros on older kernels. Wiring up with libfc will be
done in a followup patch.

Cc: <stable@vger.kernel.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:16:27 -05:00
Quinn Tran
2ce87cc5b2 scsi: qla2xxx: Fix memory corruption during hba reset test
This patch fixes memory corrpution while performing HBA Reset test.

Following stack trace is seen:

[  466.397219] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[  466.433669] IP: [<ffffffffc06f5dd0>] qlt_free_session_done+0x260/0x5f0 [qla2xxx]
[  466.467731] PGD 0
[  466.476718] Oops: 0000 [#1] SMP

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:14:30 -05:00
Tomas Henzl
4a8842de8d scsi: mpt3sas: fix an out of bound write
cpu_msix_table is allocated to store online cpus, but pci_irq_get_affinity
may return cpu_possible_mask which is then used to access cpu_msix_table.
That causes bad user experience.  Fix limits access to only online cpus,
I've also added an additional test to protect from an unlikely change in
cpu_online_mask.

[mkp: checkpatch]

Fixes: 1d55abc0e9 ("scsi: mpt3sas: switch to pci_alloc_irq_vectors")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-30 21:10:44 -05:00
Himanshu Madhani
a2390348c1 scsi: qla2xxx: Fix logo flag for qlt_free_session_done()
Commit 3515832cc6 ("scsi: qla2xxx: Reset the logo flag, after target
re-login.")fixed the target re-login after session relogin is complete,
but missed out the qlt_free_session_done() path.

This patch clears send_els_logo flag in qlt_free_session_done()
callback.

[mkp: checkpatch]

Fixes: 3515832cc6 ("scsi: qla2xxx: Reset the logo flag, after target re-login.")
Signed-off-by: Himanshu Madhani <hmadhani@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:07:39 -05:00
Arnd Bergmann
45596c7889 scsi: arcmsr: avoid do_gettimeofday
The arcmsr uses its own implementation of time_to_tm(), along with
do_gettimeofday() to read the current time. While the algorithm used
here is fine in principle, it suffers from two problems:

- it assigns the seconds portion of the timeval to a 32-bit unsigned
  integer that overflows in 2106 even on 64-bit architectures.

- do_gettimeofday() returns a time_t that overflows in 2038 on all
  32-bit systems.

This changes the time retrieval function to ktime_get_real_seconds(),
which returns a proper 64-bit value, and replaces the open-coded
time_to_tm() algorithm with a call to the safe time64_to_tm().

I checked way all numbers are indexed and found that months are given in
range 0..11 while the days are in range 1..31, same as 'struct tm', but
the year value that the firmware expects starts in 2000 while 'struct
tm' is based on year 1900, so it needs a small adjustment.

[mkp: checkpatch tweaks]

Fixes: b416c09947 ("scsi: arcmsr: Add a function to set date and time to firmware")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:04:02 -05:00
Hannes Reinecke
9c661a49e4 scsi: core: Add VENDOR_SPECIFIC sense code definitions
Some older devices will return vendor specific sense codes, so we should
be adding a definition for it.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:04:02 -05:00
Manish Rangankar
a1a20ffde2 scsi: qedi: Drop cqe response during connection recovery
We get stuck in the loop when firmware sends a cqe response during
connection recovery.

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:04:01 -05:00
Arnd Bergmann
96d5eaa9bb scsi: fas216: fix sense buffer initialization
While testing with the ARM specific memset() macro removed, I ran into a
compiler warning that shows an old bug:

drivers/scsi/arm/fas216.c: In function 'fas216_rq_sns_done':
drivers/scsi/arm/fas216.c:2014:40: error: argument to 'sizeof' in 'memset' call is the same expression as the destination; did you mean to provide an explicit length? [-Werror=sizeof-pointer-memaccess]

It turns out that the definition of the scsi_cmd structure changed back
in linux-2.6.25, so now we clear only four bytes (sizeof(pointer))
instead of 96 (SCSI_SENSE_BUFFERSIZE). I did not check whether we
actually need to initialize the buffer here, but it's clear that if we
do it, we should use the correct size.

Fixes: de25deb180 ("[SCSI] use dynamically allocated sense buffer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:04:01 -05:00
Christopher Díaz Riveros
f36cfe6a06 scsi: ibmvfc: Remove unneeded semicolons
Trivial fix removes unneeded semicolons after switch blocks.

This issue was detected by using the Coccinelle software.

Signed-off-by: Christopher Díaz Riveros <chrisadr@gentoo.org>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:04:00 -05:00
Xiang Chen
0d762b3af2 scsi: hisi_sas: fix a bug in hisi_sas_dev_gone()
When device gone, NULL pointer can be accessed in free_device callback
if during SAS controller reset as we clear structure sas_dev prior.

Actually we can only set dev_type as SAS_PHY_UNUSED and not clear
structure sas_dev as all the members of structure sas_dev will be
re-initialized after device found.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:59 -05:00
Xiaofei Tan
6379c56070 scsi: hisi_sas: directly attached disk LED feature for v2 hw
This patch implements LED feature of directly attached disk for v2 hw.
As libsas has provided an interface lldd_write_gpio() for this feature,
we just need realise the interface following SPGIO API.

We use an CPLD to finish the hardware part of this feature, and the base
address of CPLD should be configured through ACPI or DT tables.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:59 -05:00
Xiaofei Tan
56ad7fcd4f scsi: hisi_sas: devicetree: bindings: add LED feature for v2 hw
Add directly attached disk LED feature for v2 hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:58 -05:00
Shivasharan S
f870bcbe9a scsi: megaraid_sas: NVMe passthrough command support
NVMe passthrough via MFI interface. Current MegaRAID product supports
different types of encapsulation via the MFI framework.

NVMe native command should be framed by application and it should be
embedded in MFI as payload. The driver will provide interface to send
the MFI frame along with the payload (in this case, payload is NVMe
native command) to the firmware. Driver already has an existing, similar
interface for SATA and SMP passthrough.

1. Driver will pass MFI command to the firmware if the latter supports
   NVMe encapsulated processing (not all SAS3.5 firmware supports this
   feature).

2. Driver exposes sysfs entry support_nvme_encapsulation. This is
   required for backward compatibility for applications using earlier
   driver versions that did not process IOCTL frames and could result in
   host hang.

   This is already fixed as part of commit 82add4e1b3 ("scsi:
   megaraid_sas: Incorrect processing of IOCTL frames for SMP/STP
   commands")

[mkp: clarified commit message]

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:58 -05:00
Arnd Bergmann
b45093dd76 scsi: megaraid: use ktime_get_real for firmware time
do_gettimeofday() overflows in 2038 on 32-bit architectures and is
deprecated, so convert this driver to call ktime_get_real()
directly. This also simplifies the calculation.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:57 -05:00
Arnd Bergmann
22807aa812 scsi: fnic: use 64-bit timestamps
struct timespec is deprecated since it overflows in 2038 on 32-bit
architectures, so we should use timespec64 consistently.

I'm slightly adapting the format strings here, to make sure we print the
nanoseconds with the correct number of leading zeroes.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:57 -05:00
Wei Yongjun
e89cabf26e scsi: qedf: Fix error return code in __qedf_probe()
Fix to return error code -ENOMEM from the error handling case instead of
0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:56 -05:00
Xose Vazquez Perez
3f884a0a8b scsi: devinfo: fix format of the device list
Replace "" with NULL for product revision level, and merge TEXEL
duplicate entries.

Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: SCSI ML <linux-scsi@vger.kernel.org>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:56 -05:00
himanshu.madhani@cavium.com
c93a9a16f1 scsi: qla2xxx: Update driver version to 10.00.00.05-k
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:55 -05:00
Anil Gurumurthy
92d71570b6 scsi: qla2xxx: Add XCB counters to debugfs
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:54 -05:00
Darren Trapp
b027a5ace4 scsi: qla2xxx: Fix queue ID for async abort with Multiqueue
[mkp: sparse warning]

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-22 20:03:26 -05:00
himanshu.madhani@cavium.com
8a7eac2fd1 scsi: qla2xxx: Fix warning for code intentation in __qla24xx_handle_gpdb_event()
This patch fixes following smatch warning:

drivers/scsi/qla2xxx/qla_init.c:1054 __qla24xx_handle_gpdb_event() warn: inconsistent indenting

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-17 01:34:24 -05:00