linux-stable

Commit Graph

Author	SHA1	Message	Date
Dan Carpenter	037fbd3fcf	scsi: scsi_debug: Delete some bogus error checking Smatch complains that "dentry" is never initialized. These days everyone initializes all their stack variables to zero so this means that it will trigger a warning every time this function is run. Really, debugfs functions are not supposed to be checked for errors in normal code. For example, if we updated this code to check the correct variable then it would print a warning if CONFIG_DEBUGFS was disabled. We don't want that. Just delete the check. Fixes: `f084fe52c6` ("scsi: scsi_debug: Add debugfs interface to fail target reset") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/c602c9ad-5e35-4e18-a47f-87ed956a9ec2@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-11-08 21:42:26 -05:00
Dan Carpenter	860c3d03bb	scsi: scsi_debug: Fix some bugs in sdebug_error_write() There are two bug in this code: 1) If count is zero, then it will lead to a NULL dereference. The kmalloc() will successfully allocate zero bytes and the test for "if (buf[0] == '-')" will read beyond the end of the zero size buffer and Oops. 2) The code does not ensure that the user's string is properly NUL terminated which could lead to a read overflow. Fixes: `a9996d722b` ("scsi: scsi_debug: Add interface to manage error injection for a single device") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/7733643d-e102-4581-8d29-769472011c97@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-11-08 21:42:26 -05:00
Wenchao Hao	573c2d066e	scsi: scsi_debug: Add param to control sdev's allow_restart Add new module param "allow_restart" to control scsi_device's allow_restart flag. This flag determines if EH is triggered after a command completes with sense_key 0x6, ASC 0x4 and ASCQ 0x2. EH would be triggered if allow_restart=1 in this condition. The new param can be used with the error injection capability to test how commands completing with sense_key 0x6, ASC 0x4 and ASCQ 0x2 are handled. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-11-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:12 -04:00
Wenchao Hao	f084fe52c6	scsi: scsi_debug: Add debugfs interface to fail target reset The interface is found at /sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t> identifies the target to inject errors on. It's a simple bool type interface which would make this target's reset fail if set to 'Y'. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-10-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	0267811625	scsi: scsi_debug: Add new error injection type: Reset LUN failed Add error injection type 4 to make scsi_debug_device_reset() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x4 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0x12" > ${error} will make the device return FAILED when trying to reset LUN with inquiry command 10 times. error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0xff" > ${error} will make the device return FAILED when trying to reset LUN 10 times. Usually we do not care about what command it is when trying to perform reset LUN, so 0xff could be applied. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-9-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	5551ce9288	scsi: scsi_debug: Add new error injection type: Abort Failed Add error injection type 3 to make scsi_debug_abort() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x3 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "3 -10 0x12" > ${error} will make the device return FAILED when aborting inquiry command 10 times. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-8-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	3359227432	scsi: scsi_debug: Set command result and sense data if error is injected If a fail command error is injected, set the command's status and sense data then finish this SCSI command. Set SCSI command's status and sense data format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x2 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error Count \| \| \| \| 0: the rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x8 \| Host byte in scsi_cmd::status \| \| \| \| [scsi_cmd::status has 32 bits holding these 3 bytes] \| +--------+------+-------------------------------------------------------+ \| 5 \| x8 \| Driver byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 6 \| x8 \| SCSI Status byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 7 \| x8 \| SCSI Sense Key in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 8 \| x8 \| SCSI ASC in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 9 \| x8 \| SCSI ASCQ in scsi_cmnd \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >${error} will make device's read command return with media error with additional sense of "Unrecovered read error" (UNC): Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-7-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	33bccf55c2	scsi: scsi_debug: Return failed value if error is injected If a fail queuecommand error is injected, return the failed value defined in the rule from queuecommand. Make queuecommand return format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x1 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x32 \| The queuecommand() return value we want \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "1 1 0x12 0x1055" > ${error} will make each INQUIRY command sent to that device return 0x1055 (SCSI_MLQUEUE_HOST_BUSY). Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-6-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	32be8b6e22	scsi: scsi_debug: Time out command if the error is injected If a timeout error is injected, return 0 from scsi_debug_queuecommand to make the command time out. Time out SCSI command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x0 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "0 -10 0x12" > ${error} will make the device's inquiry command time out 10 times. echo "0 1 0x12" > ${error} will make the device's inquiry time out each time it is invoked on this device. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	962d77cd4c	scsi: scsi_debug: Define grammar to remove added error injection The grammar to remove error injection is a line with fixed 3 columns separated by spaces. First column is fixed to "-". It tells this is a removal operation. Second column is the error code to match. Third column is the scsi command to match. For example the following command would remove timeout injection of inquiry command: echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-4-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	a9996d722b	scsi: scsi_debug: Add interface to manage error injection for a single device This new facility uses the debugfs pseudo file system which is typically mounted under the /sys/kernel/debug directory and requires root permissions to access. The interface file is found at /sys/kernel/debug/scsi_debug/<h:c:t:l>/error where <h:c:t:l> identifies the device (logical unit (LU)) to inject errors on. For the following description the ${error} environment variable is assumed to be set to/sys/kernel/debug/scsi_debug/1:0:0:0/error where 1:0:0:0 is a pseudo device (LU) owned by the scsi_debug driver. Rules are written to ${error} in the normal sysfs fashion (e.g. 'echo "0 -2 0x12" > ${error}'). More than one rule can be active on a device at a time and inactive rules (i.e. those whose error count is 0) remain in the rule listing. The existing rules can be read with 'cat ${error}' with oneline output for each rule. The interface format is line-by-line, each line is an error injection rule. Each rule contains integers separated by spaces, the first three columns correspond to "Error code", "Error count" and "SCSI command", other columns depend on Error code. General rule format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error code \| \| \| \| 0: timeout SCSI command \| \| \| \| 1: fail queuecommand, make queuecommand return \| \| \| \| given value \| \| \| \| 2: fail command, finish command with SCSI status, \| \| \| \| sense key and ASC/ASCQ values \| \| \| \| 3: make abort commands for specific command fail \| \| \| \| 4: make reset lun for specific command fail \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| ... \| xxx \| Error type specific fields \| +--------+------+-------------------------------------------------------+ Notes: - When multiple error inject rules are added for the same SCSI command, the one with smaller error code will take effect (and the others will be ignored). - If the same error (i.e. same Error code and SCSI command) is added, the older one will be overwritten.. - Currently, the basic types are (u8/u16/u32/u64/s8/s16/s32/s64) and the hexadecimal types (x8/x16/x32/x64). - Where a hexadecimal value is expected (e.g. Column 3: SCSI command opcode) the "0x" prefix is optional on the value (e.g. the INQUIRY opcode can be given as '0x12' or '12'). - When the Error count is negative, reading ${error} will show that value incrementing, stopping when it gets to 0. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-3-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:11 -04:00
Wenchao Hao	6e2d15f59b	scsi: scsi_debug: Create scsi_debug directory in the debugfs filesystem Create directory scsi_debug in the root of the debugfs filesystem. Prepare to add interface for manage error injection. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-2-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-16 20:50:10 -04:00
Maurizio Lombardi	23815df5af	scsi: scsi_debug: Remove dead code The ramdisk rwlocks are not used anymore. Fixes: `87c715dcde` ("scsi: scsi_debug: Add per_host_store option") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230628150638.53218-1-mlombard@redhat.com Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-07-05 21:37:43 -04:00
John Garry	0c028b6a11	scsi: scsi_debug: Abort commands from scsi_debug_device_reset() Currently scsi_debug_device_reset() does not do much apart from setting the SDEBUG_UA_POR ("Power on, reset, or bus device reset") flag, which is eventually passed back to the SCSI midlayer later for a "unit attention" command. There is a report that blktest scsi/007 test fails due to commit `1107c7b24e` ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd"). The problem there is that there are dangling scsi_debug queued commands when we attempt to remove the driver. scsi/007 test triggers SCSI EH and attempts to abort a timed-out command. Function scsi_debug_device_reset() is called as part of the EH, but does not deal with outstanding erroneous command. Prior to the named commit, removing the driver caused all dangling queued commands to be stopped - this should have not been necessary. Fix by aborting outstanding commands on a scsi_device basis from scsi_debug_device_reset(). Fixes: `1107c7b24e` ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd") Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202304071111.e762fcbd-yujie.liu@intel.com Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230416175654.159163-1-john.g.garry@oracle.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-18 22:58:50 -04:00
Harshit Mogalapalli	b32283d753	scsi: scsi_debug: Fix missing error code in scsi_debug_init() Smatch reports: drivers/scsi/scsi_debug.c:6996 scsi_debug_init() warn: missing error code 'ret' Although it is unlikely that KMEM_CACHE might fail, but if it does then ret might be zero. So to fix this explicitly mark ret as "-ENOMEM" and then goto driver_unreg. Fixes: `1107c7b24e` ("scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20230406074607.3637097-1-harshit.m.mogalapalli@oracle.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:15:00 -04:00
Martin K. Petersen	dc70c9615c	Merge patch series "Fix shost command overloading issues" John Garry <john.g.garry@oracle.com> says: It's easy to get scsi_debug to error on throughput testing when we have multiple shosts: $ lsscsi [7:0:0:0] disk Linux scsi_debug 0191 [0:0:0:0] disk Linux scsi_debug 0191 $ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read --bs=4k --iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg --eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256 ... fio-3.28 Starting 40 processes [ 27.521809] hrtimer: interrupt took 33067 ns [ 27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s [ 27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s fio: io_u error [ 27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00 on file /dev/sda[ 27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s The issue is related to how the driver manages submit queues and tags. A single array of submit queues - sdebug_q_arr - with its own set of tags is shared among all shosts. As such, for occasions when we have more than one host it is possible to overload the submit queues and run out of tags. Another separate issue that we may reduce the shost submit queue depth, sdebug_max_queue, dynamically causing the shost to be overloaded. How many IOs which the shost may be sent is fixed at can_queue at init time, which is the same initial value for sdebug_max_queue. So reducing sdebug_max_queue means that the shost may be sent more IOs than it is configured to handle, causing overloading. This series removes the scsi_debug submit queue concept and uses pre-existing APIs to manage and examine tags, like scsi_block_requests() and blk_mq_tagset_busy_iter(). Using standard APIs makes the driver more maintainable and extensible in future. A restriction is also added to allow sdebug_max_queue only be modified when no shosts are present, i.e. we need to remove shosts, modify sdebug_max_queue, and then re-add the shosts. Link: https://lore.kernel.org/r/20230327074310.1862889-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:10:40 -04:00
John Garry	f1437cd1e5	scsi: scsi_debug: Drop sdebug_queue It's easy to get scsi_debug to error on throughput testing when we have multiple shosts: $ lsscsi [7:0:0:0] disk Linux scsi_debug 0191 [0:0:0:0] disk Linux scsi_debug 0191 $ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read --bs=4k --iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg --eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256 ... fio-3.28 Starting 40 processes [ 27.521809] hrtimer: interrupt took 33067 ns [ 27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s [ 27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s fio: io_u error [ 27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00 on file /dev/sda[ 27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s The issue is related to how the driver manages submit queues and tags. A single array of submit queues - sdebug_q_arr - with its own set of tags is shared among all shosts. As such, for occasions when we have more than one shost it is possible to overload the submit queues and run out of tags. The struct sdebug_queue is to manage tags and hold the associated queued command entry pointer (for that tag). Since the tagset iters are now used for functions like sdebug_blk_mq_poll(), there is no need to manage these queues. Indeed, blk-mq already provides what we need for managing tags and queues. Drop sdebug_queue and all its usage in the driver. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-12-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	57f7225a4f	scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts The shost->can_queue value is initially used to set per-HW queue context tag depth in the block layer. This ensures that the shost is not sent too many commands which it can deal with. However lowering sdebug_max_queue separately means that we can easily overload the shost, as in the following example: $ cat /sys/bus/pseudo/drivers/scsi_debug/max_queue 192 $ cat /sys/class/scsi_host/host0/can_queue 192 $ echo 100 > /sys/bus/pseudo/drivers/scsi_debug/max_queue $ cat /sys/class/scsi_host/host0/can_queue 192 $ fio --filename=/dev/sda --direct=1 --rw=read --bs=4k --iodepth=256 --runtime=1200 --numjobs=10 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error iops-test-job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256 ... fio-3.28 Starting 10 processes [ 111.269885] scsi_io_completion_action: 400 callbacks suppressed [ 111.269885] blk_print_req_error: 400 callbacks suppressed [ 111.269889] I/O error, dev sda, sector 440 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2 [ 111.269892] sd 0:0:0:0: [sda] tag#132 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s [ 111.269897] sd 0:0:0:0: [sda] tag#132 CDB: Read(10) 28 00 00 00 01 68 00 00 08 00 [ 111.277058] I/O error, dev sda, sector 360 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2 [...] Ensure that this cannot happen by allowing sdebug_max_queue be modified only when we have no shosts. As such, any shost->can_queue value will match sdebug_max_queue, and sdebug_max_queue cannot be modified separately. Since retired_max_queue is no longer set, remove support. Continue to apply the restriction that sdebug_host_max_queue cannot be modified when sdebug_host_max_queue is set. Adding support for that would mean extra code, and no one has complained about this restriction previously. A command like the following may be used to remove a shost: echo -1 > /sys/bus/pseudo/drivers/scsi_debug/add_host Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-11-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	12f3eef016	scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store() The functions to update ndelay and delay value first check whether we have any in-flight IO for any host. It does this by checking if any tag is used in the global submit queues. We can achieve the same by setting the host as blocked and then ensuring that we have no in-flight commands with scsi_host_busy(). Note that scsi_host_busy() checks SCMD_STATE_INFLIGHT flag, which is only set per command after we ensure that the host is not blocked, i.e. we see more commands active after the check for scsi_host_busy() returns 0. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-10-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	9c559c9b47	scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued() Instead of iterating all deferred commands in the submission queue structures, use blk_mq_tagset_busy_iter(), which is a standard API for this. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-9-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	600d9ead39	scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll() Instead of iterating all deferred commands in the submission queue structures, use blk_mq_tagset_busy_iter(), which is a standard API for this. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-8-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	1107c7b24e	scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd Eventually we will drop the sdebug_queue struct as it is not really required, so start with making the sdebug_queued_cmd dynamically allocated for the lifetime of the scsi_cmnd in the driver. As an interim measure, make sdebug_queued_cmd.sd_dp a pointer to struct sdebug_defer. Also keep a value of the index allocated in sdebug_queued_cmd.qc_arr in struct sdebug_queued_cmd. To deal with an races in accessing the scsi cmnd allocated struct sdebug_queued_cmd, add a spinlock for the scsi command in its priv area. Races may be between scheduling a command for completion, aborting a command, and the command actually completing and freeing the struct sdebug_queued_cmd. [mkp: typo fix] Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-7-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	a0473bf31d	scsi: scsi_debug: Use scsi_block_requests() to block queues The feature to block queues is quite dubious, since it races with in-flight IO. Indeed, it seems unnecessary for block queues for any times we do so. Anyway, to keep the same behaviour, use standard SCSI API to stop IO being sent - scsi_{un}block_requests(). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-6-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:22 -04:00
John Garry	25b80b2c75	scsi: scsi_debug: Protect block_unblock_all_queues() with mutex There is no reason that calls to block_unblock_all_queues() from different context can't race with one another, so protect with the sdebug_host_list_mutex. There's no need for a more fine-grained per shost locking here (and we don't have a per-host lock anyway). Also simplify some touched code in sdebug_change_qdepth(). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-5-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:21 -04:00
John Garry	0aaa3fad4f	scsi: scsi_debug: Change shost list lock to a mutex The shost list lock, sdebug_host_list_lock, is a spinlock. We would only lock in non-atomic context in this driver, so use a mutex instead, which is friendlier if we need to schedule when iterating. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-4-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:21 -04:00
John Garry	00f9d622e8	scsi: scsi_debug: Don't iter all shosts in clear_luns_changed_on_target() In clear_luns_changed_on_target(), we iter all devices for all shosts to conditionally clear the SDEBUG_UA_LUNS_CHANGED flag in the per-device uas_bm. One condition to see whether we clear the flag is to test whether the host for the device under consideration is the same as the matching device's (devip) host. This check will only ever pass for devices for the same shost, so only iter the devices for the matching device shost. We can now drop the spinlock'ing of the sdebug_host_list_lock in the same function. This will allow us to use a mutex instead of the spinlock for the global shost lock, as clear_luns_changed_on_target() could be called in non-blocking context, in scsi_debug_queuecommand() -> make_ua() -> clear_luns_changed_on_target() (which is why required a spinlock). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-3-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:21 -04:00
John Garry	6500d2045d	scsi: scsi_debug: Fix check for sdev queue full There is a report that the blktests scsi/004 test for "TASK SET FULL" (TSF) now fails. The condition upon we should issue this TSF is when the sdev queue is full. The check for a full queue has an off-by-1 error. Previously we would increment the number of requests in the queue after testing if the queue would be full, i.e. test if one less than full. Since we now use scsi_device_busy() to count the number of requests in the queue, this would already account for the current request, so fix the test for queue full accordingly. Fixes: `151f0ec9dd` ("scsi: scsi_debug: Drop sdebug_dev_info.num_in_q") Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202303201334.18b30edc-oliver.sang@intel.com Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230327074310.1862889-2-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 22:09:21 -04:00
Lizhe	c45b380429	scsi: scsi_debug: Remove redundant driver match function If there is no driver match function, the driver core assumes that each candidate pair (driver, device) matches, see driver_match_device(). Drop the pseudo_lld bus match function that always returned 1. This results in the same behaviour as when there is no match function. [mkp+jgg: patch description] Signed-off-by: Lizhe <sensor1010@163.com> Link: https://lore.kernel.org/r/20230319042732.278691-1-sensor1010@163.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:43:26 -04:00
John Garry	548ebb335f	scsi: scsi_debug: Add poll mode deferred completions to statistics Currently commands completed via poll mode are not included in the statistics gathering for deferred completions and missed CPUs. Poll mode completions should be treated the same as other deferred completion types, so add poll mode completions to the statistics. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-12-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	f037b5cb07	scsi: scsi_debug: Get command abort feature working again The command abort feature allows us to test aborting a command which has timed-out. The idea is that for specific commands we just don't call scsi_done() and allow the request to timeout, which ensures SCSI EH kicks-in we try to abort the command. Since commit `4a0c6f432d` ("scsi: scsi_debug: Add new defer type for mq_poll") this does not seem to work. The issue is that we clear the sd_dp->aborted flag in schedule_resp() before the completion callback has run. When the completion callback actually runs, it calls scsi_done() as normal as sd_dp->aborted unset. This is all very racy. Fix by not clearing sd_dp->aborted in schedule_resp(). Also move the call to blk_abort_request() from schedule_resp() to sdebug_q_cmd_complete(), which makes the code have a more logical sequence. I also note that this feature only works for commands which are classed as "SDEG_RES_IMMED_MASK", but only practically triggered with prior RW commands. So for my experiment I need to run fio to trigger the error on the "nth" command (see inject_on_this_cmd()), and then run something like sg_sync to queue a command to actually trigger the abort. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-11-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	151f0ec9dd	scsi: scsi_debug: Drop sdebug_dev_info.num_in_q In schedule_resp(), under certain conditions we check whether the per-device queue is full (num_in_q == queue depth - 1) and we may inject a "task set full" (TSF) error if it is. However how we read num_in_q is racy - many threads may see the same "queue is full" value (and also issue a TSF). There is per-queue locking in reading per-device num_in_q, but that would not help. Replace how we read num_in_q at this location with a call to scsi_device_busy(). Calling scsi_device_busy() is likewise racy (as reading num_in_q), so nothing lost or gained. Calling scsi_device_busy() is also slow as it needs to read all bits in the per-device budget bitmap, but we can live with that since we're just a simulator and it's only under a certain configs which we would see this. Also move the "task set full" print earlier as it would only be called now under this condition. However, previously it may not have been called - like returning early - but keep it simple and always call it. At this point we can drop sdebug_dev_info.num_in_q - it is difficult to maintain properly and adds extra normal case command processing. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-10-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	0befb87909	scsi: scsi_debug: Drop check for num_in_q exceeding queue depth The per-device num_in_q value cannot exceed the device queue depth, so drop the check. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-9-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	9c2303820b	scsi: scsi_debug: Drop scsi_debug_host_reset() device NULL pointer check The check for device pointer for the SCSI command is unnecessary, so drop it. The only caller is scsi_try_host_reset() -> eh_host_reset_handler(), and there that pointer cannot be NULL. Indeed, there is already code later in the same function which does not check the device pointer for the SCSI command. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-8-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	519bfc14c1	scsi: scsi_debug: Drop scsi_debug_bus_reset() NULL pointer checks The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so drop them. Likewise, drop the NULL check for sdbg_host. The only caller is scsi_try_bus_reset() -> eh_bus_reset_handler(), and there those pointers cannot be NULL. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-7-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	a15df530a1	scsi: scsi_debug: Drop scsi_debug_target_reset() NULL pointer checks The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so drop them. Likewise, drop the NULL check for sdbg_host. The only caller is scsi_try_target_reset() -> eh_target_reset_handler(), and there those pointers cannot be NULL. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-6-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:37 -04:00
John Garry	a19226f844	scsi: scsi_debug: Drop scsi_debug_device_reset() NULL pointer checks The SCSI cmnd pointer arg would never be NULL, so drop the check. In addition, its SCSI device pointer would never be NULL (so drop that check also). The only caller is scsi_try_bus_device_reset(), and the command and its device pointer could not be NULL when calling eh_device_reset_handler() there. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-5-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:36 -04:00
John Garry	06be9fbebb	scsi: scsi_debug: Drop scsi_debug_abort() NULL pointer checks The SCSI cmnd pointer arg would never be NULL, so drop the check. In addition, its SCSI device pointer would never be NULL. The only caller is scsi_send_eh_cmnd() -> scsi_abort_eh_cmnd() -> scsi_try_to_abort_cmd() -> scsi_try_to_abort_cmd(), and in the origin of that chain those pointers cannot be NULL. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-4-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:36 -04:00
John Garry	d280a4ef22	scsi: scsi_debug: Stop setting devip->sdbg_host twice In sdebug_device_create(), the devip->sdbg_host pointer is needlessly set twice, so stop doing that. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20230313093114.1498305-3-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:36 -04:00
John Garry	785d6b7cf3	scsi: scsi_debug: Don't hold driver host struct pointer in host->hostdata[] This driver stores just a pointer to the driver host structure in host->hostdata[]. Most other drivers actually have the driver host structure allocated in host->hostdata[], but this driver is different as we allocate that memory separately before allocating the shost memory. However there is no need to allocate this memory only in host->hostdata[] when we can already look up the driver host structure from shost->dma_dev, so add a macro for this - shost_to_sdebug_host(). Rename to_sdebug_host() -> dev_to_sdebug_host() to avoid ambiguity. Also remove a check for !sdbg_host in find_build_dev_info(), as this cannot be true. Other similar checks will be later removed. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230313093114.1498305-2-john.g.garry@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-16 23:25:36 -04:00
Martin K. Petersen	6b1c374c45	Merge branch '6.2/scsi-queue' into 6.2/scsi-fixes Pull in remaining patches from the 6.2 queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-12-30 16:29:34 +00:00
Linus Torvalds	aa5ad10f6c	SCSI misc on 20221213 Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are some core changes, mostly around reworking some of our user context assumptions in device put and moving some code around. The remaining updates are bug fixes and minor changes. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY5jjrSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishR9iAPwN++uF BNlCD36duS8LslKQMPAmFxWt3d/4RWAHsXj2WQEAtu9q8K9PSe1ueb4y+rAEG4oj 2AUQhR3v9ciWBBKlDog= =JYJC -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are some core changes, mostly around reworking some of our user context assumptions in device put and moving some code around. The remaining updates are bug fixes and minor changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits) scsi: sg: Fix get_user() in call sg_scsi_ioctl() scsi: megaraid_sas: Fix some spelling mistakes in comment scsi: core: Use SCSI_SCAN_INITIAL in do_scsi_scan_host() scsi: core: Use SCSI_SCAN_RESCAN in __scsi_add_device() scsi: ufs: ufs-mediatek: Remove unnecessary return code scsi: ufs: core: Fix the polling implementation scsi: libsas: Do not export sas_ata_wait_after_reset() scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset scsi: libsas: Add smp_ata_check_ready_type() scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset" scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()" scsi: ufs: ufs-mediatek: Modify the return value scsi: ufs: ufs-mediatek: Remove unneeded code scsi: device_handler: alua: Call scsi_device_put() from non-atomic context scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of alua_check_vpd()" scsi: snic: Fix possible UAF in snic_tgt_create() scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices ...	2022-12-14 08:58:51 -08:00
John Garry	c411a42fb9	scsi: scsi_debug: Delete unreachable code in inquiry_vpd_b0() The 2nd return statement in inquiry_vpd_b0() is unreachable, so delete it. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20221213142122.1011886-1-john.g.garry@oracle.com Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-12-14 03:09:38 +00:00
Linus Torvalds	268325bda5	Random number generator updates for Linux 6.2-rc1. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE= =QRhK -----END PGP SIGNATURE----- Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...	2022-12-12 16:22:22 -08:00
Yang Yingliang	e6d773f93a	scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper() Afer commit `1fa5ae857b` ("driver core: get rid of struct device's bus_id string array"), the name of device is allocated dynamically, it needs be freed when device_register() returns error. As comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(), and sdbg_host is freed in sdebug_release_adapter(). When the device release is not set, it means the device is not initialized. We can not call put_device() in this case. Use kfree() to free memory. Fixes: `1fa5ae857b` ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112131010.3757845-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-26 00:06:13 +00:00
Harshit Mogalapalli	07f2ca139d	scsi: scsi_debug: Fix a warning in resp_report_zones() As 'alloc_len' is user controlled data, if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `7db0e0c819` ("scsi: scsi_debug: Fix buffer size of REPORT ZONES command") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070612.2121535-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-25 23:52:18 +00:00
Harshit Mogalapalli	ed0f17b748	scsi: scsi_debug: Fix a warning in resp_verify() As 'vnum' is controlled by user, so if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `c3e2fe9222` ("scsi: scsi_debug: Implement VERIFY(10), add VERIFY(16)") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070031.2121068-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-25 23:51:32 +00:00
Jason A. Donenfeld	8032bf1233	treewide: use get_random_u32_below() instead of deprecated function This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-11-18 02:15:15 +01:00
Harshit Mogalapalli	216e179724	scsi: scsi_debug: Fix a warning in resp_write_scat() As 'lbdof_blen' is coming from user, if the size in kzalloc() is >= MAX_ORDER then we hit a warning. Call trace: sg_ioctl sg_ioctl_common scsi_ioctl sg_scsi_ioctl blk_execute_rq blk_mq_sched_insert_request blk_mq_run_hw_queue __blk_mq_delay_run_hw_queue __blk_mq_run_hw_queue blk_mq_sched_dispatch_requests __blk_mq_sched_dispatch_requests blk_mq_dispatch_rq_list scsi_queue_rq scsi_dispatch_cmd scsi_debug_queuecommand schedule_resp resp_write_scat If you try to allocate a memory larger than(>=) MAX_ORDER, then kmalloc() will definitely fail. It creates a stack trace and messes up dmesg. The user controls the size here so if they specify a too large size it will fail. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: `481b5e5c79` ("scsi: scsi_debug: add resp_write_scat function") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221111100526.1790533-1-harshit.m.mogalapalli@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-17 18:12:21 +00:00
Yuan Can	e208a1d795	scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper() If device_register() fails in sdebug_add_host_helper(), it will goto clean and sdbg_host will be freed, but sdbg_host->host_list will not be removed from sdebug_host_list, then list traversal may cause UAF. Fix it. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221117084421.58918-1-yuancan@huawei.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-17 17:48:32 +00:00
Bart Van Assche	ecb8c2580d	scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC From ZBC-1: - RC BASIS = 0: The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a contiguous range of zones that are not sequential write required zones starting with the first zone. - RC BASIS = 1: The RETURNED LOGICAL BLOCK ADDRESS field indicates the LBA of the last logical block on the logical unit. The current scsi_debug READ CAPACITY response does not comply with the above if there are one or more sequential write required zones. SCSI initiators need a way to retrieve the largest valid LBA from SCSI devices. Reporting the largest valid LBA if there are one or more sequential zones requires to set the RC BASIS field in the READ CAPACITY response to one. Hence this patch. Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221102193248.3177608-1-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-08 02:07:52 +00:00

1 2 3 4 5 ...

348 Commits