linux-stable/block
Jens Axboe 492e0aba08 block: treat poll queue enter similarly to timeouts
commit 33391eecd6 upstream.

We ran into an issue where a production workload would randomly grind to
a halt and not continue until the pending IO had timed out. This turned
out to be a complicated interaction between queue freezing and polled
IO:

1) You have an application that does polled IO. At any point in time,
   there may be polled IO pending.

2) You have a monitoring application that issues a passthrough command,
   which is marked with side effects such that it needs to freeze the
   queue.

3) Passthrough command is started, which calls blk_freeze_queue_start()
   on the device. At this point the queue is marked frozen, and any
   attempt to enter the queue will fail (for non-blocking) or block.

4) Now the driver calls blk_mq_freeze_queue_wait(), which will return
   when the queue is quiesced and pending IO has completed.

5) The pending IO is polled IO, but any attempt to poll IO through the
   normal iocb_bio_iopoll() -> bio_poll() will fail when it gets to
   bio_queue_enter() as the queue is frozen. Rather than poll and
   complete IO, the polling threads will sit in a tight loop attempting
   to poll, but failing to enter the queue to do so.

The end result is that progress for either application will be stalled
until all pending polled IO has timed out. This causes obvious huge
latency issues for the application doing polled IO, but also long delays
for passthrough command.

Fix this by treating queue enter for polled IO just like we do for
timeouts. This allows quick quiesce of the queue as we still poll and
complete this IO, while still disallowing queueing up new IO.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-16 19:06:31 +01:00
..
partitions block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-31 16:17:11 -08:00
badblocks.c block/badblocks: Remove redundant assignments 2022-04-23 07:15:26 -06:00
bdev.c block: update the stable_writes flag in bdev_add 2024-01-10 17:10:32 +01:00
bfq-cgroup.c block, bfq: fix uaf for bfqq in bic_set_bfqq() 2023-02-09 11:28:06 +01:00
bfq-iosched.c block, bfq: Fix division by zero error on zero wsum 2023-05-24 17:32:38 +01:00
bfq-iosched.h block, bfq: remove unused variable for bfq_queue 2022-10-20 05:46:49 -07:00
bfq-wf2q.c block, bfq: remove useless parameter for bfq_add/del_bfqq_busy() 2022-08-22 10:07:56 -06:00
bio-integrity.c block: factor out a bvec_set_page helper 2023-09-23 11:11:08 +02:00
bio.c block: prevent an integer overflow in bvec_try_merge_hw_page 2024-02-05 20:12:53 +00:00
blk-cgroup-fc-appid.c cgroup: Homogenize cgroup_get_from_id() return value 2022-08-26 10:57:41 -10:00
blk-cgroup-rwstat.c
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c blk-cgroup: bypass blkcg_deactivate_policy after destroying 2023-12-20 17:00:21 +01:00
blk-cgroup.h blk-cgroup: pass a gendisk to blkcg_init_queue and blkcg_exit_queue 2022-09-26 19:09:31 -06:00
blk-core.c block: treat poll queue enter similarly to timeouts 2024-02-16 19:06:31 +01:00
blk-crypto-fallback.c blk-crypto: dynamically allocate fallback profile 2023-08-23 17:52:39 +02:00
blk-crypto-internal.h blk-mq: release crypto keyslot before reporting I/O complete 2023-05-11 23:03:00 +09:00
blk-crypto-profile.c blk-crypto: use dynamic lock class for blk_crypto_profile::lock 2023-07-23 13:49:21 +02:00
blk-crypto-sysfs.c blk-crypto: show crypto capabilities in sysfs 2022-02-28 06:40:23 -07:00
blk-crypto.c blk-crypto: make blk_crypto_evict_key() more robust 2023-05-11 23:03:01 +09:00
blk-flush.c block: change request end_io handler to pass back a return value 2022-09-30 07:49:09 -06:00
blk-ia-ranges.c block: simplify disk_set_independent_access_ranges 2022-06-29 08:36:46 -06:00
blk-integrity.c blk-crypto: remove blk_crypto_unregister() 2021-11-29 06:38:51 -07:00
blk-ioc.c block: fix default IO priority handling again 2022-06-27 06:29:12 -06:00
blk-iocost.c blk-iocost: Fix an UBSAN shift-out-of-bounds warning 2024-02-16 19:06:29 +01:00
blk-iolatency.c blk-cgroup: pass a gendisk to blkcg_schedule_throttle 2022-09-26 19:17:28 -06:00
blk-ioprio.c blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: fix blkdev_issue_secure_erase 2022-09-15 00:25:17 -06:00
blk-map.c block: fix bio-cache for passthru IO 2023-06-05 09:26:21 +02:00
blk-merge.c blk-mq: release crypto keyslot before reporting I/O complete 2023-05-11 23:03:00 +09:00
blk-mq-cpumap.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-rdma.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-sched.c blk-mq: correct stale comment of .get_budget 2023-03-10 09:32:44 +01:00
blk-mq-sched.h block: move blk_mq_sched_assign_ioc to blk-ioc.c 2021-11-29 06:41:29 -07:00
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2022-12-31 13:33:03 +01:00
blk-mq-tag.c blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-mq-tag.h blk-mq: blk_mq_tag_busy is no need to return a value 2022-06-27 06:29:12 -06:00
blk-mq-virtio.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq.c blk-mq: fix IO hang from sbitmap wakeup race 2024-02-05 20:12:59 +00:00
blk-mq.h blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2021-12-22 23:38:29 -05:00
blk-pm.h
blk-rq-qos.c block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below 2022-07-12 14:38:52 -06:00
blk-rq-qos.h block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO 2022-09-21 19:50:53 -06:00
blk-settings.c block: make BLK_DEF_MAX_SECTORS unsigned 2024-01-25 15:27:30 -08:00
blk-stat.c blk-stat: fix QUEUE_FLAG_STATS clear 2023-05-11 23:03:00 +09:00
blk-stat.h block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-sysfs.c block: fix use-after-free of q->q_usage_counter 2023-10-10 22:00:37 +02:00
blk-throttle.c blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2023-12-20 17:00:21 +01:00
blk-throttle.h blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-timeout.c
blk-wbt.c blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init() 2022-10-09 07:48:16 -06:00
blk-wbt.h blk-wbt: remove wbt_track stub 2022-03-31 12:58:38 -06:00
blk-zoned.c block: adapt blk_mq_plug() to not plug for writes that require a zone lock 2022-09-29 07:45:47 -06:00
blk.h block: Revert "block: Do not reread partition table on exclusively open device" 2023-03-17 08:50:20 +01:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c blk-mq: Drop blk_mq_ops.timeout 'reserved' arg 2022-07-06 06:33:53 -06:00
bsg.c scsi: core: bsg: Remove usage of the deprecated ida_simple_xxx() API 2022-06-21 21:22:51 -04:00
disk-events.c block: increment diskseq on all media change events 2023-07-19 16:21:47 +02:00
elevator.c blk-mq: use quiesced elevator switch when reinitializing queues 2022-09-27 09:58:56 -06:00
elevator.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
fops.c block: Don't invalidate pagecache for invalid falloc modes 2024-01-10 17:10:20 +01:00
genhd.c block: add check of 'minors' and 'first_minor' in device_add_disk() 2024-01-25 15:27:28 -08:00
holder.c block: remove WARN_ON() from bd_link_disk_holder 2022-06-23 07:48:05 -06:00
ioctl.c block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-31 16:17:11 -08:00
ioprio.c block: Fix handling of tasks without ioprio in ioprio_get(2) 2022-06-27 06:29:12 -06:00
Kconfig block: remove "select BLK_RQ_IO_DATA_LEN" from BLK_CGROUP_IOCOST dependency 2022-06-29 08:35:57 -06:00
Kconfig.iosched block: only build the icq tracking code when needed 2021-12-16 10:59:02 -07:00
kyber-iosched.c block/kyber: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
Makefile blk-cgroup: move blkcg_{get,set}_fc_appid out of line 2022-05-02 14:06:20 -06:00
mq-deadline.c block/mq-deadline: use correct way to throttling write requests 2023-09-13 09:42:42 +02:00
opal_proto.h block: sed-opal: Add ioctl to return device status 2022-08-22 07:52:51 -06:00
sed-opal.c block: sed-opal: kmalloc the cmd/resp buffers 2022-11-08 07:14:35 -07:00
t10-pi.c block: add pi for extended integrity 2022-03-07 12:48:35 -07:00