linux-stable/block
Jan Kara 162bc7bd73 bfq: Do not let waker requests skip proper accounting
[ Upstream commit c65e6fd460 ]

Commit 7cc4ffc555 ("block, bfq: put reqs of waker and woken in
dispatch list") added a condition to bfq_insert_request() which added
waker's requests directly to dispatch list. The rationale was that
completing waker's IO is needed to get more IO for the current queue.
Although this rationale is valid, there is a hole in it. The waker does
not necessarily serve the IO only for the current queue and maybe it's
current IO is not needed for current queue to make progress. Furthermore
injecting IO like this completely bypasses any service accounting within
bfq and thus we do not properly track how much service is waker's queue
getting or that the waker is actually doing any IO. Depending on the
conditions this can result in the waker getting too much or too few
service.

Consider for example the following job file:

[global]
directory=/mnt/repro/
rw=write
size=8g
time_based
runtime=30
ramp_time=10
blocksize=1m
direct=0
ioengine=sync

[slowwriter]
numjobs=1
prioclass=2
prio=7
fsync=200

[fastwriter]
numjobs=1
prioclass=2
prio=0
fsync=200

Despite processes have very different IO priorities, they get the same
about of service. The reason is that bfq identifies these processes as
having waker-wakee relationship and once that happens, IO from
fastwriter gets injected during slowwriter's time slice. As a result bfq
is not aware that fastwriter has any IO to do and constantly schedules
only slowwriter's queue. Thus fastwriter is forced to compete with
slowwriter's IO all the time instead of getting its share of time based
on IO priority.

Drop the special injection condition from bfq_insert_request(). As a
result, requests will be tracked and queued in a normal way and on next
dispatch bfq_select_queue() can decide whether the waker's inserted
requests should be injected during the current queue's timeslice or not.

Fixes: 7cc4ffc555 ("block, bfq: put reqs of waker and woken in dispatch list")
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20211125133645.27483-8-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27 11:03:18 +01:00
..
partitions block: fix incorrect references to disk objects 2021-10-18 11:20:38 -06:00
badblocks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
bdev.c block: genhd: fix double kfree() in __alloc_disk_node() 2021-10-02 07:29:20 -06:00
bfq-cgroup.c block, bfq: reset last_bfqq_created on group change 2021-10-17 07:03:02 -06:00
bfq-iosched.c bfq: Do not let waker requests skip proper accounting 2022-01-27 11:03:18 +01:00
bfq-iosched.h block, bfq: cleanup the repeated declaration 2021-08-25 06:45:33 -06:00
bfq-wf2q.c block: Introduce IOPRIO_NR_LEVELS 2021-08-18 07:21:12 -06:00
bio-integrity.c block: use bvec_virt in bio_integrity_{process,free} 2021-08-16 10:50:32 -06:00
bio.c block: don't call rq_qos_ops->done_bio if the bio isn't tracked 2021-09-24 11:04:39 -06:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-05 11:32:15 -07:00
blk-cgroup-rwstat.h
blk-cgroup.c blk-cgroup: fix missing put device in error path from blkg_conf_pref() 2021-11-25 09:48:41 +01:00
blk-core.c blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
blk-crypto-fallback.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
blk-crypto-internal.h block: make blk_crypto_rq_bio_prep() able to fail 2020-10-05 10:47:43 -06:00
blk-crypto.c blk-crypto: fix check for too-large dun_bytes 2021-08-25 06:45:00 -06:00
blk-exec.c block: return errors from blk_execute_rq() 2021-06-30 15:35:45 -06:00
blk-flush.c blk-mq: fix is_flush_rq 2021-08-17 20:17:34 -06:00
blk-integrity.c block: flush the integrity workqueue in blk_integrity_unregister 2021-09-14 20:03:30 -06:00
blk-ioc.c block: remove retry loop in ioc_release_fn() 2020-07-16 10:22:15 -06:00
blk-iocost.c iocost: Fix divide-by-zero on donation from low hweight cgroup 2021-12-22 09:32:48 +01:00
blk-iolatency.c for-5.15/block-2021-08-30 2021-08-30 18:52:11 -07:00
blk-ioprio.c block: Introduce the ioprio rq-qos policy 2021-06-21 15:03:40 -06:00
blk-ioprio.h block: Introduce the ioprio rq-qos policy 2021-06-21 15:03:40 -06:00
blk-lib.c block: export blk_next_bio() 2021-06-17 15:51:20 +02:00
blk-map.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
blk-merge.c io_uring-bio-cache.5-2021-08-30 2021-08-30 19:30:30 -07:00
blk-mq-cpumap.c blk-mq: remove the calling of local_memory_node() 2020-10-20 07:08:17 -06:00
blk-mq-debugfs-zoned.c
blk-mq-debugfs.c block: decode QUEUE_FLAG_HCTX_ACTIVE in debugfs output 2021-10-04 06:58:39 -06:00
blk-mq-debugfs.h
blk-mq-pci.c
blk-mq-rdma.c
blk-mq-sched.c blk-mq-sched: Fix blk_mq_sched_alloc_tags() error handling 2021-07-27 16:44:38 -06:00
blk-mq-sched.h blk: Fix lock inversion between ioc lock and bfqd lock 2021-06-24 18:43:55 -06:00
blk-mq-sysfs.c block: remove blk-mq-sysfs dead code 2021-08-02 13:37:29 -06:00
blk-mq-tag.c blk-mq: avoid to iterate over stale request 2021-09-12 19:32:43 -06:00
blk-mq-tag.h blk-mq: Some tag allocation code refactoring 2021-05-24 06:47:22 -06:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
blk-mq.h blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
blk-pm.c scsi: block: Fix a race in the runtime power management code 2020-12-09 11:41:41 -05:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c rq-qos: fix missed wake-ups in rq_qos_throttle try two 2021-06-08 15:12:57 -06:00
blk-rq-qos.h block: Introduce the ioprio rq-qos policy 2021-06-21 15:03:40 -06:00
blk-settings.c block: Fix partition check for host-aware zoned block devices 2021-10-27 06:58:01 -06:00
blk-stat.c blk-stat: make q->stats->lock irqsafe 2020-09-01 16:48:46 -06:00
blk-stat.h
blk-sysfs.c blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
blk-throttle.c blk-throttle: fix UAF by deleteing timer in blk_throtl_exit() 2021-09-07 08:36:56 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk-wbt: prevent NULL pointer dereference in wb_timer_fn 2021-11-18 19:16:34 +01:00
blk-wbt.h blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled() 2021-06-21 15:03:41 -06:00
blk-zoned.c block: Hold invalidate_lock in BLKRESETZONE ioctl 2021-11-18 19:17:15 +01:00
blk.h block: bump max plugged deferred size from 16 to 32 2021-11-18 19:16:16 +01:00
bounce.c block: use memcpy_from_bvec in __blk_queue_bounce 2021-08-02 13:37:28 -06:00
bsg-lib.c scsi: bsg-lib: Fix commands without data transfer in bsg_transport_sg_io_fn() 2021-08-01 13:21:40 -04:00
bsg.c scsi: bsg: Fix device unregistration 2021-09-14 00:22:15 -04:00
disk-events.c block: return errors from disk_alloc_events 2021-08-23 12:55:45 -06:00
elevator.c block: avoid to quiesce queue in elevator_init_mq 2021-12-01 09:04:56 +01:00
fops.c block: hold ->invalidate_lock in blkdev_fallocate 2021-09-24 11:06:58 -06:00
genhd.c blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
holder.c block: add back the bd_holder_dir reference in bd_link_disk_holder 2021-08-20 21:14:26 -06:00
ioctl.c block: Hold invalidate_lock in BLKZEROOUT ioctl 2021-11-18 19:17:15 +01:00
ioprio.c block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) 2021-12-14 10:57:16 +01:00
Kconfig SCSI misc on 20210902 2021-09-02 15:09:46 -07:00
Kconfig.iosched Revert "block/mq-deadline: Add cgroup support" 2021-08-11 13:47:26 -06:00
keyslot-manager.c - Fix DM integrity's HMAC support to provide enhanced security of 2021-02-22 10:22:54 -08:00
kyber-iosched.c kyber: avoid q->disk dereferences in trace points 2021-10-15 21:02:57 -06:00
Makefile block-5.15-2021-09-11 2021-09-11 10:19:51 -07:00
mq-deadline.c block/mq-deadline: Move dd_queued() to fix defined but not used warning 2021-09-02 06:34:45 -06:00
opal_proto.h block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
sed-opal.c block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
t10-pi.c block: use bvec_kmap_local in t10_pi_type1_{prepare,complete} 2021-08-02 13:37:28 -06:00