linux-stable/block
Peter Zijlstra 18dd7b964c block/mq: Cure cpu hotplug lock inversion
[ Upstream commit eabe06595d ]

By poking at /debug/sched_features I triggered the following splat:

 [] ======================================================
 [] WARNING: possible circular locking dependency detected
 [] 4.11.0-00873-g964c8b7-dirty #694 Not tainted
 [] ------------------------------------------------------
 [] bash/2109 is trying to acquire lock:
 []  (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff8120cb8b>] static_key_slow_dec+0x1b/0x50
 []
 [] but task is already holding lock:
 []  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
 []
 [] which lock already depends on the new lock.
 []
 []
 [] the existing dependency chain (in reverse order) is:
 []
 [] -> #2 (&sb->s_type->i_mutex_key#4){+++++.}:
 []        lock_acquire+0x100/0x210
 []        down_write+0x28/0x60
 []        start_creating+0x5e/0xf0
 []        debugfs_create_dir+0x13/0x110
 []        blk_mq_debugfs_register+0x21/0x70
 []        blk_mq_register_dev+0x64/0xd0
 []        blk_register_queue+0x6a/0x170
 []        device_add_disk+0x22d/0x440
 []        loop_add+0x1f3/0x280
 []        loop_init+0x104/0x142
 []        do_one_initcall+0x43/0x180
 []        kernel_init_freeable+0x1de/0x266
 []        kernel_init+0xe/0x100
 []        ret_from_fork+0x31/0x40
 []
 [] -> #1 (all_q_mutex){+.+.+.}:
 []        lock_acquire+0x100/0x210
 []        __mutex_lock+0x6c/0x960
 []        mutex_lock_nested+0x1b/0x20
 []        blk_mq_init_allocated_queue+0x37c/0x4e0
 []        blk_mq_init_queue+0x3a/0x60
 []        loop_add+0xe5/0x280
 []        loop_init+0x104/0x142
 []        do_one_initcall+0x43/0x180
 []        kernel_init_freeable+0x1de/0x266
 []        kernel_init+0xe/0x100
 []        ret_from_fork+0x31/0x40

 []  *** DEADLOCK ***
 []
 [] 3 locks held by bash/2109:
 []  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff81292bcd>] vfs_write+0x17d/0x1a0
 []  #1:  (debugfs_srcu){......}, at: [<ffffffff8155a90d>] full_proxy_write+0x5d/0xd0
 []  #2:  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
 []
 [] stack backtrace:
 [] CPU: 9 PID: 2109 Comm: bash Not tainted 4.11.0-00873-g964c8b7-dirty #694
 [] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
 [] Call Trace:

 []  lock_acquire+0x100/0x210
 []  get_online_cpus+0x2a/0x90
 []  static_key_slow_dec+0x1b/0x50
 []  static_key_disable+0x20/0x30
 []  sched_feat_write+0x131/0x170
 []  full_proxy_write+0x97/0xd0
 []  __vfs_write+0x28/0x120
 []  vfs_write+0xb5/0x1a0
 []  SyS_write+0x49/0xa0
 []  entry_SYSCALL_64_fastpath+0x23/0xc2

This is because of the cpu hotplug lock rework. Break the chain at #1
by reversing the lock acquisition order. This way i_mutex_key#4 no
longer depends on cpu_hotplug_lock and things are good.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24 11:00:22 +01:00
..
partitions partitions/efi: Fix integer overflow in GPT size calculation 2017-10-08 10:26:06 +02:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-12-20 10:07:29 +01:00
bio-integrity.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
bio.c bio_copy_user_iov(): don't ignore ->iov_offset 2017-10-18 09:35:41 +02:00
blk-cgroup.c blkcg: fix double free of new_blkg in blkcg_init_queue 2018-03-22 09:17:35 +01:00
blk-core.c block: wake up all tasks blocked in get_request() 2017-12-14 09:28:23 +01:00
blk-exec.c block: Fix spelling in a source code comment 2016-07-20 21:28:22 -06:00
blk-flush.c block: flush: fix IO hang in case of flood fua req 2016-10-26 07:49:27 -06:00
blk-integrity.c block: fix blk_integrity_register to use template's interval_exp if not 0 2017-05-20 14:28:36 +02:00
blk-ioc.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
blk-lib.c block: require write_same and discard requests align to logical block size 2016-10-11 15:06:30 -07:00
blk-map.c blk_rq_map_user_iov: fix error override 2018-02-25 11:05:42 +01:00
blk-merge.c block: make sure a big bio is split into at most 256 bvecs 2016-08-24 08:17:24 -06:00
blk-mq-cpumap.c blk-mq: allow the driver to pass in a queue mapping 2016-09-15 08:42:03 -06:00
blk-mq-pci.c blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL 2017-08-24 17:12:20 -07:00
blk-mq-sysfs.c blk-mq: initialize mq kobjects in blk_mq_init_allocated_queue() 2017-12-14 09:28:21 +01:00
blk-mq-tag.c blk-mq: Fix tagset reinit in the presence of cpu hot-unplug 2017-12-20 10:07:20 +01:00
blk-mq-tag.h Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block 2016-10-09 17:29:33 -07:00
blk-mq.c block/mq: Cure cpu hotplug lock inversion 2018-03-24 11:00:22 +01:00
blk-mq.h blk-mq: initialize mq kobjects in blk_mq_init_allocated_queue() 2017-12-14 09:28:21 +01:00
blk-settings.c block: kill off q->flush_flags 2016-04-13 13:33:19 -06:00
blk-softirq.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
blk-sysfs.c blk-mq: register device instead of disk 2016-09-21 07:56:16 -06:00
blk-tag.c block: support different tag allocation policy 2015-01-23 14:15:46 -07:00
blk-throttle.c blk-throttle: make sure expire time isn't too big 2018-03-22 09:17:44 +01:00
blk-timeout.c block: Fix a race between blk_cleanup_queue() and timeout handling 2017-11-30 08:39:06 +00:00
blk.h blk-mq: remove ->map_queue 2016-09-15 08:42:03 -06:00
bounce.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2015-09-19 18:57:09 -07:00
bsg-lib.c Revert "bsg-lib: don't free job in bsg_prepare_job" 2017-10-21 17:21:33 +02:00
bsg.c sg_write()/bsg_write() is not fit to be called under KERNEL_DS 2017-01-09 08:32:25 +01:00
cfq-iosched.c cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode 2017-06-14 15:05:58 +02:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
deadline-iosched.c block: do not merge requests without consulting with io scheduler 2016-07-20 21:35:12 -06:00
elevator.c block: Fix secure erase 2016-08-16 09:16:51 -06:00
genhd.c block: fix bdi vs gendisk lifetime mismatch 2016-08-04 14:19:16 -06:00
ioctl.c block: invalidate the page cache when issuing BLKZEROOUT 2016-10-11 15:06:30 -07:00
ioprio.c block: fix use-after-free in sys_ioprio_get() 2016-07-01 08:39:24 -06:00
Kconfig Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block 2016-10-09 17:29:33 -07:00
Kconfig.iosched blkcg: make CONFIG_BLK_CGROUP bool 2012-03-06 21:27:21 +01:00
Makefile Merge branch 'for-4.9/block-smp' of git://git.kernel.dk/linux-block 2016-10-09 17:32:20 -07:00
noop-iosched.c elevator: use list_{first,prev,next}_entry 2015-11-16 15:21:48 -07:00
partition-generic.c block: get rid of blk_integrity_revalidate() 2017-05-14 14:00:22 +02:00
scsi_ioctl.c block: allow WRITE_SAME commands with the SG_IO ioctl 2017-03-22 12:43:38 +01:00
t10-pi.c block: Consolidate static integrity profile properties 2015-10-21 14:42:38 -06:00