linux-stable/drivers/md
Roman Gushchin b8a9d66d04 md/raid5: fix locking in handle_stripe_clean_event()
After commit 566c09c534 ("raid5: relieve lock contention in get_active_stripe()")
__find_stripe() is called under conf->hash_locks + hash.
But handle_stripe_clean_event() calls remove_hash() under
conf->device_lock.

Under some cirscumstances the hash chain can be circuited,
and we get an infinite loop with disabled interrupts and locked hash
lock in __find_stripe(). This leads to hard lockup on multiple CPUs
and following system crash.

I was able to reproduce this behavior on raid6 over 6 ssd disks.
The devices_handle_discard_safely option should be set to enable trim
support. The following script was used:

for i in `seq 1 32`; do
    dd if=/dev/zero of=large$i bs=10M count=100 &
done

neilb: original was against a 3.x kernel.  I forward-ported
  to 4.3-rc.  This verison is suitable for any kernel since
  Commit: 59fc630b8b ("RAID5: batch adjacent full stripe write")
  (v4.1+).  I'll post a version for earlier kernels to stable.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Fixes: 566c09c534 ("raid5: relieve lock contention in get_active_stripe()")
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org> # 3.13 - 4.2
2015-10-31 10:53:50 +11:00
..
bcache bcache: remove driver private bio splitting code 2015-08-13 12:31:40 -06:00
persistent-data dm: remove unlikely() before IS_ERR() 2015-08-12 11:32:21 -04:00
bitmap.c md/bitmap: don't pass -1 to bitmap_storage_alloc. 2015-10-02 17:24:13 +10:00
bitmap.h md-cluster: re-add capabilities 2015-04-22 07:59:39 +10:00
dm-bio-prison.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm-bio-prison.h dm bio prison: add dm_cell_promote_or_release() 2015-05-29 14:19:06 -04:00
dm-bio-record.h
dm-bufio.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm-bufio.h
dm-builtin.c
dm-cache-block-types.h dm cache: revert "remove remainder of distinct discard block size" 2014-11-10 15:25:30 -05:00
dm-cache-metadata.c dm cache: add fail io mode and needs_check flag 2015-06-11 17:13:00 -04:00
dm-cache-metadata.h dm cache: add fail io mode and needs_check flag 2015-06-11 17:13:00 -04:00
dm-cache-policy-cleaner.c dm cache: fix NULL pointer when switching from cleaner policy 2015-10-09 09:16:29 -04:00
dm-cache-policy-internal.h dm cache: age and write back cache entries even without active IO 2015-06-11 17:13:01 -04:00
dm-cache-policy-mq.c dm cache policy smq: move 'dm-cache-default' module alias to SMQ 2015-08-12 11:27:29 -04:00
dm-cache-policy-smq.c dm cache policy smq: change the mutex to a spinlock 2015-08-12 11:32:19 -04:00
dm-cache-policy.c
dm-cache-policy.h dm cache: age and write back cache entries even without active IO 2015-06-11 17:13:01 -04:00
dm-cache-target.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-crypt.c dm crypt: constrain crypt device's max_segment_size to PAGE_SIZE 2015-09-14 12:04:24 -04:00
dm-delay.c dm: do not override error code returned from dm_get_device() 2015-08-12 11:32:21 -04:00
dm-era-target.c block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
dm-exception-store.c dm snapshot: add new persistent store option to support overflow 2015-10-09 16:57:03 -04:00
dm-exception-store.h dm snapshot: add new persistent store option to support overflow 2015-10-09 16:57:03 -04:00
dm-flakey.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-io.c block: remove bio_get_nr_vecs() 2015-08-13 12:32:04 -06:00
dm-ioctl.c char: make misc_deregister a void function 2015-08-05 10:35:49 -07:00
dm-kcopyd.c
dm-linear.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-log-userspace-base.c dm log userspace base: fix compile warning 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.c dm log userspace transfer: match wait_for_completion_timeout return type 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h
dm-log-writes.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-log.c
dm-mpath.c dm-mpath, scsi_dh: request scsi_dh modules in scsi_dh, not dm-mpath 2015-08-28 13:14:55 -07:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid.c dm raid: fix round up of default region size 2015-10-02 12:02:31 -04:00
dm-raid1.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-region-hash.c
dm-round-robin.c
dm-service-time.c
dm-snap-persistent.c dm snapshot persistent: fix missing cleanup in persistent_ctr error path 2015-10-13 12:20:54 -04:00
dm-snap-transient.c dm snapshot: add new persistent store option to support overflow 2015-10-09 16:57:03 -04:00
dm-snap.c dm snapshot: add new persistent store option to support overflow 2015-10-09 16:57:03 -04:00
dm-stats.c dm stats: report precise_timestamps and histogram in @stats_list output 2015-08-18 17:20:03 -04:00
dm-stats.h dm stats: support precise timestamps 2015-06-17 12:40:40 -04:00
dm-stripe.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-switch.c
dm-sysfs.c dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr 2015-04-15 12:10:17 -04:00
dm-table.c block: Replace SG_GAPS with new queue limits mask 2015-08-19 14:26:02 -07:00
dm-target.c dm: allocate requests in target when stacking on blk-mq devices 2015-02-09 13:06:47 -05:00
dm-thin-metadata.c dm thin metadata: delete btrees when releasing metadata snapshot 2015-08-12 10:42:51 -04:00
dm-thin-metadata.h dm thin metadata: add dm_thin_remove_range() 2015-06-11 17:13:04 -04:00
dm-thin.c dm thin: fix missing pool reference count decrement in pool_ctr error path 2015-10-13 12:20:55 -04:00
dm-uevent.c
dm-uevent.h
dm-verity.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-zero.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm.c dm: fix request-based dm error reporting 2015-10-06 10:08:16 -04:00
dm.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
faulty.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
Kconfig SCSI misc on 20150911 2015-09-11 18:15:18 -07:00
linear.c block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
linear.h
Makefile dm cache: add stochastic-multi-queue (smq) policy 2015-06-11 17:12:59 -04:00
md-cluster.c md-cluster: remove inappropriate try_module_get from join() 2015-08-31 19:43:17 +02:00
md-cluster.h Fix read-balancing during node failure 2015-07-24 13:37:59 +10:00
md.c md: clear CHANGE_PENDING in readonly array 2015-10-02 17:23:44 +10:00
md.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
multipath.c md: drop null test before destroy functions 2015-10-02 17:23:44 +10:00
multipath.h
raid0.c md/raid0: apply base queue limits *before* disk_stack_limits 2015-10-02 17:23:44 +10:00
raid0.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
raid1.c md/raid1: don't clear bitmap bit when bad-block-list write fails. 2015-10-24 16:24:22 +11:00
raid1.h md/raid1: ensure device failure recorded before write request returns. 2015-08-31 19:43:23 +02:00
raid5.c md/raid5: fix locking in handle_stripe_clean_event() 2015-10-31 10:53:50 +11:00
raid5.h md/raid5: ensure device failure recorded before write request returns. 2015-08-31 19:43:59 +02:00
raid10.c md/raid10: fix the 'new' raid10 layout to work correctly. 2015-10-24 16:24:25 +11:00
raid10.h md/raid10: ensure device failure recorded before write request returns. 2015-08-31 19:43:45 +02:00