linux-stable/fs/proc
Eric W. Biederman ace0c791e6 proc/sysctl: Don't grab i_lock under sysctl_lock.
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes:
> This patch has locking problem. I've got lockdep splat under LTP.
>
> [ 6633.115456] ======================================================
> [ 6633.115502] [ INFO: possible circular locking dependency detected ]
> [ 6633.115553] 4.9.10-debug+ #9 Tainted: G             L
> [ 6633.115584] -------------------------------------------------------
> [ 6633.115627] ksm02/284980 is trying to acquire lock:
> [ 6633.115659]  (&sb->s_type->i_lock_key#4){+.+...}, at: [<ffffffff816bc1ce>] igrab+0x1e/0x80
> [ 6633.115834] but task is already holding lock:
> [ 6633.115882]  (sysctl_lock){+.+...}, at: [<ffffffff817e379b>] unregister_sysctl_table+0x6b/0x110
> [ 6633.116026] which lock already depends on the new lock.
> [ 6633.116026]
> [ 6633.116080]
> [ 6633.116080] the existing dependency chain (in reverse order) is:
> [ 6633.116117]
> -> #2 (sysctl_lock){+.+...}:
> -> #1 (&(&dentry->d_lockref.lock)->rlock){+.+...}:
> -> #0 (&sb->s_type->i_lock_key#4){+.+...}:
>
> d_lock nests inside i_lock
> sysctl_lock nests inside d_lock in d_compare
>
> This patch adds i_lock nesting inside sysctl_lock.

Al Viro <viro@ZenIV.linux.org.uk> replied:
> Once ->unregistering is set, you can drop sysctl_lock just fine.  So I'd
> try something like this - use rcu_read_lock() in proc_sys_prune_dcache(),
> drop sysctl_lock() before it and regain after.  Make sure that no inodes
> are added to the list ones ->unregistering has been set and use RCU list
> primitives for modifying the inode list, with sysctl_lock still used to
> serialize its modifications.
>
> Freeing struct inode is RCU-delayed (see proc_destroy_inode()), so doing
> igrab() is safe there.  Since we don't drop inode reference until after we'd
> passed beyond it in the list, list_for_each_entry_rcu() should be fine.

I agree with Al Viro's analsysis of the situtation.

Fixes: d6cffbbe9a ("proc/sysctl: prune stale dentries during unregistering")
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Tested-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-02-22 08:34:53 +13:00
..
array.c fs/proc/array.c: slightly improve render_sigset_t 2016-12-12 18:55:09 -08:00
base.c proc: Better ownership of files for non-dumpable tasks in user namespaces 2017-01-24 12:03:09 +13:00
cmdline.c
consoles.c
cpuinfo.c
devices.c
fd.c proc: Better ownership of files for non-dumpable tasks in user namespaces 2017-01-24 12:03:09 +13:00
fd.h proc: unsigned file descriptors 2016-09-27 18:47:38 -04:00
generic.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
inode.c proc/sysctl: prune stale dentries during unregistering 2017-02-13 17:00:06 +13:00
internal.h proc/sysctl: prune stale dentries during unregistering 2017-02-13 17:00:06 +13:00
interrupts.c
Kconfig fs, proc: add help for CONFIG_PROC_CHILDREN 2015-07-17 16:39:52 -07:00
kcore.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
kmsg.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
loadavg.c
Makefile fs/proc: Add compiler check for -Wno-override-init to support gcc < 4.2 2016-08-03 12:45:23 -04:00
meminfo.c meminfo: break apart a very long seq_printf with #ifdefs 2016-10-07 18:46:30 -07:00
namespaces.c proc: Pass file mode to proc_pid_make_inode 2016-11-14 15:39:48 -05:00
nommu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
page.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
proc_net.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
proc_sysctl.c proc/sysctl: Don't grab i_lock under sysctl_lock. 2017-02-22 08:34:53 +13:00
proc_tty.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
root.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
softirqs.c
stat.c seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char 2016-10-07 18:46:30 -07:00
task_mmu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
task_nommu.c fs/proc: Stop trying to report thread stacks 2016-10-20 09:21:41 +02:00
thread_self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
uptime.c
version.c
vmcore.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00