Commit Graph

4454 Commits

Author SHA1 Message Date
Arnd Bergmann 3db03b4afb [PATCH] rename the provided execve functions to kernel_execve
Some architectures provide an execve function that does not set errno, but
instead returns the result code directly.  Rename these to kernel_execve to
get the right semantics there.  Moreover, there is no reasone for any of these
architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
remove these right away.

[akpm@osdl.org: build fix]
[bunk@stusta.de: build fix]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:23 -07:00
Kirill Korotaev 73ea41302b [PATCH] IPC namespace - utils
This patch adds basic IPC namespace functionality to
IPC utils:
- init_ipc_ns
- copy/clone/unshare/free IPC ns
- /proc preparations

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Kirill Korotaev 25b21cb2f6 [PATCH] IPC namespace core
This patch set allows to unshare IPCs and have a private set of IPC objects
(sem, shm, msg) inside namespace.  Basically, it is another building block of
containers functionality.

This patch implements core IPC namespace changes:
- ipc_namespace structure
- new config option CONFIG_IPC_NS
- adds CLONE_NEWIPC flag
- unshare support

[clg@fr.ibm.com: small fix for unshare of ipc namespace]
[akpm@osdl.org: build fix]
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Serge E. Hallyn 071df104f8 [PATCH] namespaces: utsname: implement CLONE_NEWUTS flag
Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare.

[clg@fr.ibm.com: IPC unshare fix]
[bunk@stusta.de: cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Serge E. Hallyn bf47fdcda6 [PATCH] namespaces: utsname: remove system_utsname
The system_utsname isn't needed now that kernel/sysctl.c is fixed.
Nuke it.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn 4865ecf131 [PATCH] namespaces: utsname: implement utsname namespaces
This patch defines the uts namespace and some manipulators.
Adds the uts namespace to task_struct, and initializes a
system-wide init namespace.

It leaves a #define for system_utsname so sysctl will compile.
This define will be removed in a separate patch.

[akpm@osdl.org: build fix, cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn e9ff3990f0 [PATCH] namespaces: utsname: switch to using uts namespaces
Replace references to system_utsname to the per-process uts namespace
where appropriate.  This includes things like uname.

Changes: Per Eric Biederman's comments, use the per-process uts namespace
	for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c

[jdike@addtoit.com: UML fix]
[clg@fr.ibm.com: cleanup]
[akpm@osdl.org: build fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn 0bdd7aab7f [PATCH] namespaces: utsname: introduce temporary helpers
Define utsname() and init_utsname() which return &system_utsname.  Users of
system_utsname will be changed to use these helpers, after which
system_utsname will disappear.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn 1651e14e28 [PATCH] namespaces: incorporate fs namespace into nsproxy
This moves the mount namespace into the nsproxy.  The mount namespace count
now refers to the number of nsproxies point to it, rather than the number of
tasks.  As a result, the unshare_namespace() function in kernel/fork.c no
longer checks whether it is being shared.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Serge E. Hallyn ab516013ad [PATCH] namespaces: add nsproxy
This patch adds a nsproxy structure to the task struct.  Later patches will
move the fs namespace pointer into this structure, and introduce a new utsname
namespace into the nsproxy.

The vserver and openvz functionality, then, would be implemented in large part
by virtualizing/isolating more and more resources into namespaces, each
contained in the nsproxy.

[akpm@osdl.org: build fix]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Peter Zijlstra 12fd352038 [PATCH] nfsd: lockdep annotation
while doing a kernel make modules_install install over an NFS mount.

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9550 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9550:
   #0:  (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9580 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9580:
   #0:  (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Neil Brown <neilb@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Greg Banks bfd241600a [PATCH] knfsd: make rpc threads pools numa aware
Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
sockets on the svc_pool corresponding to the CPU on which the socket bh is run
(i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
to the CPUs in the svc_pool that owns them.

This is the patch that allows an Altix to scale NFS traffic linearly
beyond 4 CPUs and 4 NICs.

Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
Christoph Hellwig.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Greg Banks a74554429e [PATCH] knfsd: add svc_set_num_threads
Currently knfsd keeps its own list of all nfsd threads in nfssvc.c; add a new
way of managing the list of all threads in a svc_serv.  Add
svc_create_pooled() to allow creation of a svc_serv whose threads are managed
by the sunrpc code.  Add svc_set_num_threads() to manage the number of threads
in a service, either per-pool or globally across the service.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks 9a24ab5749 [PATCH] knfsd: add svc_get
add svc_get() for those occasions when we need to temporarily bump up
svc_serv->sv_nrthreads as a pseudo refcount.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks 3262c816a3 [PATCH] knfsd: split svc_serv into pools
Split out the list of idle threads and pending sockets from svc_serv into a
new svc_pool structure, and allocate a fixed number (in this patch, 1) of
pools per svc_serv.  The new structure contains a lock which takes over
several of the duties of svc_serv->sv_lock, which is now relegated to
protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv.

The point is to move the hottest fields out of svc_serv and into svc_pool,
allowing a following patch to arrange for a svc_pool per NUMA node or per CPU.
 This is a major step towards making the NFS server NUMA-friendly.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks 5685f0fa1c [PATCH] knfsd: convert sk_reserved to atomic_t
Convert the svc_sock->sk_reserved variable from an int protected by
svc_serv->sv_lock, to an atomic.  This reduces (by 1) the number of places we
need to take the (effectively global) svc_serv->sv_lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks 1a68d952af [PATCH] knfsd: use new lock for svc_sock deferred list
Protect the svc_sock->sk_deferred list with a new lock svc_sock->sk_defer_lock
instead of svc_serv->sv_lock.  Using the more fine-grained lock reduces the
number of places we need to take the svc_serv lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks c45c357d7d [PATCH] knfsd: convert sk_inuse to atomic_t
Convert the svc_sock->sk_inuse counter from an int protected by
svc_serv->sv_lock, to an atomic.  This reduces the number of places we need to
take the (effectively global) svc_serv->sv_lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks 36bdfc8bae [PATCH] knfsd: move tempsock aging to a timer
Following are 11 patches from Greg Banks which combine to make knfsd more
Numa-aware.  They reduce hitting on 'global' data structures, and create some
data-structures that can be node-local.

knfsd threads are bound to a particular node, and the thread to handle a new
request is chosen from the threads that are attach to the node that received
the interrupt.

The distribution of threads across nodes can be controlled by a new file in
the 'nfsd' filesystem, though the default approach of an even spread is
probably fine for most sites.

Some (old) numbers that show the efficacy of these patches: N == number of
NICs == number of CPUs == nmber of clients.  Number of NUMA nodes == N/2

N	Throughput, MiB/s	CPU usage, % (max=N*100)
	Before	After		Before	After
	---	------	----		-----	-----
	4	312	435		350	228
	6	500	656		501	418
	8	562	804		690	589

This patch:

Move the aging of RPC/TCP connection sockets from the main svc_recv() loop to
a timer which uses a mark-and-sweep algorithm every 6 minutes.  This reduces
the amount of work that needs to be done in the main RPC loop and the length
of time we need to hold the (effectively global) svc_serv->sv_lock.

[akpm@osdl.org: cleanup]
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
NeilBrown 6fb2b47fa1 [PATCH] knfsd: Drop 'serv' option to svc_recv and svc_process
It isn't needed as it is available in rqstp->rq_server, and dropping it allows
some local vars to be dropped.

[akpm@osdl.org: build fix]
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:18 -07:00
NeilBrown b41b66d63c [PATCH] knfsd: allow sockets to be passed to nfsd via 'portlist'
Userspace should create and bind a socket (but not connectted) and write the
'fd' to portlist.  This will cause the nfs server to listen on that socket.

To close a socket, the name of the socket - as read from 'portlist' can be
written to 'portlist' with a preceding '-'.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:18 -07:00
NeilBrown 80212d59e3 [PATCH] knfsd: define new nfsdfs file: portlist - contains list of ports
This file will list all ports that nfsd has open.
Default when TCP enabled will be
   ipv4 udp 0.0.0.0 2049
   ipv4 tcp 0.0.0.0 2049

Later, the list of ports will be settable.

'portlist' chosen rather than 'ports', to avoid unnecessary confusion with
non-mainline patches which created 'ports' with different semantics.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:18 -07:00
NeilBrown 6658d3a7bb [PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions
We have an array 'nfsd_version' which lists the available versions of nfsd,
and 'nfsd_versions' (poor choice there :-() which lists the currently active
versions.

Then we have a bitmap - nfsd_versbits which says which versions are wanted.
The bits in this bitset cause content to be copied from nfsd_version to
nfsd_versions when nfsd starts.

This patch removes nfsd_versbits and moves information directly from
nfsd_version to nfsd_versions when requests for version changes arrive.

Note that this doesn't make it possible to change versions while the server is
running.  This is because serv->sv_xdrsize is calculated when a service is
created, and used when threads are created, and xdrsize depends on the active
versions.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
NeilBrown 24e36663c3 [PATCH] knfsd: be more selective in which sockets lockd listens on
Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set.

However as lockd performs services of the client as well, this is a problem.
If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be
able to call back to lockd.

So:
 - add an option to lockd_up saying which protocol is needed
 - Always open sockets for which an explicit port was given, otherwise
   only open a socket of the type required
 - Change nfsd to do one lockd_up per socket rather than one per thread.

This
 - removes the dependancy on CONFIG_NFSD_TCP
 - means that lockd may open sockets other than at startup
 - means that lockd will *not* listen on UDP if the only
   mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know if this
might be a problem with some servers.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
NeilBrown bc591ccff2 [PATCH] knfsd: add a callback for when last rpc thread finishes
nfsd has some cleanup that it wants to do when the last thread exits, and
there will shortly be some more.  So collect this all into one place and
define a callback for an rpc service to call when the service is about to be
destroyed.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
Greg Banks 0f532f3861 [PATCH] cpumask: add highest_possible_node_id
cpumask: add highest_possible_node_id(), analogous to
highest_possible_processor_id().

[pj@sgi.com: fix typo]
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
bibo,mao 99219a3fbc [PATCH] kretprobe spinlock deadlock patch
kprobe_flush_task() possibly calls kfree function during holding
kretprobe_lock spinlock, if kfree function is probed by kretprobe that will
incur spinlock deadlock.  This patch moves kfree function out scope of
kretprobe_lock.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:16 -07:00
Ananth N Mavinakayanahalli 3a872d89ba [PATCH] Kprobes: Make kprobe modules more portable
In an effort to make kprobe modules more portable, here is a patch that:

o Introduces the "symbol_name" field to struct kprobe.
  The symbol->address resolution now happens in the kernel in an
  architecture agnostic manner. 64-bit powerpc users no longer have
  to specify the ".symbols"
o Introduces the "offset" field to struct kprobe to allow a user to
  specify an offset into a symbol.
o The legacy mechanism of specifying the kprobe.addr is still supported.
  However, if both the kprobe.addr and kprobe.symbol_name are specified,
  probe registration fails with an -EINVAL.
o The symbol resolution code uses kallsyms_lookup_name(). So
  CONFIG_KPROBES now depends on CONFIG_KALLSYMS
o Apparantly kprobe modules were the only legitimate out-of-tree user of
  the kallsyms_lookup_name() EXPORT. Now that the symbol resolution
  happens in-kernel, remove the EXPORT as suggested by Christoph Hellwig
o Modify tcp_probe.c that uses the kprobe interface so as to make it
  work on multiple platforms (in its earlier form, the code wouldn't
  work, say, on powerpc)

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:16 -07:00
Eric W. Biederman 2425c08b37 [PATCH] usb: fixup usb so it uses struct pid
The problem with remembering a user space process by its pid is that it is
possible that the process will exit, pid wrap around will occur.
Converting to a struct pid avoid that problem, and paves the way for
implementing a pid namespace.

Also since usb is the only user of kill_proc_info_as_uid rename
kill_proc_info_as_uid to kill_pid_info_as_uid and have the new version take
a struct pid.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:15 -07:00
Sukadev Bhattiprolu 3fbc964864 [PATCH] Define struct pspace
Define a per-container pid space object.  And create one instance of this
object, init_pspace, to define the entire pid space.  Subsequent patches
will provide/use interfaces to create/destroy pid spaces.

Its a subset/rework of Eric Biederman's patch
http://lkml.org/lkml/2006/2/6/285 .

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:15 -07:00
Sukadev Bhattiprolu aa5a6662f9 [PATCH] Move pidmap to pspace.h
Move struct pidmap and PIDMAP_ENTRIES to a new file, include/linux/pspace.h
where it will be used in subsequent patches to define pid spaces.

Its a subset of Eric Biederman's patch http://lkml.org/lkml/2006/2/6/285

[akpm@osdl.org: cleanups]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:15 -07:00
Oleg Nesterov d387cae075 [PATCH] pid: simplify pid iterators
I think it is hardly possible to read the current do_each_task_pid().  The
new version is much simpler and makes the code smaller.

Only the do_each_task_pid change is tested, the do_each_pid_task isn't.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:15 -07:00
Jeff Dike b68e31d0eb [PATCH] const struct tty_operations
As part of an SMP cleanliness pass over UML, I consted a bunch of
structures in order to not have to document their locking.  One of these
structures was a struct tty_operations.  In order to const it in UML
without introducing compiler complaints, the declaration of
tty_set_operations needs to be changed, and then all of its callers need to
be fixed.

This patch declares all struct tty_operations in the tree as const.  In all
cases, they are static and used only as input to tty_set_operations.  As an
extra check, I ran an i386 allyesconfig build which produced no extra
warnings.

53 drivers are affected.  I checked the history of a bunch of them, and in
most cases, there have been only a handful of maintenance changes in the
last six months.  serial_core.c was the busiest one that I looked at.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:14 -07:00
Eric W. Biederman 609d7fa956 [PATCH] file: modify struct fown_struct to use a struct pid
File handles can be requested to send sigio and sigurg to processes.  By
tracking the destination processes using struct pid instead of pid_t we make
the interface safe from all potential pid wrap around problems.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:14 -07:00
Eric W. Biederman bde0d2c98b [PATCH] vt: Make vt_pid a struct pid (making it pid wrap around safe).
I took a good hard look at the locking and it appears the locking on vt_pid
is the console semaphore.  Every modified path is called under the console
semaphore except reset_vc when it is called from fn_SAK or do_SAK both of
which appear to be in interrupt context.  In addition I need to be careful
because in the presence of an oops the console_sem may be arbitrarily
dropped.

Which leads me to conclude the current locking is inadequate for my needs.

Given the weird cases we could hit because of oops printing instead of
introducing an extra spin lock to protect the data and keep the pid to
signal and the signal to send in sync, I have opted to use xchg on just the
struct pid * pointer instead.

Due to console_sem we will stay in sync between vt_pid and vt_mode except
for a small window during a SAK, or oops handling.  SAK handling should
kill any user space process that care, and oops handling we are broken
anyway.  Besides the worst that can happen is that I try to send the wrong
signal.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:14 -07:00
Eric W. Biederman 81af8d67d4 [PATCH] vt: rework the console spawning variables
This is such a rare path it took me a while to figure out how to test
this after soring out the locking.

This patch does several things.
- The variables used are moved into a structure and declared in vt_kern.h
- A spinlock is added so we don't have SMP races updating the values.
- Instead of raw pid_t value a struct_pid is used to guard against
  pid wrap around issues, if the daemon to spawn a new console dies.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman 5feb8f5f84 [PATCH] pid: implement pid_nr
As we stop storing pid_t's and move to storing struct pid *.  We need a way to
get the pid_t from the struct pid to report to user space what we have stored.

Having a clean well defined way to do this is especially important as we move
to multiple pid spaces as may need to report a different value to the caller
depending on which pid space the caller is in.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman c4b92fc112 [PATCH] pid: implement signal functions that take a struct pid *
Currently the signal functions all either take a task or a pid_t argument.
This patch implements variants that take a struct pid *.  After all of the
users have been update it is my intention to remove the variants that take a
pid_t as using pid_t can be more work (an extra hash table lookup) and
difficult to get right in the presence of multiple pid namespaces.

There are two kinds of functions introduced in this patch.  The are the
general use functions kill_pgrp and kill_pid which take a priv argument that
is ultimately used to create the appropriate siginfo information, Then there
are _kill_pgrp_info, kill_pgrp_info, kill_pid_info the internal implementation
helpers that take an explicit siginfo.

The distinction is made because filling out an explcit siginfo is tricky, and
will be even more tricky when pid namespaces are introduced.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman 558cb32548 [PATCH] pid: add do_each_pid_task
To avoid pid rollover confusion the kernel needs to work with struct pid *
instead of pid_t.  Currently there is not an iterator that walks through all
of the tasks of a given pid type starting with a struct pid.  This prevents us
replacing some pid_t instances with struct pid.  So this patch adds
do_each_pid_task which walks through the set of task for a given pid type
starting with a struct pid.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman 22c935f47c [PATCH] pid: implement access helpers for a tacks various process groups
In the last round of cleaning up the pid hash table a more general struct pid
was introduced, that can be referenced counted.

With the more general struct pid most if not all places where we store a pid_t
we can now store a struct pid * and remove the need for a hash table lookup,
and avoid any possible problems with pid roll over.

Looking forward to the pid namespaces struct pid * gives us an absolute form a
pid so we can compare and use them without caring which pid namespace we are
in.

This patchset introduces the infrastructure needed to use struct pid instead
of pid_t, and then it goes on to convert two different kernel users that
currently store a pid_t value.

There are a lot more places to go but this is enough to get the basic idea.

Before we can merge a pid namespace patch all of the kernel pid_t users need
to be examined.  Those that deal with user space processes need to be
converted to using a struct pid *.  Those that deal with kernel processes need
to converted to using the kthread api.  A rare few that only use their current
processes pid values get to be left alone.

This patch:

task_session returns the struct pid of a tasks session.
task_pgrp    returns the struct pid of a tasks process group.
task_tgid    returns the struct pid of a tasks thread group.
task_pid     returns the struct pid of a tasks process id.

These can be used to avoid unnecessary hash table lookups, and to implement
safe pid comparisions in the face of a pid namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman 20cdc894c4 [PATCH] proc: modify proc_pident_lookup to be completely table driven
Currently proc_pident_lookup gets the names and types from a table and then
has a huge switch statement to get the inode and file operations it needs.
That is silly and is becoming increasingly hard to maintain so I just put all
of the information in the table.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:13 -07:00
Eric W. Biederman 0804ef4b0d [PATCH] proc: readdir race fix (take 3)
The problem: An opendir, readdir, closedir sequence can fail to report
process ids that are continually in use throughout the sequence of system
calls.  For this race to trigger the process that proc_pid_readdir stops at
must exit before readdir is called again.

This can cause ps to fail to report processes, and it is in violation of
posix guarantees and normal application expectations with respect to
readdir.

Currently there is no way to work around this problem in user space short
of providing a gargantuan buffer to user space so the directory read all
happens in on system call.

This patch implements the normal directory semantics for proc, that
guarantee that a directory entry that is neither created nor destroyed
while reading the directory entry will be returned.  For directory that are
either created or destroyed during the readdir you may or may not see them.
 Furthermore you may seek to a directory offset you have previously seen.

These are the guarantee that ext[23] provides and that posix requires, and
more importantly that user space expects.  Plus it is a simple semantic to
implement reliable service.  It is just a matter of calling readdir a
second time if you are wondering if something new has show up.

These better semantics are implemented by scanning through the pids in
numerical order and by making the file offset a pid plus a fixed offset.

The pid scan happens on the pid bitmap, which when you look at it is
remarkably efficient for a brute force algorithm.  Given that a typical
cache line is 64 bytes and thus covers space for 64*8 == 200 pids.  There
are only 40 cache lines for the entire 32K pid space.  A typical system
will have 100 pids or more so this is actually fewer cache lines we have to
look at to scan a linked list, and the worst case of having to scan the
entire pid bitmap is pretty reasonable.

If we need something more efficient we can go to a more efficient data
structure for indexing the pids, but for now what we have should be
sufficient.

In addition this takes no additional locks and is actually less code than
what we are doing now.

Also another very subtle bug in this area has been fixed.  It is possible
to catch a task in the middle of de_thread where a thread is assuming the
thread of it's thread group leader.  This patch carefully handles that case
so if we hit it we don't fail to return the pid, that is undergoing the
de_thread dance.

Thanks to KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> for
providing the first fix, pointing this out and working on it.

[oleg@tv-sign.ru: fix it]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:12 -07:00
Randy Dunlap 2bc2d61a96 [PATCH] list module taint flags in Oops/panic
When listing loaded modules during an oops or panic, also list each
module's Tainted flags if non-zero (P: Proprietary or F: Forced load only).

If a module is did not taint the kernel, it is just listed like
	usbcore
but if it did taint the kernel, it is listed like
	wizmodem(PF)

Example:
[ 3260.121718] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[ 3260.121729]  [<ffffffff8804c099>] :dump_test:proc_dump_test+0x99/0xc8
[ 3260.121742] PGD fe8d067 PUD 264a6067 PMD 0
[ 3260.121748] Oops: 0002 [1] SMP
[ 3260.121753] CPU 1
[ 3260.121756] Modules linked in: dump_test(P) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device ide_cd generic ohci1394 snd_hda_intel snd_hda_codec snd_pcm snd_timer snd ieee1394 snd_page_alloc piix ide_core arcmsr aic79xx scsi_transport_spi usblp
[ 3260.121785] Pid: 5556, comm: bash Tainted: P      2.6.18-git10 #1

[Alternatively, I can look into listing tainted flags with 'lsmod',
but that won't help in oopsen/panics so much.]

[akpm@osdl.org: cleanup]
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:12 -07:00
Steve Wise 322acc96d4 [PATCH] LIB: add gen_pool_destroy()
Modules using the genpool allocator need to be able to destroy the data
structure when unloading.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:12 -07:00
Zachary Amsden 5a73fdc5ea [PATCH] Some config.h removals
During tracking down a PAE compile failure, I found that config.h was being
included in a bunch of places in i386 code.  It is no longer necessary, so
drop it.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:34 -07:00
Andi Kleen e239ca5405 [PATCH] Create call_usermodehelper_pipe()
A new member in the ever growing family of call_usermode* functions is
born.  The new call_usermodehelper_pipe() function allows to pipe data to
the stdin of the called user mode progam and behaves otherwise like the
normal call_usermodehelp() (except that it always waits for the child to
finish)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:33 -07:00
Andi Kleen d6cbd281d1 [PATCH] Some cleanup in the pipe code
Split the big and hard to read do_pipe function into smaller pieces.

This creates new create_write_pipe/free_write_pipe/create_read_pipe
functions.  These functions are made global so that they can be used by
other parts of the kernel.

The resulting code is more generic and easier to read and has cleaner error
handling and less gotos.

[akpm@osdl.org: cleanup]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:33 -07:00
Haavard Skinnemoen 74588d8ba3 [PATCH] Generic ioremap_page_range: implementation
This patch adds a generic implementation of ioremap_page_range() in
lib/ioremap.c based on the i386 implementation. It differs from the
i386 version in the following ways:

  * The PTE flags are passed as a pgprot_t argument and must be
    determined up front by the arch-specific code. No additional
    PTE flags are added.
  * Uses set_pte_at() instead of set_pte()

[bunk@stusta.de: warning fix]
]dhowells@redhat.com: nommu build fix]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-m32r@ml.linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:31 -07:00
Dave Hansen ce71ec3684 [PATCH] r/o bind mounts: monitor zeroing of i_nlink
Some filesystems, instead of simply decrementing i_nlink, simply zero it
during an unlink operation.  We need to catch these in addition to the
decrement operations.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:30 -07:00
Dave Hansen d8c76e6f45 [PATCH] r/o bind mount prepwork: inc_nlink() helper
This is mostly included for parity with dec_nlink(), where we will have some
more hooks.  This one should stay pretty darn straightforward for now.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:30 -07:00