Commit graph

17676 commits

Author SHA1 Message Date
Linus Torvalds
b8ae30ee26 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits)
  stop_machine: Move local variable closer to the usage site in cpu_stop_cpu_callback()
  sched, wait: Use wrapper functions
  sched: Remove a stale comment
  ondemand: Make the iowait-is-busy time a sysfs tunable
  ondemand: Solve a big performance issue by counting IOWAIT time as busy
  sched: Intoduce get_cpu_iowait_time_us()
  sched: Eliminate the ts->idle_lastupdate field
  sched: Fold updating of the last_update_time_info into update_ts_time_stats()
  sched: Update the idle statistics in get_cpu_idle_time_us()
  sched: Introduce a function to update the idle statistics
  sched: Add a comment to get_cpu_idle_time_us()
  cpu_stop: add dummy implementation for UP
  sched: Remove rq argument to the tracepoints
  rcu: need barrier() in UP synchronize_sched_expedited()
  sched: correctly place paranioa memory barriers in synchronize_sched_expedited()
  sched: kill paranoia check in synchronize_sched_expedited()
  sched: replace migration_thread with cpu_stop
  stop_machine: reimplement using cpu_stop
  cpu_stop: implement stop_cpu[s]()
  sched: Fix select_idle_sibling() logic in select_task_rq_fair()
  ...
2010-05-18 08:27:54 -07:00
Linus Torvalds
fd25a1f556 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (23 commits)
  cifs: fix noserverino handling when unix extensions are enabled
  cifs: don't update uniqueid in cifs_fattr_to_inode
  cifs: always revalidate hardlinked inodes when using noserverino
  [CIFS] drop quota operation stubs
  cifs: propagate cifs_new_fileinfo() error back to the caller
  cifs: add comments explaining cifs_new_fileinfo behavior
  cifs: remove unused parameter from cifs_posix_open_inode_helper()
  [CIFS] Remove unused cifs_oplock_cachep
  cifs: have decode_negTokenInit set flags in server struct
  cifs: break negotiate protocol calls out of cifs_setup_session
  cifs: eliminate "first_time" parm to CIFS_SessSetup
  [CIFS] Fix lease break for writes
  cifs: save the dialect chosen by server
  cifs: change && to ||
  cifs: rename "extended_security" to "global_secflags"
  cifs: move tcon find/create into separate function
  cifs: move SMB session creation code into separate function
  cifs: track local_nls in volume info
  [CIFS] Allow null nd (as nfs server uses) on create
  [CIFS] Fix losing locks during fork()
  ...
2010-05-18 07:18:38 -07:00
Jeff Layton
4065c802da cifs: fix noserverino handling when unix extensions are enabled
The uniqueid field sent by the server when unix extensions are enabled
is currently used sometimes when it shouldn't be. The readdir codepath
is correct, but most others are not. Fix it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:59:21 +00:00
Jeff Layton
84f30c66c3 cifs: don't update uniqueid in cifs_fattr_to_inode
We use this value to find an inode within the hash bucket, so we can't
change this without re-hashing the inode. For now, treat this value
as immutable.

Eventually, we should probably use an inode number change on a path
based operation to indicate that the lookup cache is invalid, but that's
a bit more code to deal with.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:57:27 +00:00
Jeff Layton
db19272edc cifs: always revalidate hardlinked inodes when using noserverino
The old cifs_revalidate logic always revalidated hardlinked inodes.
This hack allowed CIFS to pass some connectathon tests when server inode
numbers aren't used (basic test7, in particular).

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:55:58 +00:00
Linus Torvalds
7d32c0aca4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs
* git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
  logfs: handle powerfail on NAND flash
  logfs: handle errors from get_mtd_device()
  logfs: remove unused variable
  logfs: fix sync
  logfs: fix compile failure
  logfs: initialize li->li_refcount
  logfs: commit reservations under space pressure
  logfs: survive logfs_buf_recover read errors
  logfs: Close i_ino reuse race
  logfs: fix logfs_seek_hole()
  logfs: Return -EINVAL if filesystem image doesn't match
  LogFS: Fix typo in b6349ac8
  logfs: testing the wrong variable
2010-05-17 13:53:35 -07:00
Linus Torvalds
18e41da89d Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: check for read permission on src file in the clone ioctl
2010-05-15 12:55:31 -07:00
Dan Rosenberg
5dc6416414 Btrfs: check for read permission on src file in the clone ioctl
The existing code would have allowed you to clone a file that was
only open for writing

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-05-15 12:05:50 -04:00
Linus Torvalds
3f8bf8f0fd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  JFS: Free sbi memory in error path
  fs/sysv: dereferencing ERR_PTR()
  Fix double-free in logfs
  Fix the regression created by "set S_DEAD on unlink()..." commit
2010-05-15 09:03:15 -07:00
Jan Blunck
684bdc7ff9 JFS: Free sbi memory in error path
I spotted the missing kfree() while removing the BKL.

[akpm@linux-foundation.org: avoid multiple returns so it doesn't happen again]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:34 -04:00
Dan Carpenter
404e781249 fs/sysv: dereferencing ERR_PTR()
I moved the dir_put_page() inside the if condition so we don't dereference
"page", if it's an ERR_PTR().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Al Viro
265624495f Fix double-free in logfs
iput() is needed *until* we'd done successful d_alloc_root()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Al Viro
d83c49f3e3 Fix the regression created by "set S_DEAD on unlink()..." commit
1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on.  To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question.  Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD.  Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.

2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint.  Fixed.

3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one.  Noticed and fixed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Linus Torvalds
4fc4c3ce0d Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  inotify: don't leak user struct on inotify release
  inotify: race use after free/double free in inotify inode marks
  inotify: clean up the inotify_add_watch out path
  Inotify: undefined reference to `anon_inode_getfd'

Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
2010-05-14 11:49:42 -07:00
Pavel Emelyanov
b3b38d842f inotify: don't leak user struct on inotify release
inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user.  The problem is that free_uid() is
never called on it.

Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14 11:53:36 -04:00
Eric Paris
e08733446e inotify: race use after free/double free in inotify inode marks
There is a race in the inotify add/rm watch code.  A task can find and
remove a mark which doesn't have all of it's references.  This can
result in a use after free/double free situation.

Task A					Task B
------------				-----------
inotify_new_watch()
 allocate a mark (refcnt == 1)
 add it to the idr
					inotify_rm_watch()
					 inotify_remove_from_idr()
					  fsnotify_put_mark()
					      refcnt hits 0, free
 take reference because we are on idr
 [at this point it is a use after free]
 [time goes on]
 refcnt may hit 0 again, double free

The fix is to take the reference BEFORE the object can be found in the
idr.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: <stable@kernel.org>
2010-05-14 11:52:57 -04:00
Eric Paris
3dbc6fb6a3 inotify: clean up the inotify_add_watch out path
inotify_add_watch explictly frees the unused inode mark, but it can just
use the generic code.  Just do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14 11:51:07 -04:00
Steve French
baa4563317 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	fs/cifs/inode.c
2010-05-13 22:19:32 +00:00
Linus Torvalds
4462dc0284 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: guard against hardlinking directories
2010-05-13 10:36:16 -07:00
Jan Kara
002baeecf5 vfs: Fix O_NOFOLLOW behavior for paths with trailing slashes
According to specification

	mkdir d; ln -s d a; open("a/", O_NOFOLLOW | O_RDONLY)

should return success but currently it returns ELOOP.  This is a
regression caused by path lookup cleanup patch series.

Fix the code to ignore O_NOFOLLOW in case the provided path has trailing
slashes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-13 08:46:04 -07:00
Linus Torvalds
cdf5f61ed1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: preserve seq # on requeued messages after transient transport errors
  ceph: fix cap removal races
  ceph: zero unused message header, footer fields
  ceph: fix locking for waking session requests after reconnect
  ceph: resubmit requests on pg mapping change (not just primary change)
  ceph: fix open file counting on snapped inodes when mds returns no caps
  ceph: unregister osd request on failure
  ceph: don't use writeback_control in writepages completion
  ceph: unregister bdi before kill_anon_super releases device name
2010-05-12 18:47:29 -07:00
David Howells
7ac512aa82 CacheFiles: Fix error handling in cachefiles_determine_cache_security()
cachefiles_determine_cache_security() is expected to return with a
security override in place.  However, if set_create_files_as() fails, we
fail to do this.  In this case, we should just reinstate the security
override that was set by the caller.

Furthermore, if set_create_files_as() fails, we should dispose of the
new credentials we were in the process of creating.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-12 18:23:58 -07:00
Russell King
e7b702b1a8 Inotify: undefined reference to `anon_inode_getfd'
Fix:

fs/built-in.o: In function `sys_inotify_init1':
summary.c:(.text+0x347a4): undefined reference to `anon_inode_getfd'

found by kautobuild with arms bcmring_defconfig, which ends up with
INOTIFY_USER enabled (through the 'default y') but leaves ANON_INODES
unset.  However, inotify_user.c uses anon_inode_getfd().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-12 11:03:40 -04:00
Sage Weil
e84346b726 ceph: preserve seq # on requeued messages after transient transport errors
If the tcp connection drops and we reconnect to reestablish a stateful
session (with the mds), we need to resend previously sent (and possibly
received) messages with the _same_ seq # so that they can be dropped on
the other end if needed.  Only assign a new seq once after the message is
queued.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 21:20:38 -07:00
Sage Weil
f818a73674 ceph: fix cap removal races
The iterate_session_caps helper traverses the session caps list and tries
to grab an inode reference.  However, the __ceph_remove_cap was clearing
the inode backpointer _before_ removing itself from the session list,
causing a null pointer dereference.

Clear cap->ci under protection of s_cap_lock to avoid the race, and to
tightly couple the list and backpointer state.  Use a local flag to
indicate whether we are releasing the cap, as cap->session may be modified
by a racing thread in iterate_session_caps.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 20:56:31 -07:00
Steve French
aa3e5572c5 [CIFS] drop quota operation stubs
CIFS has stubs for XFS-style quotas without an actual implementation backing
them, hidden behind a config option not visible in Kconfig.  Remove these
stubs for now as the quota operations will see some major changes and this
code simply gets in the way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-12 02:24:12 +00:00
Robin Holt
34441427aa revert "procfs: provide stack information for threads" and its fixup commits
Originally, commit d899bf7b ("procfs: provide stack information for
threads") attempted to introduce a new feature for showing where the
threadstack was located and how many pages are being utilized by the
stack.

Commit c44972f1 ("procfs: disable per-task stack usage on NOMMU") was
applied to fix the NO_MMU case.

Commit 89240ba0 ("x86, fs: Fix x86 procfs stack information for threads on
64-bit") was applied to fix a bug in ia32 executables being loaded.

Commit 9ebd4eba7 ("procfs: fix /proc/<pid>/stat stack pointer for kernel
threads") was applied to fix a bug which had kernel threads printing a
userland stack address.

Commit 1306d603f ('proc: partially revert "procfs: provide stack
information for threads"') was then applied to revert the stack pages
being used to solve a significant performance regression.

This patch nearly undoes the effect of all these patches.

The reason for reverting these is it provides an unusable value in
field 28.  For x86_64, a fork will result in the task->stack_start
value being updated to the current user top of stack and not the stack
start address.  This unpredictability of the stack_start value makes
it worthless.  That includes the intended use of showing how much stack
space a thread has.

Other architectures will get different values.  As an example, ia64
gets 0.  The do_fork() and copy_process() functions appear to treat the
stack_start and stack_size parameters as architecture specific.

I only partially reverted c44972f1 ("procfs: disable per-task stack usage
on NOMMU") .  If I had completely reverted it, I would have had to change
mm/Makefile only build pagewalk.o when CONFIG_PROC_PAGE_MONITOR is
configured.  Since I could not test the builds without significant effort,
I decided to not change mm/Makefile.

I only partially reverted 89240ba0 ("x86, fs: Fix x86 procfs stack
information for threads on 64-bit") .  I left the KSTK_ESP() change in
place as that seemed worthwhile.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 17:33:41 -07:00
Sage Weil
45c6ceb547 ceph: zero unused message header, footer fields
We shouldn't leak any prior memory contents to other parties.  And random
data, particularly in the 'version' field, can cause problems down the
line.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 15:17:40 -07:00
Jeff Layton
3d69438031 cifs: guard against hardlinking directories
When we made serverino the default, we trusted that the field sent by the
server in the "uniqueid" field was actually unique. It turns out that it
isn't reliably so.

Samba, in particular, will just put the st_ino in the uniqueid field when
unix extensions are enabled. When a share spans multiple filesystems, it's
quite possible that there will be collisions. This is a server bug, but
when the inodes in question are a directory (as is often the case) and
there is a collision with the root inode of the mount, the result is a
kernel panic on umount.

Fix this by checking explicitly for directory inodes with the same
uniqueid. If that is the case, then we can assume that using server inode
numbers will be a problem and that they should be disabled.

Fixes Samba bugzilla 7407

Signed-off-by: Jeff Layton <jlayton@redhat.com>
CC: Stable <stable@kernel.org>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-11 20:57:50 +00:00
David Howells
c61ea31dac CacheFiles: Fix occasional EIO on call to vfs_unlink()
Fix an occasional EIO returned by a call to vfs_unlink():

	[ 4868.465413] CacheFiles: I/O Error: Unlink failed
	[ 4868.465444] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 4947.320011] CacheFiles: File cache on md3 unregistering
	[ 4947.320041] FS-Cache: Withdrawing cache "mycache"
	[ 5127.348683] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 5127.348716] CacheFiles: File cache on md3 registered
	[ 7076.871081] CacheFiles: I/O Error: Unlink failed
	[ 7076.871130] FS-Cache: Cache cachefiles stopped due to I/O error
	[ 7116.780891] CacheFiles: File cache on md3 unregistering
	[ 7116.780937] FS-Cache: Withdrawing cache "mycache"
	[ 7296.813394] FS-Cache: Cache "mycache" added (type cachefiles)
	[ 7296.813432] CacheFiles: File cache on md3 registered

What happens is this:

 (1) A cached NFS file is seen to have become out of date, so NFS retires the
     object and immediately acquires a new object with the same key.

 (2) Retirement of the old object is done asynchronously - so the lookup/create
     to generate the new object may be done first.

     This can be a problem as the old object and the new object must exist at
     the same point in the backing filesystem (i.e. they must have the same
     pathname).

 (3) The lookup for the new object sees that a backing file already exists,
     checks to see whether it is valid and sees that it isn't.  It then deletes
     that file and creates a new one on disk.

 (4) The retirement phase for the old file is then performed.  It tries to
     delete the dentry it has, but ext4_unlink() returns -EIO because the inode
     attached to that dentry no longer matches the inode number associated with
     the filename in the parent directory.

The trace below shows this quite well.

	[md5sum] ==> __fscache_relinquish_cookie(ffff88002d12fb58{NFS.fh,ffff88002ce62100},1)
	[md5sum] ==> __fscache_acquire_cookie({NFS.server},{NFS.fh},ffff88002ce62100)

NFS has retired the old cookie and asked for a new one.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_ACTIVE,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DYING]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_INIT,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_LOOKING_UP]
	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_DYING,24})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_RECYCLING]

The old object (OBJ52) is going through the terminal states to get rid of it,
whilst the new object - (OBJ53) - is coming into being.

	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_LOOKING_UP,0})
	[kslowd] ==> cachefiles_walk_to_object({ffff88003029d8b8},OBJ53,@68,)
	[kslowd] lookup '@68'
	[kslowd] next -> ffff88002ce41bd0 positive
	[kslowd] advance
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff8800369faac8 positive

The new object has looked up the subdir in which the file would be in (getting
dentry ffff88002ce41bd0) and then looked up the file itself (getting dentry
ffff8800369faac8).

	[kslowd] validate 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	[kslowd] <== cachefiles_bury_object() = 0

It then checks the file's xattrs to see if it's valid.  NFS says that the
auxiliary data indicate the file is out of date (obvious to us - that's why NFS
ditched the old version and got a new one).  CacheFiles then deletes the old
file (dentry ffff8800369faac8).

	[kslowd] redo lookup
	[kslowd] lookup 'Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA'
	[kslowd] next -> ffff88002cd94288 negative
	[kslowd] create -> ffff88002cd94288{ffff88002cdaf238{ino=148247}}

CacheFiles then redoes the lookup and gets a negative result in a new dentry
(ffff88002cd94288) which it then creates a file for.

	[kslowd] ==> cachefiles_mark_object_active(,OBJ53)
	[kslowd] <== cachefiles_mark_object_active() = 0
	[kslowd] === OBTAINED_OBJECT ===
	[kslowd] <== cachefiles_walk_to_object() = 0 [148247]
	[kslowd] <== fscache_object_state_machine() [->OBJECT_AVAILABLE]

The new object is then marked active and the state machine moves to the
available state - at which point NFS can start filling the object.

	[kslowd] ==> fscache_object_state_machine({OBJ52,OBJECT_RECYCLING,20})
	[kslowd] ==> fscache_release_object()
	[kslowd] ==> cachefiles_drop_object({OBJ52,2})
	[kslowd] ==> cachefiles_delete_object(,OBJ52{ffff8800369faac8})

The old object, meanwhile, goes on with being retired.  If allocation occurs
first, cachefiles_delete_object() has to wait for dir->d_inode->i_mutex to
become available before it can continue.

	[kslowd] ==> cachefiles_bury_object(,'@68','Es0g00og0_Nd_XCYe3BOzvXrsBLMlN6aw16M1htaA')
	[kslowd] remove ffff8800369faac8 from ffff88002ce41bd0
	[kslowd] unlink stale object
	EXT4-fs warning (device sda6): ext4_unlink: Inode number mismatch in unlink (148247!=148193)
	CacheFiles: I/O Error: Unlink failed
	FS-Cache: Cache cachefiles stopped due to I/O error

CacheFiles then tries to delete the file for the old object, but the dentry it
has (ffff8800369faac8) no longer points to a valid inode for that directory
entry, and so ext4_unlink() returns -EIO when de->inode does not match i_ino.

	[kslowd] <== cachefiles_bury_object() = -5
	[kslowd] <== cachefiles_delete_object() = -5
	[kslowd] <== fscache_object_state_machine() [->OBJECT_DEAD]
	[kslowd] ==> fscache_object_state_machine({OBJ53,OBJECT_AVAILABLE,0})
	[kslowd] <== fscache_object_state_machine() [->OBJECT_ACTIVE]

(Note that the above trace includes extra information beyond that produced by
the upstream code).

The fix is to note when an object that is being retired has had its object
deleted preemptively by a replacement object that is being created, and to
skip the second removal attempt in such a case.

Reported-by: Greg M <gregm@servu.net.au>
Reported-by: Mark Moseley <moseleymark@gmail.com>
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-11 10:07:53 -07:00
Sage Weil
9abf82b8bc ceph: fix locking for waking session requests after reconnect
The session->s_waiting list is protected by mdsc->mutex, not s_mutex.  This
was causing (rare) s_waiting list corruption.

Fix errors paths too, while we're here.  A more thorough cleanup of this
function is coming soon.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:57 -07:00
Sage Weil
d85b705663 ceph: resubmit requests on pg mapping change (not just primary change)
OSD requests need to be resubmitted on any pg mapping change, not just when
the pg primary changes.  Resending only when the primary changes results in
occasional 'hung' requests during osd cluster recovery or rebalancing.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:56 -07:00
Sage Weil
04d000eb35 ceph: fix open file counting on snapped inodes when mds returns no caps
It's possible the MDS will not issue caps on a snapped inode, in which case
an open request may not __ceph_get_fmode(), botching the open file
counting.  (This is actually a server bug, but the client shouldn't BUG out
in this case.)

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:55 -07:00
Sage Weil
0ceed5db32 ceph: unregister osd request on failure
The osd request wasn't being unregistered when the osd returned a failure
code, even though the result was returned to the caller.  This would cause
it to eventually time out, and then crash the kernel when it tried to
resend the request using a stale page vector.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11 09:53:18 -07:00
Changli Gao
a93d2f1744 sched, wait: Use wrapper functions
epoll should not touch flags in wait_queue_t. This patch introduces a new
function __add_wait_queue_exclusive(), for the users, who use wait queue as a
LIFO queue.

__add_wait_queue_tail_exclusive() is introduced too instead of
add_wait_queue_exclusive_locked(). remove_wait_queue_locked() is removed, as
it is a duplicate of __remove_wait_queue(), disliked by users, and with less
users.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: <containers@lists.linux-foundation.org>
LKML-Reference: <1273214006-2979-1-git-send-email-xiaosuo@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 17:43:58 +02:00
Suresh Jayaraman
fdb3603800 cifs: propagate cifs_new_fileinfo() error back to the caller
..otherwise memory allocation errors go undetected.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-11 15:42:21 +00:00
Suresh Jayaraman
fae683f764 cifs: add comments explaining cifs_new_fileinfo behavior
The comments make it clear the otherwise subtle behavior of cifs_new_fileinfo().

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
--
 fs/cifs/dir.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-10 17:59:51 +00:00
Ian Kent
f7422464b5 autofs4-2.6.34-rc1 - fix link_count usage
After commit 1f36f774b2 ("Switch !O_CREAT case to use of do_last()") in
2.6.34-rc1 autofs direct mounts stopped working.  This is caused by
current->link_count being 0 when ->follow_link() is called from
do_filp_open().

I can't work out why this hasn't been seen before Als patch series.

This patch removes the autofs dependence on current->link_count.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-10 09:48:10 -07:00
Suresh Jayaraman
51c8176472 cifs: remove unused parameter from cifs_posix_open_inode_helper()
..a left over from the commit 3321b791b2.

Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-10 13:52:00 +00:00
Linus Torvalds
9167746716 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix RCU issues in the NFSv4 delegation code
  NFSv4: Fix the locking in nfs_inode_reclaim_delegation()
2010-05-07 13:59:48 -07:00
Joern Engel
6f485b4187 logfs: handle powerfail on NAND flash
The write buffer may not have been written and may no longer be written
due to an interrupted write in the affected page.

Signed-off-by: Joern Engel <joern@logfs.org>
2010-05-07 19:38:40 +02:00
Dan Carpenter
ccf31c10f1 logfs: handle errors from get_mtd_device()
The get_mtd_device() function returns error pointers on failure and if we
don't handle it, it leads to a crash.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Joern Engel <joern@logfs.org>
2010-05-07 14:16:09 +02:00
Joern Engel
58e323cf5e logfs: remove unused variable
Signed-off-by: Joern Engel <joern@logfs.org>
2010-05-07 14:15:04 +02:00
Sage Weil
54ad023ba8 ceph: don't use writeback_control in writepages completion
The ->writepages writeback_control is not still valid in the writepages
completion.  We were touching it solely to adjust pages_skipped when there
was a writeback error (EIO, ENOSPC, EPERM due to bad osd credentials),
causing an oops in the writeback code shortly thereafter.  Updating
pages_skipped on error isn't correct anyway, so let's just rip out this
(clearly broken) code to pass the wbc to the completion.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-05 21:31:40 -07:00
Steve French
bdfae149c5 [CIFS] Remove unused cifs_oplock_cachep
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-06 00:38:16 +00:00
Jeff Layton
26efa0bac9 cifs: have decode_negTokenInit set flags in server struct
...rather than the secType. This allows us to get rid of the MSKerberos
securityEnum. The client just makes a decision at upcall time.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-05 23:24:11 +00:00
Jeff Layton
198b568278 cifs: break negotiate protocol calls out of cifs_setup_session
So that we can reasonably set up the secType based on both the
NegotiateProtocol response and the parsed mount options.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-05 23:18:27 +00:00
Joern Engel
c0c79c31c9 logfs: fix sync
Rather self-explanatory.

Signed-off-by: Joern Engel <joern@logfs.org>
2010-05-05 22:33:36 +02:00
Joern Engel
bba0b5c2c2 logfs: fix compile failure
When CONFIG_BLOCK is not enabled:

fs/logfs/super.c:142: error: implicit declaration of function 'bdev_get_queue'
fs/logfs/super.c:142: error: invalid type argument of '->' (have 'int')

Found by Randy Dunlap <randy.dunlap@oracle.com>

Signed-off-by: Joern Engel <joern@logfs.org>
2010-05-05 22:32:52 +02:00
Linus Torvalds
7572e56314 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2: Avoid a gcc warning in ocfs2_wipe_inode().
  ocfs2: Avoid direct write if we fall back to buffered I/O
  ocfs2_dlmfs: Fix math error when reading LVB.
  ocfs2: Update VFS inode's id info after reflink.
  ocfs2: potential ERR_PTR dereference on error paths
  ocfs2: Add directory entry later in ocfs2_symlink() and ocfs2_mknod()
  ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_mknod error path
  ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_symlink error path
  ocfs2: add OCFS2_INODE_SKIP_ORPHAN_DIR flag and honor it in the inode wipe code
  ocfs2: Reset status if we want to restart file extension.
  ocfs2: Compute metaecc for superblocks during online resize.
  ocfs2: Check the owner of a lockres inside the spinlock
  ocfs2: one more warning fix in ocfs2_file_aio_write(), v2
  ocfs2_dlmfs: User DLM_* when decoding file open flags.
2010-05-04 16:33:18 -07:00