linux-stable/fs
Dave Chinner 1316d4da3f xfs: unpin stale inodes directly in IOP_COMMITTED
When inodes are marked stale in a transaction, they are treated
specially when the inode log item is being inserted into the AIL.
It tries to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c18 ("xfs: avoid moving stale inodes
in the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-06 15:44:40 -05:00
..
9p 9p: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:53 -04:00
adfs Fix common misspellings 2011-03-31 11:26:23 -03:00
affs affs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:53 -04:00
afs afs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:53 -04:00
autofs4 autofs4: bogus dentry_unhash() added in ->unlink() 2011-05-30 01:50:53 -04:00
befs Fix common misspellings 2011-03-31 11:26:23 -03:00
bfs bfs: remove unnecessary dentry_unhash on dir rename 2011-05-28 01:02:50 -04:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2011-06-05 06:17:23 +09:00
cachefiles Fix common misspellings 2011-03-31 11:26:23 -03:00
ceph ceph: remove unnecessary dentry_unhash calls 2011-05-26 07:26:53 -04:00
cifs cifs/ubifs: Fix shrinker API change fallout 2011-05-29 11:17:34 -07:00
coda coda: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:53 -04:00
configfs configfs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:54 -04:00
cramfs cramfs: generate unique inode number for better inode cache usage 2011-01-13 08:03:23 -08:00
debugfs debugfs: move to new strtobool 2011-05-19 16:55:28 +09:30
devpts fs/devpts/inode.c: correctly check d_alloc_name() return code in devpts_pty_new() 2011-03-22 17:44:17 -07:00
dlm Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 2011-05-26 13:19:00 -07:00
ecryptfs eCryptfs: Remove ecryptfs_header_cache_2 2011-05-29 14:24:25 -05:00
efs block: remove per-queue plugging 2011-03-10 08:52:07 +01:00
exofs exofs: remove unnecessary dentry_unhash on rmdir/rename_dir 2011-05-26 07:26:57 -04:00
exportfs vfs: Add open by file handle support 2011-03-15 02:21:44 -04:00
ext2 ext2: remove unnecessary dentry_unhash on rmdir/rename_dir 2011-05-26 07:26:56 -04:00
ext3 fs: pass exact type of data dirties to ->dirty_inode 2011-05-27 07:04:40 -04:00
ext4 fs: pass exact type of data dirties to ->dirty_inode 2011-05-27 07:04:40 -04:00
fat fat: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:54 -04:00
freevxfs treewide: fix a few typos in comments 2011-05-10 10:16:21 +02:00
fscache fscache: remove dead code under CONFIG_WORKQUEUE_DEBUGFS 2011-05-25 08:39:44 -07:00
fuse fuse: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:53 -04:00
gfs2 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 2011-05-26 13:19:00 -07:00
hfs hfs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:52 -04:00
hfsplus hfsplus: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:52 -04:00
hostfs hostfs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:52 -04:00
hpfs hpfs: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:54 -04:00
hppfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
hugetlbfs mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
isofs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
jbd jbd: Fix comment to match the code in journal_start() 2011-05-24 00:27:53 +02:00
jbd2 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-05-26 09:53:20 -07:00
jffs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-05-28 13:03:41 -07:00
jfs jfs: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:51 -04:00
lockd NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!" 2011-01-25 15:24:47 -05:00
logfs logfs: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:51 -04:00
minix minix: remove unnecessary dentry_unhash on rmdir, dir rename 2011-05-28 01:02:54 -04:00
ncpfs ncpfs: fix rename over directory with dangling references 2011-05-28 01:02:53 -04:00
nfs Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd 2011-05-29 14:10:13 -07:00
nfs_common Fix common misspellings 2011-03-31 11:26:23 -03:00
nfsd Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux 2011-05-29 11:21:12 -07:00
nilfs2 nilfs2: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:51 -04:00
nls
notify Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 2011-04-07 11:14:49 -07:00
ntfs Fix common misspellings 2011-03-31 11:26:23 -03:00
ocfs2 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 2011-05-27 10:23:10 -07:00
omfs omfs: remove unnecessary dentry_unhash on rmdir, dir rneame 2011-05-28 01:02:52 -04:00
openpromfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
partitions Revert "block: Remove extra discard_alignment from hd_struct." 2011-05-30 07:42:51 +02:00
proc Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2011-05-29 11:29:28 -07:00
pstore pstore: fix pstore filesystem mount/remount issue 2011-05-16 11:05:00 -07:00
qnx4 block: remove per-queue plugging 2011-03-10 08:52:07 +01:00
quota vmscan: change shrinker API by passing shrink_control struct 2011-05-25 08:39:26 -07:00
ramfs ramfs: fix memleak on no-mmu arch 2011-04-14 16:06:56 -07:00
reiserfs reiserfs: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:51 -04:00
romfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
squashfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus 2011-05-29 11:19:45 -07:00
sysfs sysfs: remove "last sysfs file:" line from the oops messages 2011-05-13 16:05:51 -07:00
sysv sysv: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:50 -04:00
ubifs UBIFS: fix-up free space earlier 2011-06-03 18:12:31 +03:00
udf udf: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:52 -04:00
ufs ufs: remove unnecessary dentry_unhash from rmdir, dir rename 2011-05-28 01:02:51 -04:00
xfs xfs: unpin stale inodes directly in IOP_COMMITTED 2011-07-06 15:44:40 -05:00
aio.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
anon_inodes.c sanitize vfsmount refcounting changes 2011-01-16 13:47:07 -05:00
attr.c Cache xattr security drop check for write v2 2011-05-28 12:02:09 -04:00
bad_inode.c fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
binfmt_aout.c
binfmt_elf.c brk: COMPAT_BRK: fix detection of randomized brk 2011-04-14 16:06:55 -07:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c CRED: Fix load_flat_shared_library() to initialise bprm correctly 2011-05-03 10:10:51 +10:00
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c block: Require subsystems to explicitly allocate bio_set integrity mempool 2011-03-17 11:11:05 +01:00
bio.c block: improve the bio_add_page() and bio_add_pc_page() descriptions 2011-05-28 14:44:46 +02:00
block_dev.c block: blkdev_get() should access ->bd_disk only after success 2011-06-01 08:28:47 +02:00
buffer.c fs: block_page_mkwrite should wait for writeback to finish 2011-05-28 01:03:21 -04:00
char_dev.c Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block 2011-01-13 10:45:01 -08:00
compat.c exec: unify do_execve/compat_do_execve code 2011-04-09 15:53:56 +02:00
compat_binfmt_elf.c
compat_ioctl.c Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 2011-01-07 14:39:20 -08:00
dcache.c vmscan: change shrinker API by passing shrink_control struct 2011-05-25 08:39:26 -07:00
dcookies.c
direct-io.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
drop_caches.c vmscan: change shrinker API by passing shrink_control struct 2011-05-25 08:39:26 -07:00
eventfd.c Docbook: add fs/eventfd.c and fix typos in it 2011-02-21 15:07:04 -08:00
eventpoll.c Fix common misspellings 2011-03-31 11:26:23 -03:00
exec.c coredump: add support for exe_file in core name 2011-05-26 17:12:36 -07:00
fcntl.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
fhandle.c fs/fhandle.c: add <linux/personality.h> for ia64 2011-04-14 16:06:56 -07:00
fifo.c Filesystem: fifo: Fixed coding style issue. 2011-03-21 00:16:09 -04:00
file.c vfs: avoid large kmalloc()s for the fdtable 2011-04-28 11:28:20 -07:00
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-03-16 13:26:17 -07:00
filesystems.c fs: synchronize_rcu when unregister_filesystem success not failure 2011-04-17 10:42:01 -07:00
fs-writeback.c fs: pass exact type of data dirties to ->dirty_inode 2011-05-27 07:04:40 -04:00
fs_struct.c sanitize vfsmount refcounting changes 2011-01-16 13:47:07 -05:00
generic_acl.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
inode.c fs: cosmetic inode.c cleanups 2011-05-27 09:43:00 -04:00
internal.h fs: move i_wb_list out from under inode_lock 2011-03-24 21:17:51 -04:00
ioctl.c vfs: cleanup do_vfs_ioctl() 2011-03-21 00:16:08 -04:00
ioprio.c
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-05-26 09:52:14 -07:00
Kconfig.binfmt
libfs.c libfs: drop unneeded dentry_unhash 2011-05-26 07:26:50 -04:00
locks.c Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux 2011-03-24 08:20:39 -07:00
Makefile Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2011-03-16 19:01:29 -07:00
mbcache.c vmscan: change shrinker API by passing shrink_control struct 2011-05-25 08:39:26 -07:00
mpage.c mm/fs: add hooks to support cleancache 2011-05-26 10:01:43 -06:00
namei.c vfs: shrink_dcache_parent before rmdir, dir rename 2011-05-30 01:48:27 -04:00
namespace.c fs/namespace.c: bound mount propagation fix 2011-05-26 07:26:44 -04:00
nfsctl.c open-style analog of vfs_path_lookup() 2011-03-14 09:15:28 -04:00
no-block.c
open.c fs: Use BUG_ON(!mnt) at dentry_open(). 2011-03-21 01:10:41 -04:00
pipe.c Fix broken "pipe: use event aware wakeups" optimization 2011-01-20 16:21:59 -08:00
pnode.c fs: scale mntget/mntput 2011-01-07 17:50:33 +11:00
pnode.h
posix_acl.c NFS: Prevent memory allocation failure in nfsacl_encode() 2011-01-25 15:24:47 -05:00
read_write.c fix signedness mess in rw_verify_area() on 64bit architectures 2011-01-12 20:06:58 -05:00
read_write.h
readdir.c
select.c select: remove unused MAX_SELECT_SECONDS 2011-03-21 00:16:08 -04:00
seq_file.c
signalfd.c
splice.c splice: add wakeup_pipe_readers() 2011-05-23 19:58:53 +02:00
stack.c
stat.c readlinkat(), fchownat() and fstatat() with empty relative pathnames 2011-03-15 02:21:45 -04:00
statfs.c clean statfs-like syscalls up 2011-03-14 09:15:28 -04:00
super.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem 2011-05-26 10:50:56 -07:00
sync.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
timerfd.c timerfd: Manage cancelable timers in timerfd 2011-05-23 13:59:53 +02:00
utimes.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
xattr.c Cache xattr security drop check for write v2 2011-05-28 12:02:09 -04:00
xattr_acl.c