linux-stable/fs
Lukas Czerner bfff68738f ext4: add support for lazy inode table initialization
When the lazy_itable_init extended option is passed to mke2fs, it
considerably speeds up filesystem creation because inode tables are
not zeroed out.  The fact that parts of the inode table are
uninitialized is not a problem so long as the block group descriptors,
which contain information regarding how much of the inode table has
been initialized, has not been corrupted However, if the block group
checksums are not valid, e2fsck must scan the entire inode table, and
the the old, uninitialized data could potentially cause e2fsck to
report false problems.

Hence, it is important for the inode tables to be initialized as soon
as possble.  This commit adds this feature so that mke2fs can safely
use the lazy inode table initialization feature to speed up formatting
file systems.

This is done via a new new kernel thread called ext4lazyinit, which is
created on demand and destroyed, when it is no longer needed.  There
is only one thread for all ext4 filesystems in the system. When the
first filesystem with inititable mount option is mounted, ext4lazyinit
thread is created, then the filesystem can register its request in the
request list.

This thread then walks through the list of requests picking up
scheduled requests and invoking ext4_init_inode_table(). Next schedule
time for the request is computed by multiplying the time it took to
zero out last inode table with wait multiplier, which can be set with
the (init_itable=n) mount option (default is 10).  We are doing
this so we do not take the whole I/O bandwidth. When the thread is no
longer necessary (request list is empty) it frees the appropriate
structures and exits (and can be created later later by another
filesystem).

We do not disturb regular inode allocations in any way, it just do not
care whether the inode table is, or is not zeroed. But when zeroing, we
have to skip used inodes, obviously. Also we should prevent new inode
allocations from the group, while zeroing is on the way. For that we
take write alloc_sem lock in ext4_init_inode_table() and read alloc_sem
in the ext4_claim_inode, so when we are unlucky and allocator hits the
group which is currently being zeroed, it just has to wait.

This can be suppresed using the mount option no_init_itable.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:05 -04:00
..
9p fs/9p: Don't use dotl version of mknod for dotu inode operations 2010-09-13 08:13:03 -05:00
adfs check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
affs AFFS: wait for sb synchronization when needed 2010-08-09 16:48:51 -04:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 2010-08-13 10:37:30 -07:00
autofs autofs/autofs4: Move compat_ioctl handling into fs 2010-08-09 00:13:34 +02:00
autofs4 autofs4: remove unneeded null check in try_to_fill_dentry() 2010-08-11 08:59:06 -07:00
befs fix typos concerning "initiali[zs]e" 2010-06-16 18:05:05 +02:00
bfs BFS: clean up the superblock usage 2010-08-09 16:48:53 -04:00
btrfs Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block 2010-08-10 15:22:42 -07:00
cachefiles Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
ceph ceph: select CRYPTO 2010-09-17 12:30:31 -07:00
cifs cifs: fix potential double put of TCP session reference 2010-09-14 23:21:03 +00:00
coda Coda: mount hangs because of missed REQ_WRITE rename 2010-09-19 11:03:09 -07:00
configfs fix setattr error handling in sysfs, configfs 2010-06-04 17:16:29 -04:00
cramfs cramfs: only unlock new inodes 2010-08-18 01:01:33 -04:00
debugfs Add x64 support to debugfs 2010-05-19 22:41:57 -04:00
devpts Simplify devpts_get_sb() failure exits 2010-05-21 18:31:12 -04:00
dlm fs/dlm: Drop unnecessary null test 2010-08-05 14:23:45 -05:00
ecryptfs eCryptfs: Fix encrypted file name lookup regression 2010-08-27 10:50:53 -05:00
efs
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2010-08-11 09:19:43 -07:00
exportfs
ext2 mbcache: Remove unused features 2010-08-09 16:48:45 -04:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ext4 ext4: add support for lazy inode table initialization 2010-10-27 21:30:05 -04:00
fat remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
freevxfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
fscache Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
fuse fuse: fix lock annotations 2010-09-07 13:42:41 +02:00
gfs2 GFS2: gfs2_logd should be using interruptible waits 2010-09-17 14:00:10 +01:00
hfs convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
hfsplus convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
hostfs hostfs ->follow_link() braino 2010-08-18 06:21:10 -04:00
hpfs switch hpfs to ->evict_inode() 2010-08-09 16:48:17 -04:00
hppfs switch hppfs to ->evict_inode() 2010-08-09 16:48:16 -04:00
hugetlbfs new helper: end_writeback() 2010-08-09 16:47:49 -04:00
isofs isofs: Fix lseek() to position beyond 4 GB 2010-08-11 00:29:47 -04:00
jbd remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
jbd2 jbd2: Add sanity check for attempts to start handle during umount 2010-10-27 21:30:04 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2010-08-10 11:49:21 -07:00
jfs jfs: don't allow os2 xattr namespace overlap with others 2010-08-10 15:33:09 -07:00
lockd
logfs logfs: kill BKL 2010-08-14 00:24:24 +02:00
minix minix: fix regression in minix_mkdir() 2010-09-09 18:57:25 -07:00
ncpfs Merge branch 'bkl/ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing 2010-08-10 13:58:28 -07:00
nfs SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies 2010-09-12 19:57:50 -04:00
nfs_common
nfsd SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies 2010-09-12 19:57:50 -04:00
nilfs2 nilfs2: fix leak of shadow dat inode in error path of load_nilfs 2010-08-30 10:18:03 +09:00
nls
notify fsnotify: drop two useless bools in the fnsotify main loop 2010-08-27 21:42:11 -04:00
ntfs convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
ocfs2 o2dlm: force free mles during dlm exit 2010-09-23 14:16:53 -07:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bcopeland/omfs 2010-08-10 11:47:36 -07:00
openpromfs
partitions [S390] partitions: fix build error in ibm partition detection code 2010-08-13 10:06:55 +02:00
proc /proc/pid/smaps: fix dirty pages accounting 2010-09-22 17:22:39 -07:00
qnx4 get rid of cont_write_begin_newtrunc 2010-08-09 16:47:31 -04:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ramfs check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
reiserfs remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
romfs
smbfs switch smbfs to evict_inode() 2010-08-09 16:48:00 -04:00
squashfs Squashfs: fix checkpatch.pl warnings 2010-08-08 22:29:33 +00:00
sysfs sysfs: checking for NULL instead of ERR_PTR 2010-09-03 17:26:28 -07:00
sysv fs/sysv/super.c: add support for non-PDP11 v7 filesystems 2010-08-11 08:59:23 -07:00
ubifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
udf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ufs remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
xfs xfs: log IO completion workqueue is a high priority queue 2010-09-10 10:16:54 -05:00
Kconfig fs/Kconfig: Fix typo Userpace -> Userspace 2010-07-20 17:30:22 +02:00
Kconfig.binfmt
Makefile Take statfs variants to fs/statfs.c 2010-05-21 18:31:17 -04:00
aio.c aio: do not return ERESTARTSYS as a result of AIO 2010-09-22 17:22:39 -07:00
anon_inodes.c Revert "anon_inode: set S_IFREG on the anon_inode" 2010-05-27 22:03:05 -04:00
attr.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
bad_inode.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
binfmt_aout.c
binfmt_elf.c
binfmt_elf_fdpic.c binfmt_elf_fdpic: Fix clear_user() error handling 2010-06-01 08:11:06 -07:00
binfmt_em86.c
binfmt_flat.c flat: tweak default stack alignment 2010-06-29 15:29:31 -07:00
binfmt_misc.c binfmt_misc: fix binfmt_misc priority 2010-09-09 18:57:24 -07:00
binfmt_script.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
binfmt_som.c
bio-integrity.c fs/bio-integrity.c: return -ENOMEM on kmalloc failure 2010-08-23 13:36:59 +02:00
bio.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
block_dev.c blkdev: cgroup whitelist permission fix 2010-08-11 08:59:18 -07:00
buffer.c remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
char_dev.c char: Mark /dev/zero and /dev/kmem as not capable of writeback 2010-09-22 09:48:47 +02:00
compat.c Prevent freeing uninitialized pointer in compat_do_readv_writev 2010-09-22 17:22:38 -07:00
compat_binfmt_elf.c
compat_ioctl.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
dcache.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
dcookies.c
direct-io.c O_DIRECT: fix the splitting up of contiguous I/O 2010-09-09 18:57:22 -07:00
drop_caches.c simplify checks for I_CLEAR/I_FREEING 2010-08-09 16:47:44 -04:00
eventfd.c
eventpoll.c
exec.c execve: make responsive to SIGKILL with large arguments 2010-09-10 08:10:26 -07:00
fcntl.c vfs: take O_NONBLOCK out of the O_* uniqueness test 2010-09-09 18:57:25 -07:00
fifo.c
file.c vfs: use kmalloc() to allocate fdmem if possible 2010-08-11 08:59:02 -07:00
file_table.c fs: scale files_lock 2010-08-18 08:35:48 -04:00
filesystems.c
fs-writeback.c bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends 2010-09-22 09:48:47 +02:00
fs_struct.c fs: fs_struct rwlock to spinlock 2010-08-18 08:35:46 -04:00
generic_acl.c vfs: update ctime when changing the file's permission by setfacl 2010-08-18 01:04:22 -04:00
inode.c Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify 2010-08-10 11:39:13 -07:00
internal.h fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
ioctl.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
ioprio.c
libfs.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
locks.c
mbcache.c mbcache: Limit the maximum number of cache entries 2010-08-18 06:24:41 -04:00
mpage.c
namei.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
namespace.c VFS: Sanity check mount flags passed to change_mnt_propagation() 2010-09-07 13:46:20 -07:00
nfsctl.c
no-block.c
open.c fs: cleanup files_lock locking 2010-08-18 08:35:47 -04:00
pipe.c pipe: fix check in "set size" fcntl 2010-06-10 19:08:34 +02:00
pnode.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
pnode.h
posix_acl.c
read_write.c fsnotify: pass a file instead of an inode to open, read, and write 2010-07-28 09:58:32 -04:00
read_write.h
readdir.c vfs: fix warning: 'dirent' is used uninitialized in this function 2010-08-09 20:45:05 -07:00
select.c
seq_file.c
signalfd.c signalfd: fill in ssi_int for posix timers and message queues 2010-08-11 08:59:20 -07:00
splice.c splice: fix misuse of SPLICE_F_NONBLOCK 2010-08-07 18:52:56 +02:00
stack.c
stat.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
statfs.c add f_flags to struct statfs(64) 2010-08-09 16:48:44 -04:00
super.c fs: scale files_lock 2010-08-18 08:35:48 -04:00
sync.c get rid of file_fsync() 2010-08-09 16:47:43 -04:00
timerfd.c fs/timerfd.c: make use of wait_event_interruptible_locked_irq() 2010-05-20 13:21:42 -07:00
utimes.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
xattr.c fs: xattr_handler table should be const 2010-05-21 18:31:18 -04:00
xattr_acl.c