linux-stable/fs
Ethan Lien 3cd24c6980 btrfs: use tagged writepage to mitigate livelock of snapshot
Snapshot is expected to be fast. But if there are writers steadily
creating dirty pages in our subvolume, the snapshot may take a very long
time to complete. To fix the problem, we use tagged writepage for
snapshot flusher as we do in the generic write_cache_pages(), so we can
omit pages dirtied after the snapshot command.

This does not change the semantics regarding which data get to the
snapshot, if there are pages being dirtied during the snapshotting
operation.  There's a sync called before snapshot is taken in old/new
case, any IO in flight just after that may be in the snapshot but this
depends on other system effects that might still sync the IO.

We do a simple snapshot speed test on a Intel D-1531 box:

fio --ioengine=libaio --iodepth=32 --bs=4k --rw=write --size=64G
--direct=0 --thread=1 --numjobs=1 --time_based --runtime=120
--filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5;
time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio

original: 1m58sec
patched:  6.54sec

This is the best case for this patch since for a sequential write case,
we omit nearly all pages dirtied after the snapshot command.

For a multi writers, random write test:

fio --ioengine=libaio --iodepth=32 --bs=4k --rw=randwrite --size=64G
--direct=0 --thread=1 --numjobs=4 --time_based --runtime=120
--filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5;
time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio

original: 15.83sec
patched:  10.35sec

The improvement is smaller compared to the sequential write case,
since we omit only half of the pages dirtied after snapshot command.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Ethan Lien <ethanlien@synology.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17 14:51:33 +01:00
..
9p Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
adfs adfs: use timespec64 for time conversion 2018-08-22 10:52:51 -07:00
affs
afs Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-30 10:47:50 -08:00
autofs
befs fix a series of Documentation/ broken file name references 2018-06-15 18:10:01 -03:00
bfs bfs: add sanity check at bfs_fill_super() 2018-11-03 10:09:38 -07:00
btrfs btrfs: use tagged writepage to mitigate livelock of snapshot 2018-12-17 14:51:33 +01:00
cachefiles fscache, cachefiles: remove redundant variable 'cache' 2018-11-30 16:00:58 +00:00
ceph ceph: make 'nocopyfrom' a default mount option 2018-12-11 18:22:17 +01:00
cifs CIFS: Avoid returning EBUSY to upper layer VFS 2018-12-07 00:59:23 -06:00
coda
configfs configfs: fix registered group removal 2018-07-17 06:14:07 -07:00
cramfs Make the Cramfs code more robust against filesystem corruptions, 2018-10-30 12:46:25 -07:00
crypto crypto: speck - remove Speck 2018-09-04 11:35:03 +08:00
debugfs Revert "debugfs: inode: debugfs_create_dir uses mode permission from parent" 2018-06-12 20:52:16 -07:00
devpts
dlm
ecryptfs ecryptfs_rename(): verify that lower dentries are still OK after lock_rename() 2018-10-09 23:33:17 -04:00
efivarfs
efs
exofs
exportfs exportfs: do not read dentry after free 2018-11-23 09:08:17 -05:00
ext2 ext2: fix potential use after free 2018-11-27 10:21:15 +01:00
ext4 A large number of ext4 bug fixes, mostly buffer and memory leaks on 2018-11-11 16:53:02 -06:00
f2fs Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
fat fat: truncate inode timestamp updates in setattr 2018-10-31 08:54:14 -07:00
freevxfs
fscache fscache: fix race between enablement and dropping of object 2018-11-30 15:57:31 +00:00
fuse fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS 2018-12-11 21:47:28 +01:00
gfs2 Merge tag 'gfs2-4.20.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 2018-11-16 11:38:14 -06:00
hfs hfs: do not free node before using 2018-11-30 14:56:14 -08:00
hfsplus hfsplus: do not free node before using 2018-11-30 14:56:14 -08:00
hostfs
hpfs
hugetlbfs mm: zero out the vma in vma_init() 2018-08-22 10:52:44 -07:00
isofs
jbd2 jbd2: fix use after free in jbd2_log_do_checkpoint() 2018-10-05 18:44:40 -04:00
jffs2 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-10-24 11:22:39 +01:00
jfs
kernfs mm: zero-seek shrinkers 2018-10-26 16:26:33 -07:00
lockd lockd: fix access beyond unterminated strings in prints 2018-10-29 16:58:04 -04:00
minix minix_lookup: use d_splice_alias() 2018-05-22 14:27:52 -04:00
nfs nfs: don't dirty kernel pages read by direct-io 2018-12-02 09:43:56 -05:00
nfs_common
nfsd nfsd: COPY and CLONE operations require the saved filehandle to be set 2018-11-08 12:11:45 -05:00
nilfs2 nilfs2: Use xa_erase_irq 2018-11-05 14:57:05 -05:00
nls
notify fanotify: fix handling of events on child sub-directory 2018-11-08 15:43:48 +01:00
ntfs
ocfs2 ocfs2: fix potential use after free 2018-11-30 14:56:15 -08:00
omfs
openpromfs
orangefs Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
overlayfs Revert "ovl: relax permission checking on underlying layers" 2018-12-04 11:31:30 +01:00
proc New gcc plugin: stackleak 2018-11-01 11:46:27 -07:00
pstore pstore fix: 2018-11-30 09:03:15 -08:00
qnx4
qnx6 qnx6_lookup: switch to d_splice_alias() 2018-05-22 14:27:54 -04:00
quota fs/quota: Fix spectre gadget in do_quotactl 2018-08-22 18:17:48 +02:00
ramfs
reiserfs reiserfs: remove workaround code for GCC 3.x 2018-10-31 08:54:14 -07:00
romfs
squashfs Squashfs: Compute expected length from inode size rather than block length 2018-08-02 09:34:02 -07:00
sysfs
sysv sysv: return 'err' instead of 0 in __sysv_write_inode 2018-11-10 08:02:40 -05:00
tracefs
ubifs
udf udf: Allow mounting volumes with incorrect identification strings 2018-11-19 10:27:59 +01:00
ufs
xfs xfs: fix inverted return from xfs_btree_sblock_verify_crc 2018-12-04 08:50:49 -08:00
aio.c for-linus-20181214 2018-12-14 12:18:30 -08:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c
buffer.c for-linus-20181102 2018-11-02 11:25:48 -07:00
char_dev.c
compat.c
compat_binfmt_elf.c
compat_ioctl.c media updates for v4.20-rc1 2018-10-29 14:29:58 -07:00
coredump.c
d_path.c
dax.c dax: Fix unlock mismatch with updated API 2018-12-04 21:32:00 -08:00
dcache.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
dcookies.c
direct-io.c fs: fix lost error code in dio_complete 2018-11-30 08:35:14 -07:00
drop_caches.c
eventfd.c
eventpoll.c
exec.c Revert "exec: make de_thread() freezable" 2018-12-04 16:04:20 +01:00
fcntl.c
fhandle.c
file.c fs: add ksys_close() wrapper; remove in-kernel calls to sys_close() 2018-04-02 20:16:00 +02:00
file_table.c
filesystems.c
fs-writeback.c
fs_pin.c
fs_struct.c
inode.c mm: don't reclaim inodes with many attached pages 2018-11-18 10:15:09 -08:00
internal.h
ioctl.c vfs: rework data cloning infrastructure 2018-11-02 09:33:08 -07:00
iomap.c fs/iomap.c: get/put the page in iomap_page_create/release() 2018-12-14 15:05:45 -08:00
Kconfig
Kconfig.binfmt
libfs.c
locks.c
Makefile autofs: remove left-over autofs4 stubs 2018-06-11 08:22:34 -07:00
mbcache.c
mount.h
mpage.c
namei.c
namespace.c mnt: fix __detach_mounts infinite loop 2018-11-12 01:02:34 -06:00
no-block.c
nsfs.c
open.c
pipe.c Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 19:58:36 -07:00
pnode.c mnt: Make propagate_umount less slow for overlapping mount propagation trees 2017-05-23 08:41:17 -05:00
pnode.h
posix_acl.c
proc_namespace.c
read_write.c vfs: allow some remap flags to be passed to vfs_clone_file_range 2018-12-04 08:50:49 -08:00
readdir.c
select.c y2038: globally rename compat_time to old_time32 2018-08-27 14:48:48 +02:00
seq_file.c
signalfd.c signal: Distinguish between kernel_siginfo and siginfo 2018-10-03 16:47:43 +02:00
splice.c splice: don't read more than available pipe space 2018-12-04 08:50:49 -08:00
stack.c fs: fix comment for 'CONFIG_LBADF' 2014-08-26 09:35:56 +02:00
stat.c
statfs.c
super.c
sync.c
timerfd.c
userfaultfd.c userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered 2018-12-14 15:05:45 -08:00
utimes.c
xattr.c