linux-stable/fs/btrfs
Filipe Manana 531a140e26 btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
commit a4527e1853 upstream.

When doing a direct IO read or write, we always return -ENOTBLK when we
find a compressed extent (or an inline extent) so that we fallback to
buffered IO. This however is not ideal in case we are in a NOWAIT context
(io_uring for example), because buffered IO can block and we currently
have no support for NOWAIT semantics for buffered IO, so if we need to
fallback to buffered IO we should first signal the caller that we may
need to block by returning -EAGAIN instead.

This behaviour can also result in short reads being returned to user
space, which although it's not incorrect and user space should be able
to deal with partial reads, it's somewhat surprising and even some popular
applications like QEMU (Link tag #1) and MariaDB (Link tag #2) don't
deal with short reads properly (or at all).

The short read case happens when we try to read from a range that has a
non-compressed and non-inline extent followed by a compressed extent.
After having read the first extent, when we find the compressed extent we
return -ENOTBLK from btrfs_dio_iomap_begin(), which results in iomap to
treat the request as a short read, returning 0 (success) and waiting for
previously submitted bios to complete (this happens at
fs/iomap/direct-io.c:__iomap_dio_rw()). After that, and while at
btrfs_file_read_iter(), we call filemap_read() to use buffered IO to
read the remaining data, and pass it the number of bytes we were able to
read with direct IO. Than at filemap_read() if we get a page fault error
when accessing the read buffer, we return a partial read instead of an
-EFAULT error, because the number of bytes previously read is greater
than zero.

So fix this by returning -EAGAIN for NOWAIT direct IO when we find a
compressed or an inline extent.

Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Link: https://lore.kernel.org/linux-btrfs/YrrFGO4A1jS0GI0G@atmark-techno.com/
Link: https://jira.mariadb.org/browse/MDEV-27900?focusedCommentId=216582&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-216582
Tested-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-21 21:24:13 +02:00
..
tests btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
acl.c overlayfs update for 5.15 2021-09-02 09:21:27 -07:00
async-thread.c btrfs: fix memory ordering between normal and ordered work functions 2021-11-25 09:48:46 +01:00
async-thread.h
backref.c btrfs: remove BUG_ON(!eie) in find_parent_nodes 2022-01-27 11:04:52 +01:00
backref.h btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
block-group.c btrfs: fix deadlock between chunk allocation and chunk btree modifications 2022-07-12 16:34:52 +02:00
block-group.h btrfs: fix deadlock between chunk allocation and chunk btree modifications 2022-07-12 16:34:52 +02:00
block-rsv.c btrfs: introduce mount option rescue=ignorebadroots 2020-12-08 15:53:41 +01:00
block-rsv.h
btrfs_inode.h btrfs: initial fsverity support 2021-08-23 13:19:09 +02:00
check-integrity.c btrfs: check-integrity: drop kmap/kunmap for block pages 2021-08-23 13:19:00 +02:00
check-integrity.h
compression.c btrfs: remove unused parameter nr_pages in add_ra_bio_pages() 2022-04-20 09:34:04 +02:00
compression.h btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00
ctree.c btrfs: fix invalid delayed ref after subvolume creation failure 2022-07-12 16:34:50 +02:00
ctree.h btrfs: zoned: use dedicated lock for data relocation 2022-07-12 16:35:05 +02:00
delalloc-space.c btrfs: free exchange changeset on failures 2021-12-14 10:57:13 +01:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: add ro compat flags to inodes 2021-08-23 13:19:09 +02:00
delayed-inode.h btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-ref.c btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
delayed-ref.h btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref 2022-07-12 16:34:50 +02:00
dev-replace.c btrfs: handle device lookup with btrfs_dev_lookup_args 2022-07-12 16:34:59 +02:00
dev-replace.h btrfs: zoned: mark block groups to copy for device-replace 2021-02-09 02:46:07 +01:00
dir-item.c btrfs: unify lookup return value when dir entry is missing 2021-10-07 22:06:32 +02:00
discard.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: zoned: use dedicated lock for data relocation 2022-07-12 16:35:05 +02:00
disk-io.h btrfs: split alloc_log_tree() 2021-02-09 02:46:07 +01:00
export.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
export.h
extent-io-tree.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent-tree.c btrfs: fix invalid delayed ref after subvolume creation failure 2022-07-12 16:34:50 +02:00
extent_io.c btrfs: zoned: encapsulate inode locking for zoned relocation 2022-07-12 16:35:05 +02:00
extent_io.h btrfs: fix qgroup reserve overflow the qgroup limit 2022-04-13 20:59:23 +02:00
extent_map.c btrfs: fix parameter description of btrfs_add_extent_mapping 2021-02-08 22:58:53 +01:00
extent_map.h
file-item.c btrfs: make search_csum_tree return 0 if we get -EFBIG 2022-04-08 14:23:58 +02:00
file.c btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref 2022-07-12 16:34:50 +02:00
free-space-cache.c btrfs: zoned: fix block group alloc_offset calculation 2021-08-23 13:19:11 +02:00
free-space-cache.h btrfs: zoned: track unusable bytes for zones 2021-02-09 02:46:03 +01:00
free-space-tree.c btrfs: fix invalid delayed ref after subvolume creation failure 2022-07-12 16:34:50 +02:00
free-space-tree.h
inode-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
inode.c btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents 2022-07-21 21:24:13 +02:00
ioctl.c btrfs: fix use of uninitialized variable at rm device ioctl 2022-07-12 16:35:11 +02:00
Kconfig btrfs: disable build on platforms having page size 256K 2021-06-22 14:11:57 +02:00
locking.c btrfs: don't set lock_owner when locking extent buffer for reading 2022-06-29 09:03:27 +02:00
locking.h btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
lzo.c btrfs: prevent copying too big compressed lzo segment 2022-03-02 11:48:07 +01:00
Makefile btrfs: initial fsverity support 2021-08-23 13:19:09 +02:00
misc.h btrfs: use correct header for div_u64 in misc.h 2021-09-07 14:29:50 +02:00
ordered-data.c btrfs: zoned: fix double counting of split ordered extent 2021-09-07 14:30:41 +02:00
ordered-data.h btrfs: remove uptodate parameter from btrfs_dec_test_first_ordered_pending 2021-08-23 13:19:02 +02:00
orphan.c
print-tree.c btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c btrfs: props: change how empty value is interpreted 2021-06-22 14:11:58 +02:00
props.h
qgroup.c btrfs: fix invalid delayed ref after subvolume creation failure 2022-07-12 16:34:50 +02:00
qgroup.h btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
raid56.c btrfs: constify and cleanup variables in comparators 2021-08-23 13:19:03 +02:00
raid56.h
rcu-string.h
reada.c btrfs: subpage: make readahead work properly 2021-03-16 11:06:21 +01:00
ref-verify.c btrfs: stop doing GFP_KERNEL memory allocations in the ref verify tool 2021-08-23 13:19:00 +02:00
ref-verify.h
reflink.c btrfs: fix unexpected error path when reflinking an inline extent 2022-04-08 14:23:11 +02:00
reflink.h
relocation.c btrfs: fix deadlock between chunk allocation and chunk btree modifications 2022-07-12 16:34:52 +02:00
root-tree.c btrfs: do not start relocation until in progress drops are done 2022-03-08 19:12:54 +01:00
scrub.c btrfs: handle device lookup with btrfs_dev_lookup_args 2022-07-12 16:34:59 +02:00
send.c btrfs: make send work with concurrent block group relocation 2022-03-16 14:23:46 +01:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: extend locking to all space_info members accesses 2022-04-08 14:23:02 +02:00
space-info.h btrfs: rip out btrfs_space_info::total_bytes_pinned 2021-06-22 14:55:25 +02:00
struct-funcs.c btrfs: add special case to setget helpers for 64k pages 2021-08-23 13:18:58 +02:00
subpage.c btrfs: subpage: fix a potential use-after-free in writeback helper 2021-08-23 13:19:05 +02:00
subpage.h btrfs: subpage: fix writeback which does not have ordered extent 2021-08-23 13:19:04 +02:00
super.c btrfs: add error messages to all unrecognized mount options 2022-06-29 09:03:19 +02:00
sysfs.c btrfs: sysfs: document structures and their associated files 2021-08-23 13:19:12 +02:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: make send work with concurrent block group relocation 2022-03-16 14:23:46 +01:00
transaction.h btrfs: do not start relocation until in progress drops are done 2022-03-08 19:12:54 +01:00
tree-checker.c btrfs: tree-checker: check item_size for dev_item 2022-03-02 11:47:48 +01:00
tree-checker.h
tree-defrag.c btrfs: locking: remove all the blocking helpers 2020-12-08 15:54:01 +01:00
tree-log.c btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref 2022-07-12 16:34:50 +02:00
tree-log.h btrfs: make fast fsyncs wait only for writeback 2020-10-07 12:06:56 +02:00
tree-mod-log.c btrfs: fix race when picking most recent mod log operation for an old root 2021-04-20 19:27:17 +02:00
tree-mod-log.h btrfs: add and use helper to get lowest sequence number for the tree mod log 2021-04-19 17:25:17 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unnecessary casts in printk 2020-12-08 15:53:52 +01:00
verity.c btrfs: fix transaction handle leak after verity rollback failure 2021-09-17 19:29:41 +02:00
volumes.c btrfs: don't access possibly stale fs_info data in device_list_add 2022-07-12 16:35:02 +02:00
volumes.h btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls 2022-07-12 16:35:00 +02:00
xattr.c btrfs: do not BUG_ON() on failure to update inode when setting xattr 2022-05-12 12:30:18 +02:00
xattr.h
zlib.c Revert "btrfs: compression: drop kmap/kunmap from zlib" 2021-10-29 13:03:05 +02:00
zoned.c btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk 2022-07-12 16:34:50 +02:00
zoned.h btrfs: zoned: use dedicated lock for data relocation 2022-07-12 16:35:05 +02:00
zstd.c Revert "btrfs: compression: drop kmap/kunmap from zstd" 2021-10-29 13:02:50 +02:00