Commit graph

227 commits

Author SHA1 Message Date
Kent Overstreet
a023127a28 bcachefs: Eliminate function calls in DIO fastpaths
We can assume that usually buffered and O_DIRECT IO won't be mixed, and
the calls to flush the page cache won't be needed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:31 -04:00
Kent Overstreet
54847d253a bcachefs: DIO write path only needs to shoot down pagecache once, not twice
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:31 -04:00
Kent Overstreet
1b783a690d bcachefs: Add pagecache_add lock to buffered IO path, fault path
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:31 -04:00
Kent Overstreet
7edcfbfefe bcachefs: Don't hold inode lock longer than necessary in dio write path
In theory we should be able to do (non appending/extending) dio writes
without taking the inode lock at all - but this gets us most of the way
there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:31 -04:00
Kent Overstreet
f8f3086338 bcachefs: Avoid atomics in write fast path
This adds some horrible hacks, but the atomic ops for closures were
getting to be a pretty expensive part of the write path. We don't want
to rip out closures entirely from the write path, because they're used
for e.g. waiting on the allocator, or waiting on the journal flush, and
that stuff would get really ugly without closures.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:31 -04:00
Kent Overstreet
406d6d5a07 bcachefs: Fix an error path race
On IO error, bch2_writepages_io_done() will set the page state to
indicate nothing's already reserved (since the write didn't happen, we
don't know what's already reserved). This can race with the buffered IO
path, in between getting a disk reservation and calling
bch2_set_page_dirty().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:30 -04:00
Kent Overstreet
2a9101a989 bcachefs: Refactor bch2_trans_commit() path
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:30 -04:00
Kent Overstreet
a94407434b bcachefs: Limit bios in writepages path to 256M
This works around a bug where bio_full() doesn't check for
bio->bi_iter.bi_size overflowing - and, we don't really want to build
bios that are that big anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:30 -04:00
Kent Overstreet
9a3df993e1 bcachefs: Kill bchfs_extent_update()
The generic IO path now handles inode updates for i_size and i_sectors -
this means we can drop a fair amount of code from fs-io.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
2e87eae1fb bcachefs: Convert bch2_fpunch to bch2_extent_update()
As before - we're moving non Linux specific code out of fs-io.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
2925fc49b3 bcachefs: Split out bchfs_extent_update()
The next few patches are going to be more moving the logic around
i_size/i_sectors updates to io.c, and better separating the Linux VFS
specific code from core bcachefs code, to better support the fuse port.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
e0541a9346 bcachefs: Kill some dependencies on ei_inode
Moving bch2_extent_update() to io.c will be greatly simplified if we
no longer have to keep ei_inode.bi_size/bi_sectors up to date.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
daf3fe502a bcachefs: Check if extending inode differently
In bch2_extent_update(), we have to update the inode if i_size is
changing (the file is being extend) or if i_sectors is changing, but we
want to avoid touching the inode if it's not necessary.

Change sum_sector_overwrites() to also check if there's already data
above where we're writing to - this means we're definitely not extending
the file.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
3826ee0b17 bcachefs: Add a lock to bch_page_state
We can't use the page lock to protect it, because on writeback IO error
we need to access the page state before calling end_page_writeback() and
the page lock semantics are completely insane so that deadlocks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
137b0ed907 bcachefs: bch2_extent_atomic_end() now traverses iter
This fixes a bug in io.c bch2_write_index_default() - it was missing the
traverse call, but bch2_extent_atomic_end returns an error now and can
just call it itself.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
58677a1d40 bcachefs: bch2_inode_peek()/bch2_inode_write()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
8de819f834 bcachefs: Fix __bch2_buffered_write() returning -ENOMEM
When grab_cache_page_write_begin() fails but we did pin some pages, we
shouldn't return -ENOMEM, we should do a partial write.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
64bc001153 bcachefs: Rework btree iterator lifetimes
The btree_trans struct needs to memoize/cache btree iterators, so that
on transaction restart we don't have to completely redo btree lookups,
and so that we can do them all at once in the correct order when the
transaction had to restart to avoid a deadlock.

This switches the btree iterator lookups to work based on iterator
position, instead of trying to match them up based on the stack trace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
a7199432c3 bcachefs: Kill deferred btree updates
Will be replaced by cached btree iterators

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
877dfb348d bcachefs: Fix for partial buffered writes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
bbd8d2038b bcachefs: BTREE_ITER_SLOTS isn't a type of btree iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
d55460bb09 bcachefs: Trivial cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
fb472ac528 bcachefs: Convert a BUG_ON() to a warning
We shouldn't ever be writing past i_size - but, apparently there's still
a bug to track down.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
0a426c3239 bcachefs: Handle bio_iov_iter_get_pages() returning unaligned bio
If the user buffer isn't aligned to the filesystem block size, on a
large enough IO - where it won't fit into a single bio -
bio_iov_iter_get_pages() won't necessarily return a bio with the proper
alignment.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
5f786787ad bcachefs: Add support for FALLOC_FL_INSERT_RANGE
Somewhat tricky and ugly, because iterating over extents backwards is a
pain.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
6cc3535dcb bcachefs: Don't write past eof
When converting from PAGE_SIZE to block_size, the .mkwrite path was
missed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
6309589468 bcachefs: Improved bch2_fcollapse()
Move extents instead of copying them - this way, we can iterate over
only live extents, not the entire keyspace. Also, this means we can
mostly skip running triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:26 -04:00
Kent Overstreet
3fb5ebcdd4 bcachefs: Inline some fast paths
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
4b0a66d508 bcachefs: Check alignment in write path
Also - fix alignment in bch2_set_page_dirty()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
76426098e4 bcachefs: Reflink
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
3c7f3b7aeb bcachefs: Refactor bch2_extent_trim_atomic() for reflink
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
b3fce09cd3 bcachefs: Mark space as unallocated on write failure
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
a99b1caf47 bcachefs: Truncate/fpunch now works on block boundaries, not page
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
2ba5d38b50 bcachefs: Count reserved extents as holes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
543ef2ebcd bcachefs: Handle partial pages in seek data/hole
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
d1542e0362 bcachefs: Change buffered write path to write to partial pages
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
7f5e31e1a4 bcachefs: Change __bch2_writepage() to not write to holes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
e10d309471 bcachefs: Fix bch2_seek_data()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
99aaf57000 bcachefs: Refactor various code to not be extent specific
With reflink, various code now has to handle both KEY_TYPE_extent
or KEY_TYPE_reflink_v - so, convert it to be generic across all keys
with pointers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
b17657d0cf bcachefs: Dont't call bch2_trans_begin_updates() in bch2_extent_update()
Prep work for reflink - for reflink, we're going to be using
bch2_extent_update() with other updates in the same transaction.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
06ed855862 bcachefs: Add offset_into_extent param to bch2_read_extent()
With reflink, we'll no longer be able to calculate the offset of the
data we want into the extent we're reading from from the extent pos and
the iter pos - we'll have to pass it in separately.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
f57a6a5d41 bcachefs: Track dirtyness at sector level, not page
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
adfcfaf068 bcachefs: Kill page_state_cmpxchg
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
e1036a2a71 bcachefs: Always touch page state with page locked
This will mean we don't have to use cmpxchg for modifying page state,
which will simplify a fair amount of code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:23 -04:00
Kent Overstreet
885678f68d bcachefs: Kill direct access to bi_io_vec
Switch to always using bio_add_page(), which merges contiguous pages now
that we have multipage bvecs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:23 -04:00
Kent Overstreet
20bceecb31 bcachefs: More work to avoid transaction restarts
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:22 -04:00
Kent Overstreet
58fbf80834 bcachefs: Delete duplicate code
Also rename for consistency

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:22 -04:00
Kent Overstreet
75812e70d9 bcachefs: Fix fsync error reporting
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:21 -04:00
Kent Overstreet
94f651e2c7 bcachefs: Return errors from for_each_btree_key()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:20 -04:00
Kent Overstreet
a6d90385e6 bcachefs: (invalidate|release)_folio fixes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:19 -04:00
Kent Overstreet
0f23836771 bcachefs: trans_for_each_iter()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:18 -04:00
Kent Overstreet
424eb88130 bcachefs: Only get btree iters from btree transactions
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:18 -04:00
Kent Overstreet
61f321fc8b bcachefs: Make deferred inode updates a mount option
Journal reclaim may still need performance tuning

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:17 -04:00
Kent Overstreet
446c562c2c bcachefs: Remove direct use of bch2_btree_iter_link()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:17 -04:00
Kent Overstreet
5154704b29 bcachefs: Use deferred btree updates for inode updates
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:17 -04:00
Kent Overstreet
7ef2a73a58 bcachefs: Fix check for if extent update is allocating
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:14 -04:00
Kent Overstreet
919dbbd18b bcachefs: dio arithmetic improvements
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:14 -04:00
Kent Overstreet
ed4840308c bcachefs: Fix a dio bug
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:14 -04:00
Kent Overstreet
26609b619f bcachefs: Make bkey types globally unique
this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:12 -04:00
Kent Overstreet
01a0108f01 bcachefs: Fix a btree iter usage error
previously, if the code traversed to the next btree node, that could
return an error (due to lock restarts) - which was not being checked
for.

fix is to rework it so it never iterates past the current leaf node, and
pops an assertion if it ever sees an error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:12 -04:00
Kent Overstreet
f81b648d1f bcachefs: Clean up, possixly fix page disk reservation accounting
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:11 -04:00
Kent Overstreet
b1ba2359fb bcachefs: Fix an error path
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:11 -04:00
Kent Overstreet
1742237ba1 bcachefs: extent_for_each_ptr_decode()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:10 -04:00
Kent Overstreet
7b3f84ea7d bcachefs: Split out alloc_background.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:10 -04:00
Kent Overstreet
fc3268c13c bcachefs: kill extent_insert_hook
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
190fa7af39 bcachefs: kill i_sectors_hook
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
8ef231bd51 bcachefs: convert fcollapse to bch2_extent_update()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
5f461e01b8 bcachefs: convert fpunch to bch2_extent_update()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
54e2264e17 bcachefs: convert truncate to bch2_extent_update()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
08af47dfc2 bcachefs: convert bchfs_write_index_update() to bch2_extent_update()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
e2d9912c6f bcachefs: bch2_extent_trim_atomic()
Prep work for extents insert hook removal

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:09 -04:00
Kent Overstreet
bb1b3658aa bcachefs: minor fsync fix
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:08 -04:00
Kent Overstreet
658971f276 bcachefs: fix mtime/ctime update on truncate
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:08 -04:00
Kent Overstreet
2ea9004864 bcachefs: Fix mtime/ctime updates
Also make inode flags consistent with how the rest of the inode is
updated

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00
Kent Overstreet
4e1ec2cc0d bcachefs: Simplify bch2_write_inode_trans, fix lockdep splat
ei_update_lock isn't currently needed for write inode (but it will be
needed again when deferred btree updates are used for inode updates)

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00
Kent Overstreet
d69f41d6bb bcachefs: Convert raw uses of bch2_btree_iter_link() to new transactions
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00
Kent Overstreet
1c6fdbd8f2 bcachefs: Initial commit
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00