summaryrefslogtreecommitdiff
path: root/fs/bcachefs/alloc_background.c
AgeCommit message (Collapse)AuthorFilesLines
2025-04-03bcachefs: Fix null ptr deref in invalidate_one_bucket()Kent Overstreet1-0/+3
bch2_backpointer_get_key() returns bkey_s_c_null when the target isn't found. backpointer_get_key() flags the error, so there's nothing else to do here - just skip it and move on. Link: https://github.com/koverstreet/bcachefs/issues/847 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Kill btree_iter.transKent Overstreet1-38/+40
This was planned to be done ages ago, now finally completed; there are places where we have quite a few btree_trans objects on the stack, so this reduces stack usage somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Split up bch_dev.io_refKent Overstreet1-7/+7
We now have separate per device io_refs for read and write access. This fixes a device removal bug where the discard workers were still running while we're removing alloc info for that device. It's also a bit of hardening; we no longer allow writes to devices that are read-only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Consistent indentation of multiline fsck errorsKent Overstreet1-11/+8
Add the new helper printbuf_indent_add_nextline(), and use it in __bch2_fsck_err() to centralize setting the indentation of multiline fsck errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Recovery no longer holds state_lockKent Overstreet1-0/+3
state_lock guards against devices coming or leaving, changing state, or the filesystem changing between ro <-> rw. But it's not necessary for running recovery passes, and holding it blocks asynchronous events that would cause us to go RO or kick out devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: bch2_disk_accounting_mod2()Kent Overstreet1-6/+4
We're hitting some issues with uninitialized struct padding, flagged by kmsan. They appear to be falso positives, otherwise bch2_accounting_validate() would have flagged them as "junk at end". But for now, we'll need to initialize disk_accounting_pos with memset(). This adds a new helper, bch2_disk_accounting_mod2(), that initializes a disk_accounting_pos and does the accounting mod all at once - so overall things actually get slightly more ergonomic. BCH_DISK_ACCOUNTING_replicas keys are left for now; KMSAN isn't warning about them and they're a bit special. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: EIO cleanupKent Overstreet1-2/+2
Replace these with proper private error codes, so that when we get an error message we're not sifting through the entire codebase to see where it came from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: Filesystem discard option now propagates to devicesKent Overstreet1-1/+14
the discard option is special, because it's both a filesystem and a device option. When set at the filesytsem level, it's supposed to propagate to (if set persistently via sysfs) or override (if non persistently as a mount option) the devices - that now works correctly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Fix error type in bch2_alloc_v3_validate()Thorsten Blum1-1/+1
Use error type alloc_v3_unpack_error in bch2_alloc_v3_validate(). Fixes: b65db750e2bb ("bcachefs: Enumerate fsck errors") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: bcachefs_metadata_version_stripe_lruKent Overstreet1-1/+2
Add a persistent LRU for stripes, ordered by "number of empty blocks", i.e. order in which we wish to reuse them. This will replace the in-memory stripes heap, so we can kill off reading stripes into memory at startup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Advance bch_alloc.oldest_gen if no stale pointersKent Overstreet1-0/+3
Now that we've got cached backpointers and aren't leaving around stale pointers on bucket invalidation, we no longer need the periodic (rare) gc_gens - which recalculates each bucket's oldest gen to avoid wraparound. We can't delete that code because we've got to support existing filesystems that will still have stale pointers, but this gets rid of another scalability limit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Invalidate cached data by backpointersKent Overstreet1-23/+79
If we don't leave stale pointers around, we won't have to deal with bucket gen wraparound. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: decouple bch2_lru_check_set() from alloc btreeKent Overstreet1-1/+4
Pass in the backpointer explicitly, instead of assuming 'referring_k' is an alloc key and calculating it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: s/BCH_LRU_FRAGMENTATION_START/BCH_LRU_BUCKET_FRAGMENTATION/Kent Overstreet1-2/+2
FRAGMENTATION_START was incorrect, there's currently only one fragmentation LRU (at the end of the reserved bits for LRU type), and we're getting ready to add a stripe fragmentation lru - so give it a better name. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: bch2_lru_change() checks for no-opKent Overstreet1-19/+13
Minor cleanup, no reason for the caller to have to this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Add comment explaining why asserts in invalidate_one_bucket() are ↵Kent Overstreet1-0/+7
impossible Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: BCH_COUNTER_bucket_discard_fastKent Overstreet1-1/+4
Add a separate counter for fastpath bucket discards, which don't require a journal flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: Fix discard path journal flushingKent Overstreet1-22/+25
The discard path is supposed to issue journal flushes when there's too many buckets empty buckets that need a journal commit before they can be written to again, but at some point this code seems to have been lost. Bring it back with a new optimization to make sure we don't issue too many journal flushes: the journal now tracks the sequence number of the most recent flush in progress, which the discard path uses when deciding which buckets need a journal flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Don't use BTREE_ITER_cached when walking alloc btree during fsckKent Overstreet1-1/+2
No need to pull the whole alloc btree into the btree key cache. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Fix reuse of bucket before journal flush on multiple empty -> ↵Kent Overstreet1-38/+40
nonempty transition For each bucket we track when the bucket became nonempty and when it became empty again: if we can ensure that there will be no journal flushes in the range [nonempty, empty) (possibly because they occured at the same journal sequence number), then it's safe to reuse the bucket without waiting for a journal commit. This is a major performance optimization for erasure coding, where writes are initially replicated, but the extra replicas are quickly dropped: if those buckets are reused and overwritten without issuing a cache flush to the underlying device, then they only cost bus bandwidth. But there's a tricky corner case when there's multiple empty -> nonempty -> empty transitions in quick succession, i.e. when data is getting overwritten immediately as it's being written. If this happens and the previous empty transition hasn't been flushed, we need to continue tracking the previous nonempty transition - not start a new one. Fixing this means we now need to track both the nonempty and empty transitions in bch_alloc_v4. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_journal_noflush_seq() now takes [start, end)Kent Overstreet1-1/+3
Harder to screw up if we're explicit about the range, and more correct as journal reservations can be outstanding on multiple journal entries simultaneously. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Set bucket needs discard, inc gen on empty -> nonempty transitionKent Overstreet1-1/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Don't recurse in check_discard_freespace_keyKent Overstreet1-8/+64
When calling check_discard_freeespace_key from the allocator, we can't repair without recursing - run it asynchronously instead. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Check for bucket journal seq in the futureKent Overstreet1-28/+35
This fixes an assertion pop in bch2_journal_noflush_seq() - log the error to the superblock and continue instead. Reported-by: syzbot+85700120f75fc10d4e18@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Don't error out when logging fsck errorKent Overstreet1-3/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Issue a transaction restart after commit in repairKent Overstreet1-1/+1
transaction commits invalidate pointers to btree values, and they also downgrade intent locks. This breaks the interior btree update path, which takes intent locks and then calls into the allocator. This isn't an ideal solution: we can't unconditionally issue a restart after a transaction commit, because that would break other codepaths. Reported-by: syzbot+78d82470c16a49702682@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: struct bkey_validate_contextKent Overstreet1-5/+5
Add a new parameter to bkey validate functions, and use it to improve invalid bkey error messages: we can now print the btree and depth it came from, or if it came from the journal, or is a btree root. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: discard fastpath now uses bch2_discard_one_bucket()Kent Overstreet1-34/+41
The discard bucket fastpath previously was using its own code for discarding buckets and clearing them in the need_discard btree, which didn't have any of the consistency checks of the main discard path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Drop swab code for backpointers in alloc keysKent Overstreet1-8/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: discard_one_bucket() now uses need_discard_or_freespace_err()Kent Overstreet1-9/+15
More conversion of inconsistent errors to fsck errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_bucket_do_index(): inconsistent_err -> fsck_errKent Overstreet1-40/+43
Factor out a common helper, need_discard_or_freespace_err(), which is now used by both fsck and the runtime checks, and can repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: try_alloc_bucket() now uses bch2_check_discard_freespace_key()Kent Overstreet1-35/+45
check_discard_freespace_key() was doing all the same checks as try_alloc_bucket(), but with repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: kill inconsistent err in invalidate_one_bucket()Kent Overstreet1-22/+6
Change it to a normal fsck_err() - meaning it'll get repaired at runtime when that's flipped on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Delete dead code from bch2_discard_one_bucket()Kent Overstreet1-16/+0
alloc key validation ensures that if a bucket is in need_discard state the sector counts are all zero - we don't have to check for that. The NEED_INC_GEN check appears to be dead code, as well: we only see buckets in the need_discard btree, and it's an error if they aren't in the need_discard state. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_btree_bit_mod_iter()Kent Overstreet1-47/+10
factor out a new helper, make it handle extents bitset btrees (freespace). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max()Kent Overstreet1-3/+3
We'll be introducing btree_iter_peek_prev_min(), so rename for consistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: Don't use commit_do() unnecessarilyKent Overstreet1-1/+1
Using commit_do() to call alloc_sectors_start_trans() breaks when we're randomly injecting transaction restarts - the restart in the commit causes us to leak the lock that alloc_sectorS_start_trans() takes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: handle restarts in bch2_bucket_io_time_reset()Kent Overstreet1-12/+16
bch2_bucket_io_time_reset() doesn't need to succeed, which is why it didn't previously retry on transaction restart - but we're now treating these as errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: fix restart handling in bch2_do_invalidates_work()Kent Overstreet1-3/+4
this one is fairly harmless since the invalidate worker will just run again later if it needs to, but still worth fixing Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-13bcachefs: Fix missing bounds checks in bch2_alloc_read()Kent Overstreet1-0/+10
We were checking that the alloc key was for a valid device, but not a valid bucket. This is the upgrade path from versions prior to bcachefs being mainlined. Reported-by: syzbot+a1b59c8e1a3f022fd301@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04bcachefs: Kill alloc_v4.fragmentation_lruKent Overstreet1-10/+20
The fragmentation_lru field hasn't been needed since we reworked the LRU btrees to use the btree write buffer; previously it was used to resolve collisions, but the revised LRU btree uses the backpointer (the bucket) as part of the key. It should have been deleted at the time of the LRU rework; since it wasn't, that left places for bugs to hide, in check/repair. This fixes LRU fsck on a filesystem image helpfully provided by a user who disappeared before I could get his name for the reported-by. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: bch_fs.rw_devs_change_countKent Overstreet1-4/+8
Add a counter that's incremented whenever rw devices change; this will be used for erasure coding so that it can keep ec_stripe_head in sync and not deadlock on a new stripe when a device it wants goes away. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: bch2_dev_remove_stripes()Kent Overstreet1-3/+4
We can now correctly force-remove a device that has stripes on it; this uses the new BCH_SB_MEMBER_INVALID sentinal value. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: bch2_dev_remove_alloc() -> alloc_background.cKent Overstreet1-0/+29
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09bcachefs: Convert to use jiffies macrosChen Yufan1-1/+2
Use jiffies macros instead of using jiffies directly to handle wraparound. Signed-off-by: Chen Yufan <chenyufan@vivo.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09bcachefs: Fix ca->io_ref usageKent Overstreet1-12/+12
ca->io_ref does not protect against the filesystem going way, c->write_ref does. Much like 0b50b7313ef2 bcachefs: Fix refcounting in discard path the other async paths need fixing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-22bcachefs: Fix refcounting in discard pathKent Overstreet1-6/+6
bch_dev->io_ref does not protect against the filesystem going away; bch_fs->writes does. Thus the filesystem write ref needs to be the last ref we release. Reported-by: syzbot+9e0404b505e604f67e41@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-22bcachefs: Fix compat issue with old alloc_v4 keysKent Overstreet1-24/+26
we allow new fields to be added to existing key types, and new versions should treat them as being zeroed; this was not handled in alloc_v4_validate. Reported-by: syzbot+3b2968fa4953885dd66a@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-22bcachefs: Fix bch2_bucket_gens_init()Kent Overstreet1-1/+1
Comparing the wrong bpos - this was missed because normally bucket_gens_init() runs on brand new filesystems, but this bug caused it to overwrite bucket_gens keys with 0s when upgrading ancient filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-22bcachefs: Fix bch2_trigger_alloc assertKent Overstreet1-1/+1
On testing on an old mangled filesystem, we missed a case. Fixes: bd864bc2d907 ("bcachefs: Fix bch2_trigger_alloc when upgrading from old versions") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>