Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was planned to be done ages ago, now finally completed; there are
places where we have quite a few btree_trans objects on the stack, so
this reduces stack usage somewhat.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We now have separate per device io_refs for read and write access.
This fixes a device removal bug where the discard workers were still
running while we're removing alloc info for that device.
It's also a bit of hardening; we no longer allow writes to devices that
are read-only.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was previously hard to hit since it requires racing with device
removal, but splitting up io_ref uncovered it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's various checks for "are we going to compress this" - but we're
not going to compress if we know it's incompressible.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Also, improve the message in prep_encoded_data() - it now prints
good/bad checksums, and checksum type.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we're moving an extent that was partially overwritten,
bch2_write_rechecksum() will trim it to the currenty live range.
If we then also want to compress it, it'll be decrypted - but the nonce
has been advanced for the overwritten start of the extent that we
dropped, and we were using the nonce we calculated before rechecksum().
Reported-by: Gabriel de Perthuis <g2p.code@gmail.com>
Fixes: 127d90d2823e ("bcachefs: bch2_write_prep_encoded_data() now returns errcode")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Replace these with proper private error codes, so that when we get an
error message we're not sifting through the entire codebase to see where
it came from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for killing off EIO and replacing them with proper private
error codes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's no reason for the caller to do the actual logging, it's all done
the same.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
dm-flakey is busted, and this is simpler anyways - this lets us test the
checksum error retry ptahs
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More prep work for automatically kicking devices out after too many IO
errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
A user has been seeing the "error verifying existing checksum while
rewriting existing data (memory corruption?)" error.
This generally indicates a hardware issue (and that may be the case
here), but it might also indicate a bug, in which case we need more
information to look for patterns.
Reported-by: Roland Vet <vet.roland@protonmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The uppercase/lowercase style is nice for making the namespace explicit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We've got per-writepoint statistics to see how well the writepoint index
update threads are pipelining; this separates running vs. runnable so we
can see at a glance if they're blocking.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This makes 'bcachefs fs top' more useful; we can now see at a glance
whether the IO to the device is being done for user reads/writes, or
copygc/rebalance.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_nocow_write_convert_unwritten is already in transaction context:
00191 ========= TEST generic/648
00242 kernel BUG at fs/bcachefs/btree_iter.c:3332!
00242 Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
00242 Modules linked in:
00242 CPU: 4 UID: 0 PID: 2593 Comm: fsstress Not tainted 6.13.0-rc3-ktest-g345af8f855b7 #14403
00242 Hardware name: linux,dummy-virt (DT)
00242 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
00242 pc : __bch2_trans_get+0x120/0x410
00242 lr : __bch2_trans_get+0xcc/0x410
00242 sp : ffffff80d89af600
00242 x29: ffffff80d89af600 x28: ffffff80ddb23000 x27: 00000000fffff705
00242 x26: ffffff80ddb23028 x25: ffffff80d8903fe0 x24: ffffff80ebb30168
00242 x23: ffffff80c8aeb500 x22: 000000000000005d x21: ffffff80d8904078
00242 x20: ffffff80d8900000 x19: ffffff80da9e8000 x18: 0000000000000000
00242 x17: 64747568735f6c61 x16: 6e72756f6a20726f x15: 0000000000000028
00242 x14: 0000000000000004 x13: 000000000000f787 x12: ffffffc081bbcdc8
00242 x11: 0000000000000000 x10: 0000000000000003 x9 : ffffffc08094efbc
00242 x8 : 000000001092c111 x7 : 000000000000000c x6 : ffffffc083c31fc4
00242 x5 : ffffffc083c31f28 x4 : ffffff80c8aeb500 x3 : ffffff80ebb30000
00242 x2 : 0000000000000001 x1 : 0000000000000a21 x0 : 000000000000028e
00242 Call trace:
00242 __bch2_trans_get+0x120/0x410 (P)
00242 bch2_inum_offset_err_msg+0x48/0xb0
00242 bch2_nocow_write_convert_unwritten+0x3d0/0x530
00242 bch2_nocow_write+0xeb0/0x1000
00242 __bch2_write+0x330/0x4e8
00242 bch2_write+0x1f0/0x530
00242 bch2_direct_write+0x530/0xc00
00242 bch2_write_iter+0x160/0xbe0
00242 vfs_write+0x1cc/0x360
00242 ksys_write+0x5c/0xf0
00242 __arm64_sys_write+0x20/0x30
00242 invoke_syscall.constprop.0+0x54/0xe8
00242 do_el0_svc+0x44/0xc0
00242 el0_svc+0x34/0xa0
00242 el0t_64_sync_handler+0x104/0x130
00242 el0t_64_sync+0x154/0x158
00242 Code: 6b01001f 54ffff01 79408460 3617fec0 (d4210000)
00242 ---[ end trace 0000000000000000 ]---
00242 Kernel panic - not syncing: Oops - BUG: Fatal exception
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Ratelimit them, and use the new bch2_write_op_error() helper that prints
path and file offset.
Reported-by: https://github.com/koverstreet/bcachefs/issues/819
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
this fixes excessive transaction restarts due to trans_commit having to
upgrade
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We'll be introducing btree_iter_peek_prev_min(), so rename for
consistency.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helper to simplify bch2_bkey_set_needs_rebalance()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bucket_gen() checks if we're lookup up a valid bucket and returns NULL
otherwise, but bucket_gen_get() was failing to check; other callers were
correct.
Also do a bit of cleanup on callers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Using commit_do() to call alloc_sectors_start_trans() breaks when we're
randomly injecting transaction restarts - the restart in the commit
causes us to leak the lock that alloc_sectorS_start_trans() takes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
give bversions a more distinct name, to aid in grepping
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
rebalance writes must be BCH_WRITE_ALLOC_NOWAIT because they don't
allocate from the full filesystem - but we don't want spurious
allocation failures due to open buckets.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Limit these messages to once every 2 minutes to avoid spamming logs;
with multiple devices the output can be quite significant.
Also, up the default timeout to 30 seconds from 10 seconds.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Use try_cmpxchg() family of functions instead of
cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).
Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when
cmpxchg fails. There is no need to re-read the value in the loop.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Turn more asserts into proper recoverable error paths.
Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes a lifetime issue; bch2_nocow_write_unlock() uses
PTR_BUCKET_POS(), which needs the device - but we drop our ref to the
device in bch2_write_endio().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_dev_bkey_exists() is going away; bch2_dev_have_ref() documents that
we're looking up a device without checking if it's present because we
have a reference to it already.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is needed for the next patch - the write submit path has to be able
to allocate a replica bio even when we weren't able to get a ref on the
device.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we block on the allocator for more than 10 seconds, print out some
useful debugging info.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're about to add new asserts for btree_trans locking consistency, and
part of that requires that aren't using the btree_trans while it's
unlocked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Some renaming for better consistency
bch2_member_exists -> bch2_member_alive
bch2_dev_exists -> bch2_member_exists
bch2_dev_exsits2 -> bch2_dev_exists
bch_dev_locked -> bch2_dev_locked
bch_dev_bkey_exists -> bch2_dev_bkey_exists
new helper - bch2_dev_safe
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Combine iter/update/trigger/str_hash flags into a single enum, and
x-macroize them for a to_text() function later.
These flags are all for a specific iter/key/update context, so it makes
sense to group them together - iter/update/trigger flags were already
given distinct bits, this cleans up and unifies that handling.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Normally this is initialized in __bch2_write(), which is executed in a
loop, but the inline data path skips this.
Reported-by: syzbot+fd3ccb331eb21f05d13b@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Reported-by: syzbot+66b9b74f6520068596a9@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Currently, struct time_stats has the optional ability to quantize the
information that it collects. This is /probably/ useful for callers who
want to see quantized information, but it more than doubles the size of
the structure from 224 bytes to 464. For users who don't care about
that (e.g. upcoming xfs patches) and want to avoid wasting 240 bytes per
counter, split the two into separate pieces.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|