summaryrefslogtreecommitdiff
path: root/fs/bcachefs/io_write.c
AgeCommit message (Collapse)AuthorFilesLines
2025-05-01bcachefs: check for inode.bi_sectors underflowKent Overstreet1-0/+21
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Kill btree_iter.transKent Overstreet1-6/+6
This was planned to be done ages ago, now finally completed; there are places where we have quite a few btree_trans objects on the stack, so this reduces stack usage somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Split up bch_dev.io_refKent Overstreet1-2/+7
We now have separate per device io_refs for read and write access. This fixes a device removal bug where the discard workers were still running while we're removing alloc info for that device. It's also a bit of hardening; we no longer allow writes to devices that are read-only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-31bcachefs: Fix null ptr deref in bch2_write_endio()Kent Overstreet1-6/+13
This was previously hard to hit since it requires racing with device removal, but splitting up io_ref uncovered it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-26bcachefs: Don't unnecessarily decrypt data when movingKent Overstreet1-0/+3
There's various checks for "are we going to compress this" - but we're not going to compress if we know it's incompressible. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-25bcachefs: Fix duplicate checksum error messages in write pathKent Overstreet1-18/+20
Also, improve the message in prep_encoded_data() - it now prints good/bad checksums, and checksum type. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-25bcachefs: Fix nonce inconsistency in bch2_write_prep_encoded_data()Kent Overstreet1-1/+2
If we're moving an extent that was partially overwritten, bch2_write_rechecksum() will trim it to the currenty live range. If we then also want to compress it, it'll be decrypted - but the nonce has been advanced for the overwritten start of the extent that we dropped, and we were using the nonce we calculated before rechecksum(). Reported-by: Gabriel de Perthuis <g2p.code@gmail.com> Fixes: 127d90d2823e ("bcachefs: bch2_write_prep_encoded_data() now returns errcode") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: EIO cleanupKent Overstreet1-7/+7
Replace these with proper private error codes, so that when we get an error message we're not sifting through the entire codebase to see where it came from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: bch2_write_prep_encoded_data() now returns errcodeKent Overstreet1-88/+71
Prep work for killing off EIO and replacing them with proper private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: Simplify bch2_write_op_error()Kent Overstreet1-78/+32
There's no reason for the caller to do the actual logging, it's all done the same. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: trace_io_move_write_failKent Overstreet1-4/+10
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-16bcachefs: Debug params for data corruption injectionKent Overstreet1-0/+24
dm-flakey is busted, and this is simpler anyways - this lets us test the checksum error retry ptahs Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Finish bch2_account_io_completion() conversionsKent Overstreet1-5/+7
More prep work for automatically kicking devices out after too many IO errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: bch2_write_op_error() now prints info about data updateKent Overstreet1-30/+62
A user has been seeing the "error verifying existing checksum while rewriting existing data (memory corruption?)" error. This generally indicates a hardware issue (and that may be the case here), but it might also indicate a bug, in which case we need more information to look for patterns. Reported-by: Roland Vet <vet.roland@protonmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Rename BCH_WRITE flags fer consistency with other x-macros enumsKent Overstreet1-43/+43
The uppercase/lowercase style is nice for making the namespace explicit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Separate running/runnable in wp statsKent Overstreet1-1/+9
We've got per-writepoint statistics to see how well the writepoint index update threads are pipelining; this separates running vs. runnable so we can see at a glance if they're blocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Don't inc io_(read|write) counters for movesKent Overstreet1-1/+2
This makes 'bcachefs fs top' more useful; we can now see at a glance whether the IO to the device is being done for user reads/writes, or copygc/rebalance. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-12bcachefs: Reuse transactionAlan Huang1-1/+11
bch2_nocow_write_convert_unwritten is already in transaction context: 00191 ========= TEST generic/648 00242 kernel BUG at fs/bcachefs/btree_iter.c:3332! 00242 Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP 00242 Modules linked in: 00242 CPU: 4 UID: 0 PID: 2593 Comm: fsstress Not tainted 6.13.0-rc3-ktest-g345af8f855b7 #14403 00242 Hardware name: linux,dummy-virt (DT) 00242 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00242 pc : __bch2_trans_get+0x120/0x410 00242 lr : __bch2_trans_get+0xcc/0x410 00242 sp : ffffff80d89af600 00242 x29: ffffff80d89af600 x28: ffffff80ddb23000 x27: 00000000fffff705 00242 x26: ffffff80ddb23028 x25: ffffff80d8903fe0 x24: ffffff80ebb30168 00242 x23: ffffff80c8aeb500 x22: 000000000000005d x21: ffffff80d8904078 00242 x20: ffffff80d8900000 x19: ffffff80da9e8000 x18: 0000000000000000 00242 x17: 64747568735f6c61 x16: 6e72756f6a20726f x15: 0000000000000028 00242 x14: 0000000000000004 x13: 000000000000f787 x12: ffffffc081bbcdc8 00242 x11: 0000000000000000 x10: 0000000000000003 x9 : ffffffc08094efbc 00242 x8 : 000000001092c111 x7 : 000000000000000c x6 : ffffffc083c31fc4 00242 x5 : ffffffc083c31f28 x4 : ffffff80c8aeb500 x3 : ffffff80ebb30000 00242 x2 : 0000000000000001 x1 : 0000000000000a21 x0 : 000000000000028e 00242 Call trace: 00242 __bch2_trans_get+0x120/0x410 (P) 00242 bch2_inum_offset_err_msg+0x48/0xb0 00242 bch2_nocow_write_convert_unwritten+0x3d0/0x530 00242 bch2_nocow_write+0xeb0/0x1000 00242 __bch2_write+0x330/0x4e8 00242 bch2_write+0x1f0/0x530 00242 bch2_direct_write+0x530/0xc00 00242 bch2_write_iter+0x160/0xbe0 00242 vfs_write+0x1cc/0x360 00242 ksys_write+0x5c/0xf0 00242 __arm64_sys_write+0x20/0x30 00242 invoke_syscall.constprop.0+0x54/0xe8 00242 do_el0_svc+0x44/0xc0 00242 el0_svc+0x34/0xa0 00242 el0t_64_sync_handler+0x104/0x130 00242 el0t_64_sync+0x154/0x158 00242 Code: 6b01001f 54ffff01 79408460 3617fec0 (d4210000) 00242 ---[ end trace 0000000000000000 ]--- 00242 Kernel panic - not syncing: Oops - BUG: Fatal exception Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-25bcachefs: Improve decompression error messagesKent Overstreet1-2/+2
Ratelimit them, and use the new bch2_write_op_error() helper that prints path and file offset. Reported-by: https://github.com/koverstreet/bcachefs/issues/819 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: list_pop_entry()Kent Overstreet1-3/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Convert write path errors to inum_to_path()Kent Overstreet1-36/+55
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: add missing BTREE_ITER_intentKent Overstreet1-0/+1
this fixes excessive transaction restarts due to trans_commit having to upgrade Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max()Kent Overstreet1-2/+2
We'll be introducing btree_iter_peek_prev_min(), so rename for consistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: io_opts_to_rebalance_opts()Kent Overstreet1-1/+1
New helper to simplify bch2_bkey_set_needs_rebalance() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-11-07bcachefs: Fix null ptr deref in bucket_gen_get()Kent Overstreet1-5/+2
bucket_gen() checks if we're lookup up a valid bucket and returns NULL otherwise, but bucket_gen_get() was failing to check; other callers were correct. Also do a bit of cleanup on callers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: Don't use commit_do() unnecessarilyKent Overstreet1-2/+2
Using commit_do() to call alloc_sectors_start_trans() breaks when we're randomly injecting transaction restarts - the restart in the commit causes us to leak the lock that alloc_sectorS_start_trans() takes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27bcachefs: rename version -> bversionKent Overstreet1-2/+2
give bversions a more distinct name, to aid in grepping Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: BCH_WRITE_ALLOC_NOWAIT no longer applies to open bucket allocationKent Overstreet1-3/+4
rebalance writes must be BCH_WRITE_ALLOC_NOWAIT because they don't allocate from the full filesystem - but we don't want spurious allocation failures due to open buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07bcachefs: Make allocator stuck timeout configurable, ratelimit messagesKent Overstreet1-4/+1
Limit these messages to once every 2 minutes to avoid spamming logs; with multiple devices the output can be quite significant. Also, up the default timeout to 30 seconds from 10 seconds. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Rename BCH_WRITE_DONE -> BCH_WRITE_SUBMITTEDKent Overstreet1-12/+12
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Ratelimit checksum error messagesKent Overstreet1-1/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: spelling fixKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Use try_cmpxchg() family of functions instead of cmpxchg()Uros Bizjak1-4/+3
Use try_cmpxchg() family of functions instead of cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()Kent Overstreet1-4/+15
Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_dev_get_ioref() checks for device not presentKent Overstreet1-2/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_dev_get_ioref2(); io_write.cKent Overstreet1-10/+11
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Move nocow unlock to bch2_write_endio()Kent Overstreet1-19/+7
This fixes a lifetime issue; bch2_nocow_write_unlock() uses PTR_BUCKET_POS(), which needs the device - but we drop our ref to the device in bch2_write_endio(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_dev_have_ref()Kent Overstreet1-2/+2
bch2_dev_bkey_exists() is going away; bch2_dev_have_ref() documents that we're looking up a device without checking if it's present because we have a reference to it already. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: extent_ptr_durability() -> bch2_dev_rcu()Kent Overstreet1-1/+6
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: move replica_set from bch_dev to bch_fsKent Overstreet1-4/+4
This is needed for the next patch - the write submit path has to be able to allocate a replica bio even when we weren't able to get a ref on the device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: PTR_BUCKET_POS() now takes bch_devKent Overstreet1-4/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_print_allocator_stuck()Kent Overstreet1-1/+5
If we block on the allocator for more than 10 seconds, print out some useful debugging info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_bkey_drop_ptrs() declares loop iterKent Overstreet1-1/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_trans_unlock() must always be followed by relock() or begin()Kent Overstreet1-0/+4
We're about to add new asserts for btree_trans locking consistency, and part of that requires that aren't using the btree_trans while it's unlocked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: member helper cleanupsKent Overstreet1-6/+6
Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet1-8/+8
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: prt_printf() now respects \r\n\tKent Overstreet1-2/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Initialize bch_write_op->failed in inline data pathKent Overstreet1-0/+2
Normally this is initialized in __bch2_write(), which is executed in a loop, but the inline data path skips this. Reported-by: syzbot+fd3ccb331eb21f05d13b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-06bcachefs: Inodes need extra padding for varint_decode_fast()Kent Overstreet1-10/+18
Reported-by: syzbot+66b9b74f6520068596a9@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: time_stats: split stats-with-quantiles into a separate structureDarrick J. Wong1-1/+1
Currently, struct time_stats has the optional ability to quantize the information that it collects. This is /probably/ useful for callers who want to see quantized information, but it more than doubles the size of the structure from 224 bytes to 464. For users who don't care about that (e.g. upcoming xfs patches) and want to avoid wasting 240 bytes per counter, split the two into separate pieces. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>