diff options
| author | Enzo Matsumiya <ematsumiya@suse.de> | 2025-03-26 17:48:27 -0300 |
|---|---|---|
| committer | Enzo Matsumiya <ematsumiya@suse.de> | 2025-03-26 17:48:27 -0300 |
| commit | 8d4c40e084f3d132434d5d3d068175c8db59ce65 (patch) | |
| tree | 34d7b026eaed6ab0d5958389afd31a2bf01813db /fs/lockd/netns.h | |
| parent | 843e64492a7ed11436cc5c9bbfba46835939071a (diff) | |
| download | linux-data_corruption_v6.x.tar.gz linux-data_corruption_v6.x.tar.bz2 linux-data_corruption_v6.x.zip | |
smb: client: fix corruption in cifs_extend_writebackdata_corruption_v6.x
cifs.ko writepages implementation will try to extend the write buffer size in order to issue less,
but bigger write requests over the wire.
The function responsible for doing so, cifs_extend_writeback, however, did not account for some
important factors, and not handling some of those factors correctly lead to data corruption on
writes coming through writepages.
Such corrupt writes are very subtle and show no errors whatsoever on dmesg -- they can only be
observed by comparing expected vs actual outputs. Easy reproducer:
done | dd ibs=4194304 iflag=fullblock count=10240000 of=remotefile
8999946
<corrupt lines shows here>
'wc -l' is not really reliable as we've seen files with corrupt lines, but no missing ones.
Of course, the corruption doesn't happen with cache=none mount option.
Bug explanation:
- Pointer arguments are updated before bound checking (actual root cause)
@_len and @_count are updated with the current folio values before actually checking if the current
values fit in their boundaries, so by the time the function exits, the caller (only
cifs_write_back_from_locked_folio(), that BTW doesn't do any further checks) those arguments might
have crossed bounds and extra data (zeroes) are added as padding.
Later, with those offsets marked as 'done', the real actual data that should've been written into
those offsets are skipped, making the final file corrupt.
- Sync calls with ongoing writeback aren't sync
Folios are tested for ongoing writeback (folio_test_writeback), but not handled directly for
data-integrity sync syscalls (e.g. fsync() or msync()). When being called from those, and folio
*is* under writeback, we MUST wait for the writeback to complete because those calls must guarantee
the write went through.
By simply bailing out of the function, the implementation relies on the timing/luck that no further
errors happens later, and that the writeback indeed finished before returning.
- Any failed checks to the folios in @xas would call xas_reset
This means that whenever some/any folios were added to batch and processed, they are so again
in further write calls because @xas, making upper layers do double work on it.
This patch fixes the cases above, and also lessen the 'hard stop' conditions for cases where only a
single folio is affected, but others in @xas can still be processed (more of a performance
improvement).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Diffstat (limited to 'fs/lockd/netns.h')
0 files changed, 0 insertions, 0 deletions
