linux.git - Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

diff options

author	Enzo Matsumiya <ematsumiya@suse.de>	2025-03-26 17:48:27 -0300
committer	Enzo Matsumiya <ematsumiya@suse.de>	2025-03-26 17:48:27 -0300
commit	8d4c40e084f3d132434d5d3d068175c8db59ce65 (patch)
tree	34d7b026eaed6ab0d5958389afd31a2bf01813db /fs/lockd/netns.h
parent	843e64492a7ed11436cc5c9bbfba46835939071a (diff)
download	linux-data_corruption_v6.x.tar.gz linux-data_corruption_v6.x.tar.bz2 linux-data_corruption_v6.x.zip

smb: client: fix corruption in cifs_extend_writebackdata_corruption_v6.x

cifs.ko writepages implementation will try to extend the write buffer size in order to issue less, but bigger write requests over the wire. The function responsible for doing so, cifs_extend_writeback, however, did not account for some important factors, and not handling some of those factors correctly lead to data corruption on writes coming through writepages. Such corrupt writes are very subtle and show no errors whatsoever on dmesg -- they can only be observed by comparing expected vs actual outputs. Easy reproducer: done | dd ibs=4194304 iflag=fullblock count=10240000 of=remotefile 8999946 <corrupt lines shows here> 'wc -l' is not really reliable as we've seen files with corrupt lines, but no missing ones. Of course, the corruption doesn't happen with cache=none mount option. Bug explanation: - Pointer arguments are updated before bound checking (actual root cause) @_len and @_count are updated with the current folio values before actually checking if the current values fit in their boundaries, so by the time the function exits, the caller (only cifs_write_back_from_locked_folio(), that BTW doesn't do any further checks) those arguments might have crossed bounds and extra data (zeroes) are added as padding. Later, with those offsets marked as 'done', the real actual data that should've been written into those offsets are skipped, making the final file corrupt. - Sync calls with ongoing writeback aren't sync Folios are tested for ongoing writeback (folio_test_writeback), but not handled directly for data-integrity sync syscalls (e.g. fsync() or msync()). When being called from those, and folio *is* under writeback, we MUST wait for the writeback to complete because those calls must guarantee the write went through. By simply bailing out of the function, the implementation relies on the timing/luck that no further errors happens later, and that the writeback indeed finished before returning. - Any failed checks to the folios in @xas would call xas_reset This means that whenever some/any folios were added to batch and processed, they are so again in further write calls because @xas, making upper layers do double work on it. This patch fixes the cases above, and also lessen the 'hard stop' conditions for cases where only a single folio is affected, but others in @xas can still be processed (more of a performance improvement). Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>

Diffstat (limited to 'fs/lockd/netns.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: