summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-05-19f2fs: fix compat F2FS_IOC_{MOVE,GARBAGE_COLLECT}_RANGEChao Yu1-33/+104
[ Upstream commit 34178b1bc4b5c936eab3adb4835578093095a571 ] Eric reported a ioctl bug in below link: https://lore.kernel.org/linux-f2fs-devel/20201103032234.GB2875@sol.localdomain/ That said, on some 32-bit architectures, u64 has only 32-bit alignment, notably i386 and x86_32, so that size of struct f2fs_gc_range compiled in x86_32 is 20 bytes, however the size in x86_64 is 24 bytes, binary compiled in x86_32 can not call F2FS_IOC_GARBAGE_COLLECT_RANGE successfully due to mismatched value of ioctl command in between binary and f2fs module, similarly, F2FS_IOC_MOVE_RANGE will fail too. In this patch we introduce two ioctls for compatibility of above special 32-bit binary: - F2FS_IOC32_GARBAGE_COLLECT_RANGE - F2FS_IOC32_MOVE_RANGE Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19f2fs: move ioctl interface definitions to separated fileChao Yu2-79/+1
[ Upstream commit fa4320cefb8537a70cc28c55d311a1f569697cd3 ] Like other filesystem does, we introduce a new file f2fs.h in path of include/uapi/linux/, and move f2fs-specified ioctl interface definitions to that file, after then, in order to use those definitions, userspace developer only need to include the new header file rather than copy & paste definitions from fs/f2fs/f2fs.h. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19cuse: prevent cloneMiklos Szeredi1-0/+2
[ Upstream commit 8217673d07256b22881127bf50dce874d0e51653 ] For cloned connections cuse_channel_release() will be called more than once, resulting in use after free. Prevent device cloning for CUSE, which does not make sense at this point, and highly unlikely to be used in real life. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19virtiofs: fix usernsMiklos Szeredi1-2/+1
[ Upstream commit 0a7419c68a45d2d066b996be5087aa2d07ce80eb ] get_user_ns() is done twice (once in virtio_fs_get_tree() and once in fuse_conn_init()), resulting in a reference leak. Also looks better to use fsc->user_ns (which *should* be the current_user_ns() at this point). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fuse: invalidate attrs when page writeback completesVivek Goyal1-0/+9
[ Upstream commit 3466958beb31a8e9d3a1441a34228ed088b84f3e ] In fuse when a direct/write-through write happens we invalidate attrs because that might have updated mtime/ctime on server and cached mtime/ctime will be stale. What about page writeback path. Looks like we don't invalidate attrs there. To be consistent, invalidate attrs in writeback path as well. Only exception is when writeback_cache is enabled. In that case we strust local mtime/ctime and there is no need to invalidate attrs. Recently users started experiencing failure of xfstests generic/080, geneirc/215 and generic/614 on virtiofs. This happened only newer "stat" utility and not older one. This patch fixes the issue. So what's the root cause of the issue. Here is detailed explanation. generic/080 test does mmap write to a file, closes the file and then checks if mtime has been updated or not. When file is closed, it leads to flushing of dirty pages (and that should update mtime/ctime on server). But we did not explicitly invalidate attrs after writeback finished. Still generic/080 passed so far and reason being that we invalidated atime in fuse_readpages_end(). This is called in fuse_readahead() path and always seems to trigger before mmaped write. So after mmaped write when lstat() is called, it sees that atleast one of the fields being asked for is invalid (atime) and that results in generating GETATTR to server and mtime/ctime also get updated and test passes. But newer /usr/bin/stat seems to have moved to using statx() syscall now (instead of using lstat()). And statx() allows it to query only ctime or mtime (and not rest of the basic stat fields). That means when querying for mtime, fuse_update_get_attr() sees that mtime is not invalid (only atime is invalid). So it does not generate a new GETATTR and fill stat with cached mtime/ctime. And that means updated mtime is not seen by xfstest and tests start failing. Invalidating attrs after writeback completion should solve this problem in a generic manner. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fs: dlm: flush swork on shutdownAlexander Aring1-4/+1
[ Upstream commit eec054b5a7cfe6d1f1598a323b05771ee99857b5 ] This patch fixes the flushing of send work before shutdown. The function cancel_work_sync() is not the right workqueue functionality to use here as it would cancel the work if the work queues itself. In cases of EAGAIN in send() for dlm message we need to be sure that everything is send out before. The function flush_work() will ensure that every send work is be done inclusive in EAGAIN cases. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fs: dlm: check on minimum msglen sizeAlexander Aring1-3/+4
[ Upstream commit 710176e8363f269c6ecd73d203973b31ace119d3 ] This patch adds an additional check for minimum dlm header size which is an invalid dlm message and signals a broken stream. A msglen field cannot be less than the dlm header size because the field is inclusive header lengths. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fs: dlm: add errno handling to check callbackAlexander Aring1-7/+16
[ Upstream commit 8aa9540b49e0833feba75dbf4f45babadd0ed215 ] This allows to return individual errno values for the config attribute check callback instead of returning invalid argument only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fs: dlm: fix debugfs dumpAlexander Aring1-0/+1
[ Upstream commit 92c48950b43f4a767388cf87709d8687151a641f ] This patch fixes the following message which randomly pops up during glocktop call: seq_file: buggy .next function table_seq_next did not update position index The issue is that seq_read_iter() in fs/seq_file.c also needs an increment of the index in an non next record case as well which this patch fixes otherwise seq_read_iter() will print out the above message. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14afs: Fix speculative status fetchesDavid Howells6-2/+23
[ Upstream commit 22650f148126571be1098d34160eb4931fc77241 ] The generic/464 xfstest causes kAFS to emit occasional warnings of the form: kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015) This indicates that the data version received back from the server did not match the expected value (the DV should be incremented monotonically for each individual modification op committed to a vnode). What is happening is that a lookup call is doing a bulk status fetch speculatively on a bunch of vnodes in a directory besides getting the status of the vnode it's actually interested in. This is racing with a StoreData operation (though it could also occur with, say, a MakeDir op). On the client, a modification operation locks the vnode, but the bulk status fetch only locks the parent directory, so no ordering is imposed there (thereby avoiding an avenue to deadlock). On the server, the StoreData op handler doesn't lock the vnode until it's received all the request data, and downgrades the lock after committing the data until it has finished sending change notifications to other clients - which allows the status fetch to occur before it has finished. This means that: - a status fetch can access the target vnode either side of the exclusive section of the modification - the status fetch could start before the modification, yet finish after, and vice-versa. - the status fetch and the modification RPCs can complete in either order. - the status fetch can return either the before or the after DV from the modification. - the status fetch might regress the locally cached DV. Some of these are handled by the previous fix[1], but that's not sufficient because it checks the DV it received against the DV it cached at the start of the op, but the DV might've been updated in the meantime by a locally generated modification op. Fix this by the following means: (1) Keep track of when we're performing a modification operation on a vnode. This is done by marking vnode parameters with a 'modification' note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode for the duration. (2) Alter the speculation race detection to ignore speculative status fetches if either the vnode is marked as being modified or the data version number is not what we expected. Note that whilst the "vnode modified" warning does get recovered from as it causes the client to refetch the status at the next opportunity, it will also invalidate the pagecache, so changes might get lost. Fixes: a9e5c87ca744 ("afs: Fix speculative status fetch going out of order wrt to modifications") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-and-reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14ovl: invalidate readdir cache on changes to dir with originAmir Goldstein3-37/+36
[ Upstream commit 65cd913ec9d9d71529665924c81015b7ab7d9381 ] The test in ovl_dentry_version_inc() was out-dated and did not include the case where readdir cache is used on a non-merge dir that has origin xattr, indicating that it may contain leftover whiteouts. To make the code more robust, use the same helper ovl_dir_is_real() to determine if readdir cache should be used and if readdir cache should be invalidated. Fixes: b79e05aaa166 ("ovl: no direct iteration for dir with origin xattr") Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxht70nODhNHNwGFMSqDyOKLXOKrY0H6g849os4BQ7cokA@mail.gmail.com/ Cc: Chris Murphy <lists@colorremedies.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14xfs: fix return of uninitialized value in variable errorColin Ian King1-0/+1
[ Upstream commit 3b6dd9a9aeeada19d0c820ff68e979243a888bb6 ] A previous commit removed a call to xfs_attr3_leaf_read that assigned an error return code to variable error. We now have a few early error return paths to label 'out' that return error if error is set; however error now is uninitialized so potentially garbage is being returned. Fix this by setting error to zero to restore the original behaviour where error was zero at the label 'restart'. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14io_uring: fix overflows checks in provide buffersPavel Begunkov1-2/+8
[ Upstream commit 38134ada0ceea3e848fe993263c0ff6207fd46e7 ] Colin reported before possible overflow and sign extension problems in io_provide_buffers_prep(). As Linus pointed out previous attempt did nothing useful, see d81269fecb8ce ("io_uring: fix provide_buffers sign extension"). Do that with help of check_<op>_overflow helpers. And fix struct io_provide_buf::len type, as it doesn't make much sense to keep it signed. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: efe68c1ca8f49 ("io_uring: validate the full range of provided buffers for access") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/46538827e70fce5f6cdb50897cff4cacc490f380.1618488258.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14seccomp: Fix CONFIG tests for Seccomp_filtersKenta.Tada@sony.com1-0/+2
[ Upstream commit 64bdc0244054f7d4bb621c8b4455e292f4e421bc ] Strictly speaking, seccomp filters are only used when CONFIG_SECCOMP_FILTER. This patch fixes the condition to enable "Seccomp_filters" in /proc/$pid/status. Signed-off-by: Kenta Tada <Kenta.Tada@sony.com> Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/OSBPR01MB26772D245E2CF4F26B76A989F5669@OSBPR01MB2677.jpnprd01.prod.outlook.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14afs: Fix updating of i_mode due to 3rd party changeDavid Howells1-3/+3
[ Upstream commit 6e1eb04a87f954eb06a89ee6034c166351dfff6e ] Fix afs_apply_status() to mask off the irrelevant bits from status->mode when OR'ing them into i_mode. This can happen when a 3rd party chmod occurs. Also fix afs_inode_init_from_status() to mask off the mode bits when initialising i_mode. Fixes: 260a980317da ("[AFS]: Add "directory write" support.") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14NFSv4.2: fix copy stateid copying for the async copyOlga Kornievskaia1-2/+2
[ Upstream commit e739b12042b6b079a397a3c234f96c09d1de0b40 ] This patch fixes Dan Carpenter's report that the static checker found a problem where memcpy() was copying into too small of a buffer. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: e0639dc5805a ("NFSD introduce async copy feature") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14NFSD: Fix sparse warning in nfs4proc.cChuck Lever1-6/+2
[ Upstream commit eb162e1772f85231dabc789fb4bfea63d2d9df79 ] linux/fs/nfsd/nfs4proc.c:1542:24: warning: incorrect type in assignment (different base types) linux/fs/nfsd/nfs4proc.c:1542:24: expected restricted __be32 [assigned] [usertype] status linux/fs/nfsd/nfs4proc.c:1542:24: got int Clean-up: The dup_copy_fields() function returns only zero, so make it return void for now, and get rid of the return code check. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14ovl: fix missing revert_creds() on error pathDan Carpenter1-1/+2
commit 7b279bbfd2b230c7a210ff8f405799c7e46bbf48 upstream. Smatch complains about missing that the ovl_override_creds() doesn't have a matching revert_creds() if the dentry is disconnected. Fix this by moving the ovl_override_creds() until after the disconnected check. Fixes: aa3ff3c152ff ("ovl: copy up of disconnected dentries") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffersThadeu Lima de Souza Cascardo1-2/+2
commit d1f82808877bb10d3deee7cf3374a4eb3fb582db upstream. Read and write operations are capped to MAX_RW_COUNT. Some read ops rely on that limit, and that is not guaranteed by the IORING_OP_PROVIDE_BUFFERS. Truncate those lengths when doing io_add_buffers, so buffer addresses still use the uncapped length. Also, take the chance and change struct io_buffer len member to __u32, so it matches struct io_provide_buffer len member. This fixes CVE-2021-3491, also reported as ZDI-CAN-13546. Fixes: ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS") Reported-by: Billy Jheng Bing-Jhong (@st424204) Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: Fix occasional generic/418 failureJan Kara1-4/+21
commit 5899593f51e63dde2f07c67358bd65a641585abb upstream. Eric has noticed that after pagecache read rework, generic/418 is occasionally failing for ext4 when blocksize < pagesize. In fact, the pagecache rework just made hard to hit race in ext4 more likely. The problem is that since ext4 conversion of direct IO writes to iomap framework (commit 378f32bab371), we update inode size after direct IO write only after invalidating page cache. Thus if buffered read sneaks at unfortunate moment like: CPU1 - write at offset 1k CPU2 - read from offset 0 iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT); ext4_readpage(); ext4_handle_inode_extension() the read will zero out tail of the page as it still sees smaller inode size and thus page cache becomes inconsistent with on-disk contents with all the consequences. Fix the problem by moving inode size update into end_io handler which gets called before the page cache is invalidated. Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com> Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: allow the dax flag to be set and cleared on inline directoriesTheodore Ts'o2-1/+8
commit 4811d9929cdae4238baf5b2522247bd2f9fa7b50 upstream. This is needed to allow generic/607 to pass for file systems with the inline data_feature enabled, and it allows the use of file systems where the directories use inline_data, while the files are accessed via DAX. Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: fix error return code in ext4_fc_perform_commit()Xu Yihang1-1/+3
commit e1262cd2e68a0870fb9fc95eb202d22e8f0074b7 upstream. In case of if not ext4_fc_add_tlv branch, an error return code is missing. Cc: stable@kernel.org Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xu Yihang <xuyihang@huawei.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20210408070033.123047-1-xuyihang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: fix ext4_error_err save negative errno into superblockYe Bin1-1/+1
commit 6810fad956df9e5467e8e8a5ac66fda0836c71fa upstream. Fix As write_mmp_block() so that it returns -EIO instead of 1, so that the correct error gets saved into the superblock. Cc: stable@kernel.org Fixes: 54d3adbc29f0 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Liu Zhi Qiang <liuzhiqiang26@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20210406025331.148343-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: fix error code in ext4_commit_superFengnan Chang1-2/+4
commit f88f1466e2a2e5ca17dfada436d3efa1b03a3972 upstream. We should set the error code when ext4_commit_super check argument failed. Found in code review. Fixes: c4be0c1dc4cdc ("filesystem freeze: add error handling of write_super_lockfs/unlockfs"). Cc: stable@kernel.org Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20210402101631.561-1-changfengnan@vivo.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()Zhang Yi1-3/+0
commit 72ffb49a7b623c92a37657eda7cc46a06d3e8398 upstream. When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due to some error happens behind ext4_orphan_cleanup(), it will end up triggering a after free issue of super_block. The problem is that ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is enabled, after we cleanup the truncated inodes, the last iput() will put them into the lru list, and these inodes' pages may probably dirty and will be write back by the writeback thread, so it could be raced by freeing super_block in the error path of mount_bdev(). After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it was used to ensure updating the quota file properly, but evict inode and trash data immediately in the last iput does not affect the quotafile, so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by just remove the SB_ACTIVE setting. [1] https://lore.kernel.org/linux-ext4/99cce8ca-e4a0-7301-840f-2ace67c551f3@huawei.com/T/#m04990cfbc4f44592421736b504afcc346b2a7c00 Cc: stable@kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Tested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210331033138.918975-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: fix check to prevent false positive report of incorrect used inodesZhang Yi1-16/+32
commit a149d2a5cabbf6507a7832a1c4fd2593c55fd450 upstream. Commit <50122847007> ("ext4: fix check to prevent initializing reserved inodes") check the block group zero and prevent initializing reserved inodes. But in some special cases, the reserved inode may not all belong to the group zero, it may exist into the second group if we format filesystem below. mkfs.ext4 -b 4096 -g 8192 -N 1024 -I 4096 /dev/sda So, it will end up triggering a false positive report of a corrupted file system. This patch fix it by avoid check reserved inodes if no free inode blocks will be zeroed. Cc: stable@kernel.org Fixes: 50122847007 ("ext4: fix check to prevent initializing reserved inodes") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Suggested-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210331121516.2243099-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: annotate data race in jbd2_journal_dirty_metadata()Jan Kara1-4/+4
commit 83fe6b18b8d04c6c849379005e1679bac9752466 upstream. Assertion checks in jbd2_journal_dirty_metadata() are known to be racy but we don't want to be grabbing locks just for them. We thus recheck them under b_state_lock only if it looks like they would fail. Annotate the checks with data_race(). Cc: stable@kernel.org Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210406161804.20150-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11ext4: annotate data race in start_this_handle()Jan Kara1-1/+6
commit 3b1833e92baba135923af4a07e73fe6e54be5a2f upstream. Access to journal->j_running_transaction is not protected by appropriate lock and thus is racy. We are well aware of that and the code handles the race properly. Just add a comment and data_race() annotation. Cc: stable@kernel.org Reported-by: syzbot+30774a6acf6a2cf6d535@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210406161804.20150-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11smb3: do not attempt multichannel to server which does not support itSteve French1-0/+6
commit 9c2dc11df50d1c8537075ff6b98472198e24438e upstream. We were ignoring CAP_MULTI_CHANNEL in the server response - if the server doesn't support multichannel we should not be attempting it. See MS-SMB2 section 3.2.5.2 Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-By: Tom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11smb3: when mounting with multichannel include it in requested capabilitiesSteve French1-0/+5
commit 679971e7213174efb56abc8fab1299d0a88db0e8 upstream. In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to advertise CAP_MULTICHANNEL capability when establishing multiple channels has been requested by the user doing the mount. See MS-SMB2 sections 2.2.3 and 3.2.5.2 Without setting it there is some risk that multichannel could fail if the server interpreted the field strictly. Reviewed-By: Tom Talpey <tom@talpey.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11jffs2: check the validity of dstlen in jffs2_zlib_compress()Yang Yang1-0/+3
commit 90ada91f4610c5ef11bc52576516d96c496fc3f1 upstream. KASAN reports a BUG when download file in jffs2 filesystem.It is because when dstlen == 1, cpage_out will write array out of bounds. Actually, data will not be compressed in jffs2_zlib_compress() if data's length less than 4. [ 393.799778] BUG: KASAN: slab-out-of-bounds in jffs2_rtime_compress+0x214/0x2f0 at addr ffff800062e3b281 [ 393.809166] Write of size 1 by task tftp/2918 [ 393.813526] CPU: 3 PID: 2918 Comm: tftp Tainted: G B 4.9.115-rt93-EMBSYS-CGEL-6.1.R6-dirty #1 [ 393.823173] Hardware name: LS1043A RDB Board (DT) [ 393.827870] Call trace: [ 393.830322] [<ffff20000808c700>] dump_backtrace+0x0/0x2f0 [ 393.835721] [<ffff20000808ca04>] show_stack+0x14/0x20 [ 393.840774] [<ffff2000086ef700>] dump_stack+0x90/0xb0 [ 393.845829] [<ffff20000827b19c>] kasan_object_err+0x24/0x80 [ 393.851402] [<ffff20000827b404>] kasan_report_error+0x1b4/0x4d8 [ 393.857323] [<ffff20000827bae8>] kasan_report+0x38/0x40 [ 393.862548] [<ffff200008279d44>] __asan_store1+0x4c/0x58 [ 393.867859] [<ffff2000084ce2ec>] jffs2_rtime_compress+0x214/0x2f0 [ 393.873955] [<ffff2000084bb3b0>] jffs2_selected_compress+0x178/0x2a0 [ 393.880308] [<ffff2000084bb530>] jffs2_compress+0x58/0x478 [ 393.885796] [<ffff2000084c5b34>] jffs2_write_inode_range+0x13c/0x450 [ 393.892150] [<ffff2000084be0b8>] jffs2_write_end+0x2a8/0x4a0 [ 393.897811] [<ffff2000081f3008>] generic_perform_write+0x1c0/0x280 [ 393.903990] [<ffff2000081f5074>] __generic_file_write_iter+0x1c4/0x228 [ 393.910517] [<ffff2000081f5210>] generic_file_write_iter+0x138/0x288 [ 393.916870] [<ffff20000829ec1c>] __vfs_write+0x1b4/0x238 [ 393.922181] [<ffff20000829ff00>] vfs_write+0xd0/0x238 [ 393.927232] [<ffff2000082a1ba8>] SyS_write+0xa0/0x110 [ 393.932283] [<ffff20000808429c>] __sys_trace_return+0x0/0x4 [ 393.937851] Object at ffff800062e3b280, in cache kmalloc-64 size: 64 [ 393.944197] Allocated: [ 393.946552] PID = 2918 [ 393.948913] save_stack_trace_tsk+0x0/0x220 [ 393.953096] save_stack_trace+0x18/0x20 [ 393.956932] kasan_kmalloc+0xd8/0x188 [ 393.960594] __kmalloc+0x144/0x238 [ 393.963994] jffs2_selected_compress+0x48/0x2a0 [ 393.968524] jffs2_compress+0x58/0x478 [ 393.972273] jffs2_write_inode_range+0x13c/0x450 [ 393.976889] jffs2_write_end+0x2a8/0x4a0 [ 393.980810] generic_perform_write+0x1c0/0x280 [ 393.985251] __generic_file_write_iter+0x1c4/0x228 [ 393.990040] generic_file_write_iter+0x138/0x288 [ 393.994655] __vfs_write+0x1b4/0x238 [ 393.998228] vfs_write+0xd0/0x238 [ 394.001543] SyS_write+0xa0/0x110 [ 394.004856] __sys_trace_return+0x0/0x4 [ 394.008684] Freed: [ 394.010691] PID = 2918 [ 394.013051] save_stack_trace_tsk+0x0/0x220 [ 394.017233] save_stack_trace+0x18/0x20 [ 394.021069] kasan_slab_free+0x88/0x188 [ 394.024902] kfree+0x6c/0x1d8 [ 394.027868] jffs2_sum_write_sumnode+0x2c4/0x880 [ 394.032486] jffs2_do_reserve_space+0x198/0x598 [ 394.037016] jffs2_reserve_space+0x3f8/0x4d8 [ 394.041286] jffs2_write_inode_range+0xf0/0x450 [ 394.045816] jffs2_write_end+0x2a8/0x4a0 [ 394.049737] generic_perform_write+0x1c0/0x280 [ 394.054179] __generic_file_write_iter+0x1c4/0x228 [ 394.058968] generic_file_write_iter+0x138/0x288 [ 394.063583] __vfs_write+0x1b4/0x238 [ 394.067157] vfs_write+0xd0/0x238 [ 394.070470] SyS_write+0xa0/0x110 [ 394.073783] __sys_trace_return+0x0/0x4 [ 394.077612] Memory state around the buggy address: [ 394.082404] ffff800062e3b180: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ 394.089623] ffff800062e3b200: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ 394.096842] >ffff800062e3b280: 01 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 394.104056] ^ [ 394.107283] ffff800062e3b300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 394.114502] ffff800062e3b380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 394.121718] ================================================================== Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Cc: Joel Stanley <joel@jms.id.au> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11exfat: fix erroneous discard when clear cluster bitHyeongseok Kim1-10/+1
commit 77edfc6e51055b61cae2f54c8e6c3bb7c762e4fe upstream. If mounted with discard option, exFAT issues discard command when clear cluster bit to remove file. But the input parameter of cluster-to-sector calculation is abnormally added by reserved cluster size which is 2, leading to discard unrelated sectors included in target+2 cluster. With fixing this, remove the wrong comments in set/clear/find bitmap functions. Fixes: 1e49a94cf707 ("exfat: add bitmap operations") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com> Acked-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11fuse: fix write deadlockVivek Goyal2-12/+30
commit 4f06dd92b5d0a6f8eec6a34b8d6ef3e1f4ac1e10 upstream. There are two modes for write(2) and friends in fuse: a) write through (update page cache, send sync WRITE request to userspace) b) buffered write (update page cache, async writeout later) The write through method kept all the page cache pages locked that were used for the request. Keeping more than one page locked is deadlock prone and Qian Cai demonstrated this with trinity fuzzing. The reason for keeping the pages locked is that concurrent mapped reads shouldn't try to pull possibly stale data into the page cache. For full page writes, the easy way to fix this is to make the cached page be the authoritative source by marking the page PG_uptodate immediately. After this the page can be safely unlocked, since mapped/cached reads will take the written data from the cache. Concurrent mapped writes will now cause data in the original WRITE request to be updated; this however doesn't cause any data inconsistency and this scenario should be exceedingly rare anyway. If the WRITE request returns with an error in the above case, currently the page is not marked uptodate; this means that a concurrent read will always read consistent data. After this patch the page is uptodate between writing to the cache and receiving the error: there's window where a cached read will read the wrong data. While theoretically this could be a regression, it is unlikely to be one in practice, since this is normal for buffered writes. In case of a partial page write to an already uptodate page the locking is also unnecessary, with the above caveats. Partial write of a not uptodate page still needs to be handled. One way would be to read the complete page before doing the write. This is not possible, since it might break filesystems that don't expect any READ requests when the file was opened O_WRONLY. The other solution is to serialize the synchronous write with reads from the partial pages. The easiest way to do this is to keep the partial pages locked. The problem is that a write() may involve two such pages (one head and one tail). This patch fixes it by only locking the partial tail page. If there's a partial head page as well, then split that off as a separate WRITE request. Reported-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/linux-fsdevel/4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw/ Fixes: ea9b9907b82a ("fuse: implement perform_write") Cc: <stable@vger.kernel.org> # v2.6.26 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11jffs2: Hook up splice_write callbackJoel Stanley1-0/+1
commit 42984af09afc414d540fcc8247f42894b0378a91 upstream. overlayfs using jffs2 as the upper filesystem would fail in some cases since moving to v5.10. The test case used was to run 'touch' on a file that exists in the lower fs, causing the modification time to be updated. It returns EINVAL when the bug is triggered. A bisection showed this was introduced in v5.9-rc1, with commit 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops"). Reverting that commit restores the expected behaviour. Some digging showed that this was due to jffs2 lacking an implementation of splice_write. (For unknown reasons the warn_unsupported that should trigger was not displaying any output). Adding this patch resolved the issue and the test now passes. Cc: stable@vger.kernel.org Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Lei YU <yulei.sh@bytedance.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11jffs2: Fix kasan slab-out-of-bounds problemlizhe1-1/+1
commit 960b9a8a7676b9054d8b46a2c7db52a0c8766b56 upstream. KASAN report a slab-out-of-bounds problem. The logs are listed below. It is because in function jffs2_scan_dirent_node, we alloc "checkedlen+1" bytes for fd->name and we check crc with length rd->nsize. If checkedlen is less than rd->nsize, it will cause the slab-out-of-bounds problem. jffs2: Dirent at *** has zeroes in name. Truncating to %d char ================================================================== BUG: KASAN: slab-out-of-bounds in crc32_le+0x1ce/0x260 at addr ffff8800842cf2d1 Read of size 1 by task test_JFFS2/915 ============================================================================= BUG kmalloc-64 (Tainted: G B O ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in jffs2_alloc_full_dirent+0x2a/0x40 age=0 cpu=1 pid=915 ___slab_alloc+0x580/0x5f0 __slab_alloc.isra.24+0x4e/0x64 __kmalloc+0x170/0x300 jffs2_alloc_full_dirent+0x2a/0x40 jffs2_scan_eraseblock+0x1ca4/0x3b64 jffs2_scan_medium+0x285/0xfe0 jffs2_do_mount_fs+0x5fb/0x1bbc jffs2_do_fill_super+0x245/0x6f0 jffs2_fill_super+0x287/0x2e0 mount_mtd_aux.isra.0+0x9a/0x144 mount_mtd+0x222/0x2f0 jffs2_mount+0x41/0x60 mount_fs+0x63/0x230 vfs_kern_mount.part.6+0x6c/0x1f4 do_mount+0xae8/0x1940 SyS_mount+0x105/0x1d0 INFO: Freed in jffs2_free_full_dirent+0x22/0x40 age=27 cpu=1 pid=915 __slab_free+0x372/0x4e4 kfree+0x1d4/0x20c jffs2_free_full_dirent+0x22/0x40 jffs2_build_remove_unlinked_inode+0x17a/0x1e4 jffs2_do_mount_fs+0x1646/0x1bbc jffs2_do_fill_super+0x245/0x6f0 jffs2_fill_super+0x287/0x2e0 mount_mtd_aux.isra.0+0x9a/0x144 mount_mtd+0x222/0x2f0 jffs2_mount+0x41/0x60 mount_fs+0x63/0x230 vfs_kern_mount.part.6+0x6c/0x1f4 do_mount+0xae8/0x1940 SyS_mount+0x105/0x1d0 entry_SYSCALL_64_fastpath+0x1e/0x97 Call Trace: [<ffffffff815befef>] dump_stack+0x59/0x7e [<ffffffff812d1d65>] print_trailer+0x125/0x1b0 [<ffffffff812d82c8>] object_err+0x34/0x40 [<ffffffff812dadef>] kasan_report.part.1+0x21f/0x534 [<ffffffff81132401>] ? vprintk+0x2d/0x40 [<ffffffff815f1ee2>] ? crc32_le+0x1ce/0x260 [<ffffffff812db41a>] kasan_report+0x26/0x30 [<ffffffff812d9fc1>] __asan_load1+0x3d/0x50 [<ffffffff815f1ee2>] crc32_le+0x1ce/0x260 [<ffffffff814764ae>] ? jffs2_alloc_full_dirent+0x2a/0x40 [<ffffffff81485cec>] jffs2_scan_eraseblock+0x1d0c/0x3b64 [<ffffffff81488813>] ? jffs2_scan_medium+0xccf/0xfe0 [<ffffffff81483fe0>] ? jffs2_scan_make_ino_cache+0x14c/0x14c [<ffffffff812da3e9>] ? kasan_unpoison_shadow+0x35/0x50 [<ffffffff812da3e9>] ? kasan_unpoison_shadow+0x35/0x50 [<ffffffff812da462>] ? kasan_kmalloc+0x5e/0x70 [<ffffffff812d5d90>] ? kmem_cache_alloc_trace+0x10c/0x2cc [<ffffffff818169fb>] ? mtd_point+0xf7/0x130 [<ffffffff81487dc9>] jffs2_scan_medium+0x285/0xfe0 [<ffffffff81487b44>] ? jffs2_scan_eraseblock+0x3b64/0x3b64 [<ffffffff812da3e9>] ? kasan_unpoison_shadow+0x35/0x50 [<ffffffff812da3e9>] ? kasan_unpoison_shadow+0x35/0x50 [<ffffffff812da462>] ? kasan_kmalloc+0x5e/0x70 [<ffffffff812d57df>] ? __kmalloc+0x12b/0x300 [<ffffffff812da462>] ? kasan_kmalloc+0x5e/0x70 [<ffffffff814a2753>] ? jffs2_sum_init+0x9f/0x240 [<ffffffff8148b2ff>] jffs2_do_mount_fs+0x5fb/0x1bbc [<ffffffff8148ad04>] ? jffs2_del_noinode_dirent+0x640/0x640 [<ffffffff812da462>] ? kasan_kmalloc+0x5e/0x70 [<ffffffff81127c5b>] ? __init_rwsem+0x97/0xac [<ffffffff81492349>] jffs2_do_fill_super+0x245/0x6f0 [<ffffffff81493c5b>] jffs2_fill_super+0x287/0x2e0 [<ffffffff814939d4>] ? jffs2_parse_options+0x594/0x594 [<ffffffff81819bea>] mount_mtd_aux.isra.0+0x9a/0x144 [<ffffffff81819eb6>] mount_mtd+0x222/0x2f0 [<ffffffff814939d4>] ? jffs2_parse_options+0x594/0x594 [<ffffffff81819c94>] ? mount_mtd_aux.isra.0+0x144/0x144 [<ffffffff81258757>] ? free_pages+0x13/0x1c [<ffffffff814fa0ac>] ? selinux_sb_copy_data+0x278/0x2e0 [<ffffffff81492b35>] jffs2_mount+0x41/0x60 [<ffffffff81302fb7>] mount_fs+0x63/0x230 [<ffffffff8133755f>] ? alloc_vfsmnt+0x32f/0x3b0 [<ffffffff81337f2c>] vfs_kern_mount.part.6+0x6c/0x1f4 [<ffffffff8133ceec>] do_mount+0xae8/0x1940 [<ffffffff811b94e0>] ? audit_filter_rules.constprop.6+0x1d10/0x1d10 [<ffffffff8133c404>] ? copy_mount_string+0x40/0x40 [<ffffffff812cbf78>] ? alloc_pages_current+0xa4/0x1bc [<ffffffff81253a89>] ? __get_free_pages+0x25/0x50 [<ffffffff81338993>] ? copy_mount_options.part.17+0x183/0x264 [<ffffffff8133e3a9>] SyS_mount+0x105/0x1d0 [<ffffffff8133e2a4>] ? copy_mnt_ns+0x560/0x560 [<ffffffff810e8391>] ? msa_space_switch_handler+0x13d/0x190 [<ffffffff81be184a>] entry_SYSCALL_64_fastpath+0x1e/0x97 [<ffffffff810e9274>] ? msa_space_switch+0xb0/0xe0 Memory state around the buggy address: ffff8800842cf180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800842cf200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8800842cf280: fc fc fc fc fc fc 00 00 00 00 01 fc fc fc fc fc ^ ffff8800842cf300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800842cf380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Cc: stable@vger.kernel.org Reported-by: Kunkun Xu <xukunkun1@huawei.com> Signed-off-by: lizhe <lizhe67@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11NFSv4: Don't discard segments marked for return in _pnfs_return_layout()Trond Myklebust1-1/+1
commit de144ff4234f935bd2150108019b5d87a90a8a96 upstream. If the pNFS layout segment is marked with the NFS_LSEG_LAYOUTRETURN flag, then the assumption is that it has some reporting requirement to perform through a layoutreturn (e.g. flexfiles layout stats or error information). Fixes: 6d597e175012 ("pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11NFS: Don't discard pNFS layout segments that are marked for returnTrond Myklebust1-0/+5
commit 39fd01863616964f009599e50ca5c6ea9ebf88d6 upstream. If the pNFS layout segment is marked with the NFS_LSEG_LAYOUTRETURN flag, then the assumption is that it has some reporting requirement to perform through a layoutreturn (e.g. flexfiles layout stats or error information). Fixes: e0b7d420f72a ("pNFS: Don't discard layout segments that are marked for return") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11NFS: fs_context: validate UDP retrans to prevent shift out-of-boundsRandy Dunlap1-0/+12
commit c09f11ef35955785f92369e25819bf0629df2e59 upstream. Fix shift out-of-bounds in xprt_calc_majortimeo(). This is caused by a garbage timeout (retrans) mount option being passed to nfs mount, in this case from syzkaller. If the protocol is XPRT_TRANSPORT_UDP, then 'retrans' is a shift value for a 64-bit long integer, so 'retrans' cannot be >= 64. If it is >= 64, fail the mount and return an error. Fixes: 9954bf92c0cd ("NFS: Move mount parameterisation bits into their own file") Reported-by: syzbot+ba2e91df8f74809417fa@syzkaller.appspotmail.com Reported-by: syzbot+f3a0fa110fd630ab56c8@syzkaller.appspotmail.com Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11f2fs: fix to avoid out-of-bounds memory accessChao Yu1-0/+3
commit b862676e371715456c9dade7990c8004996d0d9e upstream. butt3rflyh4ck <butterflyhuangxx@gmail.com> reported a bug found by syzkaller fuzzer with custom modifications in 5.12.0-rc3+ [1]: dump_stack+0xfa/0x151 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x82/0x32c mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 f2fs_test_bit fs/f2fs/f2fs.h:2572 [inline] current_nat_addr fs/f2fs/node.h:213 [inline] get_next_nat_page fs/f2fs/node.c:123 [inline] __flush_nat_entry_set fs/f2fs/node.c:2888 [inline] f2fs_flush_nat_entries+0x258e/0x2960 fs/f2fs/node.c:2991 f2fs_write_checkpoint+0x1372/0x6a70 fs/f2fs/checkpoint.c:1640 f2fs_issue_checkpoint+0x149/0x410 fs/f2fs/checkpoint.c:1807 f2fs_sync_fs+0x20f/0x420 fs/f2fs/super.c:1454 __sync_filesystem fs/sync.c:39 [inline] sync_filesystem fs/sync.c:67 [inline] sync_filesystem+0x1b5/0x260 fs/sync.c:48 generic_shutdown_super+0x70/0x370 fs/super.c:448 kill_block_super+0x97/0xf0 fs/super.c:1394 The root cause is, if nat entry in checkpoint journal area is corrupted, e.g. nid of journalled nat entry exceeds max nid value, during checkpoint, once it tries to flush nat journal to NAT area, get_next_nat_page() may access out-of-bounds memory on nat_bitmap due to it uses wrong nid value as bitmap offset. [1] https://lore.kernel.org/lkml/CAFcO6XOMWdr8pObek6eN6-fs58KG9doRFadgJj-FnF-1x43s2g@mail.gmail.com/T/#u Reported-and-tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11f2fs: fix error handling in f2fs_end_enable_verity()Eric Biggers1-21/+54
commit 3c0315424f5e3d2a4113c7272367bee1e8e6a174 upstream. f2fs didn't properly clean up if verity failed to be enabled on a file: - It left verity metadata (pages past EOF) in the page cache, which would be exposed to userspace if the file was later extended. - It didn't truncate the verity metadata at all (either from cache or from disk) if an error occurred while setting the verity bit. Fix these bugs by adding a call to truncate_inode_pages() and ensuring that we truncate the verity metadata (both from cache and from disk) in all error paths. Also rework