<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/mm/shmem.c, branch v5.10.258</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>mm: update memfd seal write check to include F_SEAL_WRITE</title>
<updated>2025-08-28T14:22:55+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lstoakes@gmail.com</email>
</author>
<published>2025-07-30T01:54:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c002f70d442bc24bb4a0ead23a723eaeac797539'/>
<id>c002f70d442bc24bb4a0ead23a723eaeac797539</id>
<content type='text'>
[ Upstream commit 28464bbb2ddc199433383994bcb9600c8034afa1 ]

The seal_check_future_write() function is called by shmem_mmap() or
hugetlbfs_file_mmap() to disallow any future writable mappings of an memfd
sealed this way.

The F_SEAL_WRITE flag is not checked here, as that is handled via the
mapping-&gt;i_mmap_writable mechanism and so any attempt at a mapping would
fail before this could be run.

However we intend to change this, meaning this check can be performed for
F_SEAL_WRITE mappings also.

The logic here is equally applicable to both flags, so update this
function to accommodate both and rename it accordingly.

Link: https://lkml.kernel.org/r/913628168ce6cce77df7d13a63970bae06a526e0.1697116581.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes &lt;lstoakes@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 28464bbb2ddc199433383994bcb9600c8034afa1 ]

The seal_check_future_write() function is called by shmem_mmap() or
hugetlbfs_file_mmap() to disallow any future writable mappings of an memfd
sealed this way.

The F_SEAL_WRITE flag is not checked here, as that is handled via the
mapping-&gt;i_mmap_writable mechanism and so any attempt at a mapping would
fail before this could be run.

However we intend to change this, meaning this check can be performed for
F_SEAL_WRITE mappings also.

The logic here is equally applicable to both flags, so update this
function to accommodate both and rename it accordingly.

Link: https://lkml.kernel.org/r/913628168ce6cce77df7d13a63970bae06a526e0.1697116581.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes &lt;lstoakes@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling</title>
<updated>2024-12-14T18:47:40+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2024-11-15T12:36:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a32712d54c61b4359e4bf00f6f805e301a89c534'/>
<id>a32712d54c61b4359e4bf00f6f805e301a89c534</id>
<content type='text'>
[ Upstream commit 5baf8b037debf4ec60108ccfeccb8636d1dbad81 ]

Currently MTE is permitted in two circumstances (desiring to use MTE
having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is
specified, as checked by arch_calc_vm_flag_bits() and actualised by
setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is
shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap
hook is activated in mmap_region().

The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also
set is the arm64 implementation of arch_validate_flags().

Unfortunately, we intend to refactor mmap_region() to perform this check
earlier, meaning that in the case of a shmem backing we will not have
invoked shmem_mmap() yet, causing the mapping to fail spuriously.

It is inappropriate to set this architecture-specific flag in general mm
code anyway, so a sensible resolution of this issue is to instead move the
check somewhere else.

We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
the arch_calc_vm_flag_bits() call.

This is an appropriate place to do this as we already check for the
MAP_ANONYMOUS case here, and the shmem file case is simply a variant of
the same idea - we permit RAM-backed memory.

This requires a modification to the arch_calc_vm_flag_bits() signature to
pass in a pointer to the struct file associated with the mapping, however
this is not too egregious as this is only used by two architectures anyway
- arm64 and parisc.

So this patch performs this adjustment and removes the unnecessary
assignment of VM_MTE_ALLOWED in shmem_mmap().

[akpm@linux-foundation.org: fix whitespace, per Catalin]
Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo -&gt;mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Suggested-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James E.J. Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5baf8b037debf4ec60108ccfeccb8636d1dbad81 ]

Currently MTE is permitted in two circumstances (desiring to use MTE
having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is
specified, as checked by arch_calc_vm_flag_bits() and actualised by
setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is
shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap
hook is activated in mmap_region().

The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also
set is the arm64 implementation of arch_validate_flags().

Unfortunately, we intend to refactor mmap_region() to perform this check
earlier, meaning that in the case of a shmem backing we will not have
invoked shmem_mmap() yet, causing the mapping to fail spuriously.

It is inappropriate to set this architecture-specific flag in general mm
code anyway, so a sensible resolution of this issue is to instead move the
check somewhere else.

We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
the arch_calc_vm_flag_bits() call.

This is an appropriate place to do this as we already check for the
MAP_ANONYMOUS case here, and the shmem file case is simply a variant of
the same idea - we permit RAM-backed memory.

This requires a modification to the arch_calc_vm_flag_bits() signature to
pass in a pointer to the struct file associated with the mapping, however
this is not too egregious as this is only used by two architectures anyway
- arm64 and parisc.

So this patch performs this adjustment and removes the unnecessary
assignment of VM_MTE_ALLOWED in shmem_mmap().

[akpm@linux-foundation.org: fix whitespace, per Catalin]
Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo -&gt;mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Suggested-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James E.J. Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: revert "mm: shmem: fix data-race in shmem_getattr()"</title>
<updated>2024-12-14T18:47:40+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2024-11-16T00:57:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=57cc8d253099d1b8627f0fb487ee011d9158ccc9'/>
<id>57cc8d253099d1b8627f0fb487ee011d9158ccc9</id>
<content type='text'>
commit d1aa0c04294e29883d65eac6c2f72fe95cc7c049 upstream.

Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as
suggested by Chuck [1].  It is causing deadlocks when accessing tmpfs over
NFS.

As Hugh commented, "added just to silence a syzbot sanitizer splat: added
where there has never been any practical problem".

Link: https://lkml.kernel.org/r/ZzdxKF39VEmXSSyN@tissot.1015granger.net [1]
Fixes: d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()")
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Cc: Jeongjun Park &lt;aha310510@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d1aa0c04294e29883d65eac6c2f72fe95cc7c049 upstream.

Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as
suggested by Chuck [1].  It is causing deadlocks when accessing tmpfs over
NFS.

As Hugh commented, "added just to silence a syzbot sanitizer splat: added
where there has never been any practical problem".

Link: https://lkml.kernel.org/r/ZzdxKF39VEmXSSyN@tissot.1015granger.net [1]
Fixes: d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()")
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Cc: Jeongjun Park &lt;aha310510@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: shmem: fix data-race in shmem_getattr()</title>
<updated>2024-11-08T15:22:04+00:00</updated>
<author>
<name>Jeongjun Park</name>
<email>aha310510@gmail.com</email>
</author>
<published>2024-09-09T12:35:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=bda1a99a0dd644f31a87d636ac624eeb975cb65a'/>
<id>bda1a99a0dd644f31a87d636ac624eeb975cb65a</id>
<content type='text'>
commit d949d1d14fa281ace388b1de978e8f2cd52875cf upstream.

I got the following KCSAN report during syzbot testing:

==================================================================
BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current

write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1:
 inode_set_ctime_to_ts include/linux/fs.h:1638 [inline]
 inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626
 shmem_mknod+0x117/0x180 mm/shmem.c:3443
 shmem_create+0x34/0x40 mm/shmem.c:3497
 lookup_open fs/namei.c:3578 [inline]
 open_last_lookups fs/namei.c:3647 [inline]
 path_openat+0xdbc/0x1f00 fs/namei.c:3883
 do_filp_open+0xf7/0x200 fs/namei.c:3913
 do_sys_openat2+0xab/0x120 fs/open.c:1416
 do_sys_open fs/open.c:1431 [inline]
 __do_sys_openat fs/open.c:1447 [inline]
 __se_sys_openat fs/open.c:1442 [inline]
 __x64_sys_openat+0xf3/0x120 fs/open.c:1442
 x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0:
 inode_get_ctime_nsec include/linux/fs.h:1623 [inline]
 inode_get_ctime include/linux/fs.h:1629 [inline]
 generic_fillattr+0x1dd/0x2f0 fs/stat.c:62
 shmem_getattr+0x17b/0x200 mm/shmem.c:1157
 vfs_getattr_nosec fs/stat.c:166 [inline]
 vfs_getattr+0x19b/0x1e0 fs/stat.c:207
 vfs_statx_path fs/stat.c:251 [inline]
 vfs_statx+0x134/0x2f0 fs/stat.c:315
 vfs_fstatat+0xec/0x110 fs/stat.c:341
 __do_sys_newfstatat fs/stat.c:505 [inline]
 __se_sys_newfstatat+0x58/0x260 fs/stat.c:499
 __x64_sys_newfstatat+0x55/0x70 fs/stat.c:499
 x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

value changed: 0x2755ae53 -&gt; 0x27ee44d3

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
==================================================================

When calling generic_fillattr(), if you don't hold read lock, data-race
will occur in inode member variables, which can cause unexpected
behavior.

Since there is no special protection when shmem_getattr() calls
generic_fillattr(), data-race occurs by functions such as shmem_unlink()
or shmem_mknod(). This can cause unexpected results, so commenting it out
is not enough.

Therefore, when calling generic_fillattr() from shmem_getattr(), it is
appropriate to protect the inode using inode_lock_shared() and
inode_unlock_shared() to prevent data-race.

Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com
Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
Signed-off-by: Jeongjun Park &lt;aha310510@gmail.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroup.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d949d1d14fa281ace388b1de978e8f2cd52875cf upstream.

I got the following KCSAN report during syzbot testing:

==================================================================
BUG: KCSAN: data-race in generic_fillattr / inode_set_ctime_current

write to 0xffff888102eb3260 of 4 bytes by task 6565 on cpu 1:
 inode_set_ctime_to_ts include/linux/fs.h:1638 [inline]
 inode_set_ctime_current+0x169/0x1d0 fs/inode.c:2626
 shmem_mknod+0x117/0x180 mm/shmem.c:3443
 shmem_create+0x34/0x40 mm/shmem.c:3497
 lookup_open fs/namei.c:3578 [inline]
 open_last_lookups fs/namei.c:3647 [inline]
 path_openat+0xdbc/0x1f00 fs/namei.c:3883
 do_filp_open+0xf7/0x200 fs/namei.c:3913
 do_sys_openat2+0xab/0x120 fs/open.c:1416
 do_sys_open fs/open.c:1431 [inline]
 __do_sys_openat fs/open.c:1447 [inline]
 __se_sys_openat fs/open.c:1442 [inline]
 __x64_sys_openat+0xf3/0x120 fs/open.c:1442
 x64_sys_call+0x1025/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:258
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

read to 0xffff888102eb3260 of 4 bytes by task 3498 on cpu 0:
 inode_get_ctime_nsec include/linux/fs.h:1623 [inline]
 inode_get_ctime include/linux/fs.h:1629 [inline]
 generic_fillattr+0x1dd/0x2f0 fs/stat.c:62
 shmem_getattr+0x17b/0x200 mm/shmem.c:1157
 vfs_getattr_nosec fs/stat.c:166 [inline]
 vfs_getattr+0x19b/0x1e0 fs/stat.c:207
 vfs_statx_path fs/stat.c:251 [inline]
 vfs_statx+0x134/0x2f0 fs/stat.c:315
 vfs_fstatat+0xec/0x110 fs/stat.c:341
 __do_sys_newfstatat fs/stat.c:505 [inline]
 __se_sys_newfstatat+0x58/0x260 fs/stat.c:499
 __x64_sys_newfstatat+0x55/0x70 fs/stat.c:499
 x64_sys_call+0x141f/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:263
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

value changed: 0x2755ae53 -&gt; 0x27ee44d3

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 3498 Comm: udevd Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
==================================================================

When calling generic_fillattr(), if you don't hold read lock, data-race
will occur in inode member variables, which can cause unexpected
behavior.

Since there is no special protection when shmem_getattr() calls
generic_fillattr(), data-race occurs by functions such as shmem_unlink()
or shmem_mknod(). This can cause unexpected results, so commenting it out
is not enough.

Therefore, when calling generic_fillattr() from shmem_getattr(), it is
appropriate to protect the inode using inode_lock_shared() and
inode_unlock_shared() to prevent data-race.

Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com
Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
Signed-off-by: Jeongjun Park &lt;aha310510@gmail.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroup.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tmpfs: verify {g,u}id mount options correctly</title>
<updated>2023-09-19T10:20:06+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-08-01T16:17:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=ac5a73b5d9338c10cc50e2cbc96fad9018bbca2f'/>
<id>ac5a73b5d9338c10cc50e2cbc96fad9018bbca2f</id>
<content type='text'>
[ Upstream commit 0200679fc7953177941e41c2a4241d0b6c2c5de8 ]

A while ago we received the following report:

"The other outstanding issue I noticed comes from the fact that
fsconfig syscalls may occur in a different userns than that which
called fsopen. That means that resolving the uid/gid via
current_user_ns() can save a kuid that isn't mapped in the associated
namespace when the filesystem is finally mounted. This means that it
is possible for an unprivileged user to create files owned by any
group in a tmpfs mount (since we can set the SUID bit on the tmpfs
directory), or a tmpfs that is owned by any user, including the root
group/user."

The contract for {g,u}id mount options and {g,u}id values in general set
from userspace has always been that they are translated according to the
caller's idmapping. In so far, tmpfs has been doing the correct thing.
But since tmpfs is mountable in unprivileged contexts it is also
necessary to verify that the resulting {k,g}uid is representable in the
namespace of the superblock to avoid such bugs as above.

The new mount api's cross-namespace delegation abilities are already
widely used. After having talked to a bunch of userspace this is the
most faithful solution with minimal regression risks. I know of one
users - systemd - that makes use of the new mount api in this way and
they don't set unresolable {g,u}ids. So the regression risk is minimal.

Link: https://lore.kernel.org/lkml/CALxfFW4BXhEwxR0Q5LSkg-8Vb4r2MONKCcUCVioehXQKr35eHg@mail.gmail.com
Fixes: f32356261d44 ("vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API")
Reviewed-by: "Seth Forshee (DigitalOcean)" &lt;sforshee@kernel.org&gt;
Reported-by: Seth Jenkins &lt;sethjenkins@google.com&gt;
Message-Id: &lt;20230801-vfs-fs_context-uidgid-v1-1-daf46a050bbf@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0200679fc7953177941e41c2a4241d0b6c2c5de8 ]

A while ago we received the following report:

"The other outstanding issue I noticed comes from the fact that
fsconfig syscalls may occur in a different userns than that which
called fsopen. That means that resolving the uid/gid via
current_user_ns() can save a kuid that isn't mapped in the associated
namespace when the filesystem is finally mounted. This means that it
is possible for an unprivileged user to create files owned by any
group in a tmpfs mount (since we can set the SUID bit on the tmpfs
directory), or a tmpfs that is owned by any user, including the root
group/user."

The contract for {g,u}id mount options and {g,u}id values in general set
from userspace has always been that they are translated according to the
caller's idmapping. In so far, tmpfs has been doing the correct thing.
But since tmpfs is mountable in unprivileged contexts it is also
necessary to verify that the resulting {k,g}uid is representable in the
namespace of the superblock to avoid such bugs as above.

The new mount api's cross-namespace delegation abilities are already
widely used. After having talked to a bunch of userspace this is the
most faithful solution with minimal regression risks. I know of one
users - systemd - that makes use of the new mount api in this way and
they don't set unresolable {g,u}ids. So the regression risk is minimal.

Link: https://lore.kernel.org/lkml/CALxfFW4BXhEwxR0Q5LSkg-8Vb4r2MONKCcUCVioehXQKr35eHg@mail.gmail.com
Fixes: f32356261d44 ("vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API")
Reviewed-by: "Seth Forshee (DigitalOcean)" &lt;sforshee@kernel.org&gt;
Reported-by: Seth Jenkins &lt;sethjenkins@google.com&gt;
Message-Id: &lt;20230801-vfs-fs_context-uidgid-v1-1-daf46a050bbf@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs</title>
<updated>2023-07-27T06:44:13+00:00</updated>
<author>
<name>Roberto Sassu</name>
<email>roberto.sassu@huawei.com</email>
</author>
<published>2023-06-07T16:15:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=5fada375113767b3b57f1b04f7a4fe64ffaa626f'/>
<id>5fada375113767b3b57f1b04f7a4fe64ffaa626f</id>
<content type='text'>
commit 36ce9d76b0a93bae799e27e4f5ac35478c676592 upstream.

As the ramfs-based tmpfs uses ramfs_init_fs_context() for the
init_fs_context method, which allocates fc-&gt;s_fs_info, use ramfs_kill_sb()
to free it and avoid a memory leak.

Link: https://lkml.kernel.org/r/20230607161523.2876433-1-roberto.sassu@huaweicloud.com
Fixes: c3b1b1cbf002 ("ramfs: add support for "mode=" mount option")
Signed-off-by: Roberto Sassu &lt;roberto.sassu@huawei.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 36ce9d76b0a93bae799e27e4f5ac35478c676592 upstream.

As the ramfs-based tmpfs uses ramfs_init_fs_context() for the
init_fs_context method, which allocates fc-&gt;s_fs_info, use ramfs_kill_sb()
to free it and avoid a memory leak.

Link: https://lkml.kernel.org/r/20230607161523.2876433-1-roberto.sassu@huaweicloud.com
Fixes: c3b1b1cbf002 ("ramfs: add support for "mode=" mount option")
Signed-off-by: Roberto Sassu &lt;roberto.sassu@huawei.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode</title>
<updated>2022-01-27T09:53:44+00:00</updated>
<author>
<name>Gang Li</name>
<email>ligang.bdlg@bytedance.com</email>
</author>
<published>2022-01-14T22:05:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7b9fa915a58d441437f13724fabb51c605a49fbd'/>
<id>7b9fa915a58d441437f13724fabb51c605a49fbd</id>
<content type='text'>
commit 62c9827cbb996c2c04f615ecd783ce28bcea894b upstream.

Fix a data race in commit 779750d20b93 ("shmem: split huge pages beyond
i_size under memory pressure").

Here are call traces causing race:

   Call Trace 1:
     shmem_unused_huge_shrink+0x3ae/0x410
     ? __list_lru_walk_one.isra.5+0x33/0x160
     super_cache_scan+0x17c/0x190
     shrink_slab.part.55+0x1ef/0x3f0
     shrink_node+0x10e/0x330
     kswapd+0x380/0x740
     kthread+0xfc/0x130
     ? mem_cgroup_shrink_node+0x170/0x170
     ? kthread_create_on_node+0x70/0x70
     ret_from_fork+0x1f/0x30

   Call Trace 2:
     shmem_evict_inode+0xd8/0x190
     evict+0xbe/0x1c0
     do_unlinkat+0x137/0x330
     do_syscall_64+0x76/0x120
     entry_SYSCALL_64_after_hwframe+0x3d/0xa2

A simple explanation:

Image there are 3 items in the local list (@list).  In the first
traversal, A is not deleted from @list.

  1)    A-&gt;B-&gt;C
        ^
        |
        pos (leave)

In the second traversal, B is deleted from @list.  Concurrently, A is
deleted from @list through shmem_evict_inode() since last reference
counter of inode is dropped by other thread.  Then the @list is corrupted.

  2)    A-&gt;B-&gt;C
        ^  ^
        |  |
     evict pos (drop)

We should make sure the inode is either on the global list or deleted from
any local list before iput().

Fixed by moving inodes back to global list before we put them.

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20211125064502.99983-1-ligang.bdlg@bytedance.com
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Gang Li &lt;ligang.bdlg@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 62c9827cbb996c2c04f615ecd783ce28bcea894b upstream.

Fix a data race in commit 779750d20b93 ("shmem: split huge pages beyond
i_size under memory pressure").

Here are call traces causing race:

   Call Trace 1:
     shmem_unused_huge_shrink+0x3ae/0x410
     ? __list_lru_walk_one.isra.5+0x33/0x160
     super_cache_scan+0x17c/0x190
     shrink_slab.part.55+0x1ef/0x3f0
     shrink_node+0x10e/0x330
     kswapd+0x380/0x740
     kthread+0xfc/0x130
     ? mem_cgroup_shrink_node+0x170/0x170
     ? kthread_create_on_node+0x70/0x70
     ret_from_fork+0x1f/0x30

   Call Trace 2:
     shmem_evict_inode+0xd8/0x190
     evict+0xbe/0x1c0
     do_unlinkat+0x137/0x330
     do_syscall_64+0x76/0x120
     entry_SYSCALL_64_after_hwframe+0x3d/0xa2

A simple explanation:

Image there are 3 items in the local list (@list).  In the first
traversal, A is not deleted from @list.

  1)    A-&gt;B-&gt;C
        ^
        |
        pos (leave)

In the second traversal, B is deleted from @list.  Concurrently, A is
deleted from @list through shmem_evict_inode() since last reference
counter of inode is dropped by other thread.  Then the @list is corrupted.

  2)    A-&gt;B-&gt;C
        ^  ^
        |  |
     evict pos (drop)

We should make sure the inode is either on the global list or deleted from
any local list before iput().

Fixed by moving inodes back to global list before we put them.

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20211125064502.99983-1-ligang.bdlg@bytedance.com
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Gang Li &lt;ligang.bdlg@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "mm/shmem: fix shmem_swapin() race with swapoff"</title>
<updated>2021-07-25T12:36:20+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2021-07-22T13:46:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7d7d0e84ac0d80815abfb7db3ba3e683830b310d'/>
<id>7d7d0e84ac0d80815abfb7db3ba3e683830b310d</id>
<content type='text'>
This reverts commit a533a21b692fc15a6aadfa827b29c7d9989109ca which is
commit 2efa33fc7f6ec94a3a538c1a264273c889be2b36 upstream.

It should not have been added to the stable trees, sorry about that.

Link: https://lore.kernel.org/r/YPVgaY6uw59Fqg5x@casper.infradead.org
Reported-by: From: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Ying Huang &lt;ying.huang@intel.com&gt;
Cc: Alex Shi &lt;alexs@kernel.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit a533a21b692fc15a6aadfa827b29c7d9989109ca which is
commit 2efa33fc7f6ec94a3a538c1a264273c889be2b36 upstream.

It should not have been added to the stable trees, sorry about that.

Link: https://lore.kernel.org/r/YPVgaY6uw59Fqg5x@casper.infradead.org
Reported-by: From: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Ying Huang &lt;ying.huang@intel.com&gt;
Cc: Alex Shi &lt;alexs@kernel.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()</title>
<updated>2021-07-14T14:56:50+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2021-07-01T01:47:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b65597377b7b690efaaa1c9f4875ded8b4c46212'/>
<id>b65597377b7b690efaaa1c9f4875ded8b4c46212</id>
<content type='text'>
[ Upstream commit e6be37b2e7bddfe0c76585ee7c7eee5acc8efeab ]

Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
(non-shmem) FS"), read-only THP file mapping is supported.  But it forgot
to add checking for it in transparent_hugepage_enabled().  To fix it, we
add checking for read-only THP file mapping and also introduce helper
transhuge_vma_enabled() to check whether thp is enabled for specified vma
to reduce duplicated code.  We rename transparent_hugepage_enabled to
transparent_hugepage_active to make the code easier to follow as suggested
by David Hildenbrand.

[linmiaohe@huawei.com: define transhuge_vma_enabled next to transhuge_vma_suitable]
  Link: https://lkml.kernel.org/r/20210514093007.4117906-1-linmiaohe@huawei.com

Link: https://lkml.kernel.org/r/20210511134857.1581273-4-linmiaohe@huawei.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: "Aneesh Kumar K . V" &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e6be37b2e7bddfe0c76585ee7c7eee5acc8efeab ]

Since commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for
(non-shmem) FS"), read-only THP file mapping is supported.  But it forgot
to add checking for it in transparent_hugepage_enabled().  To fix it, we
add checking for read-only THP file mapping and also introduce helper
transhuge_vma_enabled() to check whether thp is enabled for specified vma
to reduce duplicated code.  We rename transparent_hugepage_enabled to
transparent_hugepage_active to make the code easier to follow as suggested
by David Hildenbrand.

[linmiaohe@huawei.com: define transhuge_vma_enabled next to transhuge_vma_suitable]
  Link: https://lkml.kernel.org/r/20210514093007.4117906-1-linmiaohe@huawei.com

Link: https://lkml.kernel.org/r/20210511134857.1581273-4-linmiaohe@huawei.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: "Aneesh Kumar K . V" &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/shmem: fix shmem_swapin() race with swapoff</title>
<updated>2021-07-14T14:56:14+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2021-06-29T02:36:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=8e4af3917bfc5e82f8010417c12b755ef256fa5e'/>
<id>8e4af3917bfc5e82f8010417c12b755ef256fa5e</id>
<content type='text'>
[ Upstream commit 2efa33fc7f6ec94a3a538c1a264273c889be2b36 ]

When I was investigating the swap code, I found the below possible race
window:

CPU 1                                         CPU 2
-----                                         -----
shmem_swapin
  swap_cluster_readahead
    if (likely(si-&gt;flags &amp; (SWP_BLKDEV | SWP_FS_OPS))) {
                                              swapoff
                                                ..
                                                si-&gt;swap_file = NULL;
                                                ..
    struct inode *inode = si-&gt;swap_file-&gt;f_mapping-&gt;host;[oops!]

Close this race window by using get/put_swap_device() to guard against
concurrent swapoff.

Link: https://lkml.kernel.org/r/20210426123316.806267-5-linmiaohe@huawei.com
Fixes: 8fd2e0b505d1 ("mm: swap: check if swap backing device is congested or not")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Alex Shi &lt;alexs@kernel.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2efa33fc7f6ec94a3a538c1a264273c889be2b36 ]

When I was investigating the swap code, I found the below possible race
window:

CPU 1                                         CPU 2
-----                                         -----
shmem_swapin
  swap_cluster_readahead
    if (likely(si-&gt;flags &amp; (SWP_BLKDEV | SWP_FS_OPS))) {
                                              swapoff
                                                ..
                                                si-&gt;swap_file = NULL;
                                                ..
    struct inode *inode = si-&gt;swap_file-&gt;f_mapping-&gt;host;[oops!]

Close this race window by using get/put_swap_device() to guard against
concurrent swapoff.

Link: https://lkml.kernel.org/r/20210426123316.806267-5-linmiaohe@huawei.com
Fixes: 8fd2e0b505d1 ("mm: swap: check if swap backing device is congested or not")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Alex Shi &lt;alexs@kernel.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
