<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/ucount.c, branch v6.18.21</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>ucount: check for CAP_SYS_RESOURCE using ns_capable_noaudit()</title>
<updated>2026-02-26T22:59:20+00:00</updated>
<author>
<name>Ondrej Mosnacek</name>
<email>omosnace@redhat.com</email>
</author>
<published>2026-01-22T14:07:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b4374b0bed103fc37be64a8f9a344f8c93eb5599'/>
<id>b4374b0bed103fc37be64a8f9a344f8c93eb5599</id>
<content type='text'>
[ Upstream commit 0895a000e4fff9e950a7894210db45973e485c35 ]

The user.* sysctls implement the ctl_table_root::permissions hook and they
override the file access mode based on the CAP_SYS_RESOURCE capability (at
most rwx if capable, at most r-- if not).  The capability is being checked
unconditionally, so if an LSM denies the capability, an audit record may
be logged even when access is in fact granted.

Given the logic in the set_permissions() function in kernel/ucount.c and
the unfortunate way the permission checking is implemented, it doesn't
seem viable to avoid false positive denials by deferring the capability
check.  Thus, do the same as in net_ctl_permissions() (net/sysctl_net.c) -
switch from ns_capable() to ns_capable_noaudit(), so that the check never
logs an audit record.

Link: https://lkml.kernel.org/r/20260122140745.239428-1-omosnace@redhat.com
Fixes: dbec28460a89 ("userns: Add per user namespace sysctls.")
Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Reviewed-by: Paul Moore &lt;paul@paul-moore.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0895a000e4fff9e950a7894210db45973e485c35 ]

The user.* sysctls implement the ctl_table_root::permissions hook and they
override the file access mode based on the CAP_SYS_RESOURCE capability (at
most rwx if capable, at most r-- if not).  The capability is being checked
unconditionally, so if an LSM denies the capability, an audit record may
be logged even when access is in fact granted.

Given the logic in the set_permissions() function in kernel/ucount.c and
the unfortunate way the permission checking is implemented, it doesn't
seem viable to avoid false positive denials by deferring the capability
check.  Thus, do the same as in net_ctl_permissions() (net/sysctl_net.c) -
switch from ns_capable() to ns_capable_noaudit(), so that the check never
logs an audit record.

Link: https://lkml.kernel.org/r/20260122140745.239428-1-omosnace@redhat.com
Fixes: dbec28460a89 ("userns: Add per user namespace sysctls.")
Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Reviewed-by: Paul Moore &lt;paul@paul-moore.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()</title>
<updated>2025-08-02T19:01:38+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2025-07-21T17:45:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=58b4fba81a2e400a47ddbe7c1dc0a2bc038313b7'/>
<id>58b4fba81a2e400a47ddbe7c1dc0a2bc038313b7</id>
<content type='text'>
Use atomic_long_try_cmpxchg() instead of
atomic_long_cmpxchg (*ptr, old, new) == old in atomic_long_inc_below().
x86 CMPXCHG instruction returns success in ZF flag, so this change saves
a compare after cmpxchg (and related move instruction in front of cmpxchg).

Also, atomic_long_try_cmpxchg implicitly assigns old *ptr value to "old"
when cmpxchg fails, enabling further code simplifications.

No functional change intended.

Link: https://lkml.kernel.org/r/20250721174610.28361-2-ubizjak@gmail.com
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Reviewed-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: MengEn Sun &lt;mengensun@tencent.com&gt;
Cc: "Thomas Weißschuh" &lt;linux@weissschuh.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use atomic_long_try_cmpxchg() instead of
atomic_long_cmpxchg (*ptr, old, new) == old in atomic_long_inc_below().
x86 CMPXCHG instruction returns success in ZF flag, so this change saves
a compare after cmpxchg (and related move instruction in front of cmpxchg).

Also, atomic_long_try_cmpxchg implicitly assigns old *ptr value to "old"
when cmpxchg fails, enabling further code simplifications.

No functional change intended.

Link: https://lkml.kernel.org/r/20250721174610.28361-2-ubizjak@gmail.com
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Reviewed-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: MengEn Sun &lt;mengensun@tencent.com&gt;
Cc: "Thomas Weißschuh" &lt;linux@weissschuh.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucount: fix atomic_long_inc_below() argument type</title>
<updated>2025-08-02T19:01:38+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2025-07-21T17:45:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=f8cd9193b62e92ad25def5370ca8ea2bc7585381'/>
<id>f8cd9193b62e92ad25def5370ca8ea2bc7585381</id>
<content type='text'>
The type of u argument of atomic_long_inc_below() should be long to avoid
unwanted truncation to int.

The patch fixes the wrong argument type of an internal function to
prevent unwanted argument truncation.  It fixes an internal locking
primitive; it should not have any direct effect on userspace.

Mark said

: AFAICT there's no problem in practice because atomic_long_inc_below()
: is only used by inc_ucount(), and it looks like the value is
: constrained between 0 and INT_MAX.
: 
: In inc_ucount() the limit value is taken from
: user_namespace::ucount_max[], and AFAICT that's only written by
: sysctls, to the table setup by setup_userns_sysctls(), where
: UCOUNT_ENTRY() limits the value between 0 and INT_MAX.
: 
: This is certainly a cleanup, but there might be no functional issue in
: practice as above.

Link: https://lkml.kernel.org/r/20250721174610.28361-1-ubizjak@gmail.com
Fixes: f9c82a4ea89c ("Increase size of ucounts to atomic_long_t")
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: MengEn Sun &lt;mengensun@tencent.com&gt;
Cc: "Thomas Weißschuh" &lt;linux@weissschuh.net&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The type of u argument of atomic_long_inc_below() should be long to avoid
unwanted truncation to int.

The patch fixes the wrong argument type of an internal function to
prevent unwanted argument truncation.  It fixes an internal locking
primitive; it should not have any direct effect on userspace.

Mark said

: AFAICT there's no problem in practice because atomic_long_inc_below()
: is only used by inc_ucount(), and it looks like the value is
: constrained between 0 and INT_MAX.
: 
: In inc_ucount() the limit value is taken from
: user_namespace::ucount_max[], and AFAICT that's only written by
: sysctls, to the table setup by setup_userns_sysctls(), where
: UCOUNT_ENTRY() limits the value between 0 and INT_MAX.
: 
: This is certainly a cleanup, but there might be no functional issue in
: practice as above.

Link: https://lkml.kernel.org/r/20250721174610.28361-1-ubizjak@gmail.com
Fixes: f9c82a4ea89c ("Increase size of ucounts to atomic_long_t")
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: MengEn Sun &lt;mengensun@tencent.com&gt;
Cc: "Thomas Weißschuh" &lt;linux@weissschuh.net&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucount: use rcuref_t for reference counting</title>
<updated>2025-03-17T05:30:50+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-02-03T15:05:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b4dc0bee2a749083028afba346910e198653f42a'/>
<id>b4dc0bee2a749083028afba346910e198653f42a</id>
<content type='text'>
Use rcuref_t for reference counting.  This eliminates the cmpxchg loop in
the get and put path.  This also eliminates the need to acquire the lock
in the put path because once the final user returns the reference, it can
no longer be obtained anymore.

Use rcuref_t for reference counting.

Link: https://lkml.kernel.org/r/20250203150525.456525-5-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use rcuref_t for reference counting.  This eliminates the cmpxchg loop in
the get and put path.  This also eliminates the need to acquire the lock
in the put path because once the final user returns the reference, it can
no longer be obtained anymore.

Use rcuref_t for reference counting.

Link: https://lkml.kernel.org/r/20250203150525.456525-5-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucount: use RCU for ucounts lookups</title>
<updated>2025-03-17T05:30:50+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-02-03T15:05:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=5f01a22c5b231dd590f61a2591b3090665733bcb'/>
<id>5f01a22c5b231dd590f61a2591b3090665733bcb</id>
<content type='text'>
The ucounts element is looked up under ucounts_lock.  This can be
optimized by using RCU for a lockless lookup and return and element if the
reference can be obtained.

Replace hlist_head with hlist_nulls_head which is RCU compatible.  Let
find_ucounts() search for the required item within a RCU section and
return the item if a reference could be obtained.  This means
alloc_ucounts() will always return an element (unless the memory
allocation failed).  Let put_ucounts() RCU free the element if the
reference counter dropped to zero.

Link: https://lkml.kernel.org/r/20250203150525.456525-4-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ucounts element is looked up under ucounts_lock.  This can be
optimized by using RCU for a lockless lookup and return and element if the
reference can be obtained.

Replace hlist_head with hlist_nulls_head which is RCU compatible.  Let
find_ucounts() search for the required item within a RCU section and
return the item if a reference could be obtained.  This means
alloc_ucounts() will always return an element (unless the memory
allocation failed).  Let put_ucounts() RCU free the element if the
reference counter dropped to zero.

Link: https://lkml.kernel.org/r/20250203150525.456525-4-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucount: replace get_ucounts_or_wrap() with atomic_inc_not_zero()</title>
<updated>2025-03-17T05:30:50+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-02-03T15:05:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=328152e6774d9d801ad1d90af557b9113647b379'/>
<id>328152e6774d9d801ad1d90af557b9113647b379</id>
<content type='text'>
get_ucounts_or_wrap() increments the counter and if the counter is
negative then it decrements it again in order to reset the previous
increment.  This statement can be replaced with atomic_inc_not_zero() to
only increment the counter if it is not yet 0.

This simplifies the get function because the put (if the get failed) can
be removed.  atomic_inc_not_zero() is implement as a cmpxchg() loop which
can be repeated several times if another get/put is performed in parallel.
This will be optimized later.

Increment the reference counter only if not yet dropped to zero.

Link: https://lkml.kernel.org/r/20250203150525.456525-3-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
get_ucounts_or_wrap() increments the counter and if the counter is
negative then it decrements it again in order to reset the previous
increment.  This statement can be replaced with atomic_inc_not_zero() to
only increment the counter if it is not yet 0.

This simplifies the get function because the put (if the get failed) can
be removed.  atomic_inc_not_zero() is implement as a cmpxchg() loop which
can be repeated several times if another get/put is performed in parallel.
This will be optimized later.

Increment the reference counter only if not yet dropped to zero.

Link: https://lkml.kernel.org/r/20250203150525.456525-3-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Lai jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Mengen Sun &lt;mengensun@tencent.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Cc: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucounts: move kfree() out of critical zone protected by ucounts_lock</title>
<updated>2025-01-13T04:21:00+00:00</updated>
<author>
<name>MengEn Sun</name>
<email>mengensun@tencent.com</email>
</author>
<published>2024-12-06T04:13:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=f49b42d415a32faee6bc08923821f432f64a4e90'/>
<id>f49b42d415a32faee6bc08923821f432f64a4e90</id>
<content type='text'>
Although kfree is a non-sleep function, it is possible to enter a long
chain of calls probabilistically, so it looks better to move kfree from
alloc_ucounts() out of the critical zone of ucounts_lock.

Link: https://lkml.kernel.org/r/1733458427-11794-1-git-send-email-mengensun@tencent.com
Signed-off-by: MengEn Sun &lt;mengensun@tencent.com&gt;
Reviewed-by: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Andrei Vagin &lt;avagin@google.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Although kfree is a non-sleep function, it is possible to enter a long
chain of calls probabilistically, so it looks better to move kfree from
alloc_ucounts() out of the critical zone of ucounts_lock.

Link: https://lkml.kernel.org/r/1733458427-11794-1-git-send-email-mengensun@tencent.com
Signed-off-by: MengEn Sun &lt;mengensun@tencent.com&gt;
Reviewed-by: YueHong Wu &lt;yuehongwu@tencent.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Andrei Vagin &lt;avagin@google.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sysctl-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl</title>
<updated>2024-11-23T04:36:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-23T04:36:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=980f8f8fd4228839f3d811521a27d4f9c2ead123'/>
<id>980f8f8fd4228839f3d811521a27d4f9c2ead123</id>
<content type='text'>
Pull sysctl updates from Joel Granados:
 "sysctl ctl_table constification:

   - Constifying ctl_table structs prevents the modification of
     proc_handler function pointers. All ctl_table struct arguments are
     const qualified in the sysctl API in such a way that the ctl_table
     arrays being defined elsewhere and passed through sysctl can be
     constified one-by-one.

     We kick the constification off by qualifying user_table in
     kernel/ucount.c and expect all the ctl_tables to be constified in
     the coming releases.

  Misc fixes:

   - Adjust comments in two places to better reflect the code

   - Remove superfluous dput calls

   - Remove Luis from sysctl maintainership

   - Replace comments about holding a lock with calls to
     lockdep_assert_held"

* tag 'sysctl-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Reduce dput(child) calls in proc_sys_fill_cache()
  sysctl: Reorganize kerneldoc parameter names
  ucounts: constify sysctl table user_table
  sysctl: update comments to new registration APIs
  MAINTAINERS: remove me from sysctl
  sysctl: Convert locking comments to lockdep assertions
  const_structs.checkpatch: add ctl_table
  sysctl: make internal ctl_tables const
  sysctl: allow registration of const struct ctl_table
  sysctl: move internal interfaces to const struct ctl_table
  bpf: Constify ctl_table argument of filter function
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull sysctl updates from Joel Granados:
 "sysctl ctl_table constification:

   - Constifying ctl_table structs prevents the modification of
     proc_handler function pointers. All ctl_table struct arguments are
     const qualified in the sysctl API in such a way that the ctl_table
     arrays being defined elsewhere and passed through sysctl can be
     constified one-by-one.

     We kick the constification off by qualifying user_table in
     kernel/ucount.c and expect all the ctl_tables to be constified in
     the coming releases.

  Misc fixes:

   - Adjust comments in two places to better reflect the code

   - Remove superfluous dput calls

   - Remove Luis from sysctl maintainership

   - Replace comments about holding a lock with calls to
     lockdep_assert_held"

* tag 'sysctl-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Reduce dput(child) calls in proc_sys_fill_cache()
  sysctl: Reorganize kerneldoc parameter names
  ucounts: constify sysctl table user_table
  sysctl: update comments to new registration APIs
  MAINTAINERS: remove me from sysctl
  sysctl: Convert locking comments to lockdep assertions
  const_structs.checkpatch: add ctl_table
  sysctl: make internal ctl_tables const
  sysctl: allow registration of const struct ctl_table
  sysctl: move internal interfaces to const struct ctl_table
  bpf: Constify ctl_table argument of filter function
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: restore the override_rlimit logic</title>
<updated>2024-11-07T22:14:59+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>roman.gushchin@linux.dev</email>
</author>
<published>2024-11-04T19:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=9e05e5c7ee8758141d2db7e8fea2cab34500c6ed'/>
<id>9e05e5c7ee8758141d2db7e8fea2cab34500c6ed</id>
<content type='text'>
Prior to commit d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of
ucounts") UCOUNT_RLIMIT_SIGPENDING rlimit was not enforced for a class of
signals.  However now it's enforced unconditionally, even if
override_rlimit is set.  This behavior change caused production issues.  

For example, if the limit is reached and a process receives a SIGSEGV
signal, sigqueue_alloc fails to allocate the necessary resources for the
signal delivery, preventing the signal from being delivered with siginfo. 
This prevents the process from correctly identifying the fault address and
handling the error.  From the user-space perspective, applications are
unaware that the limit has been reached and that the siginfo is
effectively 'corrupted'.  This can lead to unpredictable behavior and
crashes, as we observed with java applications.

Fix this by passing override_rlimit into inc_rlimit_get_ucounts() and skip
the comparison to max there if override_rlimit is set.  This effectively
restores the old behavior.

Link: https://lkml.kernel.org/r/20241104195419.3962584-1-roman.gushchin@linux.dev
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Co-developed-by: Andrei Vagin &lt;avagin@google.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Prior to commit d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of
ucounts") UCOUNT_RLIMIT_SIGPENDING rlimit was not enforced for a class of
signals.  However now it's enforced unconditionally, even if
override_rlimit is set.  This behavior change caused production issues.  

For example, if the limit is reached and a process receives a SIGSEGV
signal, sigqueue_alloc fails to allocate the necessary resources for the
signal delivery, preventing the signal from being delivered with siginfo. 
This prevents the process from correctly identifying the fault address and
handling the error.  From the user-space perspective, applications are
unaware that the limit has been reached and that the siginfo is
effectively 'corrupted'.  This can lead to unpredictable behavior and
crashes, as we observed with java applications.

Fix this by passing override_rlimit into inc_rlimit_get_ucounts() and skip
the comparison to max there if override_rlimit is set.  This effectively
restores the old behavior.

Link: https://lkml.kernel.org/r/20241104195419.3962584-1-roman.gushchin@linux.dev
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Co-developed-by: Andrei Vagin &lt;avagin@google.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ucounts: fix counter leak in inc_rlimit_get_ucounts()</title>
<updated>2024-11-07T22:14:59+00:00</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@google.com</email>
</author>
<published>2024-11-01T19:19:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=432dc0654c612457285a5dcf9bb13968ac6f0804'/>
<id>432dc0654c612457285a5dcf9bb13968ac6f0804</id>
<content type='text'>
The inc_rlimit_get_ucounts() increments the specified rlimit counter and
then checks its limit.  If the value exceeds the limit, the function
returns an error without decrementing the counter.

Link: https://lkml.kernel.org/r/20241101191940.3211128-1-roman.gushchin@linux.dev
Fixes: 15bc01effefe ("ucounts: Fix signal ucount refcounting")
Signed-off-by: Andrei Vagin &lt;avagin@google.com&gt;
Co-developed-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Tested-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Andrei Vagin &lt;avagin@google.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The inc_rlimit_get_ucounts() increments the specified rlimit counter and
then checks its limit.  If the value exceeds the limit, the function
returns an error without decrementing the counter.

Link: https://lkml.kernel.org/r/20241101191940.3211128-1-roman.gushchin@linux.dev
Fixes: 15bc01effefe ("ucounts: Fix signal ucount refcounting")
Signed-off-by: Andrei Vagin &lt;avagin@google.com&gt;
Co-developed-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Tested-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Andrei Vagin &lt;avagin@google.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
