<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/watchdog.c, branch v6.18.21</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()</title>
<updated>2026-03-04T12:21:25+00:00</updated>
<author>
<name>Shengming Hu</name>
<email>hu.shengming@zte.com.cn</email>
</author>
<published>2026-01-19T13:59:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=5e227857fcd942a1ea439272e636b126a97f593c'/>
<id>5e227857fcd942a1ea439272e636b126a97f593c</id>
<content type='text'>
[ Upstream commit cafe4074a7221dca2fa954dd1ab0cf99b6318e23 ]

cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
ring buffer. need_counting_irqs() currently wraps the index using
NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.

Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.

Link: https://lkml.kernel.org/r/tencent_7068189CB6D6689EB353F3D17BF5A5311A07@qq.com
Fixes: e9a9292e2368 ("watchdog/softlockup: Report the most frequent interrupts")
Signed-off-by: Shengming Hu &lt;hu.shengming@zte.com.cn&gt;
Reviewed-by: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Zhang Run &lt;zhang.run@zte.com.cn&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cafe4074a7221dca2fa954dd1ab0cf99b6318e23 ]

cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
ring buffer. need_counting_irqs() currently wraps the index using
NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.

Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.

Link: https://lkml.kernel.org/r/tencent_7068189CB6D6689EB353F3D17BF5A5311A07@qq.com
Fixes: e9a9292e2368 ("watchdog/softlockup: Report the most frequent interrupts")
Signed-off-by: Shengming Hu &lt;hu.shengming@zte.com.cn&gt;
Reviewed-by: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Zhang Run &lt;zhang.run@zte.com.cn&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog: skip checks when panic is in progress</title>
<updated>2025-09-14T00:32:53+00:00</updated>
<author>
<name>Jinchao Wang</name>
<email>wangjinchao600@gmail.com</email>
</author>
<published>2025-08-25T02:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=3d5f4f15b778d6da9760d54455cc256ecf924c0a'/>
<id>3d5f4f15b778d6da9760d54455cc256ecf924c0a</id>
<content type='text'>
This issue was found when an EFI pstore was configured for kdump logging
with the NMI hard lockup detector enabled.  The efi-pstore write operation
was slow, and with a large number of logs, the pstore dump callback within
kmsg_dump() took a long time.

This delay triggered the NMI watchdog, leading to a nested panic.  The
call flow demonstrates how the secondary panic caused an
emergency_restart() to be triggered before the initial pstore operation
could finish, leading to a failure to dump the logs:

  real panic() {
	kmsg_dump() {
		...
		pstore_dump() {
			start_dump();
			... // long time operation triggers NMI watchdog
			nmi panic() {
				...
				emergency_restart(); // pstore unfinished
			}
			...
			finish_dump(); // never reached
		}
	}
  }

Both watchdog_buddy_check_hardlockup() and watchdog_overflow_callback()
may trigger during a panic.  This can lead to recursive panic handling.

Add panic_in_progress() checks so watchdog activity is skipped once a
panic has begun.

This prevents recursive panic and keeps the panic path more reliable.

Link: https://lkml.kernel.org/r/20250825022947.1596226-10-wangjinchao600@gmail.com
Signed-off-by: Jinchao Wang &lt;wangjinchao600@gmail.com&gt;
Reviewed-by: Yury Norov (NVIDIA) &lt;yury.norov@gmail.com&gt;
Cc: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Young &lt;dyoung@redhat.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: "Guilherme G. Piccoli" &lt;gpiccoli@igalia.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: John Ogness &lt;john.ogness@linutronix.de&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Luo Gengkun &lt;luogengkun@huaweicloud.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Nam Cao &lt;namcao@linutronix.de&gt;
Cc: oushixiong &lt;oushixiong@kylinos.cn&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Qianqiang Liu &lt;qianqiang.liu@163.com&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Zimemrmann &lt;tzimmermann@suse.de&gt;
Cc: Thorsten Blum &lt;thorsten.blum@linux.dev&gt;
Cc: Ville Syrjala &lt;ville.syrjala@linux.intel.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Cc: Yunhui Cui &lt;cuiyunhui@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This issue was found when an EFI pstore was configured for kdump logging
with the NMI hard lockup detector enabled.  The efi-pstore write operation
was slow, and with a large number of logs, the pstore dump callback within
kmsg_dump() took a long time.

This delay triggered the NMI watchdog, leading to a nested panic.  The
call flow demonstrates how the secondary panic caused an
emergency_restart() to be triggered before the initial pstore operation
could finish, leading to a failure to dump the logs:

  real panic() {
	kmsg_dump() {
		...
		pstore_dump() {
			start_dump();
			... // long time operation triggers NMI watchdog
			nmi panic() {
				...
				emergency_restart(); // pstore unfinished
			}
			...
			finish_dump(); // never reached
		}
	}
  }

Both watchdog_buddy_check_hardlockup() and watchdog_overflow_callback()
may trigger during a panic.  This can lead to recursive panic handling.

Add panic_in_progress() checks so watchdog activity is skipped once a
panic has begun.

This prevents recursive panic and keeps the panic path more reliable.

Link: https://lkml.kernel.org/r/20250825022947.1596226-10-wangjinchao600@gmail.com
Signed-off-by: Jinchao Wang &lt;wangjinchao600@gmail.com&gt;
Reviewed-by: Yury Norov (NVIDIA) &lt;yury.norov@gmail.com&gt;
Cc: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Young &lt;dyoung@redhat.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: "Guilherme G. Piccoli" &lt;gpiccoli@igalia.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: John Ogness &lt;john.ogness@linutronix.de&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Luo Gengkun &lt;luogengkun@huaweicloud.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Nam Cao &lt;namcao@linutronix.de&gt;
Cc: oushixiong &lt;oushixiong@kylinos.cn&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Qianqiang Liu &lt;qianqiang.liu@163.com&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Sohil Mehta &lt;sohil.mehta@intel.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Zimemrmann &lt;tzimmermann@suse.de&gt;
Cc: Thorsten Blum &lt;thorsten.blum@linux.dev&gt;
Cc: Ville Syrjala &lt;ville.syrjala@linux.intel.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Cc: Yunhui Cui &lt;cuiyunhui@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog/softlockup: fix incorrect CPU utilization output during softlockup</title>
<updated>2025-09-14T00:32:47+00:00</updated>
<author>
<name>ZhenguoYao</name>
<email>yaozhenguo1@gmail.com</email>
</author>
<published>2025-08-12T08:25:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=95f091274f3db39493e8b5c44671b9f1e02c0c25'/>
<id>95f091274f3db39493e8b5c44671b9f1e02c0c25</id>
<content type='text'>
Since we use 16-bit precision, the raw data will undergo integer division,
which may sometimes result in data loss.  This can lead to slightly
inaccurate CPU utilization calculations.  Under normal circumstances, this
isn't an issue.  However, when CPU utilization reaches 100%, the
calculated result might exceed 100%.  For example, with raw data like the
following:

sample_period 400000134 new_stat 83648414036 old_stat 83247417494

sample_period=400000134/2^24=23
new_stat=83648414036/2^24=4985
old_stat=83247417494/2^24=4961
util=105%

Below log will output：

CPU#3 Utilization every 0s during lockup:
    #1:   0% system,          0% softirq,   105% hardirq,     0% idle
    #2:   0% system,          0% softirq,   105% hardirq,     0% idle
    #3:   0% system,          0% softirq,   100% hardirq,     0% idle
    #4:   0% system,          0% softirq,   105% hardirq,     0% idle
    #5:   0% system,          0% softirq,   105% hardirq,     0% idle

To avoid confusion, we enforce a 100% display cap when calculations exceed
this threshold.

We also round to the nearest multiple of 16.8 milliseconds to improve the
accuracy.

[yaozhenguo1@gmail.com: make get_16bit_precision() more accurate, fix comment layout]
  Link: https://lkml.kernel.org/r/20250818081438.40540-1-yaozhenguo@jd.com
Link: https://lkml.kernel.org/r/20250812082510.32291-1-yaozhenguo@jd.com
Signed-off-by: ZhenguoYao &lt;yaozhenguo1@gmail.com&gt;
Cc: Bitao Hu &lt;yaoma@linux.alibaba.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since we use 16-bit precision, the raw data will undergo integer division,
which may sometimes result in data loss.  This can lead to slightly
inaccurate CPU utilization calculations.  Under normal circumstances, this
isn't an issue.  However, when CPU utilization reaches 100%, the
calculated result might exceed 100%.  For example, with raw data like the
following:

sample_period 400000134 new_stat 83648414036 old_stat 83247417494

sample_period=400000134/2^24=23
new_stat=83648414036/2^24=4985
old_stat=83247417494/2^24=4961
util=105%

Below log will output：

CPU#3 Utilization every 0s during lockup:
    #1:   0% system,          0% softirq,   105% hardirq,     0% idle
    #2:   0% system,          0% softirq,   105% hardirq,     0% idle
    #3:   0% system,          0% softirq,   100% hardirq,     0% idle
    #4:   0% system,          0% softirq,   105% hardirq,     0% idle
    #5:   0% system,          0% softirq,   105% hardirq,     0% idle

To avoid confusion, we enforce a 100% display cap when calculations exceed
this threshold.

We also round to the nearest multiple of 16.8 milliseconds to improve the
accuracy.

[yaozhenguo1@gmail.com: make get_16bit_precision() more accurate, fix comment layout]
  Link: https://lkml.kernel.org/r/20250818081438.40540-1-yaozhenguo@jd.com
Link: https://lkml.kernel.org/r/20250812082510.32291-1-yaozhenguo@jd.com
Signed-off-by: ZhenguoYao &lt;yaozhenguo1@gmail.com&gt;
Cc: Bitao Hu &lt;yaoma@linux.alibaba.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog/softlockup: fix wrong output when watchdog_thresh &lt; 3</title>
<updated>2025-09-14T00:32:47+00:00</updated>
<author>
<name>ZhenguoYao</name>
<email>yaozhenguo1@gmail.com</email>
</author>
<published>2025-08-12T07:41:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=41f88ddfd453fe894678e1f6909b9fb9e08e8c3d'/>
<id>41f88ddfd453fe894678e1f6909b9fb9e08e8c3d</id>
<content type='text'>
When watchdog_thresh is below 3, sample_period will be less than 1 second.
So the following output will print when softlockup:

CPU#3 Utilization every 0s during lockup

Fix this by changing time unit from seconds to milliseconds.

Link: https://lkml.kernel.org/r/20250812074132.27810-1-yaozhenguo@jd.com
Signed-off-by: ZhenguoYao &lt;yaozhenguo1@gmail.com&gt;
Cc: Bitao Hu &lt;yaoma@linux.alibaba.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When watchdog_thresh is below 3, sample_period will be less than 1 second.
So the following output will print when softlockup:

CPU#3 Utilization every 0s during lockup

Fix this by changing time unit from seconds to milliseconds.

Link: https://lkml.kernel.org/r/20250812074132.27810-1-yaozhenguo@jd.com
Signed-off-by: ZhenguoYao &lt;yaozhenguo1@gmail.com&gt;
Cc: Bitao Hu &lt;yaoma@linux.alibaba.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count</title>
<updated>2025-05-21T17:48:22+00:00</updated>
<author>
<name>Max Kellermann</name>
<email>max.kellermann@ionos.com</email>
</author>
<published>2025-05-04T18:08:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=aaf05e96e93cb9c8fc8d35fc4525715530397655'/>
<id>aaf05e96e93cb9c8fc8d35fc4525715530397655</id>
<content type='text'>
Patch series "sysfs: add counters for lockups and stalls", v2.

Commits 9db89b411170 ("exit: Expose "oops_count" to sysfs") and
8b05aa263361 ("panic: Expose "warn_count" to sysfs") added counters for
oopses and warnings to sysfs, and these two patches do the same for
hard/soft lockups and RCU stalls.

All of these counters are useful for monitoring tools to detect whether
the machine is healthy.  If the kernel has experienced a lockup or a
stall, it's probably due to a kernel bug, and I'd like to detect that
quickly and easily.  There is currently no way to detect that, other than
parsing dmesg.  Or observing indirect effects: such as certain tasks not
responding, but then I need to observe all tasks, and it may take a while
until these effects become visible/measurable.  I'd rather be able to
detect the primary cause more quickly, possibly before everything falls
apart.


This patch (of 2):

There is /proc/sys/kernel/hung_task_detect_count, /sys/kernel/warn_count
and /sys/kernel/oops_count but there is no userspace-accessible counter
for hard/soft lockups.  Having this is useful for monitoring tools.

Link: https://lkml.kernel.org/r/20250504180831.4190860-1-max.kellermann@ionos.com
Link: https://lkml.kernel.org/r/20250504180831.4190860-2-max.kellermann@ionos.com
Signed-off-by: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc:
Cc: Core Minyard &lt;cminyard@mvista.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "sysfs: add counters for lockups and stalls", v2.

Commits 9db89b411170 ("exit: Expose "oops_count" to sysfs") and
8b05aa263361 ("panic: Expose "warn_count" to sysfs") added counters for
oopses and warnings to sysfs, and these two patches do the same for
hard/soft lockups and RCU stalls.

All of these counters are useful for monitoring tools to detect whether
the machine is healthy.  If the kernel has experienced a lockup or a
stall, it's probably due to a kernel bug, and I'd like to detect that
quickly and easily.  There is currently no way to detect that, other than
parsing dmesg.  Or observing indirect effects: such as certain tasks not
responding, but then I need to observe all tasks, and it may take a while
until these effects become visible/measurable.  I'd rather be able to
detect the primary cause more quickly, possibly before everything falls
apart.


This patch (of 2):

There is /proc/sys/kernel/hung_task_detect_count, /sys/kernel/warn_count
and /sys/kernel/oops_count but there is no userspace-accessible counter
for hard/soft lockups.  Having this is useful for monitoring tools.

Link: https://lkml.kernel.org/r/20250504180831.4190860-1-max.kellermann@ionos.com
Link: https://lkml.kernel.org/r/20250504180831.4190860-2-max.kellermann@ionos.com
Signed-off-by: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Cc:
Cc: Core Minyard &lt;cminyard@mvista.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog: fix watchdog may detect false positive of softlockup</title>
<updated>2025-05-12T00:54:10+00:00</updated>
<author>
<name>Luo Gengkun</name>
<email>luogengkun@huaweicloud.com</email>
</author>
<published>2025-04-21T03:50:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7123dbbef88cfd9f09e8a7899b0911834600cfa3'/>
<id>7123dbbef88cfd9f09e8a7899b0911834600cfa3</id>
<content type='text'>
When updating `watchdog_thresh`, there is a race condition between writing
the new `watchdog_thresh` value and stopping the old watchdog timer.  If
the old timer triggers during this window, it may falsely detect a
softlockup due to the old interval and the new `watchdog_thresh` value
being used.  The problem can be described as follow:

 # We asuume previous watchdog_thresh is 60, so the watchdog timer is
 # coming every 24s.
echo 10 &gt; /proc/sys/kernel/watchdog_thresh (User space)
|
+------&gt;+ update watchdog_thresh (We are in kernel now)
	|
	|	  # using old interval and new `watchdog_thresh`
	+------&gt;+ watchdog hrtimer (irq context: detect softlockup)
		|
		|
	+-------+
	|
	|
	+ softlockup_stop_all

To fix this problem, introduce a shadow variable for `watchdog_thresh`. 
The update to the actual `watchdog_thresh` is delayed until after the old
timer is stopped, preventing false positives.

The following testcase may help to understand this problem.

---------------------------------------------
echo RT_RUNTIME_SHARE &gt; /sys/kernel/debug/sched/features
echo -1 &gt; /proc/sys/kernel/sched_rt_runtime_us
echo 0 &gt; /sys/kernel/debug/sched/fair_server/cpu3/runtime
echo 60 &gt; /proc/sys/kernel/watchdog_thresh
taskset -c 3 chrt -r 99 /bin/bash -c "while true;do true; done" &amp;
echo 10 &gt; /proc/sys/kernel/watchdog_thresh &amp;
---------------------------------------------

The test case above first removes the throttling restrictions for
real-time tasks.  It then sets watchdog_thresh to 60 and executes a
real-time task ,a simple while(1) loop, on cpu3.  Consequently, the final
command gets blocked because the presence of this real-time thread
prevents kworker:3 from being selected by the scheduler.  This eventually
triggers a softlockup detection on cpu3 due to watchdog_timer_fn operating
with inconsistent variable - using both the old interval and the updated
watchdog_thresh simultaneously.

[nysal@linux.ibm.com: fix the SOFTLOCKUP_DETECTOR=n case]
  Link: https://lkml.kernel.org/r/20250502111120.282690-1-nysal@linux.ibm.com
Link: https://lkml.kernel.org/r/20250421035021.3507649-1-luogengkun@huaweicloud.com
Signed-off-by: Luo Gengkun &lt;luogengkun@huaweicloud.com&gt;
Signed-off-by: Nysal Jan K.A. &lt;nysal@linux.ibm.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: "Nysal Jan K.A." &lt;nysal@linux.ibm.com&gt;
Cc: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When updating `watchdog_thresh`, there is a race condition between writing
the new `watchdog_thresh` value and stopping the old watchdog timer.  If
the old timer triggers during this window, it may falsely detect a
softlockup due to the old interval and the new `watchdog_thresh` value
being used.  The problem can be described as follow:

 # We asuume previous watchdog_thresh is 60, so the watchdog timer is
 # coming every 24s.
echo 10 &gt; /proc/sys/kernel/watchdog_thresh (User space)
|
+------&gt;+ update watchdog_thresh (We are in kernel now)
	|
	|	  # using old interval and new `watchdog_thresh`
	+------&gt;+ watchdog hrtimer (irq context: detect softlockup)
		|
		|
	+-------+
	|
	|
	+ softlockup_stop_all

To fix this problem, introduce a shadow variable for `watchdog_thresh`. 
The update to the actual `watchdog_thresh` is delayed until after the old
timer is stopped, preventing false positives.

The following testcase may help to understand this problem.

---------------------------------------------
echo RT_RUNTIME_SHARE &gt; /sys/kernel/debug/sched/features
echo -1 &gt; /proc/sys/kernel/sched_rt_runtime_us
echo 0 &gt; /sys/kernel/debug/sched/fair_server/cpu3/runtime
echo 60 &gt; /proc/sys/kernel/watchdog_thresh
taskset -c 3 chrt -r 99 /bin/bash -c "while true;do true; done" &amp;
echo 10 &gt; /proc/sys/kernel/watchdog_thresh &amp;
---------------------------------------------

The test case above first removes the throttling restrictions for
real-time tasks.  It then sets watchdog_thresh to 60 and executes a
real-time task ,a simple while(1) loop, on cpu3.  Consequently, the final
command gets blocked because the presence of this real-time thread
prevents kworker:3 from being selected by the scheduler.  This eventually
triggers a softlockup detection on cpu3 due to watchdog_timer_fn operating
with inconsistent variable - using both the old interval and the updated
watchdog_thresh simultaneously.

[nysal@linux.ibm.com: fix the SOFTLOCKUP_DETECTOR=n case]
  Link: https://lkml.kernel.org/r/20250502111120.282690-1-nysal@linux.ibm.com
Link: https://lkml.kernel.org/r/20250421035021.3507649-1-luogengkun@huaweicloud.com
Signed-off-by: Luo Gengkun &lt;luogengkun@huaweicloud.com&gt;
Signed-off-by: Nysal Jan K.A. &lt;nysal@linux.ibm.com&gt;
Cc: Doug Anderson &lt;dianders@chromium.org&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: "Nysal Jan K.A." &lt;nysal@linux.ibm.com&gt;
Cc: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-03-25T17:54:15+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-03-25T17:54:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a50b4fe095fb98e0b7da03b0a42fd1247284868e'/>
<id>a50b4fe095fb98e0b7da03b0a42fd1247284868e</id>
<content type='text'>
Pull timer cleanups from Thomas Gleixner:
 "A treewide hrtimer timer cleanup

  hrtimers are initialized with hrtimer_init() and a subsequent store to
  the callback pointer. This turned out to be suboptimal for the
  upcoming Rust integration and is obviously a silly implementation to
  begin with.

  This cleanup replaces the hrtimer_init(T); T-&gt;function = cb; sequence
  with hrtimer_setup(T, cb);

  The conversion was done with Coccinelle and a few manual fixups.

  Once the conversion has completely landed in mainline, hrtimer_init()
  will be removed and the hrtimer::function becomes a private member"

* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
  wifi: rt2x00: Switch to use hrtimer_update_function()
  io_uring: Use helper function hrtimer_update_function()
  serial: xilinx_uartps: Use helper function hrtimer_update_function()
  ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
  RDMA: Switch to use hrtimer_setup()
  virtio: mem: Switch to use hrtimer_setup()
  drm/vmwgfx: Switch to use hrtimer_setup()
  drm/xe/oa: Switch to use hrtimer_setup()
  drm/vkms: Switch to use hrtimer_setup()
  drm/msm: Switch to use hrtimer_setup()
  drm/i915/request: Switch to use hrtimer_setup()
  drm/i915/uncore: Switch to use hrtimer_setup()
  drm/i915/pmu: Switch to use hrtimer_setup()
  drm/i915/perf: Switch to use hrtimer_setup()
  drm/i915/gvt: Switch to use hrtimer_setup()
  drm/i915/huc: Switch to use hrtimer_setup()
  drm/amdgpu: Switch to use hrtimer_setup()
  stm class: heartbeat: Switch to use hrtimer_setup()
  i2c: Switch to use hrtimer_setup()
  iio: Switch to use hrtimer_setup()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull timer cleanups from Thomas Gleixner:
 "A treewide hrtimer timer cleanup

  hrtimers are initialized with hrtimer_init() and a subsequent store to
  the callback pointer. This turned out to be suboptimal for the
  upcoming Rust integration and is obviously a silly implementation to
  begin with.

  This cleanup replaces the hrtimer_init(T); T-&gt;function = cb; sequence
  with hrtimer_setup(T, cb);

  The conversion was done with Coccinelle and a few manual fixups.

  Once the conversion has completely landed in mainline, hrtimer_init()
  will be removed and the hrtimer::function becomes a private member"

* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
  wifi: rt2x00: Switch to use hrtimer_update_function()
  io_uring: Use helper function hrtimer_update_function()
  serial: xilinx_uartps: Use helper function hrtimer_update_function()
  ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
  RDMA: Switch to use hrtimer_setup()
  virtio: mem: Switch to use hrtimer_setup()
  drm/vmwgfx: Switch to use hrtimer_setup()
  drm/xe/oa: Switch to use hrtimer_setup()
  drm/vkms: Switch to use hrtimer_setup()
  drm/msm: Switch to use hrtimer_setup()
  drm/i915/request: Switch to use hrtimer_setup()
  drm/i915/uncore: Switch to use hrtimer_setup()
  drm/i915/pmu: Switch to use hrtimer_setup()
  drm/i915/perf: Switch to use hrtimer_setup()
  drm/i915/gvt: Switch to use hrtimer_setup()
  drm/i915/huc: Switch to use hrtimer_setup()
  drm/amdgpu: Switch to use hrtimer_setup()
  stm class: heartbeat: Switch to use hrtimer_setup()
  i2c: Switch to use hrtimer_setup()
  iio: Switch to use hrtimer_setup()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog/hardlockup/perf: Fix perf_event memory leak</title>
<updated>2025-03-06T11:05:33+00:00</updated>
<author>
<name>Li Huafei</name>
<email>lihuafei1@huawei.com</email>
</author>
<published>2024-10-21T19:30:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d6834d9c990333bfa433bc1816e2417f268eebbe'/>
<id>d6834d9c990333bfa433bc1816e2417f268eebbe</id>
<content type='text'>
During stress-testing, we found a kmemleak report for perf_event:

  unreferenced object 0xff110001410a33e0 (size 1328):
    comm "kworker/4:11", pid 288, jiffies 4294916004
    hex dump (first 32 bytes):
      b8 be c2 3b 02 00 11 ff 22 01 00 00 00 00 ad de  ...;....".......
      f0 33 0a 41 01 00 11 ff f0 33 0a 41 01 00 11 ff  .3.A.....3.A....
    backtrace (crc 24eb7b3a):
      [&lt;00000000e211b653&gt;] kmem_cache_alloc_node_noprof+0x269/0x2e0
      [&lt;000000009d0985fa&gt;] perf_event_alloc+0x5f/0xcf0
      [&lt;00000000084ad4a2&gt;] perf_event_create_kernel_counter+0x38/0x1b0
      [&lt;00000000fde96401&gt;] hardlockup_detector_event_create+0x50/0xe0
      [&lt;0000000051183158&gt;] watchdog_hardlockup_enable+0x17/0x70
      [&lt;00000000ac89727f&gt;] softlockup_start_fn+0x15/0x40
      ...

Our stress test includes CPU online and offline cycles, and updating the
watchdog configuration.

After reading the code, I found that there may be a race between cleaning up
perf_event after updating watchdog and disabling event when the CPU goes offline:

  CPU0                          CPU1                           CPU2
  (update watchdog)                                            (hotplug offline CPU1)

  ...                                                          _cpu_down(CPU1)
  cpus_read_lock()                                             // waiting for cpu lock
    softlockup_start_all
      smp_call_on_cpu(CPU1)
                                softlockup_start_fn
                                ...
                                  watchdog_hardlockup_enable(CPU1)
                                    perf create E1
                                    watchdog_ev[CPU1] = E1
  cpus_read_unlock()
                                                               cpus_write_lock()
                                                               cpuhp_kick_ap_work(CPU1)
                                cpuhp_thread_fun
                                ...
                                  watchdog_hardlockup_disable(CPU1)
                                    watchdog_ev[CPU1] = NULL
                                    dead_event[CPU1] = E1
  __lockup_detector_cleanup
    for each dead_events_mask
      release each dead_event
      /*
       * CPU1 has not been added to
       * dead_events_mask, then E1
       * will not be released
       */
                                    CPU1 -&gt; dead_events_mask
    cpumask_clear(&amp;dead_events_mask)
    // dead_events_mask is cleared, E1 is leaked

In this case, the leaked perf_event E1 matches the perf_event leak
reported by kmemleak. Due to the low probability of problem recurrence
(only reported once), I added some hack delays in the code:

  static void __lockup_detector_reconfigure(void)
  {
    ...
          watchdog_hardlockup_start();
          cpus_read_unlock();
  +       mdelay(100);
          /*
           * Must be called outside the cpus locked section to prevent
           * recursive locking in the perf code.
    ...
  }

  void watchdog_hardlockup_disable(unsigned int cpu)
  {
    ...
                  perf_event_disable(event);
                  this_cpu_write(watchdog_ev, NULL);
                  this_cpu_write(dead_event, event);
  +               mdelay(100);
                  cpumask_set_cpu(smp_processor_id(), &amp;dead_events_mask);
                  atomic_dec(&amp;watchdog_cpus);
    ...
  }

  void hardlockup_detector_perf_cleanup(void)
  {
    ...
                          perf_event_release_kernel(event);
                  per_cpu(dead_event, cpu) = NULL;
          }
  +       mdelay(100);
          cpumask_clear(&amp;dead_events_mask);
  }

Then, simultaneously performing CPU on/off and switching watchdog, it is
almost certain to reproduce this leak.

The problem here is that releasing perf_event is not within the CPU
hotplug read-write lock. Commit:

  941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")

introduced deferred release to solve the deadlock caused by calling
get_online_cpus() when releasing perf_event. Later, commit:

  efe951d3de91 ("perf/x86: Fix perf,x86,cpuhp deadlock")

removed the get_online_cpus() call on the perf_event release path to solve
another deadlock problem.

Therefore, it is now possible to move the release of perf_event back
into the CPU hotplug read-write lock, and release the event immediately
after disabling it.

Fixes: 941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")
Signed-off-by: Li Huafei &lt;lihuafei1@huawei.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241021193004.308303-1-lihuafei1@huawei.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During stress-testing, we found a kmemleak report for perf_event:

  unreferenced object 0xff110001410a33e0 (size 1328):
    comm "kworker/4:11", pid 288, jiffies 4294916004
    hex dump (first 32 bytes):
      b8 be c2 3b 02 00 11 ff 22 01 00 00 00 00 ad de  ...;....".......
      f0 33 0a 41 01 00 11 ff f0 33 0a 41 01 00 11 ff  .3.A.....3.A....
    backtrace (crc 24eb7b3a):
      [&lt;00000000e211b653&gt;] kmem_cache_alloc_node_noprof+0x269/0x2e0
      [&lt;000000009d0985fa&gt;] perf_event_alloc+0x5f/0xcf0
      [&lt;00000000084ad4a2&gt;] perf_event_create_kernel_counter+0x38/0x1b0
      [&lt;00000000fde96401&gt;] hardlockup_detector_event_create+0x50/0xe0
      [&lt;0000000051183158&gt;] watchdog_hardlockup_enable+0x17/0x70
      [&lt;00000000ac89727f&gt;] softlockup_start_fn+0x15/0x40
      ...

Our stress test includes CPU online and offline cycles, and updating the
watchdog configuration.

After reading the code, I found that there may be a race between cleaning up
perf_event after updating watchdog and disabling event when the CPU goes offline:

  CPU0                          CPU1                           CPU2
  (update watchdog)                                            (hotplug offline CPU1)

  ...                                                          _cpu_down(CPU1)
  cpus_read_lock()                                             // waiting for cpu lock
    softlockup_start_all
      smp_call_on_cpu(CPU1)
                                softlockup_start_fn
                                ...
                                  watchdog_hardlockup_enable(CPU1)
                                    perf create E1
                                    watchdog_ev[CPU1] = E1
  cpus_read_unlock()
                                                               cpus_write_lock()
                                                               cpuhp_kick_ap_work(CPU1)
                                cpuhp_thread_fun
                                ...
                                  watchdog_hardlockup_disable(CPU1)
                                    watchdog_ev[CPU1] = NULL
                                    dead_event[CPU1] = E1
  __lockup_detector_cleanup
    for each dead_events_mask
      release each dead_event
      /*
       * CPU1 has not been added to
       * dead_events_mask, then E1
       * will not be released
       */
                                    CPU1 -&gt; dead_events_mask
    cpumask_clear(&amp;dead_events_mask)
    // dead_events_mask is cleared, E1 is leaked

In this case, the leaked perf_event E1 matches the perf_event leak
reported by kmemleak. Due to the low probability of problem recurrence
(only reported once), I added some hack delays in the code:

  static void __lockup_detector_reconfigure(void)
  {
    ...
          watchdog_hardlockup_start();
          cpus_read_unlock();
  +       mdelay(100);
          /*
           * Must be called outside the cpus locked section to prevent
           * recursive locking in the perf code.
    ...
  }

  void watchdog_hardlockup_disable(unsigned int cpu)
  {
    ...
                  perf_event_disable(event);
                  this_cpu_write(watchdog_ev, NULL);
                  this_cpu_write(dead_event, event);
  +               mdelay(100);
                  cpumask_set_cpu(smp_processor_id(), &amp;dead_events_mask);
                  atomic_dec(&amp;watchdog_cpus);
    ...
  }

  void hardlockup_detector_perf_cleanup(void)
  {
    ...
                          perf_event_release_kernel(event);
                  per_cpu(dead_event, cpu) = NULL;
          }
  +       mdelay(100);
          cpumask_clear(&amp;dead_events_mask);
  }

Then, simultaneously performing CPU on/off and switching watchdog, it is
almost certain to reproduce this leak.

The problem here is that releasing perf_event is not within the CPU
hotplug read-write lock. Commit:

  941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")

introduced deferred release to solve the deadlock caused by calling
get_online_cpus() when releasing perf_event. Later, commit:

  efe951d3de91 ("perf/x86: Fix perf,x86,cpuhp deadlock")

removed the get_online_cpus() call on the perf_event release path to solve
another deadlock problem.

Therefore, it is now possible to move the release of perf_event back
into the CPU hotplug read-write lock, and release the event immediately
after disabling it.

Fixes: 941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")
Signed-off-by: Li Huafei &lt;lihuafei1@huawei.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241021193004.308303-1-lihuafei1@huawei.com
</pre>
</div>
</content>
</entry>
<entry>
<title>watchdog: Switch to use hrtimer_setup()</title>
<updated>2025-02-18T09:32:33+00:00</updated>
<author>
<name>Nam Cao</name>
<email>namcao@linutronix.de</email>
</author>
<published>2025-02-05T10:39:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d2254b0643222ffa7e4e7ac0ae8aa0e4cbf6d09a'/>
<id>d2254b0643222ffa7e4e7ac0ae8aa0e4cbf6d09a</id>
<content type='text'>
hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Patch was created by using Coccinelle.

Signed-off-by: Nam Cao &lt;namcao@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/a5c62f2b5e1ea1cf4d32f37bc2d21a8eeab2f875.1738746821.git.namcao@linutronix.de

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Patch was created by using Coccinelle.

Signed-off-by: Nam Cao &lt;namcao@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/a5c62f2b5e1ea1cf4d32f37bc2d21a8eeab2f875.1738746821.git.namcao@linutronix.de

</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: const qualify ctl_tables where applicable</title>
<updated>2025-01-28T12:48:37+00:00</updated>
<author>
<name>Joel Granados</name>
<email>joel.granados@kernel.org</email>
</author>
<published>2025-01-28T12:48:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1751f872cc97f992ed5c4c72c55588db1f0021e1'/>
<id>1751f872cc97f992ed5c4c72c55588db1f0021e1</id>
<content type='text'>
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &amp;uts_kern/const struct ctl_table *table = \&amp;uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu &lt;song@kernel.org&gt;
Acked-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt; # for kernel/trace/
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt; # SCSI
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # xfs
Acked-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Acked-by: Corey Minyard &lt;cminyard@mvista.com&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Bill O'Donnell &lt;bodonnel@redhat.com&gt;
Acked-by: Baoquan He &lt;bhe@redhat.com&gt;
Acked-by: Ashutosh Dixit &lt;ashutosh.dixit@intel.com&gt;
Acked-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Signed-off-by: Joel Granados &lt;joel.granados@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &amp;uts_kern/const struct ctl_table *table = \&amp;uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu &lt;song@kernel.org&gt;
Acked-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt; # for kernel/trace/
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt; # SCSI
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt; # xfs
Acked-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Acked-by: Corey Minyard &lt;cminyard@mvista.com&gt;
Acked-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Bill O'Donnell &lt;bodonnel@redhat.com&gt;
Acked-by: Baoquan He &lt;bhe@redhat.com&gt;
Acked-by: Ashutosh Dixit &lt;ashutosh.dixit@intel.com&gt;
Acked-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Signed-off-by: Joel Granados &lt;joel.granados@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
