linux.git/kernel/power/process.c, branch v6.18.21

Revert "PM: sleep: Make pm_wakeup_clear() call more clear"

2025-10-23T10:48:04+00:00

This reverts commit 56a232d93cea0ba14da5e3157830330756a45b4c.

The above commit changed the position of pm_wakeup_clear() for the
suspend call path, but other call paths with references to
freeze_processes() were not updated. This means that other call
paths, such as hibernate(), will not have pm_wakeup_clear() called.

Suggested-by: Saravana Kannan 
Signed-off-by: Samuel Wu 
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251022222830.634086-1-wusamuel@google.com
Signed-off-by: Rafael J. Wysocki

PM: sleep: Make pm_wakeup_clear() call more clear

2025-09-04T19:05:14+00:00

Move pm_wakeup_clear() to the same location as other functions that do
bookkeeping prior to suspend_prepare().

Since calling pm_wakeup_clear() is a prerequisite to setting up for
suspend and enabling functionalities of suspend (like aborting during
suspend), moving pm_wakeup_clear() higher up the call stack makes its
intent more clear and obvious that it is called prior to
suspend_prepare().

After this change, there is a slightly larger window when abort events
can be registered, but otherwise suspend functionality is the same.

Suggested-by: Saravana Kannan 
Signed-off-by: Samuel Wu 
Link: https://patch.msgid.link/20250821004237.2712312-2-wusamuel@google.com
Reviewed-by: Saravana Kannan 
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki

PM: freezer: Rewrite restarting tasks log to remove stray done.

2025-05-16T20:22:10+00:00

`pr_cont()` unfortunately does not work here, as other parts of the
Linux kernel log between the two log lines:

    [18445.295056] r8152-cfgselector 4-1.1.3: USB disconnect, device number 5
    [18445.295112] OOM killer enabled.
    [18445.295115] Restarting tasks ...
    [18445.295185] usb 3-1: USB disconnect, device number 2
    [18445.295193] usb 3-1.1: USB disconnect, device number 3
    [18445.296262] usb 3-1.5: USB disconnect, device number 4
    [18445.297017] done.
    [18445.297029] random: crng reseeded on system resumption

`pr_cont()` also uses the default log level, normally warning, if the
corresponding log line is interrupted.

Therefore, replace the `pr_cont()`, and explicitly log it as a separate
line with log level info:

    Restarting tasks: Starting
    […]
    Restarting tasks: Done

Signed-off-by: Paul Menzel 
Link: https://patch.msgid.link/20250511174648.950430-1-pmenzel@molgen.mpg.de
[ rjw: Rebase on top of an earlier analogous change ]
Signed-off-by: Rafael J. Wysocki

PM: sleep: Use two lines for "Restarting..." / "done" messages

2025-04-22T11:58:30+00:00

Other messages are occasionally printed between these two, for example:

    [203104.106534] Restarting tasks ...
    [203104.106559] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
    [203104.112354] done.

This seems to be a timing issue, seen in two of the eleven
hibernation exits in my current `dmesg` output.

When printed on its own, the "done" message has the default log level.
This makes the output of `dmesg --level=warn` quite misleading.

Add enough context for the "done" messages to make sense on their own,
and use the same log level for all messages.

Change the messages to "..." / "Done .", unlike a449dfbfc089
which uses "..." / " completed.".  Front-loading the unique
part of the message makes it easier to scan the log, and reduces ambiguity
for users who aren't confident in their English comprehension.

Reviewed-by: Lucas De Marchi 
Signed-off-by: Andrew Sayers 
Link: https://patch.msgid.link/20250411152632.2806038-1-kernel.org@pileofstuff.org
Signed-off-by: Rafael J. Wysocki

cgroup/cpuset: Make cpuset hotplug processing synchronous

2024-04-08T17:39:16+00:00

Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside
get_online_cpus()"), cpuset hotplug was done asynchronously via a work
function. This is to avoid recursive locking of cgroup_mutex.

Since then, the cgroup locking scheme has changed quite a bit. A
cpuset_mutex was introduced to protect cpuset specific operations.
The cpuset_mutex is then replaced by a cpuset_rwsem. With commit
d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock
order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on,
cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes
allow the hotplug code to call into cpuset core directly.

The following commits were also merged due to the asynchronous nature
of cpuset hotplug processing.

  - commit b22afcdf04c9 ("cpu/hotplug: Cure the cpusets trainwreck")
  - commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume
    bugs")
  - commit 28b89b9e6f7b ("cpuset: handle race between CPU hotplug and
    cpuset_hotplug_work")

Clean up all these bandages by making cpuset hotplug
processing synchronous again with the exception that the call to
cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1
cpuset, if necessary, will still be done via a work function due to the
existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible
to reverse that dependency, but that will require updating a number of
different cgroup controllers. This special hotplug code path should be
rarely taken anyway.

As all the cpuset states will be updated by the end of the hotplug
operation, we can revert most the above commits except commit
50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs")
which is partially reverted.  Also removing some cpus_read_lock trylock
attempts in the cpuset partition code as they are no longer necessary
since the cpu_hotplug_lock is now held for the whole duration of the
cpuset hotplug code path.

Signed-off-by: Waiman Long 
Tested-by: Valentin Schneider 
Signed-off-by: Tejun Heo

workqueue: Introduce show_freezable_workqueues

2023-03-24T01:55:38+00:00

Currently show_all_workqueue is called if freeze fails at the time of
freeze the workqueues, which shows the status of all workqueues and of
all worker pools. In this cases we may only need to dump state of only
workqueues that are freezable and busy.

This patch defines show_freezable_workqueues, which uses
show_one_workqueue, a granular function that shows the state of individual
workqueues, so that dump only the state of freezable workqueues
at that time.

tj: Minor message adjustment.

Signed-off-by: Jungseung Lee 
Signed-off-by: Tejun Heo

PM: sleep: Refine error message in try_to_freeze_tasks()

2022-12-06T11:04:34+00:00

A previous change amended try_to_freeze_tasks() with the "what"
variable pointing to a string describing the group of tasks subject to
the freezing which may be used in the error message in there too, so
make that happen.

Accordingly, update sleepgraph.py to catch the modified error message
as appropriate.

Signed-off-by: Rafael J. Wysocki 
Reviewed-by: Petr Mladek

PM: sleep: Avoid using pr_cont() in the tasks freezing code

2022-12-06T11:04:10+00:00

Using pr_cont() in the tasks freezing code related to system-wide
suspend and hibernation is problematic, because the continuation
messages printed there are susceptible to interspersing with other
unrelated messages which results in output that is hard to
understand.

Address this issue by modifying try_to_freeze_tasks() to print
messages that don't require continuations and adjusting its
callers accordingly.

Reported-by: Thomas Weißschuh 
Signed-off-by: Rafael J. Wysocki 
Reviewed-by: Petr Mladek

freezer,sched: Rewrite core freezer logic

2022-09-07T19:53:50+00:00

Rewrite the core freezer to behave better wrt thawing and be simpler
in general.

By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.

As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).

Specifically; the current scheme works a little like:

	freezer_do_not_count();
	schedule();
	freezer_count();

And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.

However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.

That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.

This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.

As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:

	TASK_FREEZABLE	-> TASK_FROZEN
	__TASK_STOPPED	-> TASK_FROZEN
	__TASK_TRACED	-> TASK_FROZEN

The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).

The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.

With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.

Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Ingo Molnar 
Acked-by: Rafael J. Wysocki 
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org

PM: sleep: Narrow down -DDEBUG on kernel/power/ files

2022-04-13T14:34:01+00:00

The macro -DDEBUG is broadly enabled on kernel/power/ directory if
CONFIG_DYNAMIC_DEBUG is enabled. As side effect all debug messages using
pr_debug() and dev_dbg() are enabled by default on dynamic debug.
We're reworking pm_pr_dbg() to support dynamic debug, where pm_pr_dbg()
will print message if either pm_debug_messages_on flag is set or if it's
explicitly enabled on dynamic debug's control. That means if we let
-DDEBUG broadly set, the pm_debug_messages_on flag will be bypassed by
default on pm_pr_dbg() if dynamic debug is also enabled.

The files that directly use pr_debug() and dev_dbg() on kernel/power/ are:
 - swap.c
 - snapshot.c
 - energy_model.c

And those files do not use pm_pr_dbg(). So if we limit -DDEBUG to them,
we keep the same functional behavior while allowing the pm_pr_dbg()
refactor.

Signed-off-by: David Cohen 
Signed-off-by: Rafael J. Wysocki