From 6b93bb41f6eaa1cc5c5f30ec3b687b380f116cd0 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:00 +0206 Subject: printk: Add non-BKL (nbcon) console basic infrastructure The current console/printk subsystem is protected by a Big Kernel Lock, (aka console_lock) which has ill defined semantics and is more or less stateless. This puts severe limitations on the console subsystem and makes forced takeover and output in emergency and panic situations a fragile endeavour that is based on try and pray. The goal of non-BKL (nbcon) consoles is to break out of the console lock jail and to provide a new infrastructure that avoids the pitfalls and also allows console drivers to be gradually converted over. The proposed infrastructure aims for the following properties: - Per console locking instead of global locking - Per console state that allows to make informed decisions - Stateful handover and takeover As a first step, state is added to struct console. The per console state is an atomic_t using a 32bit bit field. Reserve state bits, which will be populated later in the series. Wire it up into the console register/unregister functionality. It was decided to use a bitfield because using a plain u32 with mask/shift operations resulted in uncomprehensible code. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Reviewed-by: Petr Mladek Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-2-john.ogness@linutronix.de --- include/linux/console.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index 7de11c763eb3..a2d37a7a98a8 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -156,6 +156,8 @@ static inline int con_debug_leave(void) * /dev/kmesg which requires a larger output buffer. * @CON_SUSPENDED: Indicates if a console is suspended. If true, the * printing callbacks must not be called. + * @CON_NBCON: Console can operate outside of the legacy style console_lock + * constraints. */ enum cons_flags { CON_PRINTBUFFER = BIT(0), @@ -166,8 +168,32 @@ enum cons_flags { CON_BRL = BIT(5), CON_EXTENDED = BIT(6), CON_SUSPENDED = BIT(7), + CON_NBCON = BIT(8), }; +/** + * struct nbcon_state - console state for nbcon consoles + * @atom: Compound of the state fields for atomic operations + * + * To be used for reading and preparing of the value stored in the nbcon + * state variable @console::nbcon_state. + */ +struct nbcon_state { + union { + unsigned int atom; + struct { + }; + }; +}; + +/* + * The nbcon_state struct is used to easily create and interpret values that + * are stored in the @console::nbcon_state variable. Ensure this struct stays + * within the size boundaries of the atomic variable's underlying type in + * order to avoid any accidental truncation. + */ +static_assert(sizeof(struct nbcon_state) <= sizeof(int)); + /** * struct console - The console descriptor structure * @name: The name of the console driver @@ -187,6 +213,8 @@ enum cons_flags { * @dropped: Number of unreported dropped ringbuffer records * @data: Driver private data * @node: hlist node for the console list + * + * @nbcon_state: State for nbcon consoles */ struct console { char name[16]; @@ -206,6 +234,9 @@ struct console { unsigned long dropped; void *data; struct hlist_node node; + + /* nbcon console specific members */ + atomic_t __private nbcon_state; }; #ifdef CONFIG_LOCKDEP -- cgit v1.2.3 From 3a5bb25162b880da749f7cdf281c78dbade4164b Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:01 +0206 Subject: printk: nbcon: Add acquire/release logic Add per console acquire/release functionality. The state of the console is maintained in the "nbcon_state" atomic variable. The console is locked when: - The 'prio' field contains the priority of the context that owns the console. Only higher priority contexts are allowed to take over the lock. A value of 0 (NBCON_PRIO_NONE) means the console is not locked. - The 'cpu' field denotes on which CPU the console is locked. It is used to prevent busy waiting on the same CPU. Also it informs the lock owner that it has lost the lock in a more complex scenario when the lock was taken over by a higher priority context, released, and taken on another CPU with the same priority as the interrupted owner. The acquire mechanism uses a few more fields: - The 'req_prio' field is used by the handover approach to make the current owner aware that there is a context with a higher priority waiting for the friendly handover. - The 'unsafe' field allows to take over the console in a safe way in the middle of emitting a message. The field is set only when accessing some shared resources or when the console device is manipulated. It can be cleared, for example, after emitting one character when the console device is in a consistent state. - The 'unsafe_takeover' field is set when a hostile takeover took the console in an unsafe state. The console will stay in the unsafe state until re-initialized. The acquire mechanism uses three approaches: 1) Direct acquire when the console is not owned or is owned by a lower priority context and is in a safe state. 2) Friendly handover mechanism uses a request/grant handshake. It is used when the current owner has lower priority and the console is in an unsafe state. The requesting context: a) Sets its priority into the 'req_prio' field. b) Waits (with a timeout) for the owning context to unlock the console. c) Takes the lock and clears the 'req_prio' field. The owning context: a) Observes the 'req_prio' field set on exit from the unsafe console state. b) Gives up console ownership by clearing the 'prio' field. 3) Unsafe hostile takeover allows to take over the lock even when the console is an unsafe state. It is used only in panic() by the final attempt to flush consoles in a try and hope mode. Note that separate record buffers are used in panic(). As a result, the messages can be read and formatted without any risk even after using the hostile takeover in unsafe state. The release function simply clears the 'prio' field. All operations on @console::nbcon_state are atomic cmpxchg based to handle concurrency. The acquire/release functions implement only minimal policies: - Preference for higher priority contexts. - Protection of the panic CPU. All other policy decisions must be made at the call sites: - What is marked as an unsafe section. - Whether to spin-wait if there is already an owner and the console is in an unsafe state. - Whether to attempt an unsafe hostile takeover. The design allows to implement the well known: acquire() output_one_printk_record() release() The output of one printk record might be interrupted with a higher priority context. The new owner is supposed to reprint the entire interrupted record from scratch. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-3-john.ogness@linutronix.de --- include/linux/console.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index a2d37a7a98a8..98210fd01f18 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -175,13 +175,29 @@ enum cons_flags { * struct nbcon_state - console state for nbcon consoles * @atom: Compound of the state fields for atomic operations * + * @req_prio: The priority of a handover request + * @prio: The priority of the current owner + * @unsafe: Console is busy in a non takeover region + * @unsafe_takeover: A hostile takeover in an unsafe state happened in the + * past. The console cannot be safe until re-initialized. + * @cpu: The CPU on which the owner runs + * * To be used for reading and preparing of the value stored in the nbcon * state variable @console::nbcon_state. + * + * The @prio and @req_prio fields are particularly important to allow + * spin-waiting to timeout and give up without the risk of a waiter being + * assigned the lock after giving up. */ struct nbcon_state { union { unsigned int atom; struct { + unsigned int prio : 2; + unsigned int req_prio : 2; + unsigned int unsafe : 1; + unsigned int unsafe_takeover : 1; + unsigned int cpu : 24; }; }; }; @@ -194,6 +210,46 @@ struct nbcon_state { */ static_assert(sizeof(struct nbcon_state) <= sizeof(int)); +/** + * nbcon_prio - console owner priority for nbcon consoles + * @NBCON_PRIO_NONE: Unused + * @NBCON_PRIO_NORMAL: Normal (non-emergency) usage + * @NBCON_PRIO_EMERGENCY: Emergency output (WARN/OOPS...) + * @NBCON_PRIO_PANIC: Panic output + * @NBCON_PRIO_MAX: The number of priority levels + * + * A higher priority context can takeover the console when it is + * in the safe state. The final attempt to flush consoles in panic() + * can be allowed to do so even in an unsafe state (Hope and pray). + */ +enum nbcon_prio { + NBCON_PRIO_NONE = 0, + NBCON_PRIO_NORMAL, + NBCON_PRIO_EMERGENCY, + NBCON_PRIO_PANIC, + NBCON_PRIO_MAX, +}; + +struct console; + +/** + * struct nbcon_context - Context for console acquire/release + * @console: The associated console + * @spinwait_max_us: Limit for spin-wait acquire + * @prio: Priority of the context + * @allow_unsafe_takeover: Allow performing takeover even if unsafe. Can + * be used only with NBCON_PRIO_PANIC @prio. It + * might cause a system freeze when the console + * is used later. + */ +struct nbcon_context { + /* members set by caller */ + struct console *console; + unsigned int spinwait_max_us; + enum nbcon_prio prio; + unsigned int allow_unsafe_takeover : 1; +}; + /** * struct console - The console descriptor structure * @name: The name of the console driver -- cgit v1.2.3 From 5634c90fd8553de7cbafecd048d0273690a2e84e Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:03 +0206 Subject: printk: nbcon: Add buffer management In case of hostile takeovers it must be ensured that the previous owner cannot scribble over the output buffer of the emergency/panic context. This is achieved by: - Adding a global output buffer instance for the panic context. This is the only situation where hostile takeovers can occur and there is always at most 1 panic context. - Allocating an output buffer per non-boot console upon console registration. This buffer is used by the console owner when not in panic context. (For boot consoles, the existing shared global legacy output buffer is used instead. Boot console printing will be synchronized with legacy console printing.) - Choosing the appropriate buffer is handled in the acquire/release functions. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Reviewed-by: Petr Mladek Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-5-john.ogness@linutronix.de --- include/linux/console.h | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index 98210fd01f18..ca1ef8700e55 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -231,6 +231,7 @@ enum nbcon_prio { }; struct console; +struct printk_buffers; /** * struct nbcon_context - Context for console acquire/release @@ -241,6 +242,7 @@ struct console; * be used only with NBCON_PRIO_PANIC @prio. It * might cause a system freeze when the console * is used later. + * @pbufs: Pointer to the text buffer for this context */ struct nbcon_context { /* members set by caller */ @@ -248,6 +250,9 @@ struct nbcon_context { unsigned int spinwait_max_us; enum nbcon_prio prio; unsigned int allow_unsafe_takeover : 1; + + /* members set by acquire */ + struct printk_buffers *pbufs; }; /** @@ -271,6 +276,7 @@ struct nbcon_context { * @node: hlist node for the console list * * @nbcon_state: State for nbcon consoles + * @pbufs: Pointer to nbcon private buffer */ struct console { char name[16]; @@ -293,6 +299,7 @@ struct console { /* nbcon console specific members */ atomic_t __private nbcon_state; + struct printk_buffers *pbufs; }; #ifdef CONFIG_LOCKDEP -- cgit v1.2.3 From ad56ebd1d79b216dc147474fac89a11daf6b10df Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:05 +0206 Subject: printk: nbcon: Add sequence handling Add an atomic_long_t field @nbcon_seq to the console struct to store the sequence number for nbcon consoles. For nbcon consoles this will be used instead of the non-atomic @seq field. The new field allows for safe atomic sequence number updates without requiring any locking. On 64bit systems the new field stores the full sequence number. On 32bit systems the new field stores the lower 32 bits of the sequence number, which are expanded to 64bit as needed by folding the values based on the sequence numbers available in the ringbuffer. For 32bit systems, having a 32bit representation in the console is sufficient. If a console ever gets more than 2^31 records behind the ringbuffer then this is the least of the problems. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-7-john.ogness@linutronix.de --- include/linux/console.h | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index ca1ef8700e55..20cd486b76ad 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -243,6 +243,7 @@ struct printk_buffers; * might cause a system freeze when the console * is used later. * @pbufs: Pointer to the text buffer for this context + * @seq: The sequence number to print for this context */ struct nbcon_context { /* members set by caller */ @@ -253,6 +254,7 @@ struct nbcon_context { /* members set by acquire */ struct printk_buffers *pbufs; + u64 seq; }; /** @@ -276,6 +278,7 @@ struct nbcon_context { * @node: hlist node for the console list * * @nbcon_state: State for nbcon consoles + * @nbcon_seq: Sequence number of the next record for nbcon to print * @pbufs: Pointer to nbcon private buffer */ struct console { @@ -299,6 +302,7 @@ struct console { /* nbcon console specific members */ atomic_t __private nbcon_state; + atomic_long_t __private nbcon_seq; struct printk_buffers *pbufs; }; -- cgit v1.2.3 From 06653d57ff283be627a2c769139d73ecc487810f Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:06 +0206 Subject: printk: nbcon: Add emit function and callback function for atomic printing Implement an emit function for nbcon consoles to output printk messages. It utilizes the lockless printk_get_next_message() and console_prepend_dropped() functions to retrieve/build the output message. The emit function includes the required safety points to check for handover/takeover and calls a new write_atomic callback of the console driver to output the message. It also includes proper handling for updating the nbcon console sequence number. A new nbcon_write_context struct is introduced. This is provided to the write_atomic callback and includes only the information necessary for performing atomic writes. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Reviewed-by: Petr Mladek Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-8-john.ogness@linutronix.de --- include/linux/console.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index 20cd486b76ad..14563dcb34b1 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -242,6 +242,7 @@ struct printk_buffers; * be used only with NBCON_PRIO_PANIC @prio. It * might cause a system freeze when the console * is used later. + * @backlog: Ringbuffer has pending records * @pbufs: Pointer to the text buffer for this context * @seq: The sequence number to print for this context */ @@ -252,11 +253,28 @@ struct nbcon_context { enum nbcon_prio prio; unsigned int allow_unsafe_takeover : 1; + /* members set by emit */ + unsigned int backlog : 1; + /* members set by acquire */ struct printk_buffers *pbufs; u64 seq; }; +/** + * struct nbcon_write_context - Context handed to the nbcon write callbacks + * @ctxt: The core console context + * @outbuf: Pointer to the text buffer for output + * @len: Length to write + * @unsafe_takeover: If a hostile takeover in an unsafe state has occurred + */ +struct nbcon_write_context { + struct nbcon_context __private ctxt; + char *outbuf; + unsigned int len; + bool unsafe_takeover; +}; + /** * struct console - The console descriptor structure * @name: The name of the console driver @@ -277,6 +295,7 @@ struct nbcon_context { * @data: Driver private data * @node: hlist node for the console list * + * @write_atomic: Write callback for atomic context * @nbcon_state: State for nbcon consoles * @nbcon_seq: Sequence number of the next record for nbcon to print * @pbufs: Pointer to nbcon private buffer @@ -301,6 +320,8 @@ struct console { struct hlist_node node; /* nbcon console specific members */ + bool (*write_atomic)(struct console *con, + struct nbcon_write_context *wctxt); atomic_t __private nbcon_state; atomic_long_t __private nbcon_seq; struct printk_buffers *pbufs; -- cgit v1.2.3 From 9757acd0a700ba4a0d16dde4ba820eb052aba1a7 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 16 Sep 2023 21:26:07 +0206 Subject: printk: nbcon: Allow drivers to mark unsafe regions and check state For the write_atomic callback, the console driver may have unsafe regions that need to be appropriately marked. Provide functions that accept the nbcon_write_context struct to allow for the driver to enter and exit unsafe regions. Also provide a function for drivers to check if they are still the owner of the console. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Reviewed-by: Petr Mladek Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20230916192007.608398-9-john.ogness@linutronix.de --- include/linux/console.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include') diff --git a/include/linux/console.h b/include/linux/console.h index 14563dcb34b1..e4fc6f7c1496 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -451,6 +451,16 @@ static inline bool console_is_registered(const struct console *con) lockdep_assert_console_list_lock_held(); \ hlist_for_each_entry(con, &console_list, node) +#ifdef CONFIG_PRINTK +extern bool nbcon_can_proceed(struct nbcon_write_context *wctxt); +extern bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt); +extern bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt); +#else +static inline bool nbcon_can_proceed(struct nbcon_write_context *wctxt) { return false; } +static inline bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt) { return false; } +static inline bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt) { return false; } +#endif + extern int console_set_on_cmdline; extern struct console *early_console; -- cgit v1.2.3