summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2024-03-11 12:02:50 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2024-03-11 12:02:50 -0700
commite5a3878c947ceef7b6ab68fdc093f3848059842c (patch)
tree0ad530cfbc2f6c1c2b6e98a08988a8c48a522e86
parent1ddeeb2a058d7b2a58ed9e820396b4ceb715d529 (diff)
parent3add00be5fe5810d7aa5ec3af8b6a245ef33144b (diff)
downloadlinux-e5a3878c947ceef7b6ab68fdc093f3848059842c.tar.gz
linux-e5a3878c947ceef7b6ab68fdc093f3848059842c.tar.bz2
linux-e5a3878c947ceef7b6ab68fdc093f3848059842c.zip
Merge tag 'rcu.next.v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux
Pull RCU updates from Boqun Feng: - Eliminate deadlocks involving do_exit() and RCU tasks, by Paul: Instead of SRCU read side critical sections, now a percpu list is used in do_exit() for scaning yet-to-exit tasks - Fix a deadlock due to the dependency between workqueue and RCU expedited grace period, reported by Anna-Maria Behnsen and Thomas Gleixner and fixed by Frederic: Now RCU expedited always uses its own kthread worker instead of a workqueue - RCU NOCB updates, code cleanups, unnecessary barrier removals and minor bug fixes - Maintain real-time response in rcu_tasks_postscan() and a minor fix for tasks trace quiescence check - Misc updates, comments and readibility improvement, boot time parameter for lazy RCU and rcutorture improvement - Documentation updates * tag 'rcu.next.v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux: (34 commits) rcu-tasks: Maintain real-time response in rcu_tasks_postscan() rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks rcu-tasks: Initialize callback lists at rcu_init() time rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks rcu-tasks: Repair RCU Tasks Trace quiescence check rcu/sync: remove un-used rcu_sync_enter_start function rcutorture: Suppress rtort_pipe_count warnings until after stalls srcu: Improve comments about acceleration leak rcu: Provide a boot time parameter to control lazy RCU rcu: Rename jiffies_till_flush to jiffies_lazy_flush doc: Update checklist.rst discussion of callback execution doc: Clarify use of slab constructors and SLAB_TYPESAFE_BY_RCU context_tracking: Fix kerneldoc headers for __ct_user_{enter,exit}() doc: Add EARLY flag to early-parsed kernel boot parameters doc: Add CONFIG_RCU_STRICT_GRACE_PERIOD to checklist.rst doc: Make checklist.rst note that spinlocks are implied RCU readers doc: Make whatisRCU.rst note that spinlocks are RCU readers doc: Spinlocks are implied RCU readers ...
-rw-r--r--Documentation/RCU/checklist.rst32
-rw-r--r--Documentation/RCU/rcu_dereference.rst5
-rw-r--r--Documentation/RCU/whatisRCU.rst19
-rw-r--r--Documentation/admin-guide/kernel-parameters.rst1
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt489
-rw-r--r--include/linux/rcu_sync.h1
-rw-r--r--include/linux/rcupdate.h4
-rw-r--r--include/linux/sched.h2
-rw-r--r--init/init_task.c1
-rw-r--r--kernel/context_tracking.c4
-rw-r--r--kernel/fork.c1
-rw-r--r--kernel/rcu/Kconfig13
-rw-r--r--kernel/rcu/rcu.h19
-rw-r--r--kernel/rcu/rcuscale.c6
-rw-r--r--kernel/rcu/rcutorture.c13
-rw-r--r--kernel/rcu/srcutree.c24
-rw-r--r--kernel/rcu/sync.c16
-rw-r--r--kernel/rcu/tasks.h135
-rw-r--r--kernel/rcu/tiny.c1
-rw-r--r--kernel/rcu/tree.c237
-rw-r--r--kernel/rcu/tree.h20
-rw-r--r--kernel/rcu/tree_exp.h83
-rw-r--r--kernel/rcu/tree_nocb.h69
-rw-r--r--kernel/rcu/tree_plugin.h52
24 files changed, 687 insertions, 560 deletions
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index 2d42998a89a6..3e6407de231c 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -68,7 +68,8 @@ over a rather long period of time, but improvements are always welcome!
rcu_read_lock_sched(), or by the appropriate update-side lock.
Explicit disabling of preemption (preempt_disable(), for example)
can serve as rcu_read_lock_sched(), but is less readable and
- prevents lockdep from detecting locking issues.
+ prevents lockdep from detecting locking issues. Acquiring a
+ spinlock also enters an RCU read-side critical section.
Please note that you *cannot* rely on code known to be built
only in non-preemptible kernels. Such code can and will break,
@@ -382,16 +383,17 @@ over a rather long period of time, but improvements are always welcome!
must use whatever locking or other synchronization is required
to safely access and/or modify that data structure.
- Do not assume that RCU callbacks will be executed on the same
- CPU that executed the corresponding call_rcu() or call_srcu().
- For example, if a given CPU goes offline while having an RCU
- callback pending, then that RCU callback will execute on some
- surviving CPU. (If this was not the case, a self-spawning RCU
- callback would prevent the victim CPU from ever going offline.)
- Furthermore, CPUs designated by rcu_nocbs= might well *always*
- have their RCU callbacks executed on some other CPUs, in fact,
- for some real-time workloads, this is the whole point of using
- the rcu_nocbs= kernel boot parameter.
+ Do not assume that RCU callbacks will be executed on
+ the same CPU that executed the corresponding call_rcu(),
+ call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or
+ call_rcu_tasks_trace(). For example, if a given CPU goes offline
+ while having an RCU callback pending, then that RCU callback
+ will execute on some surviving CPU. (If this was not the case,
+ a self-spawning RCU callback would prevent the victim CPU from
+ ever going offline.) Furthermore, CPUs designated by rcu_nocbs=
+ might well *always* have their RCU callbacks executed on some
+ other CPUs, in fact, for some real-time workloads, this is the
+ whole point of using the rcu_nocbs= kernel boot parameter.
In addition, do not assume that callbacks queued in a given order
will be invoked in that order, even if they all are queued on the
@@ -444,7 +446,7 @@ over a rather long period of time, but improvements are always welcome!
real-time workloads than is synchronize_rcu_expedited().
It is also permissible to sleep in RCU Tasks Trace read-side
- critical, which are delimited by rcu_read_lock_trace() and
+ critical section, which are delimited by rcu_read_lock_trace() and
rcu_read_unlock_trace(). However, this is a specialized flavor
of RCU, and you should not use it without first checking with
its current users. In most cases, you should instead use SRCU.
@@ -490,6 +492,12 @@ over a rather long period of time, but improvements are always welcome!
since the last time that you passed that same object to
call_rcu() (or friends).
+ CONFIG_RCU_STRICT_GRACE_PERIOD:
+ combine with KASAN to check for pointers leaked out
+ of RCU read-side critical sections. This Kconfig
+ option is tough on both performance and scalability,
+ and so is limited to four-CPU systems.
+
__rcu sparse checks:
tag the pointer to the RCU-protected data structure
with __rcu, and sparse will warn you if you access that
diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst
index 659d5913784d..2524dcdadde2 100644
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -408,7 +408,10 @@ member of the rcu_dereference() to use in various situations:
RCU flavors, an RCU read-side critical section is entered
using rcu_read_lock(), anything that disables bottom halves,
anything that disables interrupts, or anything that disables
- preemption.
+ preemption. Please note that spinlock critical sections
+ are also implied RCU read-side critical sections, even when
+ they are preemptible, as they are in kernels built with
+ CONFIG_PREEMPT_RT=y.
2. If the access might be within an RCU read-side critical section
on the one hand, or protected by (say) my_lock on the other,
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index 60ce02475142..872ac665223f 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -172,14 +172,25 @@ rcu_read_lock()
critical section. Reference counts may be used in conjunction
with RCU to maintain longer-term references to data structures.
+ Note that anything that disables bottom halves, preemption,
+ or interrupts also enters an RCU read-side critical section.
+ Acquiring a spinlock also enters an RCU read-side critical
+ sections, even for spinlocks that do not disable preemption,
+ as is the case in kernels built with CONFIG_PREEMPT_RT=y.
+ Sleeplocks do *not* enter RCU read-side critical sections.
+
rcu_read_unlock()
^^^^^^^^^^^^^^^^^
void rcu_read_unlock(void);
This temporal primitives is used by a reader to inform the
reclaimer that the reader is exiting an RCU read-side critical
- section. Note that RCU read-side critical sections may be nested
- and/or overlapping.
+ section. Anything that enables bottom halves, preemption,
+ or interrupts also exits an RCU read-side critical section.
+ Releasing a spinlock also exits an RCU read-side critical section.
+
+ Note that RCU read-side critical sections may be nested and/or
+ overlapping.
synchronize_rcu()
^^^^^^^^^^^^^^^^^
@@ -952,8 +963,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
initialized after each and every call to kmem_cache_alloc(), which renders
reference-free spinlock acquisition completely unsafe. Therefore, when
using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
-(Those willing to use a kmem_cache constructor may also use locking,
-including cache-friendly sequence locking.)
+(Those willing to initialize their locks in a kmem_cache constructor
+may also use locking, including cache-friendly sequence locking.)
With traditional reference counting -- such as that implemented by the
kref library in Linux -- there is typically code that runs when the last
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index 4410384596a9..e8bdf5e86a9b 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -108,6 +108,7 @@ is applicable::
CMA Contiguous Memory Area support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
+ EARLY Parameter processed too early to be embedded in initrd.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EVM Extended Verification Module
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25680d0..94314d0eb301 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -9,7 +9,7 @@
accept_memory=eager can be used to accept all memory
at once during boot.
- acpi= [HW,ACPI,X86,ARM64,RISCV64]
+ acpi= [HW,ACPI,X86,ARM64,RISCV64,EARLY]
Advanced Configuration and Power Interface
Format: { force | on | off | strict | noirq | rsdt |
copy_dsdt }
@@ -26,7 +26,7 @@
See also Documentation/power/runtime_pm.rst, pci=noacpi
- acpi_apic_instance= [ACPI, IOAPIC]
+ acpi_apic_instance= [ACPI,IOAPIC,EARLY]
Format: <int>
2: use 2nd APIC table, if available
1,0: use 1st APIC table
@@ -41,7 +41,7 @@
If set to native, use the device's native backlight mode.
If set to none, disable the ACPI backlight interface.
- acpi_force_32bit_fadt_addr
+ acpi_force_32bit_fadt_addr [ACPI,EARLY]
force FADT to use 32 bit addresses rather than the
64 bit X_* addresses. Some firmware have broken 64
bit addresses for force ACPI ignore these and use
@@ -97,7 +97,7 @@
no: ACPI OperationRegions are not marked as reserved,
no further checks are performed.
- acpi_force_table_verification [HW,ACPI]
+ acpi_force_table_verification [HW,ACPI,EARLY]
Enable table checksum verification during early stage.
By default, this is disabled due to x86 early mapping
size limitation.
@@ -137,7 +137,7 @@
acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kdump
kernels.
- acpi_no_static_ssdt [HW,ACPI]
+ acpi_no_static_ssdt [HW,ACPI,EARLY]
Disable installation of static SSDTs at early boot time
By default, SSDTs contained in the RSDT/XSDT will be
installed automatically and they will appear under
@@ -151,7 +151,7 @@
Ignore the ACPI-based watchdog interface (WDAT) and let
a native driver control the watchdog device instead.
- acpi_rsdp= [ACPI,EFI,KEXEC]
+ acpi_rsdp= [ACPI,EFI,KEXEC,EARLY]
Pass the RSDP address to the kernel, mostly used
on machines running EFI runtime service to boot the
second kernel for kdump.
@@ -228,10 +228,10 @@
to assume that this machine's pmtimer latches its value
and always returns good values.
- acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
+ acpi_sci= [HW,ACPI,EARLY] ACPI System Control Interrupt trigger mode
Format: { level | edge | high | low }
- acpi_skip_timer_override [HW,ACPI]
+ acpi_skip_timer_override [HW,ACPI,EARLY]
Recognize and ignore IRQ0/pin2 Interrupt Override.
For broken nForce2 BIOS resulting in XT-PIC timer.
@@ -266,11 +266,11 @@
behave incorrectly in some ways with respect to system
suspend and resume to be ignored (use wisely).
- acpi_use_timer_override [HW,ACPI]
+ acpi_use_timer_override [HW,ACPI,EARLY]
Use timer override. For some broken Nvidia NF5 boards
that require a timer override, but don't have HPET
- add_efi_memmap [EFI; X86] Include EFI memory map in
+ add_efi_memmap [EFI,X86,EARLY] Include EFI memory map in
kernel's map of available physical RAM.
agp= [AGP]
@@ -307,7 +307,7 @@
do not want to use tracing_snapshot_alloc() as it needs
to be done where GFP_KERNEL allocations are allowed.
- allow_mismatched_32bit_el0 [ARM64]
+ allow_mismatched_32bit_el0 [ARM64,EARLY]
Allow execve() of 32-bit applications and setting of the
PER_LINUX32 personality on systems where only a strict
subset of the CPUs support 32-bit EL0. When this
@@ -351,7 +351,7 @@
This mode requires kvm-amd.avic=1.
(Default when IOMMU HW support is present.)
- amd_pstate= [X86]
+ amd_pstate= [X86,EARLY]
disable
Do not enable amd_pstate as the default
scaling driver for the supported processors
@@ -391,7 +391,7 @@
not play well with APC CPU idle - disable it if you have
APC and your system crashes randomly.
- apic= [APIC,X86] Advanced Programmable Interrupt Controller
+ apic= [APIC,X86,EARLY] Advanced Programmable Interrupt Controller
Change the output verbosity while booting
Format: { quiet (default) | verbose | debug }
Change the amount of debugging information output
@@ -401,7 +401,7 @@
Format: apic=driver_name
Examples: apic=bigsmp
- apic_extnmi= [APIC,X86] External NMI delivery setting
+ apic_extnmi= [APIC,X86,EARLY] External NMI delivery setting
Format: { bsp (default) | all | none }
bsp: External NMI is delivered only to CPU 0
all: External NMIs are broadcast to all CPUs as a
@@ -508,21 +508,22 @@
bert_disable [ACPI]
Disable BERT OS support on buggy BIOSes.
- bgrt_disable [ACPI][X86]
+ bgrt_disable [ACPI,X86,EARLY]
Disable BGRT to avoid flickering OEM logo.
blkdevparts= Manual partition parsing of block device(s) for
embedded devices based on command line input.
See Documentation/block/cmdline-partition.rst
- boot_delay= Milliseconds to delay each printk during boot.
+ boot_delay= [KNL,EARLY]
+ Milliseconds to delay each printk during boot.
Only works if CONFIG_BOOT_PRINTK_DELAY is enabled,
and you may also have to specify "lpj=". Boot_delay
values larger than 10 seconds (10000) are assumed
erroneous and ignored.
Format: integer
- bootconfig [KNL]
+ bootconfig [KNL,EARLY]
Extended command line options can be added to an initrd
and this will cause the kernel to look for it.
@@ -557,7 +558,7 @@
trust validation.
format: { id:<keyid> | builtin }
- cca= [MIPS] Override the kernel pages' cache coherency
+ cca= [MIPS,EARLY] Override the kernel pages' cache coherency
algorithm. Accepted values range from 0 to 7
inclusive. See arch/mips/include/asm/pgtable-bits.h
for platform specific values (SB1, Loongson3 and
@@ -672,7 +673,7 @@
[X86-64] hpet,tsc
clocksource.arm_arch_timer.evtstrm=
- [ARM,ARM64]
+ [ARM,ARM64,EARLY]
Format: <bool>
Enable/disable the eventstream feature of the ARM
architected timer so that code using WFE-based polling
@@ -702,7 +703,7 @@
10 seconds when built into the kernel.
cma=nn[MG]@[start[MG][-end[MG]]]
- [KNL,CMA]
+ [KNL,CMA,EARLY]
Sets the size of kernel global memory area for
contiguous memory allocations and optionally the
placement constraint by the physical address range of
@@ -711,7 +712,7 @@
kernel/dma/contiguous.c
cma_pernuma=nn[MG]
- [KNL,CMA]
+ [KNL,CMA,EARLY]
Sets the size of kernel per-numa memory area for
contiguous memory allocations. A value of 0 disables
per-numa CMA altogether. And If this option is not
@@ -722,7 +723,7 @@
they will fallback to the global default memory area.
numa_cma=<node>:nn[MG][,<node>:nn[MG]]
- [KNL,CMA]
+ [KNL,CMA,EARLY]
Sets the size of kernel numa memory area for
contiguous memory allocations. It will reserve CMA
area for the specified node.
@@ -739,7 +740,7 @@
a hypervisor.
Default: yes
- coherent_pool=nn[KMG] [ARM,KNL]
+ coherent_pool=nn[KMG] [ARM,KNL,EARLY]
Sets the size of memory pool for coherent, atomic dma
allocations, by default set to 256K.
@@ -757,7 +758,7 @@
condev= [HW,S390] console device
conmode=
- con3215_drop= [S390] 3215 console drop mode.
+ con3215_drop= [S390,EARLY] 3215 console drop mode.
Format: y|n|Y|N|1|0
When set to true, drop data on the 3215 console when
the console buffer is full. In this case the
@@ -863,7 +864,7 @@
kernel before the cpufreq driver probes.
cpu_init_udelay=N
- [X86] Delay for N microsec between assert and de-assert
+ [X86,EARLY] Delay for N microsec between assert and de-assert
of APIC INIT to start processors. This delay occurs
on every CPU online, such as boot, and resume from suspend.
Default: 10000
@@ -883,7 +884,7 @@
kernel more unstable.
crashkernel=size[KMG][@offset[KMG]]
- [KNL] Using kexec, Linux can switch to a 'crash kernel'
+ [KNL,EARLY] Using kexec, Linux can switch to a 'crash kernel'
upon panic. This parameter reserves the physical
memory region [offset, offset + size] for that kernel
image. If '@offset' is omitted, then a suitable offset
@@ -954,10 +955,10 @@
Format: <port#>,<type>
See also Documentation/input/devices/joystick-parport.rst
- debug [KNL] Enable kernel debugging (events log level).
+ debug [KNL,EARLY] Enable kernel debugging (events log level).
debug_boot_weak_hash
- [KNL] Enable printing [hashed] pointers early in the
+ [KNL,EARLY] Enable printing [hashed] pointers early in the
boot sequence. If enabled, we use a weak hash instead
of siphash to hash pointers. Use this option if you are
seeing instances of '(___ptrval___)') and need to see a
@@ -974,10 +975,10 @@
will print _a_lot_ more information - normally only
useful to lockdep developers.
- debug_objects [KNL] Enable object debugging
+ debug_objects [KNL,EARLY] Enable object debugging
debug_guardpage_minorder=
- [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
+ [KNL,EARLY] When CONFIG_DEBUG_PAGEALLOC is set, this
parameter allows control of the order of pages that will
be intentionally kept free (and hence protected) by the
buddy allocator. Bigger value increase the probability
@@ -996,7 +997,7 @@
help tracking down these problems.
debug_pagealloc=
- [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this parameter
+ [KNL,EARLY] When CONFIG_DEBUG_PAGEALLOC is set, this parameter
enables the feature at boot time. By default, it is
disabled and the system will work mostly the same as a
kernel built without CONFIG_DEBUG_PAGEALLOC.
@@ -1004,8 +1005,8 @@
useful to also enable the page_owner functionality.
on: enable the feature
- debugfs= [KNL] This parameter enables what is exposed to userspace
- and debugfs internal clients.
+ debugfs= [KNL,EARLY] This parameter enables what is exposed to
+ userspace and debugfs internal clients.
Format: { on, no-mount, off }
on: All functions are enabled.
no-mount:
@@ -1084,7 +1085,7 @@
dhash_entries= [KNL]
Set number of hash buckets for dentry cache.
- disable_1tb_segments [PPC]
+ disable_1tb_segments [PPC,EARLY]
Disables the use of 1TB hash page table segments. This
causes the kernel to fall back to 256MB segments which
can be useful when debugging issues that require an SLB
@@ -1093,7 +1094,7 @@
disable= [IPV6]
See Documentation/networking/ipv6.rst.
- disable_radix [PPC]
+ disable_radix [PPC,EARLY]
Disable RADIX MMU mode on POWER9
disable_tlbie [PPC]
@@ -1109,25 +1110,25 @@
causing system reset or hang due to sending
INIT from AP to BSP.
- disable_ddw [PPC/PSERIES]
+ disable_ddw [PPC/PSERIES,EARLY]
Disable Dynamic DMA Window support. Use this
to workaround buggy firmware.
disable_ipv6= [IPV6]
See Documentation/networking/ipv6.rst.
- disable_mtrr_cleanup [X86]
+ disable_mtrr_cleanup [X86,EARLY]
The kernel tries to adjust MTRR layout from continuous
to discrete, to make X server driver able to add WB
entry later. This parameter disables that.
- disable_mtrr_trim [X86, Intel and AMD only]
+ disable_mtrr_trim [X86, Intel and AMD only,EARLY]
By default the kernel will trim any uncacheable
memory out of your available memory pool based on
MTRR settings. This parameter disables that behavior,
possibly causing your machine to run very slowly.
- disable_timer_pin_1 [X86]
+ disable_timer_pin_1 [X86,EARLY]
Disable PIN 1 of APIC timer
Can be useful to work around chipset bugs.
@@ -1177,7 +1178,7 @@
dscc4.setup= [NET]
- dt_cpu_ftrs= [PPC]
+ dt_cpu_ftrs= [PPC,EARLY]
Format: {"off" | "known"}
Control how the dt_cpu_ftrs device-tree binding is
used for CPU feature discovery and setup (if it
@@ -1197,12 +1198,12 @@
Documentation/admin-guide/dynamic-debug-howto.rst
for details.
- early_ioremap_debug [KNL]
+ early_ioremap_debug [KNL,EARLY]
Enable debug messages in early_ioremap support. This
is useful for tracking down temporary early mappings
which are not unmapped.
- earlycon= [KNL] Output early console device and options.
+ earlycon= [KNL,EARLY] Output early console device and options.
When used with no options, the early console is
determined by stdout-path property in device tree's
@@ -1338,7 +1339,7 @@
address must be provided, and the serial port must
already be setup and configured.
- earlyprintk= [X86,SH,ARM,M68k,S390]
+ earlyprintk= [X86,SH,ARM,M68k,S390,UM,EARLY]
earlyprintk=vga
earlyprintk=sclp
earlyprintk=xen
@@ -1396,7 +1397,7 @@
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
- efi= [EFI]
+ efi= [EFI,EARLY]
Format: { "debug", "disable_early_pci_dma",
"nochunk", "noruntime", "nosoftreserve",
"novamap", "no_disable_early_pci_dma" }
@@ -1417,13 +1418,13 @@
no_disable_early_pci_dma: Leave the busmaster bit set
on all PCI bridges while in the EFI boot stub
- efi_no_storage_paranoia [EFI; X86]
+ efi_no_storage_paranoia [EFI,X86,EARLY]
Using this parameter you can use more than 50% of
your efi variable storage. Use this parameter only if
you are really sure that your UEFI does sane gc and
fulfills the spec otherwise your board may brick.
- efi_fake_mem= nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
+ efi_fake_mem= nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI,X86,EARLY]
Add arbitrary attribute to specific memory range by
updating original EFI memory map.
Region of memory which aa attribute is added to is
@@ -1454,7 +1455,7 @@
eisa_irq_edge= [PARISC,HW]
See header of drivers/parisc/eisa.c.
- ekgdboc= [X86,KGDB] Allow early kernel console debugging
+ ekgdboc= [X86,KGDB,EARLY] Allow early kernel console debugging
Format: ekgdboc=kbd
This is designed to be used in conjunction with
@@ -1469,13 +1470,13 @@
See comment before function elanfreq_setup() in
arch/x86/kernel/cpu/cpufreq/elanfreq.c.
- elfcorehdr=[size[KMG]@]offset[KMG] [PPC,SH,X86,S390]
+ elfcorehdr=[size[KMG]@]offset[KMG] [PPC,SH,X86,S390,EARLY]
Specifies physical address of start of kernel core
image elf header and optionally the size. Generally
kexec loader will pass this option to capture kernel.
See Documentation/admin-guide/kdump/kdump.rst for details.
- enable_mtrr_cleanup [X86]
+ enable_mtrr_cleanup [X86,EARLY]
The kernel tries to adjust MTRR layout from continuous
to discrete, to make X server driver able to add WB
entry later. This parameter enables that.
@@ -1508,7 +1509,7 @@
Permit 'security.evm' to be updated regardless of
current integrity status.
- early_page_ext [KNL] Enforces page_ext initialization to earlier
+ early_page_ext [KNL,EARLY] Enforces page_ext initialization to earlier
stages so cover more early boot allocations.
Please note that as side effect some optimizations
might be disabled to achieve that (e.g. parallelized
@@ -1600,7 +1601,7 @@
can be changed at run time by the max_graph_depth file
in the tracefs tracing directory. default: 0 (no limit)
- fw_devlink= [KNL] Create device links between consumer and supplier
+ fw_devlink= [KNL,EARLY] Create device links between consumer and supplier
devices by scanning the firmware to infer the
consumer/supplier relationships. This feature is
especially useful when drivers are loaded as modules as
@@ -1619,12 +1620,12 @@
rpm -- Like "on", but also use to order runtime PM.
fw_devlink.strict=<bool>
- [KNL] Treat all inferred dependencies as mandatory
+ [KNL,EARLY] Treat all inferred dependencies as mandatory
dependencies. This only applies for fw_devlink=on|rpm.
Format: <bool>
fw_devlink.sync_state =
- [KNL] When all devices that could probe have finished
+ [KNL,EARLY] When all devices that could probe have finished
probing, this parameter controls what to do with
devices that haven't yet received their sync_state()
calls.
@@ -1645,12 +1646,12 @@
gamma= [HW,DRM]
- gart_fix_e820= [X86-64] disable the fix e820 for K8 GART
+ gart_fix_e820= [X86-64,EARLY] disable the fix e820 for K8 GART
Format: off | on
default: on
gather_data_sampling=
- [X86,INTEL] Control the Gather Data Sampling (GDS)
+ [X86,INTEL,EARLY] Control the Gather Data Sampling (GDS)
mitigation.
Gather Data Sampling is a hardware vulnerability which
@@ -1748,7 +1749,7 @@
(that will set all pages holding image data
during restoration read-only).
- highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact
+ highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact
size of <nn>. This works even on boxes that have no
highmem otherwise. This also works to reduce highmem
size on bigger boxes.
@@ -1759,7 +1760,7 @@
hlt [BUGS=ARM,SH]
- hostname= [KNL] Set the hostname (aka UTS nodename).
+ hostname= [KNL,EARLY] Set the hostname (aka UTS nodename).
Format: <string>
This allows setting the system's hostname during early
startup. This sets the name returned by gethostname.
@@ -1804,7 +1805,7 @@
Documentation/admin-guide/mm/hugetlbpage.rst.
Format: size[KMG]
- hugetlb_cma= [HW,CMA] The size of a CMA area used for allocation
+ hugetlb_cma= [HW,CMA,EARLY] The size of a CMA area used for allocation
of gigantic hugepages. Or using node format, the size
of a CMA area per node can be specified.
Format: nn[KMGTPE] or (node format)
@@ -1850,9 +1851,10 @@
If specified, z/VM IUCV HVC accepts connections
from listed z/VM user IDs only.
- hv_nopvspin [X86,HYPER_V] Disables the paravirt spinlock optimizations
- which allow the hypervisor to 'idle' the
- guest on lock contention.
+ hv_nopvspin [X86,HYPER_V,EARLY]
+ Disables the paravirt spinlock optimizations
+ which allow the hypervisor to 'idle' the guest
+ on lock contention.
i2c_bus= [HW] Override the default board specific I2C bus speed
or register an additional I2C bus that is not
@@ -1917,7 +1919,7 @@
Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]]
- idle= [X86]
+ idle= [X86,EARLY]
Format: idle=poll, idle=halt, idle=nomwait
Poll forces a polling idle loop that can slightly
improve the performance of waking up a idle CPU, but
@@ -1973,7 +1975,7 @@
mode generally follows that for the NaN encoding,
except where unsupported by hardware.
- ignore_loglevel [KNL]
+ ignore_loglevel [KNL,EARLY]
Ignore loglevel setting - this will print /all/
kernel messages to the console. Useful for debugging.
We also add it as printk module parameter, so users
@@ -2091,21 +2093,21 @@
unpacking being completed before device_ and
late_ initcalls.
- initrd= [BOOT] Specify the location of the initial ramdisk
+ initrd= [BOOT,EARLY] Specify the location of the initial ramdisk
- initrdmem= [KNL] Specify a physical address and size from which to
+ initrdmem= [KNL,EARLY] Specify a physical address and size from which to
load the initrd. If an initrd is compiled in or
specified in the bootparams, it takes priority over this
setting.
Format: ss[KMG],nn[KMG]
Default is 0, 0
- init_on_alloc= [MM] Fill newly allocated pages and heap objects with
+ init_on_alloc= [MM,EARLY] Fill newly allocated pages and heap objects with
zeroes.
Format: 0 | 1
Default set by CONFIG_INIT_ON_ALLOC_DEFAULT_ON.
- init_on_free= [MM] Fill freed pages and heap objects with zeroes.
+ init_on_free= [MM,EARLY] Fill freed pages and heap objects with zeroes.
Format: 0 | 1
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
@@ -2161,7 +2163,7 @@
0 disables intel_idle and fall back on acpi_idle.
1 to 9 specify maximum depth of C-state.
- intel_pstate= [X86]
+ intel_pstate= [X86,EARLY]
disable
Do not enable intel_pstate as the default
scaling driver for the supported processors
@@ -2205,7 +2207,7 @@
Allow per-logical-CPU P-State performance control limits using
cpufreq sysfs interface
- intremap= [X86-64, Intel-IOMMU]
+ intremap= [X86-64,Intel-IOMMU,EARLY]
on enable Interrupt Remapping (default)
off disable Interrupt Remapping
nosid disable Source ID checking
@@ -2217,7 +2219,7 @@
strict regions from userspace.
relaxed
- iommu= [X86]
+ iommu= [X86,EARLY]
off
force
noforce
@@ -2232,7 +2234,7 @@
nobypass [PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.
- iommu.forcedac= [ARM64, X86] Control IOVA allocation for PCI devices.
+ iommu.forcedac= [ARM64,X86,EARLY] Control IOVA allocation for PCI devices.
Format: { "0" | "1" }
0 - Try to allocate a 32-bit DMA address first, before
falling back to the full range if needed.
@@ -2240,7 +2242,7 @@
forcing Dual Address Cycle for PCI cards supporting
greater than 32-bit addressing.
- iommu.strict= [ARM64, X86, S390] Configure TLB invalidation behaviour
+ iommu.strict= [ARM64,X86,S390,EARLY] Configure TLB invalidation behaviour
Format: { "0" | "1" }
0 - Lazy mode.
Request that DMA unmap operations use deferred
@@ -2256,7 +2258,7 @@
legacy driver-specific options takes precedence.
iommu.passthrough=
- [ARM64, X86] Configure DMA to bypass the IOMMU by default.
+ [ARM64,X86,EARLY] Configure DMA to bypass the IOMMU by default.
Format: { "0" | "1" }
0 - Use IOMMU translation for DMA.
1 - Bypass the IOMMU for DMA.
@@ -2266,7 +2268,7 @@
See comment before marvel_specify_io7 in
arch/alpha/kernel/core_marvel.c.
- io_delay= [X86] I/O delay method
+ io_delay= [X86,EARLY] I/O delay method
0x80
Standard port 0x80 based delay
0x