summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-02-21 12:04:41 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2021-02-21 12:04:41 -0800
commitd089f48fba28db14d0fe7753248f2575a9ddfc73 (patch)
treea3821c02dd38342193459e41ba453c058f75e3d2
parent3f6ec19f2d05d800bbc42d95dece433da7697864 (diff)
parent2b392cb11c0db645ba81a08b6a2e96c56ec1fc64 (diff)
downloadlinux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.gz
linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.bz2
linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.zip
Merge tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar: "These are the latest RCU updates for v5.12: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. - nolibc fixes for the torture tests" * tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) percpu_ref: Dump mem_dump_obj() info upon reference-count underflow rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback mm: Make mem_obj_dump() vmalloc() dumps include start and length mm: Make mem_dump_obj() handle vmalloc() memory mm: Make mem_dump_obj() handle NULL and zero-sized pointers mm: Add mem_dump_obj() to print source of memory block tools/rcutorture: Fix position of -lgcc in mkinitrd.sh tools/nolibc: Fix position of -lgcc in the documented example tools/nolibc: Emit detailed error for missing alternate syscall number definitions tools/nolibc: Remove incorrect definitions of __ARCH_WANT_* tools/nolibc: Get timeval, timespec and timezone from linux/time.h tools/nolibc: Implement poll() based on ppoll() tools/nolibc: Implement fork() based on clone() tools/nolibc: Make getpgrp() fall back to getpgid(0) tools/nolibc: Make dup2() rely on dup3() when available tools/nolibc: Add the definition for dup() rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01 torture: Maintain torture-specific set of CPUs-online books torture: Clean up after torture-test CPU hotplugging rcutorture: Make object_debug also double call_rcu() heap object ...
-rw-r--r--Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst4
-rw-r--r--Documentation/RCU/Design/Requirements/Requirements.rst732
-rw-r--r--Documentation/RCU/checklist.rst10
-rw-r--r--Documentation/RCU/rcubarrier.rst6
-rw-r--r--Documentation/RCU/stallwarn.rst27
-rw-r--r--Documentation/RCU/whatisRCU.rst10
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt39
-rw-r--r--include/linux/cpu.h2
-rw-r--r--include/linux/list.h2
-rw-r--r--include/linux/mm.h2
-rw-r--r--include/linux/rcu_segcblist.h120
-rw-r--r--include/linux/rcupdate.h42
-rw-r--r--include/linux/slab.h2
-rw-r--r--include/linux/srcu.h3
-rw-r--r--include/linux/srcutiny.h7
-rw-r--r--include/linux/timer.h2
-rw-r--r--include/linux/torture.h27
-rw-r--r--include/linux/vmalloc.h6
-rw-r--r--include/trace/events/rcu.h26
-rw-r--r--kernel/cpu.c7
-rw-r--r--kernel/locking/locktorture.c1
-rw-r--r--kernel/rcu/Kconfig5
-rw-r--r--kernel/rcu/rcu.h16
-rw-r--r--kernel/rcu/rcu_segcblist.c216
-rw-r--r--kernel/rcu/rcu_segcblist.h57
-rw-r--r--kernel/rcu/rcutorture.c395
-rw-r--r--kernel/rcu/refscale.c23
-rw-r--r--kernel/rcu/srcutiny.c77
-rw-r--r--kernel/rcu/srcutree.c147
-rw-r--r--kernel/rcu/tasks.h79
-rw-r--r--kernel/rcu/tree.c101
-rw-r--r--kernel/rcu/tree.h2
-rw-r--r--kernel/rcu/tree_exp.h2
-rw-r--r--kernel/rcu/tree_plugin.h367
-rw-r--r--kernel/rcu/tree_stall.h60
-rw-r--r--kernel/rcu/update.c4
-rw-r--r--kernel/scftorture.c6
-rw-r--r--kernel/sched/core.c9
-rw-r--r--kernel/time/timer.c14
-rw-r--r--kernel/torture.c167
-rw-r--r--lib/percpu-refcount.c12
-rw-r--r--mm/slab.c20
-rw-r--r--mm/slab.h12
-rw-r--r--mm/slab_common.c75
-rw-r--r--mm/slob.c6
-rw-r--r--mm/slub.c40
-rw-r--r--mm/util.c31
-rw-r--r--mm/vmalloc.c13
-rw-r--r--tools/include/nolibc/nolibc.h153
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/config2csv.sh67
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/console-badness.sh1
-rw-r--r--tools/testing/selftests/rcutorture/bin/functions.sh36
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-find-errors.sh9
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-recheck.sh3
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh12
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm.sh103
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/mkinitrd.sh2
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/parse-build.sh2
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/parse-console.sh2
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/torture.sh442
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/RUDE01.boot1
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TASKS01.boot1
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot4
63 files changed, 3108 insertions, 763 deletions
diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
index 72f0f6fbd53c..6f89cf1e567d 100644
--- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
+++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
@@ -38,7 +38,7 @@ sections.
RCU-preempt Expedited Grace Periods
===================================
-``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
+``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
The overall flow of the handling of a given CPU by an RCU-preempt
expedited grace period is shown in the following diagram:
@@ -112,7 +112,7 @@ things.
RCU-sched Expedited Grace Periods
---------------------------------
-``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
+``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
the handling of a given CPU by an RCU-sched expedited grace period is
shown in the following diagram:
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index d4c9a016074b..38a39476fc24 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -72,13 +72,13 @@ understanding of this guarantee.
RCU's grace-period guarantee allows updaters to wait for the completion
of all pre-existing RCU read-side critical sections. An RCU read-side
-critical section begins with the marker ``rcu_read_lock()`` and ends
-with the marker ``rcu_read_unlock()``. These markers may be nested, and
+critical section begins with the marker rcu_read_lock() and ends
+with the marker rcu_read_unlock(). These markers may be nested, and
RCU treats a nested set as one big RCU read-side critical section.
-Production-quality implementations of ``rcu_read_lock()`` and
-``rcu_read_unlock()`` are extremely lightweight, and in fact have
+Production-quality implementations of rcu_read_lock() and
+rcu_read_unlock() are extremely lightweight, and in fact have
exactly zero overhead in Linux kernels built for production use with
-``CONFIG_PREEMPT=n``.
+``CONFIG_PREEMPTION=n``.
This guarantee allows ordering to be enforced with extremely low
overhead to readers, for example:
@@ -102,12 +102,12 @@ overhead to readers, for example:
15 WRITE_ONCE(y, 1);
16 }
-Because the ``synchronize_rcu()`` on line 14 waits for all pre-existing
-readers, any instance of ``thread0()`` that loads a value of zero from
-``x`` must complete before ``thread1()`` stores to ``y``, so that
+Because the synchronize_rcu() on line 14 waits for all pre-existing
+readers, any instance of thread0() that loads a value of zero from
+``x`` must complete before thread1() stores to ``y``, so that
instance must also load a value of zero from ``y``. Similarly, any
-instance of ``thread0()`` that loads a value of one from ``y`` must have
-started after the ``synchronize_rcu()`` started, and must therefore also
+instance of thread0() that loads a value of one from ``y`` must have
+started after the synchronize_rcu() started, and must therefore also
load a value of one from ``x``. Therefore, the outcome:
::
@@ -121,14 +121,14 @@ cannot happen.
+-----------------------------------------------------------------------+
| Wait a minute! You said that updaters can make useful forward |
| progress concurrently with readers, but pre-existing readers will |
-| block ``synchronize_rcu()``!!! |
+| block synchronize_rcu()!!! |
| Just who are you trying to fool??? |
+-----------------------------------------------------------------------+
| **Answer**: |
+-----------------------------------------------------------------------+
| First, if updaters do not wish to be blocked by readers, they can use |
-| ``call_rcu()`` or ``kfree_rcu()``, which will be discussed later. |
-| Second, even when using ``synchronize_rcu()``, the other update-side |
+| call_rcu() or kfree_rcu(), which will be discussed later. |
+| Second, even when using synchronize_rcu(), the other update-side |
| code does run concurrently with readers, whether pre-existing or not. |
+-----------------------------------------------------------------------+
@@ -170,34 +170,34 @@ recovery from node failure, more or less as follows:
29 WRITE_ONCE(state, STATE_NORMAL);
30 }
-The RCU read-side critical section in ``do_something_dlm()`` works with
-the ``synchronize_rcu()`` in ``start_recovery()`` to guarantee that
-``do_something()`` never runs concurrently with ``recovery()``, but with
-little or no synchronization overhead in ``do_something_dlm()``.
+The RCU read-side critical section in do_something_dlm() works with
+the synchronize_rcu() in start_recovery() to guarantee that
+do_something() never runs concurrently with recovery(), but with
+little or no synchronization overhead in do_something_dlm().
+-----------------------------------------------------------------------+
| **Quick Quiz**: |
+-----------------------------------------------------------------------+
-| Why is the ``synchronize_rcu()`` on line 28 needed? |
+| Why is the synchronize_rcu() on line 28 needed? |
+-----------------------------------------------------------------------+
| **Answer**: |
+-----------------------------------------------------------------------+
| Without that extra grace period, memory reordering could result in |
-| ``do_something_dlm()`` executing ``do_something()`` concurrently with |
-| the last bits of ``recovery()``. |
+| do_something_dlm() executing do_something() concurrently with |
+| the last bits of recovery(). |
+-----------------------------------------------------------------------+
In order to avoid fatal problems such as deadlocks, an RCU read-side
-critical section must not contain calls to ``synchronize_rcu()``.
+critical section must not contain calls to synchronize_rcu().
Similarly, an RCU read-side critical section must not contain anything
that waits, directly or indirectly, on completion of an invocation of
-``synchronize_rcu()``.
+synchronize_rcu().
Although RCU's grace-period guarantee is useful in and of itself, with
`quite a few use cases <https://lwn.net/Articles/573497/>`__, it would
be good to be able to use RCU to coordinate read-side access to linked
data structures. For this, the grace-period guarantee is not sufficient,
-as can be seen in function ``add_gp_buggy()`` below. We will look at the
+as can be seen in function add_gp_buggy() below. We will look at the
reader's code later, but in the meantime, just think of the reader as
locklessly picking up the ``gp`` pointer, and, if the value loaded is
non-\ ``NULL``, locklessly accessing the ``->a`` and ``->b`` fields.
@@ -256,8 +256,8 @@ Publish/Subscribe Guarantee
RCU's publish-subscribe guarantee allows data to be inserted into a
linked data structure without disrupting RCU readers. The updater uses
-``rcu_assign_pointer()`` to insert the new data, and readers use
-``rcu_dereference()`` to access data, whether new or old. The following
+rcu_assign_pointer() to insert the new data, and readers use
+rcu_dereference() to access data, whether new or old. The following
shows an example of insertion:
::
@@ -279,7 +279,7 @@ shows an example of insertion:
15 return true;
16 }
-The ``rcu_assign_pointer()`` on line 13 is conceptually equivalent to a
+The rcu_assign_pointer() on line 13 is conceptually equivalent to a
simple assignment statement, but also guarantees that its assignment
will happen after the two assignments in lines 11 and 12, similar to the
C11 ``memory_order_release`` store operation. It also prevents any
@@ -289,7 +289,7 @@ number of “interesting” compiler optimizations, for example, the use of
+-----------------------------------------------------------------------+
| **Quick Quiz**: |
+-----------------------------------------------------------------------+
-| But ``rcu_assign_pointer()`` does nothing to prevent the two |
+| But rcu_assign_pointer() does nothing to prevent the two |
| assignments to ``p->a`` and ``p->b`` from being reordered. Can't that |
| also cause problems? |
+-----------------------------------------------------------------------+
@@ -303,7 +303,7 @@ number of “interesting” compiler optimizations, for example, the use of
It is tempting to assume that the reader need not do anything special to
control its accesses to the RCU-protected data, as shown in
-``do_something_gp_buggy()`` below:
+do_something_gp_buggy() below:
::
@@ -321,11 +321,10 @@ control its accesses to the RCU-protected data, as shown in
12 }
However, this temptation must be resisted because there are a
-surprisingly large number of ways that the compiler (to say nothing of
-`DEC Alpha CPUs <https://h71000.www7.hp.com/wizard/wiz_2637.html>`__)
-can trip this code up. For but one example, if the compiler were short
-of registers, it might choose to refetch from ``gp`` rather than keeping
-a separate copy in ``p`` as follows:
+surprisingly large number of ways that the compiler (or weak ordering
+CPUs like the DEC Alpha) can trip this code up. For but one example, if
+the compiler were short of registers, it might choose to refetch from
+``gp`` rather than keeping a separate copy in ``p`` as follows:
::
@@ -345,7 +344,7 @@ If this function ran concurrently with a series of updates that replaced
the current structure with a new one, the fetches of ``gp->a`` and
``gp->b`` might well come from two different structures, which could
cause serious confusion. To prevent this (and much else besides),
-``do_something_gp()`` uses ``rcu_dereference()`` to fetch from ``gp``:
+do_something_gp() uses rcu_dereference() to fetch from ``gp``:
::
@@ -362,21 +361,21 @@ cause serious confusion. To prevent this (and much else besides),
11 return false;
12 }
-The ``rcu_dereference()`` uses volatile casts and (for DEC Alpha) memory
+The rcu_dereference() uses volatile casts and (for DEC Alpha) memory
barriers in the Linux kernel. Should a `high-quality implementation of
C11 ``memory_order_consume``
[PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__
-ever appear, then ``rcu_dereference()`` could be implemented as a
+ever appear, then rcu_dereference() could be implemented as a
``memory_order_consume`` load. Regardless of the exact implementation, a
-pointer fetched by ``rcu_dereference()`` may not be used outside of the
+pointer fetched by rcu_dereference() may not be used outside of the
outermost RCU read-side critical section containing that
-``rcu_dereference()``, unless protection of the corresponding data
+rcu_dereference(), unless protection of the corresponding data
element has been passed from RCU to some other synchronization
mechanism, most commonly locking or `reference
counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
-In short, updaters use ``rcu_assign_pointer()`` and readers use
-``rcu_dereference()``, and these two RCU API elements work together to
+In short, updaters use rcu_assign_pointer() and readers use
+rcu_dereference(), and these two RCU API elements work together to
ensure that readers have a consistent view of newly added data elements.
Of course, it is also necessary to remove elements from RCU-protected
@@ -388,9 +387,9 @@ data structures, for example, using the following process:
the newly removed data element).
#. At this point, only the updater has a reference to the newly removed
data element, so it can safely reclaim the data element, for example,
- by passing it to ``kfree()``.
+ by passing it to kfree().
-This process is implemented by ``remove_gp_synchronous()``:
+This process is implemented by remove_gp_synchronous():
::
@@ -413,16 +412,16 @@ This process is implemented by ``remove_gp_synchronous()``:
This function is straightforward, with line 13 waiting for a grace
period before line 14 frees the old data element. This waiting ensures
-that readers will reach line 7 of ``do_something_gp()`` before the data
-element referenced by ``p`` is freed. The ``rcu_access_pointer()`` on
-line 6 is similar to ``rcu_dereference()``, except that:
+that readers will reach line 7 of do_something_gp() before the data
+element referenced by ``p`` is freed. The rcu_access_pointer() on
+line 6 is similar to rcu_dereference(), except that:
-#. The value returned by ``rcu_access_pointer()`` cannot be
+#. The value returned by rcu_access_pointer() cannot be
dereferenced. If you want to access the value pointed to as well as
- the pointer itself, use ``rcu_dereference()`` instead of
- ``rcu_access_pointer()``.
-#. The call to ``rcu_access_pointer()`` need not be protected. In
- contrast, ``rcu_dereference()`` must either be within an RCU
+ the pointer itself, use rcu_dereference() instead of
+ rcu_access_pointer().
+#. The call to rcu_access_pointer() need not be protected. In
+ contrast, rcu_dereference() must either be within an RCU
read-side critical section or in a code segment where the pointer
cannot change, for example, in code protected by the corresponding
update-side lock.
@@ -430,13 +429,13 @@ line 6 is similar to ``rcu_dereference()``, except that:
+-----------------------------------------------------------------------+
| **Quick Quiz**: |
+-----------------------------------------------------------------------+
-| Without the ``rcu_dereference()`` or the ``rcu_access_pointer()``, |
+| Without the rcu_dereference() or the rcu_access_pointer(), |
| what destructive optimizations might the compiler make use of? |
+-----------------------------------------------------------------------+
| **Answer**: |
+-----------------------------------------------------------------------+
-| Let's start with what happens to ``do_something_gp()`` if it fails to |
-| use ``rcu_dereference()``. It could reuse a value formerly fetched |
+| Let's start with what happens to do_something_gp() if it fails to |
+| use rcu_dereference(). It could reuse a value formerly fetched |
| from this same pointer. It could also fetch the pointer from ``gp`` |
| in a byte-at-a-time manner, resulting in *load tearing*, in turn |
| resulting a bytewise mash-up of two distinct pointer values. It might |
@@ -445,15 +444,15 @@ line 6 is similar to ``rcu_dereference()``, except that:
| update has changed the pointer to match the wrong guess. Too bad |
| about any dereferences that returned pre-initialization garbage in |
| the meantime! |
-| For ``remove_gp_synchronous()``, as long as all modifications to |
+| For remove_gp_synchronous(), as long as all modifications to |
| ``gp`` are carried out while holding ``gp_lock``, the above |
| optimizations are harmless. However, ``sparse`` will complain if you |
| define ``gp`` with ``__rcu`` and then access it without using either |
-| ``rcu_access_pointer()`` or ``rcu_dereference()``. |
+| rcu_access_pointer() or rcu_dereference(). |
+-----------------------------------------------------------------------+
In short, RCU's publish-subscribe guarantee is provided by the
-combination of ``rcu_assign_pointer()`` and ``rcu_dereference()``. This
+combination of rcu_assign_pointer() and rcu_dereference(). This
guarantee allows data elements to be safely added to RCU-protected
linked data structures without disrupting RCU readers. This guarantee
can be used in combination with the grace-period guarantee to also allow
@@ -462,9 +461,9 @@ again without disrupting RCU readers.
This guarantee was only partially premeditated. DYNIX/ptx used an
explicit memory barrier for publication, but had nothing resembling
-``rcu_dereference()`` for subscription, nor did it have anything
+rcu_dereference() for subscription, nor did it have anything
resembling the dependency-ordering barrier that was later subsumed
-into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The
+into rcu_dereference() and later still into READ_ONCE(). The
need for these operations made itself known quite suddenly at a
late-1990s meeting with the DEC Alpha architects, back in the days when
DEC was still a free-standing company. It took the Alpha architects a
@@ -474,7 +473,7 @@ documentation did not make this point clear. More recent work with the C
and C++ standards committees have provided much education on tricks and
traps from the compiler. In short, compilers were much less tricky in
the early 1990s, but in 2015, don't even think about omitting
-``rcu_dereference()``!
+rcu_dereference()!
Memory-Barrier Guarantees
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -484,31 +483,31 @@ demonstrates the need for RCU's stringent memory-ordering guarantees on
systems with more than one CPU:
#. Each CPU that has an RCU read-side critical section that begins
- before ``synchronize_rcu()`` starts is guaranteed to execute a full
+ before synchronize_rcu() starts is guaranteed to execute a full
memory barrier between the time that the RCU read-side critical
- section ends and the time that ``synchronize_rcu()`` returns. Without
+ section ends and the time that synchronize_rcu() returns. Without
this guarantee, a pre-existing RCU read-side critical section might
hold a reference to the newly removed ``struct foo`` after the
- ``kfree()`` on line 14 of ``remove_gp_synchronous()``.
+ kfree() on line 14 of remove_gp_synchronous().
#. Each CPU that has an RCU read-side critical section that ends after
- ``synchronize_rcu()`` returns is guaranteed to execute a full memory
- barrier between the time that ``synchronize_rcu()`` begins and the
+ synchronize_rcu() returns is guaranteed to execute a full memory
+ barrier between the time that synchronize_rcu() begins and the
time that the RCU read-side critical section begins. Without this
guarantee, a later RCU read-side critical section running after the
- ``kfree()`` on line 14 of ``remove_gp_synchronous()`` might later run
- ``do_something_gp()`` and find the newly deleted ``struct foo