Merge tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar: "These are the latest RCU updates for v5.12: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. - nolibc fixes for the torture tests" * tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) percpu_ref: Dump mem_dump_obj() info upon reference-count underflow rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback mm: Make mem_obj_dump() vmalloc() dumps include start and length mm: Make mem_dump_obj() handle vmalloc() memory mm: Make mem_dump_obj() handle NULL and zero-sized pointers mm: Add mem_dump_obj() to print source of memory block tools/rcutorture: Fix position of -lgcc in mkinitrd.sh tools/nolibc: Fix position of -lgcc in the documented example tools/nolibc: Emit detailed error for missing alternate syscall number definitions tools/nolibc: Remove incorrect definitions of __ARCH_WANT_* tools/nolibc: Get timeval, timespec and timezone from linux/time.h tools/nolibc: Implement poll() based on ppoll() tools/nolibc: Implement fork() based on clone() tools/nolibc: Make getpgrp() fall back to getpgid(0) tools/nolibc: Make dup2() rely on dup3() when available tools/nolibc: Add the definition for dup() rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01 torture: Maintain torture-specific set of CPUs-online books torture: Clean up after torture-test CPU hotplugging rcutorture: Make object_debug also double call_rcu() heap object ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2021-02-21 12:04:41 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2021-02-21 12:04:41 -0800
commit: d089f48fba28db14d0fe7753248f2575a9ddfc73 (patch)
tree: a3821c02dd38342193459e41ba453c058f75e3d2
parent: 3f6ec19f2d05d800bbc42d95dece433da7697864 (diff)
parent: 2b392cb11c0db645ba81a08b6a2e96c56ec1fc64 (diff)
download: linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.gz
linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.bz2
linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.zip
63 files changed, 3108 insertions, 763 deletions
diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
index 72f0f6fbd53c..6f89cf1e567d 100644
--- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
+++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
@@ -38,7 +38,7 @@ sections.
 RCU-preempt Expedited Grace Periods
 ===================================
 
-``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
+``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
 The overall flow of the handling of a given CPU by an RCU-preempt
 expedited grace period is shown in the following diagram:
 
@@ -112,7 +112,7 @@ things.
 RCU-sched Expedited Grace Periods
 ---------------------------------
 
-``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
+``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
 the handling of a given CPU by an RCU-sched expedited grace period is
 shown in the following diagram:
 
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index d4c9a016074b..38a39476fc24 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -72,13 +72,13 @@ understanding of this guarantee.
 
 RCU's grace-period guarantee allows updaters to wait for the completion
 of all pre-existing RCU read-side critical sections. An RCU read-side
-critical section begins with the marker ``rcu_read_lock()`` and ends
-with the marker ``rcu_read_unlock()``. These markers may be nested, and
+critical section begins with the marker rcu_read_lock() and ends
+with the marker rcu_read_unlock(). These markers may be nested, and
 RCU treats a nested set as one big RCU read-side critical section.
-Production-quality implementations of ``rcu_read_lock()`` and
-``rcu_read_unlock()`` are extremely lightweight, and in fact have
+Production-quality implementations of rcu_read_lock() and
+rcu_read_unlock() are extremely lightweight, and in fact have
 exactly zero overhead in Linux kernels built for production use with
-``CONFIG_PREEMPT=n``.
+``CONFIG_PREEMPTION=n``.
 
 This guarantee allows ordering to be enforced with extremely low
 overhead to readers, for example:
@@ -102,12 +102,12 @@ overhead to readers, for example:
       15   WRITE_ONCE(y, 1);
       16 }
 
-Because the ``synchronize_rcu()`` on line 14 waits for all pre-existing
-readers, any instance of ``thread0()`` that loads a value of zero from
-``x`` must complete before ``thread1()`` stores to ``y``, so that
+Because the synchronize_rcu() on line 14 waits for all pre-existing
+readers, any instance of thread0() that loads a value of zero from
+``x`` must complete before thread1() stores to ``y``, so that
 instance must also load a value of zero from ``y``. Similarly, any
-instance of ``thread0()`` that loads a value of one from ``y`` must have
-started after the ``synchronize_rcu()`` started, and must therefore also
+instance of thread0() that loads a value of one from ``y`` must have
+started after the synchronize_rcu() started, and must therefore also
 load a value of one from ``x``. Therefore, the outcome:
 
    ::
@@ -121,14 +121,14 @@ cannot happen.
 +-----------------------------------------------------------------------+
 | Wait a minute! You said that updaters can make useful forward         |
 | progress concurrently with readers, but pre-existing readers will     |
-| block ``synchronize_rcu()``!!!                                        |
+| block synchronize_rcu()!!!                                            |
 | Just who are you trying to fool???                                    |
 +-----------------------------------------------------------------------+
 | **Answer**:                                                           |
 +-----------------------------------------------------------------------+
 | First, if updaters do not wish to be blocked by readers, they can use |
-| ``call_rcu()`` or ``kfree_rcu()``, which will be discussed later.     |
-| Second, even when using ``synchronize_rcu()``, the other update-side  |
+| call_rcu() or kfree_rcu(), which will be discussed later.             |
+| Second, even when using synchronize_rcu(), the other update-side      |
 | code does run concurrently with readers, whether pre-existing or not. |
 +-----------------------------------------------------------------------+
 
@@ -170,34 +170,34 @@ recovery from node failure, more or less as follows:
       29   WRITE_ONCE(state, STATE_NORMAL);
       30 }
 
-The RCU read-side critical section in ``do_something_dlm()`` works with
-the ``synchronize_rcu()`` in ``start_recovery()`` to guarantee that
-``do_something()`` never runs concurrently with ``recovery()``, but with
-little or no synchronization overhead in ``do_something_dlm()``.
+The RCU read-side critical section in do_something_dlm() works with
+the synchronize_rcu() in start_recovery() to guarantee that
+do_something() never runs concurrently with recovery(), but with
+little or no synchronization overhead in do_something_dlm().
 
 +-----------------------------------------------------------------------+
 | **Quick Quiz**:                                                       |
 +-----------------------------------------------------------------------+
-| Why is the ``synchronize_rcu()`` on line 28 needed?                   |
+| Why is the synchronize_rcu() on line 28 needed?                       |
 +-----------------------------------------------------------------------+
 | **Answer**:                                                           |
 +-----------------------------------------------------------------------+
 | Without that extra grace period, memory reordering could result in    |
-| ``do_something_dlm()`` executing ``do_something()`` concurrently with |
-| the last bits of ``recovery()``.                                      |
+| do_something_dlm() executing do_something() concurrently with         |
+| the last bits of recovery().                                          |
 +-----------------------------------------------------------------------+
 
 In order to avoid fatal problems such as deadlocks, an RCU read-side
-critical section must not contain calls to ``synchronize_rcu()``.
+critical section must not contain calls to synchronize_rcu().
 Similarly, an RCU read-side critical section must not contain anything
 that waits, directly or indirectly, on completion of an invocation of
-``synchronize_rcu()``.
+synchronize_rcu().
 
 Although RCU's grace-period guarantee is useful in and of itself, with
 `quite a few use cases <https://lwn.net/Articles/573497/>`__, it would
 be good to be able to use RCU to coordinate read-side access to linked
 data structures. For this, the grace-period guarantee is not sufficient,
-as can be seen in function ``add_gp_buggy()`` below. We will look at the
+as can be seen in function add_gp_buggy() below. We will look at the
 reader's code later, but in the meantime, just think of the reader as
 locklessly picking up the ``gp`` pointer, and, if the value loaded is
 non-\ ``NULL``, locklessly accessing the ``->a`` and ``->b`` fields.
@@ -256,8 +256,8 @@ Publish/Subscribe Guarantee
 
 RCU's publish-subscribe guarantee allows data to be inserted into a
 linked data structure without disrupting RCU readers. The updater uses
-``rcu_assign_pointer()`` to insert the new data, and readers use
-``rcu_dereference()`` to access data, whether new or old. The following
+rcu_assign_pointer() to insert the new data, and readers use
+rcu_dereference() to access data, whether new or old. The following
 shows an example of insertion:
 
    ::
@@ -279,7 +279,7 @@ shows an example of insertion:
       15   return true;
       16 }
 
-The ``rcu_assign_pointer()`` on line 13 is conceptually equivalent to a
+The rcu_assign_pointer() on line 13 is conceptually equivalent to a
 simple assignment statement, but also guarantees that its assignment
 will happen after the two assignments in lines 11 and 12, similar to the
 C11 ``memory_order_release`` store operation. It also prevents any
@@ -289,7 +289,7 @@ number of “interesting” compiler optimizations, for example, the use of
 +-----------------------------------------------------------------------+
 | **Quick Quiz**:                                                       |
 +-----------------------------------------------------------------------+
-| But ``rcu_assign_pointer()`` does nothing to prevent the two          |
+| But rcu_assign_pointer() does nothing to prevent the two              |
 | assignments to ``p->a`` and ``p->b`` from being reordered. Can't that |
 | also cause problems?                                                  |
 +-----------------------------------------------------------------------+
@@ -303,7 +303,7 @@ number of “interesting” compiler optimizations, for example, the use of
 
 It is tempting to assume that the reader need not do anything special to
 control its accesses to the RCU-protected data, as shown in
-``do_something_gp_buggy()`` below:
+do_something_gp_buggy() below:
 
    ::
 
@@ -321,11 +321,10 @@ control its accesses to the RCU-protected data, as shown in
       12 }
 
 However, this temptation must be resisted because there are a
-surprisingly large number of ways that the compiler (to say nothing of
-`DEC Alpha CPUs <https://h71000.www7.hp.com/wizard/wiz_2637.html>`__)
-can trip this code up. For but one example, if the compiler were short
-of registers, it might choose to refetch from ``gp`` rather than keeping
-a separate copy in ``p`` as follows:
+surprisingly large number of ways that the compiler (or weak ordering
+CPUs like the DEC Alpha) can trip this code up. For but one example, if
+the compiler were short of registers, it might choose to refetch from
+``gp`` rather than keeping a separate copy in ``p`` as follows:
 
    ::
 
@@ -345,7 +344,7 @@ If this function ran concurrently with a series of updates that replaced
 the current structure with a new one, the fetches of ``gp->a`` and
 ``gp->b`` might well come from two different structures, which could
 cause serious confusion. To prevent this (and much else besides),
-``do_something_gp()`` uses ``rcu_dereference()`` to fetch from ``gp``:
+do_something_gp() uses rcu_dereference() to fetch from ``gp``:
 
    ::
 
@@ -362,21 +361,21 @@ cause serious confusion. To prevent this (and much else besides),
       11   return false;
       12 }
 
-The ``rcu_dereference()`` uses volatile casts and (for DEC Alpha) memory
+The rcu_dereference() uses volatile casts and (for DEC Alpha) memory
 barriers in the Linux kernel. Should a `high-quality implementation of
 C11 ``memory_order_consume``
 [PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__
-ever appear, then ``rcu_dereference()`` could be implemented as a
+ever appear, then rcu_dereference() could be implemented as a
 ``memory_order_consume`` load. Regardless of the exact implementation, a
-pointer fetched by ``rcu_dereference()`` may not be used outside of the
+pointer fetched by rcu_dereference() may not be used outside of the
 outermost RCU read-side critical section containing that
-``rcu_dereference()``, unless protection of the corresponding data
+rcu_dereference(), unless protection of the corresponding data
 element has been passed from RCU to some other synchronization
 mechanism, most commonly locking or `reference
 counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
 
-In short, updaters use ``rcu_assign_pointer()`` and readers use
-``rcu_dereference()``, and these two RCU API elements work together to
+In short, updaters use rcu_assign_pointer() and readers use
+rcu_dereference(), and these two RCU API elements work together to
 ensure that readers have a consistent view of newly added data elements.
 
 Of course, it is also necessary to remove elements from RCU-protected
@@ -388,9 +387,9 @@ data structures, for example, using the following process:
    the newly removed data element).
 #. At this point, only the updater has a reference to the newly removed
    data element, so it can safely reclaim the data element, for example,
-   by passing it to ``kfree()``.
+   by passing it to kfree().
 
-This process is implemented by ``remove_gp_synchronous()``:
+This process is implemented by remove_gp_synchronous():
 
    ::
 
@@ -413,16 +412,16 @@ This process is implemented by ``remove_gp_synchronous()``:
 
 This function is straightforward, with line 13 waiting for a grace
 period before line 14 frees the old data element. This waiting ensures
-that readers will reach line 7 of ``do_something_gp()`` before the data
-element referenced by ``p`` is freed. The ``rcu_access_pointer()`` on
-line 6 is similar to ``rcu_dereference()``, except that:
+that readers will reach line 7 of do_something_gp() before the data
+element referenced by ``p`` is freed. The rcu_access_pointer() on
+line 6 is similar to rcu_dereference(), except that:
 
-#. The value returned by ``rcu_access_pointer()`` cannot be
+#. The value returned by rcu_access_pointer() cannot be
    dereferenced. If you want to access the value pointed to as well as
-   the pointer itself, use ``rcu_dereference()`` instead of
-   ``rcu_access_pointer()``.
-#. The call to ``rcu_access_pointer()`` need not be protected. In
-   contrast, ``rcu_dereference()`` must either be within an RCU
+   the pointer itself, use rcu_dereference() instead of
+   rcu_access_pointer().
+#. The call to rcu_access_pointer() need not be protected. In
+   contrast, rcu_dereference() must either be within an RCU
    read-side critical section or in a code segment where the pointer
    cannot change, for example, in code protected by the corresponding
    update-side lock.
@@ -430,13 +429,13 @@ line 6 is similar to ``rcu_dereference()``, except that:
 +-----------------------------------------------------------------------+
 | **Quick Quiz**:                                                       |
 +-----------------------------------------------------------------------+
-| Without the ``rcu_dereference()`` or the ``rcu_access_pointer()``,    |
+| Without the rcu_dereference() or the rcu_access_pointer(),            |
 | what destructive optimizations might the compiler make use of?        |
 +-----------------------------------------------------------------------+
 | **Answer**:                                                           |
 +-----------------------------------------------------------------------+
-| Let's start with what happens to ``do_something_gp()`` if it fails to |
-| use ``rcu_dereference()``. It could reuse a value formerly fetched    |
+| Let's start with what happens to do_something_gp() if it fails to     |
+| use rcu_dereference(). It could reuse a value formerly fetched        |
 | from this same pointer. It could also fetch the pointer from ``gp``   |
 | in a byte-at-a-time manner, resulting in *load tearing*, in turn      |
 | resulting a bytewise mash-up of two distinct pointer values. It might |
@@ -445,15 +444,15 @@ line 6 is similar to ``rcu_dereference()``, except that:
 | update has changed the pointer to match the wrong guess. Too bad      |
 | about any dereferences that returned pre-initialization garbage in    |
 | the meantime!                                                         |
-| For ``remove_gp_synchronous()``, as long as all modifications to      |
+| For remove_gp_synchronous(), as long as all modifications to          |
 | ``gp`` are carried out while holding ``gp_lock``, the above           |
 | optimizations are harmless. However, ``sparse`` will complain if you  |
 | define ``gp`` with ``__rcu`` and then access it without using either  |
-| ``rcu_access_pointer()`` or ``rcu_dereference()``.                    |
+| rcu_access_pointer() or rcu_dereference().                            |
 +-----------------------------------------------------------------------+
 
 In short, RCU's publish-subscribe guarantee is provided by the
-combination of ``rcu_assign_pointer()`` and ``rcu_dereference()``. This
+combination of rcu_assign_pointer() and rcu_dereference(). This
 guarantee allows data elements to be safely added to RCU-protected
 linked data structures without disrupting RCU readers. This guarantee
 can be used in combination with the grace-period guarantee to also allow
@@ -462,9 +461,9 @@ again without disrupting RCU readers.
 
 This guarantee was only partially premeditated. DYNIX/ptx used an
 explicit memory barrier for publication, but had nothing resembling
-``rcu_dereference()`` for subscription, nor did it have anything
+rcu_dereference() for subscription, nor did it have anything
 resembling the dependency-ordering barrier that was later subsumed
-into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The
+into rcu_dereference() and later still into READ_ONCE(). The
 need for these operations made itself known quite suddenly at a
 late-1990s meeting with the DEC Alpha architects, back in the days when
 DEC was still a free-standing company. It took the Alpha architects a
@@ -474,7 +473,7 @@ documentation did not make this point clear. More recent work with the C
 and C++ standards committees have provided much education on tricks and
 traps from the compiler. In short, compilers were much less tricky in
 the early 1990s, but in 2015, don't even think about omitting
-``rcu_dereference()``!
+rcu_dereference()!
 
 Memory-Barrier Guarantees
 ~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -484,31 +483,31 @@ demonstrates the need for RCU's stringent memory-ordering guarantees on
 systems with more than one CPU:
 
 #. Each CPU that has an RCU read-side critical section that begins
-   before ``synchronize_rcu()`` starts is guaranteed to execute a full
+   before synchronize_rcu() starts is guaranteed to execute a full
    memory barrier between the time that the RCU read-side critical
-   section ends and the time that ``synchronize_rcu()`` returns. Without
+   section ends and the time that synchronize_rcu() returns. Without
    this guarantee, a pre-existing RCU read-side critical section might
    hold a reference to the newly removed ``struct foo`` after the
-   ``kfree()`` on line 14 of ``remove_gp_synchronous()``.
+   kfree() on line 14 of remove_gp_synchronous().
 #. Each CPU that has an RCU read-side critical section that ends after
-   ``synchronize_rcu()`` returns is guaranteed to execute a full memory
-   barrier between the time that ``synchronize_rcu()`` begins and the
+   synchronize_rcu() returns is guaranteed to execute a full memory
+   barrier between the time that synchronize_rcu() begins and the
    time that the RCU read-side critical section begins. Without this
    guarantee, a later RCU read-side critical section running after the
-   ``kfree()`` on line 14 of ``remove_gp_synchronous()`` might later run
-   ``do_something_gp()`` and find the newly deleted ``struct foo
author	Linus Torvalds <torvalds@linux-foundation.org>	2021-02-21 12:04:41 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2021-02-21 12:04:41 -0800
commit	d089f48fba28db14d0fe7753248f2575a9ddfc73 (patch)
tree	a3821c02dd38342193459e41ba453c058f75e3d2
parent	3f6ec19f2d05d800bbc42d95dece433da7697864 (diff)
parent	2b392cb11c0db645ba81a08b6a2e96c56ec1fc64 (diff)
download	linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.gz linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.tar.bz2 linux-d089f48fba28db14d0fe7753248f2575a9ddfc73.zip