Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably: - Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks - Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized - Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have done this for many years) - Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases) - Make kfree_rcu() and friends take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark This also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh) - SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture This also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another - Clean up the now-useless SRCU Kconfig option There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window - RCU-tasks updates, perhaps most notably these fixes: - A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang - A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period - A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU - Torture-test updates and fixes - Torture-test scripting updates and fixes - Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() kernel/notifier: Remove CONFIG_SRCU init: Remove "select SRCU" fs/quota: Remove "select SRCU" fs/notify: Remove "select SRCU" fs/btrfs: Remove "select SRCU" fs: Remove CONFIG_SRCU drivers/pci/controller: Remove "select SRCU" drivers/net: Remove "select SRCU" drivers/md: Remove "select SRCU" drivers/hwtracing/stm: Remove "select SRCU" drivers/dax: Remove "select SRCU" drivers/base: Remove CONFIG_SRCU rcu: Disable laziness if lazy-tracking says so rcu: Track laziness during boot and suspend rcu: Remove redundant call to rcu_boost_kthread_setaffinity() rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts rcu: Align the output of RCU CPU stall warning messages rcu: Add RCU stall diagnosis information sched: Add helper nr_context_switches_cpu() ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2023-02-21 10:45:51 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2023-02-21 10:45:51 -0800
commit: 8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f (patch)
tree: 053ed1940a0ddb7ff2972c05637edf820772cbb8
parent: 8ca8d89b43caf9a02a18414d6eeff966d2b14512 (diff)
parent: bba8d3d17dc2678f9647962900aa421a18c25320 (diff)
download: linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.tar.gz
linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.tar.bz2
linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.zip
54 files changed, 1734 insertions, 839 deletions
diff --git a/Documentation/RCU/NMI-RCU.rst b/Documentation/RCU/NMI-RCU.rst
index 2a92bc685ef1..dff60a80b386 100644
--- a/Documentation/RCU/NMI-RCU.rst
+++ b/Documentation/RCU/NMI-RCU.rst
@@ -8,7 +8,7 @@ Although RCU is usually used to protect read-mostly data structures,
 it is possible to use RCU to provide dynamic non-maskable interrupt
 handlers, as well as dynamic irq handlers.  This document describes
 how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
-work in "arch/x86/kernel/traps.c".
+work in an old version of "arch/x86/kernel/traps.c".
 
 The relevant pieces of code are listed below, each followed by a
 brief explanation::
@@ -116,7 +116,7 @@ Answer to Quick Quiz:
 
 	This same sad story can happen on other CPUs when using
 	a compiler with aggressive pointer-value speculation
-	optimizations.
+	optimizations.  (But please don't!)
 
 	More important, the rcu_dereference_sched() makes it
 	clear to someone reading the code that the pointer is
diff --git a/Documentation/RCU/UP.rst b/Documentation/RCU/UP.rst
index e26dda27430c..8b20fd45f255 100644
--- a/Documentation/RCU/UP.rst
+++ b/Documentation/RCU/UP.rst
@@ -38,7 +38,7 @@ by having call_rcu() directly invoke its arguments only if it was called
 from process context.  However, this can fail in a similar manner.
 
 Suppose that an RCU-based algorithm again scans a linked list containing
-elements A, B, and C in process contexts, but that it invokes a function
+elements A, B, and C in process context, but that it invokes a function
 on each element as it is scanned.  Suppose further that this function
 deletes element B from the list, then passes it to call_rcu() for deferred
 freeing.  This may be a bit unconventional, but it is perfectly legal
@@ -59,7 +59,8 @@ Example 3: Death by Deadlock
 Suppose that call_rcu() is invoked while holding a lock, and that the
 callback function must acquire this same lock.  In this case, if
 call_rcu() were to directly invoke the callback, the result would
-be self-deadlock.
+be self-deadlock *even if* this invocation occurred from a later
+call_rcu() invocation a full grace period later.
 
 In some cases, it would possible to restructure to code so that
 the call_rcu() is delayed until after the lock is released.  However,
@@ -85,6 +86,14 @@ Quick Quiz #2:
 
 :ref:`Answers to Quick Quiz <answer_quick_quiz_up>`
 
+It is important to note that userspace RCU implementations *do*
+permit call_rcu() to directly invoke callbacks, but only if a full
+grace period has elapsed since those callbacks were queued.  This is
+the case because some userspace environments are extremely constrained.
+Nevertheless, people writing userspace RCU implementations are strongly
+encouraged to avoid invoking callbacks from call_rcu(), thus obtaining
+the deadlock-avoidance benefits called out above.
+
 Summary
 -------
 
diff --git a/Documentation/RCU/lockdep.rst b/Documentation/RCU/lockdep.rst
index 9308f1bdba05..2749f43ec1b0 100644
--- a/Documentation/RCU/lockdep.rst
+++ b/Documentation/RCU/lockdep.rst
@@ -69,9 +69,8 @@ checking of rcu_dereference() primitives:
 		value of the pointer itself, for example, against NULL.
 
 The rcu_dereference_check() check expression can be any boolean
-expression, but would normally include a lockdep expression.  However,
-any boolean expression can be used.  For a moderately ornate example,
-consider the following::
+expression, but would normally include a lockdep expression.  For a
+moderately ornate example, consider the following::
 
 	file = rcu_dereference_check(fdt->fd[fd],
 				     lockdep_is_held(&files->file_lock) ||
@@ -97,10 +96,10 @@ code, it could instead be written as follows::
 					 atomic_read(&files->count) == 1);
 
 This would verify cases #2 and #3 above, and furthermore lockdep would
-complain if this was used in an RCU read-side critical section unless one
-of these two cases held.  Because rcu_dereference_protected() omits all
-barriers and compiler constraints, it generates better code than do the
-other flavors of rcu_dereference().  On the other hand, it is illegal
+complain even if this was used in an RCU read-side critical section unless
+one of these two cases held.  Because rcu_dereference_protected() omits
+all barriers and compiler constraints, it generates better code than do
+the other flavors of rcu_dereference().  On the other hand, it is illegal
 to use rcu_dereference_protected() if either the RCU-protected pointer
 or the RCU-protected data that it points to can change concurrently.
 
diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
index 3cfe01ba9a49..bf6617b330a7 100644
--- a/Documentation/RCU/rcu.rst
+++ b/Documentation/RCU/rcu.rst
@@ -77,15 +77,17 @@ Frequently Asked Questions
   search for the string "Patent" in Documentation/RCU/RTFP.txt to find them.
   Of these, one was allowed to lapse by the assignee, and the
   others have been contributed to the Linux kernel under GPL.
+  Many (but not all) have long since expired.
   There are now also LGPL implementations of user-level RCU
   available (https://liburcu.org/).
 
 - I hear that RCU needs work in order to support realtime kernels?
 
-  Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
+  Realtime-friendly RCU are enabled via the CONFIG_PREEMPTION
   kernel configuration parameter.
 
 - Where can I find more information on RCU?
 
   See the Documentation/RCU/RTFP.txt file.
-  Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/).
+  Or point your browser at (https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit)
+  or (https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing).
diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst
index 81e828c8313b..3b739f6243c8 100644
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -19,8 +19,9 @@ Follow these rules to keep your RCU code working properly:
 	can reload the value, and won't your code have fun with two
 	different values for a single pointer!  Without rcu_dereference(),
 	DEC Alpha can load a pointer, dereference that pointer, and
-	return data preceding initialization that preceded the store of
-	the pointer.
+	return data preceding initialization that preceded the store
+	of the pointer.  (As noted later, in recent kernels READ_ONCE()
+	also prevents DEC Alpha from playing these tricks.)
 
 	In addition, the volatile cast in rcu_dereference() prevents the
 	compiler from deducing the resulting pointer value.  Please see
@@ -34,7 +35,7 @@ Follow these rules to keep your RCU code working properly:
 	takes on the role of the lockless_dereference() primitive that
 	was removed in v4.15.
 
--	You are only permitted to use rcu_dereference on pointer values.
+-	You are only permitted to use rcu_dereference() on pointer values.
 	The compiler simply knows too much about integral values to
 	trust it to carry dependencies through integer operations.
 	There are a very few exceptions, namely that you can temporarily
@@ -240,6 +241,7 @@ precautions.  To see this, consider the following code fragment::
 		struct foo *q;
 		int r1, r2;
 
+		rcu_read_lock();
 		p = rcu_dereference(gp2);
 		if (p == NULL)
 			return;
@@ -248,7 +250,10 @@ precautions.  To see this, consider the following code fragment::
 		if (p == q) {
 			/* The compiler decides that q->c is same as p->c. */
 			r2 = p->c; /* Could get 44 on weakly order system. */
+		} else {
+			r2 = p->c - r1; /* Unconditional access to p->c. */
 		}
+		rcu_read_unlock();
 		do_something_with(r1, r2);
 	}
 
@@ -297,6 +302,7 @@ Then one approach is to use locking, for example, as follows::
 		struct foo *q;
 		int r1, r2;
 
+		rcu_read_lock();
 		p = rcu_dereference(gp2);
 		if (p == NULL)
 			return;
@@ -306,7 +312,12 @@ Then one approach is to use locking, for example, as follows::
 		if (p == q) {
 			/* The compiler decides that q->c is same as p->c. */
 			r2 = p->c; /* Locking guarantees r2 == 144. */
+		} else {
+			spin_lock(&q->lock);
+			r2 = q->c - r1;
+			spin_unlock(&q->lock);
 		}
+		rcu_read_unlock();
 		spin_unlock(&p->lock);
 		do_something_with(r1, r2);
 	}
@@ -364,7 +375,7 @@ the exact value of "p" even in the not-equals case.  This allows the
 compiler to make the return values independent of the load from "gp",
 in turn destroying the ordering between this load and the loads of the
 return values.  This can result in "p->b" returning pre-initialization
-garbage values.
+garbage values on weakly ordered systems.
 
 In short, rcu_dereference() is *not* optional when you are going to
 dereference the resulting pointer.
@@ -430,7 +441,7 @@ member of the rcu_dereference() to use in various situations:
 SPARSE CHECKING OF RCU-PROTECTED POINTERS
 -----------------------------------------
 
-The sparse static-analysis tool checks for direct access to RCU-protected
+The sparse static-analysis tool checks for non-RCU access to RCU-protected
 pointers, which can result in "interesting" bugs due to compiler
 optimizations involving invented loads and perhaps also load tearing.
 For example, suppose someone mistakenly does something like this::
diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst
index 3b4a24877496..6da7f66da2a8 100644
--- a/Documentation/RCU/rcubarrier.rst
+++ b/Documentation/RCU/rcubarrier.rst
@@ -5,37 +5,12 @@ RCU and Unloadable Modules
 
 [Originally published in LWN Jan. 14, 2007: http://lwn.net/Articles/217484/]
 
-RCU (read-copy update) is a synchronization mechanism that can be thought
-of as a replacement for read-writer locking (among other things), but with
-very low-overhead readers that are immune to deadlock, priority inversion,
-and unbounded latency. RCU read-side critical sections are delimited
-by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
-kernels, generate no code whatsoever.
-
-This means that RCU writers are unaware of the presence of concurrent
-readers, so that RCU updates to shared data must be undertaken quite
-carefully, leaving an old version of the data structure in place until all
-pre-existing readers have finished. These old versions are needed because
-such readers might hold a reference to them. RCU updates can therefore be
-rather expensive, and RCU is thus best suited for read-mostly situations.
-
-How can an RCU writer possibly determine when all readers are finished,
-given that readers might well leave absolutely no trace of their
-presence? There is a synchronize_rcu() primitive that blocks until all
-pre-existing readers have completed. An updater wishing to delete an
-element p from a linked list might do the following, while holding an
-appropriate lock, of course::
-
-	list_del_rcu(p);
-	synchronize_rcu();
-	kfree(p);
-
-But the above code cannot be used in IRQ context -- the call_rcu()
-primitive must be used instead. This primitive takes a pointer to an
-rcu_head struct placed within the RCU-protected data structure and
-another pointer to a function that may be invoked later to free that
-structure. Code to delete an element p from the linked list from IRQ
-context might then be as follows::
+RCU updaters sometimes use call_rcu() to initiate an asynchronous wait for
+a grace period to elapse.  This primitive takes a pointer to an rcu_head
+struct placed within the RCU-protected data structure and another pointer
+to a function that may be invoked later to free that structure. Code to
+delete an element p from the linked list from IRQ context might then be
+as follows::
 
 	list_del_rcu(p);
 	call_rcu(&p->rcu, p_callback);
@@ -54,7 +29,7 @@ IRQ context. The function p_callback() might be defined as follows::
 Unloading Modules That Use call_rcu()
 -------------------------------------
 
-But what if p_callback is defined in an unloadable module?
+But what if the p_callback() function is defined in an unloadable module?
 
 If we unload the module while some RCU callbacks are pending,
 the CPUs executing these callbacks are going to be severely
@@ -67,20 +42,21 @@ grace period to elapse, it does not wait for the callbacks to complete.
 
 One might be tempted to try several back-to-back synchronize_rcu()
 calls, but this is still not guaranteed to work. If there is a very
-heavy RCU-callback load, then some of the callbacks might be deferred
-in order to allow other processing to proceed. Such deferral is required
-in realtime kernels in order to avoid excessive scheduling latencies.
+heavy RCU-callback load, then some of the callbacks might be deferred in
+order to allow other processing to proceed. For but one example, such
+deferral is required in realtime kernels in order to avoid excessive
+scheduling latencies.
 
 
 rcu_barrier()
 -------------
 
-We instead need the rcu_barrier() primitive.  Rather than waiting for
-a grace period to elapse, rcu_barrier() waits for all outstanding RCU
-callbacks to complete.  Please note that rcu_barrier() does **not** imply
-synchronize_rcu(), in particular, if there are no RCU callbacks queued
-anywhere, rcu_barrier() is within its rights to return immediately,
-without waiting for a grace period to elapse.
+This situation can be handled by the rcu_barrier() primitive.  Rather
+than waiting for a grace period to elapse, rcu_barrier() waits for all
+outstanding RCU callbacks to complete.  Please note that rcu_barrier()
+does **not** imply synchronize_rcu(), in particular, if there are no RCU
+callbacks queued anywhere, rcu_barrier() is within its rights to return
+immediately, without waiting for anything, let alone a grace period.
 
 Pseudo-code using rcu_barrier() is as follows:
 
@@ -89,83 +65,86 @@ Pseudo-code using rcu_barrier() is as follows:
    3. Allow the module to be unloaded.
 
 There is also an srcu_barrier() function for SRCU, and you of course
-must match the flavor of rcu_barrier() with that of call_rcu().  If your
-module uses multiple flavors of call_rcu(), then it must also use multiple
-flavors of rcu_barrier() when unloading that module.  For example, if
-it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on
-srcu_struct_2, then the following three lines of code will be required
-when unloading::
-
- 1 rcu_barrier();
- 2 srcu_barrier(&srcu_struct_1);
- 3 srcu_barrier(&srcu_struct_2);
-
-The rcutorture module makes use of rcu_barrier() in its exit function
-as follows::
-
- 1  static void
- 2  rcu_torture_cleanup(void)
- 3  {
- 4    int i;
- 5
- 6    fullstop = 1;
- 7    if (shuffler_task != NULL) {
- 8     VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
- 9     kthread_stop(shuffler_task);
- 10   }
- 11   shuffler_task = NULL;
+must match the flavor of srcu_barrier() with that of call_srcu().
+If your module uses multiple srcu_struct structures, then it must also
+use multiple invocations of srcu_barrier() when unloading that module.
+For example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and
+call_srcu() on srcu_struct_2, then the following three lines of code
+will be required when unloading::
+
+  1  rcu_barrier();
+  2  srcu_barrier(&srcu_struct_1);
+  3  srcu_barrier(&srcu_struct_2);
+
+If latency is of the essence, workqueues could be used to run these
+three functions concurrently.
+
+An ancient version of the rcutorture module makes use of rcu_barrier()
+in its exit function as follows::
+
+  1  static void
+  2  rcu_torture_cleanup(void)
+  3  {
+  4    int i;
+  5
+  6    fullstop = 1;
+  7    if (shuffler_task != NULL) {
+  8      VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
+  9      kthread_stop(shuffler_task);
+ 10    }
+ 11    shuffler_task = NULL;
  12
- 13   if (writer_task != NULL) {
- 14     VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
- 15     kthread_stop(writer_task);
- 16   }
- 17   writer_task = NULL;
+ 13    if (writer_task != NULL) {
+ 14      VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
+ 15      kthread_stop(writer_task);
+ 16    }
+ 17    writer_task = NULL;
  18
- 19   if (reader_tasks != NULL) {
- 20     for (i = 0; i < nrealreaders; i++) {
- 21       if (reader_tasks[i] != NULL) {
- 22         VERBOSE_PRINTK_STRING(
- 23           "Stopping rcu_torture_reader task");
- 24         kthread_stop(reader_tasks[i]);
- 25       }
- 26       reader_tasks[i] = NULL;
- 27     }
- 28     kfree(reader_tasks);
- 29     reader_tasks = NULL;
- 30   }
- 31   rcu_torture_current = NULL;
+ 19    if (reader_tasks != NULL) {
+ 20      for (i = 0; i < nrealreaders; i++) {
+ 21        if (reader_tasks[i] != NULL) {
+ 22          VERBOSE_PRINTK_STRING(
+ 23            "Stopping rcu_torture_reader task");
+ 24          kthread_stop(reader_tasks[i]);
+ 25        }
+ 26        reader_tasks[i] = NULL;
+ 27      }
+ 28      kfree(reader_tasks);
+ 29      reader_tasks = NULL;
+ 30    }
+ 31    rcu_torture_current = NULL;
  32
- 33   if (fakewriter_tasks != NULL) {
- 34     for (i = 0; i < nfakewriters; i++) {
- 35       if (fakewriter_tasks[i] != NULL) {
- 36         VERBOSE_PRINTK_STRING(
- 37           "Stopping rcu_torture_fakewriter task");
- 38         kthread_stop(fakewriter_tasks[i]);
- 39       }
- 40       fakewriter_tasks[i] = NULL;
- 41     }
- 42     kfree(fakewriter_tasks);
- 43     fakewriter_tasks = NULL;
- 44   }
+ 33    if (fakewriter_tasks != NULL) {
+ 34      for (i = 0; i < nfakewriters; i++) {
+ 35        if (fakewriter_tasks[i] != NULL) {
+ 36          VERBOSE_PRINTK_STRING(
+ 37            "Stopping rcu_torture_fakewriter task");
+ 38          kthread_stop(fakewriter_tasks[i]);
+ 39        }
+ 40        fakewriter_tasks[i] = NULL;
+ 41      }
+ 42      kfree(fakewriter_tasks);
+ 43      fakewriter_tasks = NULL;
+ 44    }
  45
- 46   if (stats_task != NULL) {
- 47     VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
- 48     kthread_stop(stats_task);
- 49   }
- 50   stats_task = NULL;
+ 46    if (stats_task != NULL) {
+ 47      VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
+ 48      kthread_stop(stats_task);
+ 49    }
+ 50    stats_task = NULL;
  51
- 52   /* Wait for all RCU callbacks to fire. */
- 53   rcu_barrier();
+ 52    /* Wait for all RCU callbacks to fire. */
+ 53    rcu_barrier();
  54
- 55   rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
+ 55    rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
  56
- 57   if (cur_ops->cleanup != NULL)
- 58     cur_ops->cleanup();
- 59   if (atomic_read(&n_rcu_torture_error))
- 60     rcu_torture_print_module_parms("End of test: FAILURE");
- 61   else
- 62     rcu_torture_print_module_parms("End of test: SUCCESS");
- 63 }
+ 57    if (cur_ops->cleanup != NULL)
+ 58      cur_ops->cleanup();
+ 59    if (atomic_read(&n_rcu_torture_error))
+ 60      rcu_torture_print_module_parms("End of test: FAILURE");
+ 61    else
+ 62      rcu_torture_print_module_parms("End of test: SUCCESS");
+ 63  }
 
 Line 6 sets a global variable that prevents any RCU callbacks from
 re-posting themselves. This will not be necessary in most cases, since
@@ -190,16 +169,17 @@ Quick Quiz #1:
 :ref:`Answer to Quick Quiz #1 <answer_rcubarrier_quiz_1>`
 
 Your module might have additional complications. For example, if your
-module invokes call_rcu() from timers, you will need to first cancel all
-the timers, and only then invoke rcu_barrier() to wait for any remaining
+module invokes call_rcu() from timers, you will need to first refrain
+from posting new timers, cancel (or wait for) all the already-posted
+timers, and only then invoke rcu_barrier() to wait for any remaining
 RCU callbacks to complete.
 
-Of course, if you module uses call_rcu(), you will need to invoke
+Of course, if your module uses call_rcu(), you will need to invoke
 rcu_barrier() before unloading.  Similarly, if your module uses
 call_srcu(), you will need to invoke srcu_barrier() before unloading,
 and on the same srcu_struct structure.  If your module uses call_rcu()
-**and** call_srcu(), then you will need to invoke rcu_barrier() **and**
-srcu_barrier().
+**and** call_srcu(), then (as noted above) you will need to invoke
+rcu_barrier() **and** srcu_barrier().
 
 
 Implementing rcu_barrier()
@@ -211,27 +191,40 @@ queues. His implementation queues an RCU callback on each of the per-CPU
 callback queues, and then waits until they have all started executing, at
 which point, all earlier RCU callbacks are guaranteed to have completed.
 
-The original code for rcu_barrier() was as follows::
-
- 1  void rcu_barrier(void)
author	Linus Torvalds <torvalds@linux-foundation.org>	2023-02-21 10:45:51 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2023-02-21 10:45:51 -0800
commit	8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f (patch)
tree	053ed1940a0ddb7ff2972c05637edf820772cbb8
parent	8ca8d89b43caf9a02a18414d6eeff966d2b14512 (diff)
parent	bba8d3d17dc2678f9647962900aa421a18c25320 (diff)
download	linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.tar.gz linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.tar.bz2 linux-8cc01d43f882fa1f44d8aa6727a6ea783d8fbe3f.zip