diff options
Diffstat (limited to 'Documentation')
76 files changed, 851 insertions, 808 deletions
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 75b8ca007a11..8f41ad0aa753 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -463,7 +463,7 @@ again without disrupting RCU readers. This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling ``rcu_dereference()`` for subscription, nor did it have anything -resembling the ``smp_read_barrier_depends()`` that was later subsumed +resembling the dependency-ordering barrier that was later subsumed into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when @@ -2583,7 +2583,12 @@ not work to have these markers in the trampoline itself, because there would need to be instructions following ``rcu_read_unlock()``. Although ``synchronize_rcu()`` would guarantee that execution reached the ``rcu_read_unlock()``, it would not be able to guarantee that execution -had completely left the trampoline. +had completely left the trampoline. Worse yet, in some situations +the trampoline's protection must extend a few instructions *prior* to +execution reaching the trampoline. For example, these few instructions +might calculate the address of the trampoline, so that entering the +trampoline would be pre-ordained a surprisingly long time before execution +actually reached the trampoline itself. The solution, in the form of `Tasks RCU <https://lwn.net/Articles/607117/>`__, is to have implicit read-side diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.rst index e98ff261a438..2efed9926c3f 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.rst @@ -1,4 +1,8 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================ Review Checklist for RCU Patches +================================ This document contains a checklist for producing and reviewing patches @@ -411,18 +415,21 @@ over a rather long period of time, but improvements are always welcome! __rcu sparse checks to validate your RCU code. These can help find problems as follows: - CONFIG_PROVE_LOCKING: check that accesses to RCU-protected data + CONFIG_PROVE_LOCKING: + check that accesses to RCU-protected data structures are carried out under the proper RCU read-side critical section, while holding the right combination of locks, or whatever other conditions are appropriate. - CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the + CONFIG_DEBUG_OBJECTS_RCU_HEAD: + check that you don't pass the same object to call_rcu() (or friends) before an RCU grace period has elapsed since the last time that you passed that same object to call_rcu() (or friends). - __rcu sparse checks: tag the pointer to the RCU-protected data + __rcu sparse checks: + tag the pointer to the RCU-protected data structure with __rcu, and sparse will warn you if you access that pointer without the services of one of the variants of rcu_dereference(). @@ -442,8 +449,8 @@ over a rather long period of time, but improvements are always welcome! You instead need to use one of the barrier functions: - o call_rcu() -> rcu_barrier() - o call_srcu() -> srcu_barrier() + - call_rcu() -> rcu_barrier() + - call_srcu() -> srcu_barrier() However, these barrier functions are absolutely -not- guaranteed to wait for a grace period. In fact, if there are no call_rcu() diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst index 81a0a1e5f767..e703d3dbe60c 100644 --- a/Documentation/RCU/index.rst +++ b/Documentation/RCU/index.rst @@ -1,3 +1,5 @@ +.. SPDX-License-Identifier: GPL-2.0 + .. _rcu_concepts: ============ @@ -8,10 +10,17 @@ RCU concepts :maxdepth: 3 arrayRCU + checklist + lockdep + lockdep-splat rcubarrier rcu_dereference whatisRCU rcu + rculist_nulls + rcuref + torture + stallwarn listRCU NMI-RCU UP diff --git a/Documentation/RCU/lockdep-splat.txt b/Documentation/RCU/lockdep-splat.rst index b8096316fd11..2a5c79db57dc 100644 --- a/Documentation/RCU/lockdep-splat.txt +++ b/Documentation/RCU/lockdep-splat.rst @@ -1,3 +1,9 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================= +Lockdep-RCU Splat +================= + Lockdep-RCU was added to the Linux kernel in early 2010 (http://lwn.net/Articles/371986/). This facility checks for some common misuses of the RCU API, most notably using one of the rcu_dereference() @@ -12,55 +18,54 @@ overwriting or worse. There can of course be false positives, this being the real world and all that. So let's look at an example RCU lockdep splat from 3.0-rc5, one that -has long since been fixed: - -============================= -WARNING: suspicious RCU usage ------------------------------ -block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage! - -other info that might help us debug this: - - -rcu_scheduler_active = 1, debug_locks = 0 -3 locks held by scsi_scan_6/1552: - #0: (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>] -scsi_scan_host_selected+0x5a/0x150 - #1: (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>] -elevator_exit+0x22/0x60 - #2: (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>] -cfq_exit_queue+0x43/0x190 - -stack backtrace: -Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17 -Call Trace: - [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0 - [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120 - [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190 - [<ffffffff812a5046>] elevator_exit+0x36/0x60 - [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60 - [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10 - [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0 - [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10 - [<ffffffff817da069>] ? error_exit+0x29/0xb0 - [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80 - [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680 - [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c - [<ffffffff817da069>] ? error_exit+0x29/0xb0 - [<ffffffff812bcc60>] ? kobject_del+0x40/0x40 - [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0 - [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150 - [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90 - [<ffffffff8145f170>] do_scan_async+0x20/0x160 - [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90 - [<ffffffff810975b6>] kthread+0xa6/0xb0 - [<ffffffff817db154>] kernel_thread_helper+0x4/0x10 - [<ffffffff81066430>] ? finish_task_switch+0x80/0x110 - [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe - [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70 - [<ffffffff817db150>] ? gs_change+0xb/0xb - -Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows: +has long since been fixed:: + + ============================= + WARNING: suspicious RCU usage + ----------------------------- + block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage! + +other info that might help us debug this:: + + rcu_scheduler_active = 1, debug_locks = 0 + 3 locks held by scsi_scan_6/1552: + #0: (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>] + scsi_scan_host_selected+0x5a/0x150 + #1: (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>] + elevator_exit+0x22/0x60 + #2: (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>] + cfq_exit_queue+0x43/0x190 + + stack backtrace: + Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17 + Call Trace: + [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0 + [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120 + [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190 + [<ffffffff812a5046>] elevator_exit+0x36/0x60 + [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60 + [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10 + [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0 + [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10 + [<ffffffff817da069>] ? error_exit+0x29/0xb0 + [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80 + [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680 + [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c + [<ffffffff817da069>] ? error_exit+0x29/0xb0 + [<ffffffff812bcc60>] ? kobject_del+0x40/0x40 + [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0 + [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150 + [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90 + [<ffffffff8145f170>] do_scan_async+0x20/0x160 + [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90 + [<ffffffff810975b6>] kthread+0xa6/0xb0 + [<ffffffff817db154>] kernel_thread_helper+0x4/0x10 + [<ffffffff81066430>] ? finish_task_switch+0x80/0x110 + [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe + [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70 + [<ffffffff817db150>] ? gs_change+0xb/0xb + +Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows:: if (rcu_dereference(ioc->ioc_data) == cic) { @@ -70,7 +75,7 @@ case. Instead, we hold three locks, one of which might be RCU related. And maybe that lock really does protect this reference. If so, the fix is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to take the struct request_queue "q" from cfq_exit_queue() as an argument, -which would permit us to invoke rcu_dereference_protected as follows: +which would permit us to invoke rcu_dereference_protected as follows:: if (rcu_dereference_protected(ioc->ioc_data, lockdep_is_held(&q->queue_lock)) == cic) { @@ -85,7 +90,7 @@ On the other hand, perhaps we really do need an RCU read-side critical section. In this case, the critical section must span the use of the return value from rcu_dereference(), or at least until there is some reference count incremented or some such. One way to handle this is to -add rcu_read_lock() and rcu_read_unlock() as follows: +add rcu_read_lock() and rcu_read_unlock() as follows:: rcu_read_lock(); if (rcu_dereference(ioc->ioc_data) == cic) { @@ -102,7 +107,7 @@ above lockdep-RCU splat. But in this particular case, we don't actually dereference the pointer returned from rcu_dereference(). Instead, that pointer is just compared to the cic pointer, which means that the rcu_dereference() can be replaced -by rcu_access_pointer() as follows: +by rcu_access_pointer() as follows:: if (rcu_access_pointer(ioc->ioc_data) == cic) { diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.rst index 89db949eeca0..f1fc8ae3846a 100644 --- a/Documentation/RCU/lockdep.txt +++ b/Documentation/RCU/lockdep.rst @@ -1,4 +1,8 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================== RCU and lockdep checking +======================== All flavors of RCU have lockdep checking available, so that lockdep is aware of when each task enters and leaves any flavor of RCU read-side @@ -8,7 +12,7 @@ tracking to include RCU state, which can sometimes help when debugging deadlocks and the like. In addition, RCU provides the following primitives that check lockdep's -state: +state:: rcu_read_lock_held() for normal RCU. rcu_read_lock_bh_held() for RCU-bh. @@ -63,7 +67,7 @@ checking of rcu_dereference() primitives: The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, any boolean expression can be used. For a moderately ornate example, -consider the following: +consider the following:: file = rcu_dereference_check(fdt->fd[fd], lockdep_is_held(&files->file_lock) || @@ -82,7 +86,7 @@ RCU read-side critical sections, in case (2) the ->file_lock prevents any change from taking place, and finally, in case (3) the current task is the only task accessing the file_struct, again preventing any change from taking place. If the above statement was invoked only from updater -code, it could instead be written as follows: +code, it could instead be written as follows:: file = rcu_dereference_protected(fdt->fd[fd], lockdep_is_held(&files->file_lock) || @@ -105,7 +109,7 @@ false and they are called from outside any RCU read-side critical section. For example, the workqueue for_each_pwq() macro is intended to be used either within an RCU read-side critical section or with wq->mutex held. -It is thus implemented as |
