diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-13 09:49:06 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-13 09:49:06 -0700 |
| commit | c0b9620bc3f0a0f914996cc6631522d41870a9e0 (patch) | |
| tree | 8fdf2fed856bebe2c3e60302df3fb64cafa87339 | |
| parent | 736676f5c3abd1fc01c41813a95246e892937f6d (diff) | |
| parent | 64619b283bb35b12a96129e82b40304f7e5551b7 (diff) | |
| download | linux-c0b9620bc3f0a0f914996cc6631522d41870a9e0.tar.gz linux-c0b9620bc3f0a0f914996cc6631522d41870a9e0.tar.bz2 linux-c0b9620bc3f0a0f914996cc6631522d41870a9e0.zip | |
Merge tag 'rcu.next.v6.10' of https://github.com/urezki/linux
Pull RCU updates from Uladzislau Rezki:
- Fix a lockdep complain for lazy-preemptible kernel, remove redundant
BH disable for TINY_RCU, remove redundant READ_ONCE() in tree.c, fix
false positives KCSAN splat and fix buffer overflow in the
print_cpu_stall_info().
- Misc updates related to bpf, tracing and update the MAINTAINERS file.
- An improvement of a normal synchronize_rcu() call in terms of
latency. It maintains a separate track for sync. users only. This
approach bypasses per-cpu nocb-lists thus sync-users do not depend on
nocb-list length and how fast regular callbacks are processed.
- RCU tasks: switch tasks RCU grace periods to sleep at TASK_IDLE
priority, fix some comments, add some diagnostic warning to the
exit_tasks_rcu_start() and fix a buffer overflow in the
show_rcu_tasks_trace_gp_kthread().
- RCU torture: Increase memory to guest OS, fix a Tasks Rude RCU
testing, some updates for TREE09, dump mode information to debug GP
kthread state, remove redundant READ_ONCE(), fix some comments about
RCU_TORTURE_PIPE_LEN and pipe_count, remove some redundant pointer
initialization, fix a hung splat task by when the rcutorture tests
start to exit, fix invalid context warning, add '--do-kvfree'
parameter to torture test and use slow register unregister callbacks
only for rcutype test.
* tag 'rcu.next.v6.10' of https://github.com/urezki/linux: (48 commits)
rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test
torture: Scale --do-kvfree test time
rcutorture: Fix invalid context warning when enable srcu barrier testing
rcutorture: Make stall-tasks directly exit when rcutorture tests end
rcutorture: Removing redundant function pointer initialization
rcutorture: Make rcutorture support print rcu-tasks gp state
rcutorture: Use the gp_kthread_dbg operation specified by cur_ops
rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading
rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE()
rcu: Allocate WQ with WQ_MEM_RECLAIM bit set
rcu: Support direct wake-up of synchronize_rcu() users
rcu: Add a trace event for synchronize_rcu_normal()
rcu: Reduce synchronize_rcu() latency
rcu: Fix buffer overflow in print_cpu_stall_info()
rcu: Mollify sparse with RCU guard
rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow
rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer
rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE()
rcu: Remove redundant CONFIG_PROVE_RCU #if condition
...
29 files changed, 663 insertions, 137 deletions
@@ -467,7 +467,8 @@ Nadia Yvette Chambers <nyc@holomorphy.com> William Lee Irwin III <wli@holomorphy Naoya Horiguchi <nao.horiguchi@gmail.com> <n-horiguchi@ah.jp.nec.com> Naoya Horiguchi <nao.horiguchi@gmail.com> <naoya.horiguchi@nec.com> Nathan Chancellor <nathan@kernel.org> <natechancellor@gmail.com> -Neeraj Upadhyay <quic_neeraju@quicinc.com> <neeraju@codeaurora.org> +Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <quic_neeraju@quicinc.com> +Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <neeraju@codeaurora.org> Neil Armstrong <neil.armstrong@linaro.org> <narmstrong@baylibre.com> Nguyen Anh Quynh <aquynh@gmail.com> Nicholas Piggin <npiggin@gmail.com> <npiggen@suse.de> diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index 872ac665223f..94838c65c7d9 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -427,7 +427,7 @@ their assorted primitives. This section shows a simple use of the core RCU API to protect a global pointer to a dynamically allocated structure. More-typical -uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst. +uses of RCU may be found in listRCU.rst and NMI-RCU.rst. :: struct foo { @@ -510,8 +510,8 @@ So, to sum up: data item. See checklist.rst for additional rules to follow when using RCU. -And again, more-typical uses of RCU may be found in listRCU.rst, -arrayRCU.rst, and NMI-RCU.rst. +And again, more-typical uses of RCU may be found in listRCU.rst +and NMI-RCU.rst. .. _4_whatisRCU: diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 535301f6dd09..5dd8b6b08dee 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5098,6 +5098,20 @@ delay, memory pressure or callback list growing too big. + rcutree.rcu_normal_wake_from_gp= [KNL] + Reduces a latency of synchronize_rcu() call. This approach + maintains its own track of synchronize_rcu() callers, so it + does not interact with regular callbacks because it does not + use a call_rcu[_hurry]() path. Please note, this is for a + normal grace period. + + How to enable it: + + echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp + or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1" + + Default is 0. + rcuscale.gp_async= [KNL] Measure performance of asynchronous grace-period primitives such as call_rcu(). diff --git a/MAINTAINERS b/MAINTAINERS index 68b67ab9c0e9..3ae91a5185d5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18600,7 +18600,7 @@ F: tools/testing/selftests/resctrl/ READ-COPY UPDATE (RCU) M: "Paul E. McKenney" <paulmck@kernel.org> M: Frederic Weisbecker <frederic@kernel.org> (kernel/rcu/tree_nocb.h) -M: Neeraj Upadhyay <quic_neeraju@quicinc.com> (kernel/rcu/tasks.h) +M: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> (kernel/rcu/tasks.h) M: Joel Fernandes <joel@joelfernandes.org> M: Josh Triplett <josh@joshtriplett.org> M: Boqun Feng <boqun.feng@gmail.com> diff --git a/arch/Kconfig b/arch/Kconfig index 30f7930275d8..a51addb1abb9 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -63,7 +63,7 @@ config KPROBES depends on MODULES depends on HAVE_KPROBES select KALLSYMS - select TASKS_RCU if PREEMPTION + select NEED_TASKS_RCU help Kprobes allows you to trap at almost any kernel address and execute a callback function. register_kprobe() establishes @@ -112,7 +112,7 @@ config STATIC_CALL_SELFTEST config OPTPROBES def_bool y depends on KPROBES && HAVE_OPTPROBES - select TASKS_RCU if PREEMPTION + select NEED_TASKS_RCU config KPROBES_ON_FTRACE def_bool y diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 17d7ed5f3ae6..dfd2399f2cde 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -401,15 +401,15 @@ static inline int debug_lockdep_rcu_enabled(void) } \ } while (0) -#if defined(CONFIG_PROVE_RCU) && !defined(CONFIG_PREEMPT_RCU) +#ifndef CONFIG_PREEMPT_RCU static inline void rcu_preempt_sleep_check(void) { RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map), "Illegal context switch in RCU read-side critical section"); } -#else /* #ifdef CONFIG_PROVE_RCU */ +#else // #ifndef CONFIG_PREEMPT_RCU static inline void rcu_preempt_sleep_check(void) { } -#endif /* #else #ifdef CONFIG_PROVE_RCU */ +#endif // #else // #ifndef CONFIG_PREEMPT_RCU #define rcu_sleep_check() \ do { \ @@ -809,9 +809,9 @@ static inline void rcu_read_unlock(void) { RCU_LOCKDEP_WARN(!rcu_is_watching(), "rcu_read_unlock() used illegally while idle"); + rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */ __release(RCU); __rcu_read_unlock(); - rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */ } /** @@ -1090,6 +1090,18 @@ rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f) extern int rcu_expedited; extern int rcu_normal; -DEFINE_LOCK_GUARD_0(rcu, rcu_read_lock(), rcu_read_unlock()) +DEFINE_LOCK_GUARD_0(rcu, + do { + rcu_read_lock(); + /* + * sparse doesn't call the cleanup function, + * so just release immediately and don't track + * the context. We don't need to anyway, since + * the whole point of the guard is to not need + * the explicit unlock. + */ + __release(RCU); + } while (0), + rcu_read_unlock()) #endif /* __LINUX_RCUPDATE_H */ diff --git a/include/linux/rcupdate_wait.h b/include/linux/rcupdate_wait.h index d07f0848802e..303ab9bee155 100644 --- a/include/linux/rcupdate_wait.h +++ b/include/linux/rcupdate_wait.h @@ -19,18 +19,18 @@ struct rcu_synchronize { }; void wakeme_after_rcu(struct rcu_head *head); -void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array, +void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array, struct rcu_synchronize *rs_array); -#define _wait_rcu_gp(checktiny, ...) \ -do { \ - call_rcu_func_t __crcu_array[] = { __VA_ARGS__ }; \ - struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)]; \ - __wait_rcu_gp(checktiny, ARRAY_SIZE(__crcu_array), \ - __crcu_array, __rs_array); \ +#define _wait_rcu_gp(checktiny, state, ...) \ +do { \ + call_rcu_func_t __crcu_array[] = { __VA_ARGS__ }; \ + struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)]; \ + __wait_rcu_gp(checktiny, state, ARRAY_SIZE(__crcu_array), __crcu_array, __rs_array); \ } while (0) -#define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__) +#define wait_rcu_gp(...) _wait_rcu_gp(false, TASK_UNINTERRUPTIBLE, __VA_ARGS__) +#define wait_rcu_gp_state(state, ...) _wait_rcu_gp(false, state, __VA_ARGS__) /** * synchronize_rcu_mult - Wait concurrently for multiple grace periods @@ -54,7 +54,7 @@ do { \ * grace period. */ #define synchronize_rcu_mult(...) \ - _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__) + _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), TASK_UNINTERRUPTIBLE, __VA_ARGS__) static inline void cond_resched_rcu(void) { diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h index 447133171d95..4d96bbdb45f0 100644 --- a/include/linux/srcutiny.h +++ b/include/linux/srcutiny.h @@ -64,8 +64,10 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp) { int idx; + preempt_disable(); // Needed for PREEMPT_AUTO idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1; WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1); + preempt_enable(); return idx; } diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h index 2ef9c719772a..31b3e0d3e65f 100644 --- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h @@ -708,6 +708,33 @@ TRACE_EVENT_RCU(rcu_invoke_kfree_bulk_callback, ); /* + * Tracepoint for a normal synchronize_rcu() states. The first argument + * is the RCU flavor, the second argument is a pointer to rcu_head the + * last one is an event. + */ +TRACE_EVENT_RCU(rcu_sr_normal, + + TP_PROTO(const char *rcuname, struct rcu_head *rhp, const char *srevent), + + TP_ARGS(rcuname, rhp, srevent), + + TP_STRUCT__entry( + __field(const char *, rcuname) + __field(void *, rhp) + __field(const char *, srevent) + ), + + TP_fast_assign( + __entry->rcuname = rcuname; + __entry->rhp = rhp; + __entry->srevent = srevent; + ), + + TP_printk("%s rhp=0x%p event=%s", + __entry->rcuname, __entry->rhp, __entry->srevent) +); + +/* * Tracepoint for exiting rcu_do_batch after RCU callbacks have been * invoked. The first argument is the name of the RCU flavor, * the second argument is number of callbacks actually invoked, diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig index bc25f5098a25..4100df44c665 100644 --- a/kernel/bpf/Kconfig +++ b/kernel/bpf/Kconfig @@ -28,7 +28,7 @@ config BPF_SYSCALL bool "Enable bpf() system call" select BPF select IRQ_WORK - select TASKS_RCU if PREEMPTION + select NEED_TASKS_RCU select TASKS_TRACE_RCU select BINARY_PRINTF select NET_SOCK_MSG if NET diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index db7599c59c78..88673a4267eb 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -333,7 +333,7 @@ static void bpf_tramp_image_put(struct bpf_tramp_image *im) int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP, NULL, im->ip_epilogue); WARN_ON(err); - if (IS_ENABLED(CONFIG_PREEMPTION)) + if (IS_ENABLED(CONFIG_TASKS_RCU)) call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu_tasks); else percpu_ref_kill(&im->pcref); diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig index e7d2dd267593..3e079de0f5b4 100644 --- a/kernel/rcu/Kconfig +++ b/kernel/rcu/Kconfig @@ -31,7 +31,7 @@ config PREEMPT_RCU config TINY_RCU bool - default y if !PREEMPTION && !SMP + default y if !PREEMPT_RCU && !SMP help This option selects the RCU implementation that is designed for UP systems from which real-time response @@ -85,9 +85,13 @@ config FORCE_TASKS_RCU idle, and user-mode execution as quiescent states. Not for manual selection in most cases. -config TASKS_RCU +config NEED_TASKS_RCU bool default n + +config TASKS_RCU + bool + default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO) select IRQ_WORK config FORCE_TASKS_RUDE_RCU diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 86fce206560e..38238e595a61 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -522,12 +522,18 @@ static inline void show_rcu_tasks_gp_kthreads(void) {} #ifdef CONFIG_TASKS_RCU struct task_struct *get_rcu_tasks_gp_kthread(void); +void rcu_tasks_get_gp_data(int *flags, unsigned long *gp_seq); #endif // # ifdef CONFIG_TASKS_RCU #ifdef CONFIG_TASKS_RUDE_RCU struct task_struct *get_rcu_tasks_rude_gp_kthread(void); +void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq); #endif // # ifdef CONFIG_TASKS_RUDE_RCU +#ifdef CONFIG_TASKS_TRACE_RCU +void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq); +#endif + #ifdef CONFIG_TASKS_RCU_GENERIC void tasks_cblist_init_generic(void); #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ @@ -557,8 +563,7 @@ static inline void rcu_set_jiffies_lazy_flush(unsigned long j) { } #endif #if defined(CONFIG_TREE_RCU) -void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, - unsigned long *gp_seq); +void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq); void do_trace_rcu_torture_read(const char *rcutorturename, struct rcu_head *rhp, unsigned long secs, @@ -566,8 +571,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename, unsigned long c); void rcu_gp_set_torture_wait(int duration); #else -static inline void rcutorture_get_gp_data(enum rcutorture_type test_type, - int *flags, unsigned long *gp_seq) +static inline void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq) { *flags = 0; *gp_seq = 0; @@ -587,20 +591,16 @@ static inline void rcu_gp_set_torture_wait(int duration) { } #ifdef CONFIG_TINY_SRCU -static inline void srcutorture_get_gp_data(enum rcutorture_type test_type, - struct srcu_struct *sp, int *flags, +static inline void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags, unsigned long *gp_seq) { - if (test_type != SRCU_FLAVOR) - return; *flags = 0; *gp_seq = sp->srcu_idx; } #elif defined(CONFIG_TREE_SRCU) -void srcutorture_get_gp_data(enum rcutorture_type test_type, - struct srcu_struct *sp, int *flags, +void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags, unsigned long *gp_seq); #endif diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 45d6b4c3d199..807fbf6123a7 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -381,6 +381,9 @@ struct rcu_torture_ops { void (*gp_kthread_dbg)(void); bool (*check_boost_failed)(unsigned long gp_state, int *cpup); int (*stall_dur)(void); + void (*get_gp_data)(int *flags, unsigned long *gp_seq); + void (*gp_slow_register)(atomic_t *rgssp); + void (*gp_slow_unregister)(atomic_t *rgssp); long cbflood_max; int irq_capable; int can_boost; @@ -461,12 +464,13 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp) WRITE_ONCE(rp->rtort_chkp, NULL); smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire(). } - i = READ_ONCE(rp->rtort_pipe_count); + i = rp->rtort_pipe_count; if (i > RCU_TORTURE_PIPE_LEN) i = RCU_TORTURE_PIPE_LEN; atomic_inc(&rcu_torture_wcount[i]); WRITE_ONCE(rp->rtort_pipe_count, i + 1); - if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { + ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count); + if (i + 1 >= RCU_TORTURE_PIPE_LEN) { rp->rtort_mbtest = 0; return true; } @@ -564,10 +568,12 @@ static struct rcu_torture_ops rcu_ops = { .call = call_rcu_hurry, .cb_barrier = rcu_barrier, .fqs = rcu_force_quiescent_state, - .stats = NULL, .gp_kthread_dbg = show_rcu_gp_kthreads, .check_boost_failed = rcu_check_boost_fail, .stall_dur = rcu_jiffies_till_stall_check, + .get_gp_data = rcutorture_get_gp_data, + .gp_slow_register = rcu_gp_slow_register, + .gp_slow_unregister = rcu_gp_slow_unregister, .irq_capable = 1, .can_boost = IS_ENABLED(CONFIG_RCU_BOOST), .extendables = RCUTORTURE_MAX_EXTEND, @@ -611,9 +617,6 @@ static struct rcu_torture_ops rcu_busted_ops = { .sync = synchronize_rcu_busted, .exp_sync = synchronize_rcu_busted, .call = call_rcu_busted, - .cb_barrier = NULL, - .fqs = NULL, - .stats = NULL, .irq_capable = 1, .name = "busted" }; @@ -627,6 +630,11 @@ static struct srcu_struct srcu_ctld; static struct srcu_struct *srcu_ctlp = &srcu_ctl; static struct rcu_torture_ops srcud_ops; +static void srcu_get_gp_data(int *flags, unsigned long *gp_seq) +{ + srcutorture_get_gp_data(srcu_ctlp, flags, gp_seq); +} + static int srcu_torture_read_lock(void) { if (cur_ops == &srcud_ops) @@ -735,6 +743,7 @@ static struct rcu_torture_ops srcu_ops = { .call = srcu_torture_call, .cb_barrier = srcu_torture_barrier, .stats = srcu_torture_stats, + .get_gp_data = srcu_get_gp_data, .cbflood_max = 50000, .irq_capable = 1, .no_pi_lock = IS_ENABLED(CONFIG_TINY_SRCU), @@ -773,6 +782,7 @@ static struct rcu_torture_ops srcud_ops = { .call = srcu_torture_call, .cb_barrier = srcu_torture_barrier, .stats = srcu_torture_stats, + .get_gp_data = srcu_get_gp_data, .cbflood_max = 50000, .irq_capable = 1, .no_pi_lock = IS_ENABLED(CONFIG_TINY_SRCU), @@ -837,8 +847,6 @@ static struct rcu_torture_ops trivial_ops = { .get_gp_seq = rcu_no_completed, .sync = synchronize_rcu_trivial, .exp_sync = synchronize_rcu_trivial, - .fqs = NULL, - .stats = NULL, .irq_capable = 1, .name = "trivial" }; @@ -881,8 +889,7 @@ static struct rcu_torture_ops tasks_ops = { .call = call_rcu_tasks, .cb_barrier = rcu_barrier_tasks, .gp_kthread_dbg = show_rcu_tasks_classic_gp_kthread, - .fqs = NULL, - .stats = NULL, + .get_gp_data = rcu_tasks_get_gp_data, .irq_capable = 1, .slow_gps = 1, .name = "tasks" @@ -921,9 +928,8 @@ static struct rcu_torture_ops tasks_rude_ops = { .call = call_rcu_tasks_rude, .cb_barrier = rcu_barrier_tasks_rude, .gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread, + .get_gp_data = rcu_tasks_rude_get_gp_data, .cbflood_max = 50000, - .fqs = NULL, - .stats = NULL, .irq_capable = 1, .name = "tasks-rude" }; @@ -973,9 +979,8 @@ static struct rcu_torture_ops tasks_tracing_ops = { .call = call_rcu_tasks_trace, .cb_barrier = rcu_barrier_tasks_trace, .gp_kthread_dbg = show_rcu_tasks_trace_gp_kthread, + .get_gp_data = rcu_tasks_trace_get_gp_data, .cbflood_max = 50000, - .fqs = NULL, - .stats = NULL, .irq_capable = 1, .slow_gps = 1, .name = "tasks-tracing" @@ -1399,6 +1404,7 @@ rcu_torture_writer(void *arg) if (rp == NULL) continue; rp->rtort_pipe_count = 0; + ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count); rcu_torture_writer_state = RTWS_DELAY; udelay(torture_random(&rand) & 0x3ff); rcu_torture_writer_state = RTWS_REPLACE; @@ -1414,6 +1420,7 @@ rcu_torture_writer(void *arg) atomic_inc(&rcu_torture_wcount[i]); WRITE_ONCE(old_rp->rtort_pipe_count, old_rp->rtort_pipe_count + 1); + ASSERT_EXCLUSIVE_WRITER(old_rp->rtort_pipe_count); // Make sure readers block polled grace periods. if (cur_ops->get_gp_state && cur_ops->poll_gp_state) { @@ -1586,7 +1593,8 @@ rcu_torture_writer(void *arg) if (list_empty(&rcu_tortures[i].rtort_free) && rcu_access_pointer(rcu_torture_current) != &rcu_tortures[i]) { tracing_off(); - show_rcu_gp_kthreads(); + if (cur_ops->gp_kthread_dbg) + cur_ops->gp_kthread_dbg(); WARN(1, "%s: rtort_pipe_count: %d\n", __func__, rcu_tortures[i].rtort_pipe_count); rcu_ftrace_dump(DUMP_ALL); } @@ -1997,7 +2005,8 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp, long myid) preempt_disable(); pipe_count = READ_ONCE(p->rtort_pipe_count); if (pipe_count > RCU_TORTURE_PIPE_LEN) { - /* Should not happen, but... */ + // Should not happen in a correct RCU implementation, + // happens quite often for torture_type=busted. pipe_count = RCU_TORTURE_PIPE_LEN; } completed = cur_ops->get_gp_seq(); @@ -2259,10 +2268,8 @@ rcu_torture_stats_print(void) int __maybe_unused flags = 0; unsigned long __maybe_unused gp_seq = 0; - rcutorture_get_gp_data(cur_ops->ttype, - &flags, &gp_seq); - srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, - &flags, &gp_seq); + if (cur_ops->get_gp_data) + cur_ops->get_gp_data(&flags, &gp_seq); wtp = READ_ONCE(writer_task); pr_alert("??? Writer stall state %s(%d) g%lu f%#x ->state %#x cpu %d\n", rcu_torture_writer_state_getname(), @@ -2486,8 +2493,8 @@ static int rcu_torture_stall(void *args) preempt_disable(); pr_alert("%s start on CPU %d.\n", __func__, raw_smp_processor_id()); - while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), - stop_at)) + while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) && + !kthread_should_stop()) if (stall_cpu_block) { #ifdef CONFIG_PREEMPTION preempt_schedule(); @@ -2832,13 +2839,14 @@ static void rcu_torture_fwd_prog_cr(struct rcu_fwd *rfp) if (!torture_must_stop() && !READ_ONCE(rcu_fwd_emergency_stop) && !shutdown_time_arrived()) { - WARN_ON(n_max_gps < MIN_FWD_CBS_LAUNDERED); - pr_alert("%s Duration %lu barrier: %lu pending %ld n_launders: %ld n_launders_sa: %ld n_max_gps: %ld n_max_cbs: %ld cver %ld gps %ld\n", + if (WARN_ON(n_max_gps < MIN_FWD_CBS_LAUNDERED) && cur_ops->gp_kthread_dbg) + cur_ops->gp_kthread_dbg(); + pr_alert("%s Duration %lu barrier: %lu pending %ld n_launders: %ld n_launders_sa: %ld n_max_gps: %ld n_max_cbs: %ld cver %ld gps %ld #online %u\n", __func__, stoppedat - rfp->rcu_fwd_startat, jiffies - stoppedat, n_launders + n_max_cbs - n_launders_cb_snap, n_launders, n_launders_sa, - n_max_gps, n_max_cbs, cver, gps); + n_max_gps, n_max_cbs, cver, gps, num_online_cpus()); atomic_long_add(n_max_cbs, &rcu_fwd_max_cbs); mutex_lock(&rcu_fwd_mutex); // Serialize histograms. rcu_torture_fwd_cb_hist(rfp); @@ -3040,11 +3048,12 @@ static void rcu_torture_barrier_cbf(struct rcu_head *rcu) } /* IPI handler to get callback posted on desired CPU, if online. */ -static void rcu_torture_barrier1cb(void *rcu_void) +static int rcu_torture_barrier1cb(void *rcu_void) { struct rcu_head *rhp = rcu_void; cur_ops->call(rhp, rcu_torture_barrier_cbf); + return 0; } /* kthread function to register callbacks used to test RCU barriers. */ @@ -3070,11 +3079,9 @@ static int rcu_torture_barrier_cbs(void *arg) * The above smp_load_acquire() ensures barrier_phase load * is ordered before the following ->call(). */ - if (smp_call_function_single(myid, rcu_torture_barrier1cb, - &rcu, 1)) { - // IPI failed, so use direct call from current CPU. + if (smp_call_on_cpu(myid, rcu_torture_barrier1cb, &rcu, 1)) cur_ops->call(&rcu, rcu_torture_barrier_cbf); - } + if (atomic_dec_and_test(&barrier_cbs_count)) wake_up(&barrier_wq); } while (!torture_must_stop()); @@ -3340,12 +3347,12 @@ rcu_torture_cleanup(void) pr_info("%s: Invoking %pS().\n", __func__, cur_ops->cb_barrier); cur_ops->cb_barrier(); } - rcu_gp_slow_unregister(NULL); + if (cur_ops->gp_slow_unregister) + cur_ops->gp_slow_unregister(NULL); return; } if (!cur_ops) { torture_cleanup_end(); - rcu_gp_slow_unregister(NULL); return; } @@ -3384,8 +3391,8 @@ rcu_torture_cleanup(void) fakewriter_tasks = NULL; } - rcutorture_get_gp_data(cur_ops->ttype, &flags, &gp_seq); - srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, &flags, &gp_seq); + if (cur_ops->get_gp_data) + cur_ops->get_gp_data(&flags, &gp_seq); pr_alert("%s: End-test grace-period state: g%ld f%#x total-gps=%ld\n", cur_ops->name, (long)gp_seq, flags, rcutorture_seq_diff(gp_seq, start_gp_seq)); @@ -3444,7 +3451,8 @@ rcu_torture_cleanup(void) else rcu_torture_print_module_parms(cur_ops, "End of test: SUCCESS"); torture_cleanup_end(); - rcu_gp_slow_unregister(&rcu_fwd_cb_nodelay); + if (cur_ops->gp_slow_unregister) + cur_ops->gp_slow_unregister(NULL); } #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD @@ -3756,8 +3764,8 @@ rcu_torture_init(void) nrealreaders = 1; } rcu_torture_print_module_parms(cur_ops, "Start of test"); - rcutorture_get_gp_data(cur_ops->ttype, &flags, &gp_seq); - srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, &flags, &gp_seq); + if (cur_ops->get_gp_data) + cur_ops->get_gp_data(&flags, &gp_seq); start_gp_seq = gp_seq; pr_alert("%s: Start-test grace-period state: g%ld f%#x\n", cur_ops->name, (long)gp_seq, flags); @@ -3926,7 +3934,8 @@ rcu_torture_init(void) if (object_debug) rcu_test_debug_objects(); torture_init_end(); - rcu_gp_slow_register(&rcu_fwd_cb_nodelay); + if (cur_ops->gp_slow_register && !WARN_ON_ONCE(!cur_ops->gp_slow_unregister)) + cur_ops->gp_slow_register(&rcu_fwd_cb_nodelay); return 0; unwind: diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index c38e5933a5d6..5afd5cf494db 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -96,9 +96,12 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct); */ void __srcu_read_unlock(struct srcu_struct *ssp, int idx) { - int newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1; + int newval; + preempt_disable(); // Needed for PREEMPT_AUTO + newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1; WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval); + preempt_enable(); if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task()) swake_up_one(&ssp->srcu_wq); } @@ -117,8 +120,11 @@ void srcu_drive_gp(struct work_struct *wp) struct srcu_struct *ssp; ssp = container_of(wp, struct srcu_struct, srcu_work); - if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) + preempt_disable(); // Needed for PREEMPT_AUTO + if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) { return; /* Already running or nothing to do. */ + preempt_enable(); + } /* Remove recently arrived callbacks and wait for readers. */ WRITE_ONCE(ssp->srcu_gp_running, true); @@ -130,9 +136,12 @@ void srcu_drive_gp(struct work_struct *wp) idx = (ssp->srcu_idx & 0x2) / 2; WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1); WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */ + preempt_enable(); swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx])); + preempt_disable(); // Needed for PREEMPT_AUTO WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */ WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1); + preempt_enable(); /* Invoke the callbacks we removed above. */ while (lh) { @@ -150,8 +159,11 @@ void srcu_drive_gp(struct work_struct *wp) * at interrupt level, but the ->srcu_gp_running checks will * straighten that out. */ + preempt_disable(); // Needed for PREEMPT_AUTO WRITE_ONCE(ssp->srcu_gp_running, false); |
