2023-09-08 22:00:44

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 00/10] rcu cleanups

Hi,

Here is a bunch of accumulated cleanups. Many of them are trivial but
beware some tricky ordering changes in the middle :-)

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
rcu/dev

HEAD: 43d6b973aeb7459d29cd52054142291b099bf8ad

Thanks,
Frederic
---

Frederic Weisbecker (10):
rcu: Use rcu_segcblist_segempty() instead of open coding it
rcu: Rename jiffies_till_flush to jiffies_lazy_flush
rcu/nocb: Remove needless LOAD-ACQUIRE
rcu/nocb: Remove needless full barrier after callback advancing
rcu: Assume IRQS disabled from rcu_report_dead()
rcu: Assume rcu_report_dead() is always called locally
rcu: Conditionally build CPU-hotplug teardown callbacks
rcu: Standardize explicit CPU-hotplug calls
rcu: Remove references to rcu_migrate_callbacks() from diagrams
rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP


.../Expedited-Grace-Periods.rst | 2 +-
.../Memory-Ordering/TreeRCU-callback-registry.svg | 9 --
.../RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg | 4 +-
.../RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 13 +-
.../RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg | 4 +-
.../RCU/Design/Requirements/Requirements.rst | 4 +-
arch/arm64/kernel/smp.c | 4 +-
arch/powerpc/kernel/smp.c | 2 +-
arch/s390/kernel/smp.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
include/linux/interrupt.h | 2 +-
include/linux/rcupdate.h | 2 -
include/linux/rcutiny.h | 2 +-
include/linux/rcutree.h | 16 ++-
kernel/cpu.c | 13 +-
kernel/rcu/rcu.h | 8 +-
kernel/rcu/rcu_segcblist.c | 4 +-
kernel/rcu/rcuscale.c | 6 +-
kernel/rcu/tree.c | 138 ++++++++++-----------
kernel/rcu/tree_nocb.h | 24 ++--
20 files changed, 129 insertions(+), 132 deletions(-)


2023-09-08 22:07:51

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 08/10] rcu: Standardize explicit CPU-hotplug calls

rcu_report_dead() and rcutree_migrate_callbacks() have their headers in
rcupdate.h while those are pure rcutree calls, like the other CPU-hotplug
functions.

Also rcu_cpu_starting() and rcu_report_dead() have different naming
conventions while they mirror each other's effects.

Fix the headers and propose a naming that relates both functions and
aligns with the prefix of other rcutree CPU-hotplug functions.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
.../Expedited-Grace-Periods.rst | 2 +-
.../RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg | 4 ++--
.../RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 4 ++--
.../RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg | 4 ++--
.../RCU/Design/Requirements/Requirements.rst | 4 ++--
arch/arm64/kernel/smp.c | 4 ++--
arch/powerpc/kernel/smp.c | 2 +-
arch/s390/kernel/smp.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
include/linux/interrupt.h | 2 +-
include/linux/rcupdate.h | 2 --
include/linux/rcutiny.h | 2 +-
include/linux/rcutree.h | 7 ++++++-
kernel/cpu.c | 6 +++---
kernel/rcu/tree.c | 12 ++++++++----
15 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
index 93d899d53258..414f8a2012d6 100644
--- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
+++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
@@ -181,7 +181,7 @@ operations is carried out at several levels:
of this wait (or series of waits, as the case may be) is to permit a
concurrent CPU-hotplug operation to complete.
#. In the case of RCU-sched, one of the last acts of an outgoing CPU is
- to invoke ``rcu_report_dead()``, which reports a quiescent state for
+ to invoke ``rcutree_report_cpu_dead()``, which reports a quiescent state for
that CPU. However, this is likely paranoia-induced redundancy.

+-----------------------------------------------------------------------+
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
index 7ddc094d7f28..d82a77d03d8c 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
@@ -1135,7 +1135,7 @@
font-weight="bold"
font-size="192"
id="text202-7-5-3-27-6-5"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
<text
xml:space="preserve"
x="3745.7725"
@@ -1256,7 +1256,7 @@
font-style="normal"
y="3679.27"
x="-3804.9949"
- xml:space="preserve">rcu_cpu_starting()</text>
+ xml:space="preserve">rcutree_report_cpu_starting()</text>
<g
style="fill:none;stroke-width:0.025in"
id="g3107-7-5-0"
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
index 069f6f8371c2..6e690a3be161 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
@@ -3274,7 +3274,7 @@
font-weight="bold"
font-size="192"
id="text202-7-5-3-27-6-5"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
<text
xml:space="preserve"
x="3745.7725"
@@ -3395,7 +3395,7 @@
font-style="normal"
y="3679.27"
x="-3804.9949"
- xml:space="preserve">rcu_cpu_starting()</text>
+ xml:space="preserve">rcutree_report_cpu_starting()</text>
<g
style="fill:none;stroke-width:0.025in"
id="g3107-7-5-0"
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
index 2c9310ba29ba..4fa7506082bf 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
@@ -607,7 +607,7 @@
font-weight="bold"
font-size="192"
id="text202-7-5-3-27-6"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
<text
xml:space="preserve"
x="3745.7725"
@@ -728,7 +728,7 @@
font-style="normal"
y="3679.27"
x="-3804.9949"
- xml:space="preserve">rcu_cpu_starting()</text>
+ xml:space="preserve">rcutree_report_cpu_starting()</text>
<g
style="fill:none;stroke-width:0.025in"
id="g3107-7-5-0"
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index f3b605285a87..cccafdaa1f84 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -1955,12 +1955,12 @@ if offline CPUs block an RCU grace period for too long.

An offline CPU's quiescent state will be reported either:

-1. As the CPU goes offline using RCU's hotplug notifier (rcu_report_dead()).
+1. As the CPU goes offline using RCU's hotplug notifier (rcutree_report_cpu_dead()).
2. When grace period initialization (rcu_gp_init()) detects a
race either with CPU offlining or with a task unblocking on a leaf
``rcu_node`` structure whose CPUs are all offline.

-The CPU-online path (rcu_cpu_starting()) should never need to report
+The CPU-online path (rcutree_report_cpu_starting()) should never need to report
a quiescent state for an offline CPU. However, as a debugging measure,
it does emit a warning if a quiescent state was not already reported
for that CPU.
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ce672cb69f1c..90c0083438ea 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -215,7 +215,7 @@ asmlinkage notrace void secondary_start_kernel(void)
if (system_uses_irq_prio_masking())
init_gic_priority_masking();

- rcu_cpu_starting(cpu);
+ rcutree_report_cpu_starting(cpu);
trace_hardirqs_off();

/*
@@ -401,7 +401,7 @@ void __noreturn cpu_die_early(void)

/* Mark this CPU absent */
set_cpu_present(cpu, 0);
- rcu_report_dead();
+ rcutree_report_cpu_dead();

if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
update_cpu_boot_status(CPU_KILL_ME);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index fbbb695bae3d..a132e3290e98 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1619,7 +1619,7 @@ void start_secondary(void *unused)

smp_store_cpu_info(cpu);
set_dec(tb_ticks_per_jiffy);
- rcu_cpu_starting(cpu);
+ rcutree_report_cpu_starting(cpu);
cpu_callin_map[cpu] = 1;

if (smp_ops->setup_cpu)
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index f9a2b755f510..3e39a5e1bf48 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -894,7 +894,7 @@ static void smp_start_secondary(void *cpuvoid)
S390_lowcore.restart_flags = 0;
restore_access_regs(S390_lowcore.access_regs_save_area);
cpu_init();
- rcu_cpu_starting(cpu);
+ rcutree_report_cpu_starting(cpu);
init_cpu_timer();
vtime_init();
vdso_getcpu_init();
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e1aa2cd7734b..d25b952a2b91 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -288,7 +288,7 @@ static void notrace start_secondary(void *unused)

cpu_init();
fpu__init_cpu();
- rcu_cpu_starting(raw_smp_processor_id());
+ rcutree_report_cpu_starting(raw_smp_processor_id());
x86_cpuinit.early_percpu_clock_init();

ap_starting();
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce40b04b..d05e1e9a553c 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -566,7 +566,7 @@ enum
*
* _ RCU:
* 1) rcutree_migrate_callbacks() migrates the queue.
- * 2) rcu_report_dead() reports the final quiescent states.
+ * 2) rcutree_report_cpu_dead() reports the final quiescent states.
*
* _ IRQ_POLL: irq_poll_cpu_dead() migrates the queue
*/
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index aa351ddcbe8d..f7206b2623c9 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -122,8 +122,6 @@ static inline void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
void rcu_init(void);
extern int rcu_scheduler_active;
void rcu_sched_clock_irq(int user);
-void rcu_report_dead(void);
-void rcutree_migrate_callbacks(int cpu);

#ifdef CONFIG_TASKS_RCU_GENERIC
void rcu_init_tasks_generic(void);
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7f17acf29dda..04889a9602e7 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -169,6 +169,6 @@ static inline void rcu_all_qs(void) { barrier(); }
#define rcutree_offline_cpu NULL
#define rcutree_dead_cpu NULL
#define rcutree_dying_cpu NULL
-static inline void rcu_cpu_starting(unsigned int cpu) { }
+static inline void rcutree_report_cpu_starting(unsigned int cpu) { }

#endif /* __LINUX_RCUTINY_H */
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 7d75066c72aa..07d0fc1e0d31 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -109,7 +109,7 @@ void rcu_all_qs(void);
/* RCUtree hotplug events */
int rcutree_prepare_cpu(unsigned int cpu);
int rcutree_online_cpu(unsigned int cpu);
-void rcu_cpu_starting(unsigned int cpu);
+void rcutree_report_cpu_starting(unsigned int cpu);

#ifdef CONFIG_HOTPLUG_CPU
int rcutree_dead_cpu(unsigned int cpu);
@@ -121,4 +121,9 @@ int rcutree_offline_cpu(unsigned int cpu);
#define rcutree_offline_cpu NULL
#endif

+void rcutree_migrate_callbacks(int cpu);
+
+/* Called from hotplug and also arm64 early secondary boot failure */
+void rcutree_report_cpu_dead(void);
+
#endif /* __LINUX_RCUTREE_H */
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 86f08eafbd9f..a41a6fff3c91 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1368,10 +1368,10 @@ void cpuhp_report_idle_dead(void)
struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);

BUG_ON(st->state != CPUHP_AP_OFFLINE);
- rcu_report_dead();
+ rcutree_report_cpu_dead();
st->state = CPUHP_AP_IDLE_DEAD;
/*
- * We cannot call complete after rcu_report_dead() so we delegate it
+ * We cannot call complete after rcutree_report_cpu_dead() so we delegate it
* to an online cpu.
*/
smp_call_function_single(cpumask_first(cpu_online_mask),
@@ -1575,7 +1575,7 @@ void notify_cpu_starting(unsigned int cpu)
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);

- rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
+ rcutree_report_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
cpumask_set_cpu(cpu, &cpus_booted_once_mask);

/*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 875f241db508..5698c3f30b1d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4207,7 +4207,7 @@ bool rcu_lockdep_current_cpu_online(void)
rdp = this_cpu_ptr(&rcu_data);
/*
* Strictly, we care here about the case where the current CPU is
- * in rcu_cpu_starting() and thus has an excuse for rdp->grpmask
+ * in rcutree_report_cpu_starting() and thus has an excuse for rdp->grpmask
* not being up to date. So arch_spin_is_locked() might have a
* false positive if it's held by some *other* CPU, but that's
* OK because that just means a false *negative* on the warning.
@@ -4436,8 +4436,10 @@ int rcutree_online_cpu(unsigned int cpu)
* from the incoming CPU rather than from the cpuhp_step mechanism.
* This is because this function must be invoked at a precise location.
* This incoming CPU must not have enabled interrupts yet.
+ *
+ * This mirrors the effects of rcutree_report_cpu_dead().
*/
-void rcu_cpu_starting(unsigned int cpu)
+void rcutree_report_cpu_starting(unsigned int cpu)
{
unsigned long mask;
struct rcu_data *rdp;
@@ -4491,8 +4493,10 @@ void rcu_cpu_starting(unsigned int cpu)
* Note that this function is special in that it is invoked directly
* from the outgoing CPU rather than from the cpuhp_step mechanism.
* This is because this function must be invoked at a precise location.
+ *
+ * This mirrors the effect of rcutree_report_cpu_starting().
*/
-void rcu_report_dead(void)
+void rcutree_report_cpu_dead(void)
{
unsigned long flags;
unsigned long mask;
@@ -5063,7 +5067,7 @@ void __init rcu_init(void)
pm_notifier(rcu_pm_notify, 0);
WARN_ON(num_online_cpus() > 1); // Only one CPU this early in boot.
rcutree_prepare_cpu(cpu);
- rcu_cpu_starting(cpu);
+ rcutree_report_cpu_starting(cpu);
rcutree_online_cpu(cpu);

/* Create workqueue for Tree SRCU and for expedited GPs. */
--
2.41.0

2023-09-08 22:18:24

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 07/10] rcu: Conditionally build CPU-hotplug teardown callbacks

Among the three CPU-hotplug teardown RCU callbacks, two of them early
exit if CONFIG_HOTPLUG_CPU=n, and one is left unchanged. In any case
all of them have an implementation when CONFIG_HOTPLUG_CPU=n.

Align instead with the common way to deal with CPU-hotplug teardown
callbacks and provide a proper stub when they are not supported.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
include/linux/rcutree.h | 11 +++-
kernel/rcu/tree.c | 114 +++++++++++++++++++---------------------
2 files changed, 63 insertions(+), 62 deletions(-)

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index af6ddbd291eb..7d75066c72aa 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -109,9 +109,16 @@ void rcu_all_qs(void);
/* RCUtree hotplug events */
int rcutree_prepare_cpu(unsigned int cpu);
int rcutree_online_cpu(unsigned int cpu);
-int rcutree_offline_cpu(unsigned int cpu);
+void rcu_cpu_starting(unsigned int cpu);
+
+#ifdef CONFIG_HOTPLUG_CPU
int rcutree_dead_cpu(unsigned int cpu);
int rcutree_dying_cpu(unsigned int cpu);
-void rcu_cpu_starting(unsigned int cpu);
+int rcutree_offline_cpu(unsigned int cpu);
+#else
+#define rcutree_dead_cpu NULL
+#define rcutree_dying_cpu NULL
+#define rcutree_offline_cpu NULL
+#endif

#endif /* __LINUX_RCUTREE_H */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 289c51417cbc..875f241db508 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4228,25 +4228,6 @@ static bool rcu_init_invoked(void)
return !!rcu_state.n_online_cpus;
}

-/*
- * Near the end of the offline process. Trace the fact that this CPU
- * is going offline.
- */
-int rcutree_dying_cpu(unsigned int cpu)
-{
- bool blkd;
- struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
- struct rcu_node *rnp = rdp->mynode;
-
- if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
- return 0;
-
- blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
- trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
- blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
- return 0;
-}
-
/*
* All CPUs for the specified rcu_node structure have gone offline,
* and all tasks that were preempted within an RCU read-side critical
@@ -4292,23 +4273,6 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
}
}

-/*
- * The CPU has been completely removed, and some other CPU is reporting
- * this fact from process context. Do the remainder of the cleanup.
- * There can only be one CPU hotplug operation at a time, so no need for
- * explicit locking.
- */
-int rcutree_dead_cpu(unsigned int cpu)
-{
- if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
- return 0;
-
- WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
- // Stop-machine done, so allow nohz_full to disable tick.
- tick_dep_clear(TICK_DEP_BIT_RCU);
- return 0;
-}
-
/*
* Propagate ->qsinitmask bits up the rcu_node tree to account for the
* first CPU in a given leaf rcu_node structure coming online. The caller
@@ -4461,29 +4425,6 @@ int rcutree_online_cpu(unsigned int cpu)
return 0;
}

-/*
- * Near the beginning of the process. The CPU is still very much alive
- * with pretty much all services enabled.
- */
-int rcutree_offline_cpu(unsigned int cpu)
-{
- unsigned long flags;
- struct rcu_data *rdp;
- struct rcu_node *rnp;
-
- rdp = per_cpu_ptr(&rcu_data, cpu);
- rnp = rdp->mynode;
- raw_spin_lock_irqsave_rcu_node(rnp, flags);
- rnp->ffmask &= ~rdp->grpmask;
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-
- rcutree_affinity_setting(cpu, cpu);
-
- // nohz_full CPUs need the tick for stop-machine to work quickly
- tick_dep_set(TICK_DEP_BIT_RCU);
- return 0;
-}
-
/*
* Mark the specified CPU as being online so that subsequent grace periods
* (both expedited and normal) will wait on it. Note that this means that
@@ -4637,7 +4578,60 @@ void rcutree_migrate_callbacks(int cpu)
cpu, rcu_segcblist_n_cbs(&rdp->cblist),
rcu_segcblist_first_cb(&rdp->cblist));
}
-#endif
+
+/*
+ * The CPU has been completely removed, and some other CPU is reporting
+ * this fact from process context. Do the remainder of the cleanup.
+ * There can only be one CPU hotplug operation at a time, so no need for
+ * explicit locking.
+ */
+int rcutree_dead_cpu(unsigned int cpu)
+{
+ WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
+ // Stop-machine done, so allow nohz_full to disable tick.
+ tick_dep_clear(TICK_DEP_BIT_RCU);
+ return 0;
+}
+
+/*
+ * Near the end of the offline process. Trace the fact that this CPU
+ * is going offline.
+ */
+int rcutree_dying_cpu(unsigned int cpu)
+{
+ bool blkd;
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ struct rcu_node *rnp = rdp->mynode;
+
+ blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
+ trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
+ blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
+ return 0;
+}
+
+/*
+ * Near the beginning of the process. The CPU is still very much alive
+ * with pretty much all services enabled.
+ */
+int rcutree_offline_cpu(unsigned int cpu)
+{
+ unsigned long flags;
+ struct rcu_data *rdp;
+ struct rcu_node *rnp;
+
+ rdp = per_cpu_ptr(&rcu_data, cpu);
+ rnp = rdp->mynode;
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
+ rnp->ffmask &= ~rdp->grpmask;
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+
+ rcutree_affinity_setting(cpu, cpu);
+
+ // nohz_full CPUs need the tick for stop-machine to work quickly
+ tick_dep_set(TICK_DEP_BIT_RCU);
+ return 0;
+}
+#endif /* #ifdef CONFIG_HOTPLUG_CPU */

/*
* On non-huge systems, use expedited RCU grace periods to make suspend
--
2.41.0

2023-09-08 22:19:24

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 03/10] rcu/nocb: Remove needless LOAD-ACQUIRE

The LOAD-ACQUIRE access performed on rdp->nocb_cb_sleep advertizes
ordering callback execution against grace period completion. However
this is contradicted by the following:

* This LOAD-ACQUIRE doesn't pair with anything. The only counterpart
barrier that can be found is the smp_mb() placed after callbacks
advancing in nocb_gp_wait(). However the barrier is placed _after_
->nocb_cb_sleep write.

* Callbacks can be concurrently advanced between the LOAD-ACQUIRE on
->nocb_cb_sleep and the call to rcu_segcblist_extract_done_cbs() in
rcu_do_batch(), making any ordering based on ->nocb_cb_sleep broken.

* Both rcu_segcblist_extract_done_cbs() and rcu_advance_cbs() are called
under the nocb_lock, the latter hereby providing already the desired
ACQUIRE semantics.

Therefore it is safe to access ->nocb_cb_sleep with a simple compiler
barrier.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/rcu/tree_nocb.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index b9eab359c597..6e63ba4788e1 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -933,8 +933,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
nocb_cb_wait_cond(rdp));

- // VVV Ensure CB invocation follows _sleep test.
- if (smp_load_acquire(&rdp->nocb_cb_sleep)) { // ^^^
+ if (READ_ONCE(rdp->nocb_cb_sleep)) {
WARN_ON(signal_pending(current));
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
}
--
2.41.0

2023-09-08 22:20:22

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 06/10] rcu: Assume rcu_report_dead() is always called locally

rcu_report_dead() has to be called locally by the CPU that is going to
exit the RCU state machine. Passing a cpu argument here is error-prone
and leaves the possibility for a racy remote call.

Use local access instead.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
arch/arm64/kernel/smp.c | 2 +-
include/linux/rcupdate.h | 2 +-
kernel/cpu.c | 2 +-
kernel/rcu/tree.c | 4 ++--
4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index edd63894d61e..ce672cb69f1c 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -401,7 +401,7 @@ void __noreturn cpu_die_early(void)

/* Mark this CPU absent */
set_cpu_present(cpu, 0);
- rcu_report_dead(cpu);
+ rcu_report_dead();

if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
update_cpu_boot_status(CPU_KILL_ME);
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 5e5f920ade90..aa351ddcbe8d 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -122,7 +122,7 @@ static inline void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
void rcu_init(void);
extern int rcu_scheduler_active;
void rcu_sched_clock_irq(int user);
-void rcu_report_dead(unsigned int cpu);
+void rcu_report_dead(void);
void rcutree_migrate_callbacks(int cpu);

#ifdef CONFIG_TASKS_RCU_GENERIC
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 88a7ede322bd..86f08eafbd9f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1368,7 +1368,7 @@ void cpuhp_report_idle_dead(void)
struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);

BUG_ON(st->state != CPUHP_AP_OFFLINE);
- rcu_report_dead(smp_processor_id());
+ rcu_report_dead();
st->state = CPUHP_AP_IDLE_DEAD;
/*
* We cannot call complete after rcu_report_dead() so we delegate it
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8b5ebef32e17..289c51417cbc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4551,11 +4551,11 @@ void rcu_cpu_starting(unsigned int cpu)
* from the outgoing CPU rather than from the cpuhp_step mechanism.
* This is because this function must be invoked at a precise location.
*/
-void rcu_report_dead(unsigned int cpu)
+void rcu_report_dead(void)
{
unsigned long flags;
unsigned long mask;
- struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */

/*
--
2.41.0

2023-09-08 22:28:26

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 10/10] rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP

The callbacks migration is performed through an explicit call from
the hotplug control CPU right after the death of the target CPU and
before proceeding with the CPUHP_ teardown functions.

This is unusual but necessary and yet uncommented. Summarize the reason
as explained in the changelog of:

a58163d8ca2c (rcu: Migrate callbacks earlier in the CPU-offline timeline)

Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/cpu.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index a41a6fff3c91..b135bb481be1 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1352,7 +1352,14 @@ static int takedown_cpu(unsigned int cpu)
cpuhp_bp_sync_dead(cpu);

tick_cleanup_dead_cpu(cpu);
+
+ /*
+ * Callbacks must be re-integrated right away to the RCU state machine.
+ * Otherwise an RCU callback could block a further teardown function
+ * waiting for its completion.
+ */
rcutree_migrate_callbacks(cpu);
+
return 0;
}

--
2.41.0

2023-09-08 22:36:18

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 04/10] rcu/nocb: Remove needless full barrier after callback advancing

A full barrier is issued from nocb_gp_wait() upon callbacks advancing
to order grace period completion with callbacks execution.

However these two events are already ordered by the
smp_mb__after_unlock_lock() barrier within the call to
raw_spin_lock_rcu_node() that is necessary for callbacks advancing to
happen.

The following litmus test shows the kind of guarantee that this barrier
provides:

C smp_mb__after_unlock_lock

{}

// rcu_gp_cleanup()
P0(spinlock_t *rnp_lock, int *gpnum)
{
// Grace period cleanup increase gp sequence number
spin_lock(rnp_lock);
WRITE_ONCE(*gpnum, 1);
spin_unlock(rnp_lock);
}

// nocb_gp_wait()
P1(spinlock_t *rnp_lock, spinlock_t *nocb_lock, int *gpnum, int *cb_ready)
{
int r1;

// Call rcu_advance_cbs() from nocb_gp_wait()
spin_lock(nocb_lock);
spin_lock(rnp_lock);
smp_mb__after_unlock_lock();
r1 = READ_ONCE(*gpnum);
WRITE_ONCE(*cb_ready, 1);
spin_unlock(rnp_lock);
spin_unlock(nocb_lock);
}

// nocb_cb_wait()
P2(spinlock_t *nocb_lock, int *cb_ready, int *cb_executed)
{
int r2;

// rcu_do_batch() -> rcu_segcblist_extract_done_cbs()
spin_lock(nocb_lock);
r2 = READ_ONCE(*cb_ready);
spin_unlock(nocb_lock);

// Actual callback execution
WRITE_ONCE(*cb_executed, 1);
}

P3(int *cb_executed, int *gpnum)
{
int r3;

WRITE_ONCE(*cb_executed, 2);
smp_mb();
r3 = READ_ONCE(*gpnum);
}

exists (1:r1=1 /\ 2:r2=1 /\ cb_executed=2 /\ 3:r3=0) (* Bad outcome. *)

Here the bad outcome only occurs if the smp_mb__after_unlock_lock() is
removed. This barrier orders the grace period completion against
callbacks advancing and even later callbacks invocation, thanks to the
opportunistic propagation via the ->nocb_lock to nocb_cb_wait().

Therefore the smp_mb() placed after callbacks advancing can be safely
removed.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/rcu/tree_nocb.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 6e63ba4788e1..2dc76f5e6e78 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -779,7 +779,6 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
if (rcu_segcblist_ready_cbs(&rdp->cblist)) {
needwake = rdp->nocb_cb_sleep;
WRITE_ONCE(rdp->nocb_cb_sleep, false);
- smp_mb(); /* CB invocation -after- GP end. */
} else {
needwake = false;
}
--
2.41.0

2023-09-08 22:43:59

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 01/10] rcu: Use rcu_segcblist_segempty() instead of open coding it

This makes the code more readable.

Reviewed-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/rcu/rcu_segcblist.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index f71fac422c8f..1693ea22ef1b 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -368,7 +368,7 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
smp_mb(); /* Ensure counts are updated before callback is entrained. */
rhp->next = NULL;
for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
- if (rsclp->tails[i] != rsclp->tails[i - 1])
+ if (!rcu_segcblist_segempty(rsclp, i))
break;
rcu_segcblist_inc_seglen(rsclp, i);
WRITE_ONCE(*rsclp->tails[i], rhp);
@@ -551,7 +551,7 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
* as their ->gp_seq[] grace-period completion sequence number.
*/
for (i = RCU_NEXT_READY_TAIL; i > RCU_DONE_TAIL; i--)
- if (rsclp->tails[i] != rsclp->tails[i - 1] &&
+ if (!rcu_segcblist_segempty(rsclp, i) &&
ULONG_CMP_LT(rsclp->gp_seq[i], seq))
break;

--
2.41.0

2023-09-08 22:50:27

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 05/10] rcu: Assume IRQS disabled from rcu_report_dead()

rcu_report_dead() is the last RCU word from the CPU down through the
hotplug path. It is called in the idle loop right before the CPU shuts
down for good. Because it removes the CPU from the grace period state
machine and reports an ultimate quiescent state if necessary, no further
use of RCU is allowed. Therefore it is expected that IRQs are disabled
upon calling this function and are not to be re-enabled again until the
CPU shuts down.

Remove the IRQs disablement from that function and verify instead that
it is actually called with IRQs disabled as it is expected at that
special point in the idle path.

Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/rcu/tree.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a83ecab77917..8b5ebef32e17 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4553,11 +4553,16 @@ void rcu_cpu_starting(unsigned int cpu)
*/
void rcu_report_dead(unsigned int cpu)
{
- unsigned long flags, seq_flags;
+ unsigned long flags;
unsigned long mask;
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */

+ /*
+ * IRQS must be disabled from now on and until the CPU dies, or an interrupt
+ * may introduce a new READ-side while it is actually off the QS masks.
+ */
+ lockdep_assert_irqs_disabled();
// Do any dangling deferred wakeups.
do_nocb_deferred_wakeup(rdp);

@@ -4565,7 +4570,6 @@ void rcu_report_dead(unsigned int cpu)

/* Remove outgoing CPU from mask in the leaf rcu_node structure. */
mask = rdp->grpmask;
- local_irq_save(seq_flags);
arch_spin_lock(&rcu_state.ofl_lock);
raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq);
@@ -4579,8 +4583,6 @@ void rcu_report_dead(unsigned int cpu)
WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
arch_spin_unlock(&rcu_state.ofl_lock);
- local_irq_restore(seq_flags);
-
rdp->cpu_started = false;
}

--
2.41.0

2023-09-08 23:02:18

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 09/10] rcu: Remove references to rcu_migrate_callbacks() from diagrams

This function is gone since:

53b46303da84 (rcu: Remove rsp parameter from rcu_boot_init_percpu_data() and friends)

Signed-off-by: Frederic Weisbecker <[email protected]>
---
.../Design/Memory-Ordering/TreeRCU-callback-registry.svg | 9 ---------
Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 9 ---------
2 files changed, 18 deletions(-)

diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
index 7ac6f9269806..63eff867175a 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
@@ -564,15 +564,6 @@
font-size="192"
id="text202-7-9-6"
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_migrate_callbacks()</text>
- <text
- xml:space="preserve"
- x="8335.4873"
- y="5357.1006"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-7-9-6-0"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_migrate_callbacks()</text>
<text
xml:space="preserve"
x="8768.4678"
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
index 6e690a3be161..53e0dc2a2c79 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
@@ -1446,15 +1446,6 @@
font-size="192"
id="text202-7-9-6"
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_migrate_callbacks()</text>
- <text
- xml:space="preserve"
- x="8335.4873"
- y="5357.1006"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-7-9-6-0"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_migrate_callbacks()</text>
<text
xml:space="preserve"
x="8768.4678"
--
2.41.0

2023-10-02 15:50:57

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 01/10] rcu: Use rcu_segcblist_segempty() instead of open coding it

On Fri, Sep 08, 2023 at 10:35:54PM +0200, Frederic Weisbecker wrote:
> This makes the code more readable.
>
> Reviewed-by: Qiuxu Zhuo <[email protected]>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> kernel/rcu/rcu_segcblist.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index f71fac422c8f..1693ea22ef1b 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -368,7 +368,7 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
> smp_mb(); /* Ensure counts are updated before callback is entrained. */
> rhp->next = NULL;
> for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
> - if (rsclp->tails[i] != rsclp->tails[i - 1])
> + if (!rcu_segcblist_segempty(rsclp, i))
> break;
> rcu_segcblist_inc_seglen(rsclp, i);
> WRITE_ONCE(*rsclp->tails[i], rhp);
> @@ -551,7 +551,7 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
> * as their ->gp_seq[] grace-period completion sequence number.
> */
> for (i = RCU_NEXT_READY_TAIL; i > RCU_DONE_TAIL; i--)
> - if (rsclp->tails[i] != rsclp->tails[i - 1] &&
> + if (!rcu_segcblist_segempty(rsclp, i) &&
> ULONG_CMP_LT(rsclp->gp_seq[i], seq))
> break;
>
> --
> 2.41.0
>

2023-10-02 16:02:43

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 08/10] rcu: Standardize explicit CPU-hotplug calls

On Fri, Sep 08, 2023 at 10:36:01PM +0200, Frederic Weisbecker wrote:
> rcu_report_dead() and rcutree_migrate_callbacks() have their headers in
> rcupdate.h while those are pure rcutree calls, like the other CPU-hotplug
> functions.
>
> Also rcu_cpu_starting() and rcu_report_dead() have different naming
> conventions while they mirror each other's effects.
>
> Fix the headers and propose a naming that relates both functions and
> aligns with the prefix of other rcutree CPU-hotplug functions.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> .../Expedited-Grace-Periods.rst | 2 +-
> .../RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg | 4 ++--
> .../RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 4 ++--
> .../RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg | 4 ++--
> .../RCU/Design/Requirements/Requirements.rst | 4 ++--
> arch/arm64/kernel/smp.c | 4 ++--
> arch/powerpc/kernel/smp.c | 2 +-
> arch/s390/kernel/smp.c | 2 +-
> arch/x86/kernel/smpboot.c | 2 +-
> include/linux/interrupt.h | 2 +-
> include/linux/rcupdate.h | 2 --
> include/linux/rcutiny.h | 2 +-
> include/linux/rcutree.h | 7 ++++++-
> kernel/cpu.c | 6 +++---
> kernel/rcu/tree.c | 12 ++++++++----
> 15 files changed, 33 insertions(+), 26 deletions(-)
>
> diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
> index 93d899d53258..414f8a2012d6 100644
> --- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
> +++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
> @@ -181,7 +181,7 @@ operations is carried out at several levels:
> of this wait (or series of waits, as the case may be) is to permit a
> concurrent CPU-hotplug operation to complete.
> #. In the case of RCU-sched, one of the last acts of an outgoing CPU is
> - to invoke ``rcu_report_dead()``, which reports a quiescent state for
> + to invoke ``rcutree_report_cpu_dead()``, which reports a quiescent state for
> that CPU. However, this is likely paranoia-induced redundancy.
>
> +-----------------------------------------------------------------------+
> diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
> index 7ddc094d7f28..d82a77d03d8c 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
> +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
> @@ -1135,7 +1135,7 @@
> font-weight="bold"
> font-size="192"
> id="text202-7-5-3-27-6-5"
> - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
> <text
> xml:space="preserve"
> x="3745.7725"
> @@ -1256,7 +1256,7 @@
> font-style="normal"
> y="3679.27"
> x="-3804.9949"
> - xml:space="preserve">rcu_cpu_starting()</text>
> + xml:space="preserve">rcutree_report_cpu_starting()</text>
> <g
> style="fill:none;stroke-width:0.025in"
> id="g3107-7-5-0"
> diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> index 069f6f8371c2..6e690a3be161 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> @@ -3274,7 +3274,7 @@
> font-weight="bold"
> font-size="192"
> id="text202-7-5-3-27-6-5"
> - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
> <text
> xml:space="preserve"
> x="3745.7725"
> @@ -3395,7 +3395,7 @@
> font-style="normal"
> y="3679.27"
> x="-3804.9949"
> - xml:space="preserve">rcu_cpu_starting()</text>
> + xml:space="preserve">rcutree_report_cpu_starting()</text>
> <g
> style="fill:none;stroke-width:0.025in"
> id="g3107-7-5-0"
> diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
> index 2c9310ba29ba..4fa7506082bf 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
> +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
> @@ -607,7 +607,7 @@
> font-weight="bold"
> font-size="192"
> id="text202-7-5-3-27-6"
> - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_report_dead()</text>
> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_report_cpu_dead()</text>
> <text
> xml:space="preserve"
> x="3745.7725"
> @@ -728,7 +728,7 @@
> font-style="normal"
> y="3679.27"
> x="-3804.9949"
> - xml:space="preserve">rcu_cpu_starting()</text>
> + xml:space="preserve">rcutree_report_cpu_starting()</text>
> <g
> style="fill:none;stroke-width:0.025in"
> id="g3107-7-5-0"
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> index f3b605285a87..cccafdaa1f84 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> @@ -1955,12 +1955,12 @@ if offline CPUs block an RCU grace period for too long.
>
> An offline CPU's quiescent state will be reported either:
>
> -1. As the CPU goes offline using RCU's hotplug notifier (rcu_report_dead()).
> +1. As the CPU goes offline using RCU's hotplug notifier (rcutree_report_cpu_dead()).
> 2. When grace period initialization (rcu_gp_init()) detects a
> race either with CPU offlining or with a task unblocking on a leaf
> ``rcu_node`` structure whose CPUs are all offline.
>
> -The CPU-online path (rcu_cpu_starting()) should never need to report
> +The CPU-online path (rcutree_report_cpu_starting()) should never need to report
> a quiescent state for an offline CPU. However, as a debugging measure,
> it does emit a warning if a quiescent state was not already reported
> for that CPU.
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index ce672cb69f1c..90c0083438ea 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -215,7 +215,7 @@ asmlinkage notrace void secondary_start_kernel(void)
> if (system_uses_irq_prio_masking())
> init_gic_priority_masking();
>
> - rcu_cpu_starting(cpu);
> + rcutree_report_cpu_starting(cpu);
> trace_hardirqs_off();
>
> /*
> @@ -401,7 +401,7 @@ void __noreturn cpu_die_early(void)
>
> /* Mark this CPU absent */
> set_cpu_present(cpu, 0);
> - rcu_report_dead();
> + rcutree_report_cpu_dead();
>
> if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
> update_cpu_boot_status(CPU_KILL_ME);
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index fbbb695bae3d..a132e3290e98 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1619,7 +1619,7 @@ void start_secondary(void *unused)
>
> smp_store_cpu_info(cpu);
> set_dec(tb_ticks_per_jiffy);
> - rcu_cpu_starting(cpu);
> + rcutree_report_cpu_starting(cpu);
> cpu_callin_map[cpu] = 1;
>
> if (smp_ops->setup_cpu)
> diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
> index f9a2b755f510..3e39a5e1bf48 100644
> --- a/arch/s390/kernel/smp.c
> +++ b/arch/s390/kernel/smp.c
> @@ -894,7 +894,7 @@ static void smp_start_secondary(void *cpuvoid)
> S390_lowcore.restart_flags = 0;
> restore_access_regs(S390_lowcore.access_regs_save_area);
> cpu_init();
> - rcu_cpu_starting(cpu);
> + rcutree_report_cpu_starting(cpu);
> init_cpu_timer();
> vtime_init();
> vdso_getcpu_init();
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index e1aa2cd7734b..d25b952a2b91 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -288,7 +288,7 @@ static void notrace start_secondary(void *unused)
>
> cpu_init();
> fpu__init_cpu();
> - rcu_cpu_starting(raw_smp_processor_id());
> + rcutree_report_cpu_starting(raw_smp_processor_id());
> x86_cpuinit.early_percpu_clock_init();
>
> ap_starting();
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index a92bce40b04b..d05e1e9a553c 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -566,7 +566,7 @@ enum
> *
> * _ RCU:
> * 1) rcutree_migrate_callbacks() migrates the queue.
> - * 2) rcu_report_dead() reports the final quiescent states.
> + * 2) rcutree_report_cpu_dead() reports the final quiescent states.
> *
> * _ IRQ_POLL: irq_poll_cpu_dead() migrates the queue
> */
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index aa351ddcbe8d..f7206b2623c9 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -122,8 +122,6 @@ static inline void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
> void rcu_init(void);
> extern int rcu_scheduler_active;
> void rcu_sched_clock_irq(int user);
> -void rcu_report_dead(void);
> -void rcutree_migrate_callbacks(int cpu);
>
> #ifdef CONFIG_TASKS_RCU_GENERIC
> void rcu_init_tasks_generic(void);
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 7f17acf29dda..04889a9602e7 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -169,6 +169,6 @@ static inline void rcu_all_qs(void) { barrier(); }
> #define rcutree_offline_cpu NULL
> #define rcutree_dead_cpu NULL
> #define rcutree_dying_cpu NULL
> -static inline void rcu_cpu_starting(unsigned int cpu) { }
> +static inline void rcutree_report_cpu_starting(unsigned int cpu) { }
>
> #endif /* __LINUX_RCUTINY_H */
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 7d75066c72aa..07d0fc1e0d31 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -109,7 +109,7 @@ void rcu_all_qs(void);
> /* RCUtree hotplug events */
> int rcutree_prepare_cpu(unsigned int cpu);
> int rcutree_online_cpu(unsigned int cpu);
> -void rcu_cpu_starting(unsigned int cpu);
> +void rcutree_report_cpu_starting(unsigned int cpu);
>
> #ifdef CONFIG_HOTPLUG_CPU
> int rcutree_dead_cpu(unsigned int cpu);
> @@ -121,4 +121,9 @@ int rcutree_offline_cpu(unsigned int cpu);
> #define rcutree_offline_cpu NULL
> #endif
>
> +void rcutree_migrate_callbacks(int cpu);
> +
> +/* Called from hotplug and also arm64 early secondary boot failure */
> +void rcutree_report_cpu_dead(void);
> +
> #endif /* __LINUX_RCUTREE_H */
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 86f08eafbd9f..a41a6fff3c91 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1368,10 +1368,10 @@ void cpuhp_report_idle_dead(void)
> struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
>
> BUG_ON(st->state != CPUHP_AP_OFFLINE);
> - rcu_report_dead();
> + rcutree_report_cpu_dead();
> st->state = CPUHP_AP_IDLE_DEAD;
> /*
> - * We cannot call complete after rcu_report_dead() so we delegate it
> + * We cannot call complete after rcutree_report_cpu_dead() so we delegate it
> * to an online cpu.
> */
> smp_call_function_single(cpumask_first(cpu_online_mask),
> @@ -1575,7 +1575,7 @@ void notify_cpu_starting(unsigned int cpu)
> struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
> enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
>
> - rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
> + rcutree_report_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
> cpumask_set_cpu(cpu, &cpus_booted_once_mask);
>
> /*
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 875f241db508..5698c3f30b1d 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4207,7 +4207,7 @@ bool rcu_lockdep_current_cpu_online(void)
> rdp = this_cpu_ptr(&rcu_data);
> /*
> * Strictly, we care here about the case where the current CPU is
> - * in rcu_cpu_starting() and thus has an excuse for rdp->grpmask
> + * in rcutree_report_cpu_starting() and thus has an excuse for rdp->grpmask
> * not being up to date. So arch_spin_is_locked() might have a
> * false positive if it's held by some *other* CPU, but that's
> * OK because that just means a false *negative* on the warning.
> @@ -4436,8 +4436,10 @@ int rcutree_online_cpu(unsigned int cpu)
> * from the incoming CPU rather than from the cpuhp_step mechanism.
> * This is because this function must be invoked at a precise location.
> * This incoming CPU must not have enabled interrupts yet.
> + *
> + * This mirrors the effects of rcutree_report_cpu_dead().
> */
> -void rcu_cpu_starting(unsigned int cpu)
> +void rcutree_report_cpu_starting(unsigned int cpu)
> {
> unsigned long mask;
> struct rcu_data *rdp;
> @@ -4491,8 +4493,10 @@ void rcu_cpu_starting(unsigned int cpu)
> * Note that this function is special in that it is invoked directly
> * from the outgoing CPU rather than from the cpuhp_step mechanism.
> * This is because this function must be invoked at a precise location.
> + *
> + * This mirrors the effect of rcutree_report_cpu_starting().
> */
> -void rcu_report_dead(void)
> +void rcutree_report_cpu_dead(void)
> {
> unsigned long flags;
> unsigned long mask;
> @@ -5063,7 +5067,7 @@ void __init rcu_init(void)
> pm_notifier(rcu_pm_notify, 0);
> WARN_ON(num_online_cpus() > 1); // Only one CPU this early in boot.
> rcutree_prepare_cpu(cpu);
> - rcu_cpu_starting(cpu);
> + rcutree_report_cpu_starting(cpu);
> rcutree_online_cpu(cpu);
>
> /* Create workqueue for Tree SRCU and for expedited GPs. */
> --
> 2.41.0
>

2023-10-02 16:07:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 05/10] rcu: Assume IRQS disabled from rcu_report_dead()

On Fri, Sep 08, 2023 at 10:35:58PM +0200, Frederic Weisbecker wrote:
> rcu_report_dead() is the last RCU word from the CPU down through the
> hotplug path. It is called in the idle loop right before the CPU shuts
> down for good. Because it removes the CPU from the grace period state
> machine and reports an ultimate quiescent state if necessary, no further
> use of RCU is allowed. Therefore it is expected that IRQs are disabled
> upon calling this function and are not to be re-enabled again until the
> CPU shuts down.
>
> Remove the IRQs disablement from that function and verify instead that
> it is actually called with IRQs disabled as it is expected at that
> special point in the idle path.
>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> kernel/rcu/tree.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a83ecab77917..8b5ebef32e17 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4553,11 +4553,16 @@ void rcu_cpu_starting(unsigned int cpu)
> */
> void rcu_report_dead(unsigned int cpu)
> {
> - unsigned long flags, seq_flags;
> + unsigned long flags;
> unsigned long mask;
> struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
>
> + /*
> + * IRQS must be disabled from now on and until the CPU dies, or an interrupt
> + * may introduce a new READ-side while it is actually off the QS masks.
> + */
> + lockdep_assert_irqs_disabled();
> // Do any dangling deferred wakeups.
> do_nocb_deferred_wakeup(rdp);
>
> @@ -4565,7 +4570,6 @@ void rcu_report_dead(unsigned int cpu)
>
> /* Remove outgoing CPU from mask in the leaf rcu_node structure. */
> mask = rdp->grpmask;
> - local_irq_save(seq_flags);
> arch_spin_lock(&rcu_state.ofl_lock);
> raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
> rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq);
> @@ -4579,8 +4583,6 @@ void rcu_report_dead(unsigned int cpu)
> WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask);
> raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> arch_spin_unlock(&rcu_state.ofl_lock);
> - local_irq_restore(seq_flags);
> -
> rdp->cpu_started = false;
> }
>
> --
> 2.41.0
>

2023-10-02 16:10:15

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 09/10] rcu: Remove references to rcu_migrate_callbacks() from diagrams

On Fri, Sep 08, 2023 at 10:36:02PM +0200, Frederic Weisbecker wrote:
> This function is gone since:
>
> 53b46303da84 (rcu: Remove rsp parameter from rcu_boot_init_percpu_data() and friends)
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> .../Design/Memory-Ordering/TreeRCU-callback-registry.svg | 9 ---------
> Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 9 ---------
> 2 files changed, 18 deletions(-)
>
> diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
> index 7ac6f9269806..63eff867175a 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
> +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-callback-registry.svg
> @@ -564,15 +564,6 @@
> font-size="192"
> id="text202-7-9-6"
> style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_migrate_callbacks()</text>
> - <text
> - xml:space="preserve"
> - x="8335.4873"
> - y="5357.1006"
> - font-style="normal"
> - font-weight="bold"
> - font-size="192"
> - id="text202-7-9-6-0"
> - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_migrate_callbacks()</text>
> <text
> xml:space="preserve"
> x="8768.4678"
> diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> index 6e690a3be161..53e0dc2a2c79 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
> @@ -1446,15 +1446,6 @@
> font-size="192"
> id="text202-7-9-6"
> style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcutree_migrate_callbacks()</text>
> - <text
> - xml:space="preserve"
> - x="8335.4873"
> - y="5357.1006"
> - font-style="normal"
> - font-weight="bold"
> - font-size="192"
> - id="text202-7-9-6-0"
> - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_migrate_callbacks()</text>
> <text
> xml:space="preserve"
> x="8768.4678"
> --
> 2.41.0
>

2023-10-02 16:10:44

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 10/10] rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP

On Fri, Sep 08, 2023 at 10:36:03PM +0200, Frederic Weisbecker wrote:
> The callbacks migration is performed through an explicit call from
> the hotplug control CPU right after the death of the target CPU and
> before proceeding with the CPUHP_ teardown functions.
>
> This is unusual but necessary and yet uncommented. Summarize the reason
> as explained in the changelog of:
>
> a58163d8ca2c (rcu: Migrate callbacks earlier in the CPU-offline timeline)
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> kernel/cpu.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index a41a6fff3c91..b135bb481be1 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1352,7 +1352,14 @@ static int takedown_cpu(unsigned int cpu)
> cpuhp_bp_sync_dead(cpu);
>
> tick_cleanup_dead_cpu(cpu);
> +
> + /*
> + * Callbacks must be re-integrated right away to the RCU state machine.
> + * Otherwise an RCU callback could block a further teardown function
> + * waiting for its completion.
> + */
> rcutree_migrate_callbacks(cpu);
> +
> return 0;
> }
>
> --
> 2.41.0
>

2023-10-02 16:45:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 06/10] rcu: Assume rcu_report_dead() is always called locally

On Fri, Sep 08, 2023 at 10:35:59PM +0200, Frederic Weisbecker wrote:
> rcu_report_dead() has to be called locally by the CPU that is going to
> exit the RCU state machine. Passing a cpu argument here is error-prone
> and leaves the possibility for a racy remote call.
>
> Use local access instead.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

I was going to ask for an assertion for "cpu" in cpu_die_early(), but
given that its value comes from smp_processor_id() just a few lines
earlier, there isn't a whole lot of point to that. So:

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> arch/arm64/kernel/smp.c | 2 +-
> include/linux/rcupdate.h | 2 +-
> kernel/cpu.c | 2 +-
> kernel/rcu/tree.c | 4 ++--
> 4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index edd63894d61e..ce672cb69f1c 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -401,7 +401,7 @@ void __noreturn cpu_die_early(void)
>
> /* Mark this CPU absent */
> set_cpu_present(cpu, 0);
> - rcu_report_dead(cpu);
> + rcu_report_dead();
>
> if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
> update_cpu_boot_status(CPU_KILL_ME);
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 5e5f920ade90..aa351ddcbe8d 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -122,7 +122,7 @@ static inline void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
> void rcu_init(void);
> extern int rcu_scheduler_active;
> void rcu_sched_clock_irq(int user);
> -void rcu_report_dead(unsigned int cpu);
> +void rcu_report_dead(void);
> void rcutree_migrate_callbacks(int cpu);
>
> #ifdef CONFIG_TASKS_RCU_GENERIC
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 88a7ede322bd..86f08eafbd9f 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1368,7 +1368,7 @@ void cpuhp_report_idle_dead(void)
> struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
>
> BUG_ON(st->state != CPUHP_AP_OFFLINE);
> - rcu_report_dead(smp_processor_id());
> + rcu_report_dead();
> st->state = CPUHP_AP_IDLE_DEAD;
> /*
> * We cannot call complete after rcu_report_dead() so we delegate it
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8b5ebef32e17..289c51417cbc 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4551,11 +4551,11 @@ void rcu_cpu_starting(unsigned int cpu)
> * from the outgoing CPU rather than from the cpuhp_step mechanism.
> * This is because this function must be invoked at a precise location.
> */
> -void rcu_report_dead(unsigned int cpu)
> +void rcu_report_dead(void)
> {
> unsigned long flags;
> unsigned long mask;
> - struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> + struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
>
> /*
> --
> 2.41.0
>

2023-10-04 16:58:01

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 07/10] rcu: Conditionally build CPU-hotplug teardown callbacks

On Fri, Sep 08, 2023 at 10:36:00PM +0200, Frederic Weisbecker wrote:
> Among the three CPU-hotplug teardown RCU callbacks, two of them early
> exit if CONFIG_HOTPLUG_CPU=n, and one is left unchanged. In any case
> all of them have an implementation when CONFIG_HOTPLUG_CPU=n.
>
> Align instead with the common way to deal with CPU-hotplug teardown
> callbacks and provide a proper stub when they are not supported.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Good eyes!

Reviewed-by: Paul E. McKenney <[email protected]>

> ---
> include/linux/rcutree.h | 11 +++-
> kernel/rcu/tree.c | 114 +++++++++++++++++++---------------------
> 2 files changed, 63 insertions(+), 62 deletions(-)
>
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index af6ddbd291eb..7d75066c72aa 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -109,9 +109,16 @@ void rcu_all_qs(void);
> /* RCUtree hotplug events */
> int rcutree_prepare_cpu(unsigned int cpu);
> int rcutree_online_cpu(unsigned int cpu);
> -int rcutree_offline_cpu(unsigned int cpu);
> +void rcu_cpu_starting(unsigned int cpu);
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> int rcutree_dead_cpu(unsigned int cpu);
> int rcutree_dying_cpu(unsigned int cpu);
> -void rcu_cpu_starting(unsigned int cpu);
> +int rcutree_offline_cpu(unsigned int cpu);
> +#else
> +#define rcutree_dead_cpu NULL
> +#define rcutree_dying_cpu NULL
> +#define rcutree_offline_cpu NULL
> +#endif
>
> #endif /* __LINUX_RCUTREE_H */
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 289c51417cbc..875f241db508 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4228,25 +4228,6 @@ static bool rcu_init_invoked(void)
> return !!rcu_state.n_online_cpus;
> }
>
> -/*
> - * Near the end of the offline process. Trace the fact that this CPU
> - * is going offline.
> - */
> -int rcutree_dying_cpu(unsigned int cpu)
> -{
> - bool blkd;
> - struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> - struct rcu_node *rnp = rdp->mynode;
> -
> - if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> - return 0;
> -
> - blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
> - trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
> - blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
> - return 0;
> -}
> -
> /*
> * All CPUs for the specified rcu_node structure have gone offline,
> * and all tasks that were preempted within an RCU read-side critical
> @@ -4292,23 +4273,6 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
> }
> }
>
> -/*
> - * The CPU has been completely removed, and some other CPU is reporting
> - * this fact from process context. Do the remainder of the cleanup.
> - * There can only be one CPU hotplug operation at a time, so no need for
> - * explicit locking.
> - */
> -int rcutree_dead_cpu(unsigned int cpu)
> -{
> - if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> - return 0;
> -
> - WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> - // Stop-machine done, so allow nohz_full to disable tick.
> - tick_dep_clear(TICK_DEP_BIT_RCU);
> - return 0;
> -}
> -
> /*
> * Propagate ->qsinitmask bits up the rcu_node tree to account for the
> * first CPU in a given leaf rcu_node structure coming online. The caller
> @@ -4461,29 +4425,6 @@ int rcutree_online_cpu(unsigned int cpu)
> return 0;
> }
>
> -/*
> - * Near the beginning of the process. The CPU is still very much alive
> - * with pretty much all services enabled.
> - */
> -int rcutree_offline_cpu(unsigned int cpu)
> -{
> - unsigned long flags;
> - struct rcu_data *rdp;
> - struct rcu_node *rnp;
> -
> - rdp = per_cpu_ptr(&rcu_data, cpu);
> - rnp = rdp->mynode;
> - raw_spin_lock_irqsave_rcu_node(rnp, flags);
> - rnp->ffmask &= ~rdp->grpmask;
> - raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> -
> - rcutree_affinity_setting(cpu, cpu);
> -
> - // nohz_full CPUs need the tick for stop-machine to work quickly
> - tick_dep_set(TICK_DEP_BIT_RCU);
> - return 0;
> -}
> -
> /*
> * Mark the specified CPU as being online so that subsequent grace periods
> * (both expedited and normal) will wait on it. Note that this means that
> @@ -4637,7 +4578,60 @@ void rcutree_migrate_callbacks(int cpu)
> cpu, rcu_segcblist_n_cbs(&rdp->cblist),
> rcu_segcblist_first_cb(&rdp->cblist));
> }
> -#endif
> +
> +/*
> + * The CPU has been completely removed, and some other CPU is reporting
> + * this fact from process context. Do the remainder of the cleanup.
> + * There can only be one CPU hotplug operation at a time, so no need for
> + * explicit locking.
> + */
> +int rcutree_dead_cpu(unsigned int cpu)
> +{
> + WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> + // Stop-machine done, so allow nohz_full to disable tick.
> + tick_dep_clear(TICK_DEP_BIT_RCU);
> + return 0;
> +}
> +
> +/*
> + * Near the end of the offline process. Trace the fact that this CPU
> + * is going offline.
> + */
> +int rcutree_dying_cpu(unsigned int cpu)
> +{
> + bool blkd;
> + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> + struct rcu_node *rnp = rdp->mynode;
> +
> + blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
> + trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
> + blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
> + return 0;
> +}
> +
> +/*
> + * Near the beginning of the process. The CPU is still very much alive
> + * with pretty much all services enabled.
> + */
> +int rcutree_offline_cpu(unsigned int cpu)
> +{
> + unsigned long flags;
> + struct rcu_data *rdp;
> + struct rcu_node *rnp;
> +
> + rdp = per_cpu_ptr(&rcu_data, cpu);
> + rnp = rdp->mynode;
> + raw_spin_lock_irqsave_rcu_node(rnp, flags);
> + rnp->ffmask &= ~rdp->grpmask;
> + raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> +
> + rcutree_affinity_setting(cpu, cpu);
> +
> + // nohz_full CPUs need the tick for stop-machine to work quickly
> + tick_dep_set(TICK_DEP_BIT_RCU);
> + return 0;
> +}
> +#endif /* #ifdef CONFIG_HOTPLUG_CPU */
>
> /*
> * On non-huge systems, use expedited RCU grace periods to make suspend
> --
> 2.41.0
>