2023-03-25 17:34:38

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 00/13] Core RCU patches for 6.4

Hello,

These are core RCU patches for 6.4. I am resending as there are few more
patches for MAINTAINERS file changes, and few more tags. Plus I dropped the
Frederick's patch that Thomas took in for fixing the entry code.

o MAINTAINERS files additions and changes.

o Fix hotplug warning in nohz code.

o Tick dependency changes by Zqiang.

o Lazy-RCU shrinker fixes by Zqiang.

o rcu-tasks stall reporting improvements by Neeraj.

o Other changes.

Let me know if any objections to anything.

thanks,

- Joel

Boqun Feng (1):
MAINTAINERS: Add Boqun to RCU entry

Joel Fernandes (Google) (3):
MAINTAINERS: Change Joel Fernandes from R: to M:
MAINTAINERS: Add Zqiang as a RCU reviewer
tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

Neeraj Upadhyay (1):
rcu-tasks: Report stalls during synchronize_srcu() in
rcu_tasks_postscan()

Xu Panda (1):
rcu/trace: use strscpy() to instead of strncpy()

Zheng Yejian (1):
rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being
kprobe-ed

Zqiang (6):
rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access

MAINTAINERS | 4 +++-
drivers/base/cpu.c | 3 ++-
include/linux/tick.h | 2 ++
include/trace/events/rcu.h | 4 +---
include/trace/events/timer.h | 3 ++-
kernel/rcu/tasks.h | 31 +++++++++++++++++++++++++++++++
kernel/rcu/tree.c | 16 +++++++++-------
kernel/rcu/tree_exp.h | 16 ++++++++++------
kernel/rcu/tree_nocb.h | 4 ++++
kernel/time/tick-sched.c | 16 +++++++++++++---
10 files changed, 77 insertions(+), 22 deletions(-)

--
2.40.0.348.gf938b09366-goog


2023-03-25 17:34:39

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 01/13] MAINTAINERS: Change Joel Fernandes from R: to M:

I have spent years learning / contributing to RCU with several features,
talks and presentations, with my most recent work being on Lazy-RCU.

Please consider me for M, so I can tell my wife why I spend a lot of my
weekends and evenings on this complicated and mysterious thing -- which is
mostly in the hopes of preventing the world from burning down because
everything runs on this one way or another. ;-)

Acked-by: Paul E. McKenney <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Boqun Feng <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d5bc223f305..698c330d37cf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17637,11 +17637,11 @@ READ-COPY UPDATE (RCU)
M: "Paul E. McKenney" <[email protected]>
M: Frederic Weisbecker <[email protected]> (kernel/rcu/tree_nocb.h)
M: Neeraj Upadhyay <[email protected]> (kernel/rcu/tasks.h)
+M: Joel Fernandes <[email protected]>
M: Josh Triplett <[email protected]>
R: Steven Rostedt <[email protected]>
R: Mathieu Desnoyers <[email protected]>
R: Lai Jiangshan <[email protected]>
-R: Joel Fernandes <[email protected]>
L: [email protected]
S: Supported
W: http://www.rdrop.com/users/paulmck/RCU/
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:34:45

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 03/13] MAINTAINERS: Add Zqiang as a RCU reviewer

I have spent about two years studying and contributing to RCU,
and sharing RCU-related knowledge within my team, if possible,
please consider me as R ;-).

Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Zqiang <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e9fb1c172ffe..e03067b857a2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17643,6 +17643,7 @@ M: Boqun Feng <[email protected]>
R: Steven Rostedt <[email protected]>
R: Mathieu Desnoyers <[email protected]>
R: Lai Jiangshan <[email protected]>
+R: Zqiang <[email protected]>
L: [email protected]
S: Supported
W: http://www.rdrop.com/users/paulmck/RCU/
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:34:52

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 04/13] tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.

Fix it by asking the opinion of the nohz subsystem on whether the CPU can
be hotplugged.

[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]

For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <[email protected]>

Cc: Frederic Weisbecker <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Zhouyi Zhou <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: rcu <[email protected]>
Cc: [email protected]
Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
drivers/base/cpu.c | 3 ++-
include/linux/tick.h | 2 ++
kernel/time/tick-sched.c | 11 ++++++++---
3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 182c6122f815..c1815b9dae68 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -487,7 +487,8 @@ static const struct attribute_group *cpu_root_attr_groups[] = {
bool cpu_is_hotpluggable(unsigned int cpu)
{
struct device *dev = get_cpu_device(cpu);
- return dev && container_of(dev, struct cpu, dev)->hotpluggable;
+ return dev && container_of(dev, struct cpu, dev)->hotpluggable
+ && tick_nohz_cpu_hotpluggable(cpu);
}
EXPORT_SYMBOL_GPL(cpu_is_hotpluggable);

diff --git a/include/linux/tick.h b/include/linux/tick.h
index bfd571f18cfd..9459fef5b857 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -216,6 +216,7 @@ extern void tick_nohz_dep_set_signal(struct task_struct *tsk,
enum tick_dep_bits bit);
extern void tick_nohz_dep_clear_signal(struct signal_struct *signal,
enum tick_dep_bits bit);
+extern bool tick_nohz_cpu_hotpluggable(unsigned int cpu);

/*
* The below are tick_nohz_[set,clear]_dep() wrappers that optimize off-cases
@@ -280,6 +281,7 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }

static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }
+static inline bool tick_nohz_cpu_hotpluggable(unsigned int cpu) { return true; }

static inline void tick_dep_set(enum tick_dep_bits bit) { }
static inline void tick_dep_clear(enum tick_dep_bits bit) { }
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index b0e3c9205946..68d81a4283c8 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -527,7 +527,7 @@ void __init tick_nohz_full_setup(cpumask_var_t cpumask)
tick_nohz_full_running = true;
}

-static int tick_nohz_cpu_down(unsigned int cpu)
+bool tick_nohz_cpu_hotpluggable(unsigned int cpu)
{
/*
* The tick_do_timer_cpu CPU handles housekeeping duty (unbound
@@ -535,8 +535,13 @@ static int tick_nohz_cpu_down(unsigned int cpu)
* CPUs. It must remain online when nohz full is enabled.
*/
if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
- return -EBUSY;
- return 0;
+ return false;
+ return true;
+}
+
+static int tick_nohz_cpu_down(unsigned int cpu)
+{
+ return tick_nohz_cpu_hotpluggable(cpu) ? 0 : -EBUSY;
}

void __init tick_nohz_init(void)
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:34:56

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 05/13] rcu/trace: use strscpy() to instead of strncpy()

From: Xu Panda <[email protected]>

This commit saves a line of code by switching from strncpy() to strscpy()
by permitting the later NUL assignment to be removed. While in the area,
save another line by taking advantage of 100 characters.

Signed-off-by: Xu Panda <[email protected]>
Signed-off-by: Yang Yang <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
include/trace/events/rcu.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 90b2fb0292cb..c19ac1fa8a60 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -776,9 +776,7 @@ TRACE_EVENT_RCU(rcu_torture_read,
),

TP_fast_assign(
- strncpy(__entry->rcutorturename, rcutorturename,
- RCUTORTURENAME_LEN);
- __entry->rcutorturename[RCUTORTURENAME_LEN - 1] = 0;
+ strscpy(__entry->rcutorturename, rcutorturename, RCUTORTURENAME_LEN);
__entry->rhp = rhp;
__entry->secs = secs;
__entry->c_old = c_old;
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:03

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 06/13] rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race

From: Zqiang <[email protected]>

For kernels built with CONFIG_NO_HZ_FULL=y, the following scenario can result
in the scheduling-clock interrupt remaining enabled on a holdout CPU after
its quiescent state has been reported:

CPU1 CPU2
rcu_report_exp_cpu_mult synchronize_rcu_expedited_wait
acquires rnp->lock mask = rnp->expmask;
for_each_leaf_node_cpu_mask(rnp, cpu, mask)
rnp->expmask = rnp->expmask & ~mask; rdp = per_cpu_ptr(&rcu_data, cpu1);
for_each_leaf_node_cpu_mask(rnp, cpu, mask)
rdp = per_cpu_ptr(&rcu_data, cpu1);
if (!rdp->rcu_forced_tick_exp)
continue; rdp->rcu_forced_tick_exp = true;
tick_dep_set_cpu(cpu1, TICK_DEP_BIT_RCU_EXP);

The problem is that CPU2's sampling of rnp->expmask is obsolete by the
time it invokes tick_dep_set_cpu(), and CPU1 is not guaranteed to see
CPU2's store to ->rcu_forced_tick_exp in time to clear it. And even if
CPU1 does see that store, it might invoke tick_dep_clear_cpu() before
CPU2 got around to executing its tick_dep_set_cpu(), which would still
leave the victim CPU with its scheduler-clock tick running.

Either way, an nohz_full real-time application running on the victim
CPU would have its latency needlessly degraded.

Note that expedited RCU grace periods look at context-tracking
information, and so if the CPU is executing in nohz_full usermode
throughout, that CPU cannot be victimized in this manner.

This commit therefore causes synchronize_rcu_expedited_wait to hold
the rcu_node structure's ->lock when checking for holdout CPUs, setting
TICK_DEP_BIT_RCU_EXP, and invoking tick_dep_set_cpu(), thus preventing
this race.

Signed-off-by: Zqiang <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree_exp.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 249c2967d9e6..7cc4856da081 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -594,6 +594,7 @@ static void synchronize_rcu_expedited_wait(void)
struct rcu_data *rdp;
struct rcu_node *rnp;
struct rcu_node *rnp_root = rcu_get_root();
+ unsigned long flags;

trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait"));
jiffies_stall = rcu_exp_jiffies_till_stall_check();
@@ -602,17 +603,17 @@ static void synchronize_rcu_expedited_wait(void)
if (synchronize_rcu_expedited_wait_once(1))
return;
rcu_for_each_leaf_node(rnp) {
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
mask = READ_ONCE(rnp->expmask);
for_each_leaf_node_cpu_mask(rnp, cpu, mask) {
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rdp->rcu_forced_tick_exp)
continue;
rdp->rcu_forced_tick_exp = true;
- preempt_disable();
if (cpu_online(cpu))
tick_dep_set_cpu(cpu, TICK_DEP_BIT_RCU_EXP);
- preempt_enable();
}
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
j = READ_ONCE(jiffies_till_first_fqs);
if (synchronize_rcu_expedited_wait_once(j + HZ))
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:07

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 07/13] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check

From: Zqiang <[email protected]>

This commit adds checks for the TICK_DEP_MASK_RCU_EXP bit, thus enabling
RCU expedited grace periods to actually force-enable scheduling-clock
interrupts on holdout CPUs.

Fixes: df1e849ae455 ("rcu: Enable tick for nohz_full CPUs slow to provide expedited QS")
Signed-off-by: Zqiang <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Anna-Maria Behnsen <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
include/trace/events/timer.h | 3 ++-
kernel/time/tick-sched.c | 5 +++++
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 2e713a7d9aa3..3e8619c72f77 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -371,7 +371,8 @@ TRACE_EVENT(itimer_expire,
tick_dep_name(PERF_EVENTS) \
tick_dep_name(SCHED) \
tick_dep_name(CLOCK_UNSTABLE) \
- tick_dep_name_end(RCU)
+ tick_dep_name(RCU) \
+ tick_dep_name_end(RCU_EXP)

#undef tick_dep_name
#undef tick_dep_mask_name
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 68d81a4283c8..a46506f7ec6d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -281,6 +281,11 @@ static bool check_tick_dependency(atomic_t *dep)
return true;
}

+ if (val & TICK_DEP_MASK_RCU_EXP) {
+ trace_tick_stop(0, TICK_DEP_MASK_RCU_EXP);
+ return true;
+ }
+
return false;
}

--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:16

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 10/13] rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early

From: Zqiang <[email protected]>

According to the commit log of the patch that added it to the kernel,
start_poll_synchronize_rcu_expedited() can be invoked very early, as
in long before rcu_init() has been invoked. But before rcu_init(),
the rcu_data structure's ->mynode field has not yet been initialized.
This means that the start_poll_synchronize_rcu_expedited() function's
attempt to set the CPU's leaf rcu_node structure's ->exp_seq_poll_rq
field will result in a segmentation fault.

This commit therefore causes start_poll_synchronize_rcu_expedited() to
set ->exp_seq_poll_rq only after rcu_init() has initialized all CPUs'
rcu_data structures' ->mynode fields. It also removes the check from
the rcu_init() function so that start_poll_synchronize_rcu_expedited(
is unconditionally invoked. Yes, this might result in an unnecessary
boot-time grace period, but this is down in the noise.

Signed-off-by: Zqiang <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree.c | 5 ++---
kernel/rcu/tree_exp.h | 5 +++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e80e8f128c57..90d54571126a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4942,9 +4942,8 @@ void __init rcu_init(void)
else
qovld_calc = qovld;

- // Kick-start any polled grace periods that started early.
- if (!(per_cpu_ptr(&rcu_data, cpu)->mynode->exp_seq_poll_rq & 0x1))
- (void)start_poll_synchronize_rcu_expedited();
+ // Kick-start in case any polled grace periods started early.
+ (void)start_poll_synchronize_rcu_expedited();

rcu_test_sync_prims();
}
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 7cc4856da081..5343f32e7d67 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -1066,9 +1066,10 @@ unsigned long start_poll_synchronize_rcu_expedited(void)
if (rcu_init_invoked())
raw_spin_lock_irqsave(&rnp->exp_poll_lock, flags);
if (!poll_state_synchronize_rcu(s)) {
- rnp->exp_seq_poll_rq = s;
- if (rcu_init_invoked())
+ if (rcu_init_invoked()) {
+ rnp->exp_seq_poll_rq = s;
queue_work(rcu_gp_wq, &rnp->exp_poll_wq);
+ }
}
if (rcu_init_invoked())
raw_spin_unlock_irqrestore(&rnp->exp_poll_lock, flags);
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:21

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 02/13] MAINTAINERS: Add Boqun to RCU entry

From: Boqun Feng <[email protected]>

Just to be clear, the "M:" tag before my name is short of "Minions" ;-)

Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 698c330d37cf..e9fb1c172ffe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17639,6 +17639,7 @@ M: Frederic Weisbecker <[email protected]> (kernel/rcu/tree_nocb.h)
M: Neeraj Upadhyay <[email protected]> (kernel/rcu/tasks.h)
M: Joel Fernandes <[email protected]>
M: Josh Triplett <[email protected]>
+M: Boqun Feng <[email protected]>
R: Steven Rostedt <[email protected]>
R: Mathieu Desnoyers <[email protected]>
R: Lai Jiangshan <[email protected]>
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:21

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 12/13] rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed

From: Zheng Yejian <[email protected]>

Registering a kprobe on __rcu_irq_enter_check_tick() can cause kernel
stack overflow as shown below. This issue can be reproduced by enabling
CONFIG_NO_HZ_FULL and booting the kernel with argument "nohz_full=",
and then giving the following commands at the shell prompt:

# cd /sys/kernel/tracing/
# echo 'p:mp1 __rcu_irq_enter_check_tick' >> kprobe_events
# echo 1 > events/kprobes/enable

This commit therefore adds __rcu_irq_enter_check_tick() to the kprobes
blacklist using NOKPROBE_SYMBOL().

Insufficient stack space to handle exception!
ESR: 0x00000000f2000004 -- BRK (AArch64)
FAR: 0x0000ffffccf3e510
Task stack: [0xffff80000ad30000..0xffff80000ad38000]
IRQ stack: [0xffff800008050000..0xffff800008058000]
Overflow stack: [0xffff089c36f9f310..0xffff089c36fa0310]
CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19
Hardware name: linux,dummy-virt (DT)
pstate: 400003c5 (nZcv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __rcu_irq_enter_check_tick+0x0/0x1b8
lr : ct_nmi_enter+0x11c/0x138
sp : ffff80000ad30080
x29: ffff80000ad30080 x28: ffff089c82e20000 x27: 0000000000000000
x26: 0000000000000000 x25: ffff089c02a8d100 x24: 0000000000000000
x23: 00000000400003c5 x22: 0000ffffccf3e510 x21: ffff089c36fae148
x20: ffff80000ad30120 x19: ffffa8da8fcce148 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: ffffa8da8e44ea6c
x14: ffffa8da8e44e968 x13: ffffa8da8e03136c x12: 1fffe113804d6809
x11: ffff6113804d6809 x10: 0000000000000a60 x9 : dfff800000000000
x8 : ffff089c026b404f x7 : 00009eec7fb297f7 x6 : 0000000000000001
x5 : ffff80000ad30120 x4 : dfff800000000000 x3 : ffffa8da8e3016f4
x2 : 0000000000000003 x1 : 0000000000000000 x0 : 0000000000000000
Kernel panic - not syncing: kernel stack overflow
CPU: 5 PID: 190 Comm: bash Not tainted 6.2.0-rc2-00320-g1f5abbd77e2c #19
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0xf8/0x108
show_stack+0x20/0x30
dump_stack_lvl+0x68/0x84
dump_stack+0x1c/0x38
panic+0x214/0x404
add_taint+0x0/0xf8
panic_bad_stack+0x144/0x160
handle_bad_stack+0x38/0x58
__bad_stack+0x78/0x7c
__rcu_irq_enter_check_tick+0x0/0x1b8
arm64_enter_el1_dbg.isra.0+0x14/0x20
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
arm64_enter_el1_dbg.isra.0+0x14/0x20
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
arm64_enter_el1_dbg.isra.0+0x14/0x20
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
[...]
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
arm64_enter_el1_dbg.isra.0+0x14/0x20
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
arm64_enter_el1_dbg.isra.0+0x14/0x20
el1_dbg+0x2c/0x90
el1h_64_sync_handler+0xcc/0xe8
el1h_64_sync+0x64/0x68
__rcu_irq_enter_check_tick+0x0/0x1b8
el1_interrupt+0x28/0x60
el1h_64_irq_handler+0x18/0x28
el1h_64_irq+0x64/0x68
__ftrace_set_clr_event_nolock+0x98/0x198
__ftrace_set_clr_event+0x58/0x80
system_enable_write+0x144/0x178
vfs_write+0x174/0x738
ksys_write+0xd0/0x188
__arm64_sys_write+0x4c/0x60
invoke_syscall+0x64/0x180
el0_svc_common.constprop.0+0x84/0x160
do_el0_svc+0x48/0xe8
el0_svc+0x34/0xd0
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x190/0x194
SMP: stopping secondary CPUs
Kernel Offset: 0x28da86000000 from 0xffff800008000000
PHYS_OFFSET: 0xfffff76600000000
CPU features: 0x00000,01a00100,0000421b
Memory Limit: none

Link: https://lore.kernel.org/all/[email protected]/
Fixes: aaf2bc50df1f ("rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()")
Signed-off-by: Zheng Yejian <[email protected]>
Cc: [email protected]
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 90d54571126a..ee27a03d7576 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -640,6 +640,7 @@ void __rcu_irq_enter_check_tick(void)
}
raw_spin_unlock_rcu_node(rdp->mynode);
}
+NOKPROBE_SYMBOL(__rcu_irq_enter_check_tick);
#endif /* CONFIG_NO_HZ_FULL */

/*
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:35

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 13/13] rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access

From: Zqiang <[email protected]>

For kernels built with CONFIG_PREEMPT_RCU=y, the following scenario can
result in a NULL-pointer dereference:

CPU1 CPU2
rcu_preempt_deferred_qs_irqrestore rcu_print_task_exp_stall
if (special.b.blocked) READ_ONCE(rnp->exp_tasks) != NULL
raw_spin_lock_rcu_node
np = rcu_next_node_entry(t, rnp)
if (&t->rcu_node_entry == rnp->exp_tasks)
WRITE_ONCE(rnp->exp_tasks, np)
....
raw_spin_unlock_irqrestore_rcu_node
raw_spin_lock_irqsave_rcu_node
t = list_entry(rnp->exp_tasks->prev,
struct task_struct, rcu_node_entry)
(if rnp->exp_tasks is NULL, this
will dereference a NULL pointer)

The problem is that CPU2 accesses the rcu_node structure's->exp_tasks
field without holding the rcu_node structure's ->lock and CPU2 did
not observe CPU1's change to rcu_node structure's ->exp_tasks in time.
Therefore, if CPU1 sets rcu_node structure's->exp_tasks pointer to NULL,
then CPU2 might dereference that NULL pointer.

This commit therefore holds the rcu_node structure's ->lock while
accessing that structure's->exp_tasks field.

[ paulmck: Apply Frederic Weisbecker feedback. ]

Signed-off-by: Zqiang <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree_exp.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 5343f32e7d67..3b7abb58157d 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -803,9 +803,11 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
int ndetected = 0;
struct task_struct *t;

- if (!READ_ONCE(rnp->exp_tasks))
- return 0;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
+ if (!rnp->exp_tasks) {
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ return 0;
+ }
t = list_entry(rnp->exp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:35:55

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 11/13] rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()

From: Neeraj Upadhyay <[email protected]>

The call to synchronize_srcu() from rcu_tasks_postscan() can be stalled
by a task getting stuck in do_exit() between that function's calls to
exit_tasks_rcu_start() and exit_tasks_rcu_finish(). To ease diagnosis
of this situation, print a stall warning message every rcu_task_stall_info
period when rcu_tasks_postscan() is stalled.

[ paulmck: Adjust to handle CONFIG_SMP=n. ]

Reported-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/rcu/20230111212736.GA1062057@paulmck-ThinkPad-P17-Gen-1/
Signed-off-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tasks.h | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index bfb5e1549f2b..baf7ec178155 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -139,6 +139,12 @@ static struct rcu_tasks rt_name = \
/* Track exiting tasks in order to allow them to be waited for. */
DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);

+#ifdef CONFIG_TASKS_RCU
+/* Report delay in synchronize_srcu() completion in rcu_tasks_postscan(). */
+static void tasks_rcu_exit_srcu_stall(struct timer_list *unused);
+static DEFINE_TIMER(tasks_rcu_exit_srcu_stall_timer, tasks_rcu_exit_srcu_stall);
+#endif
+
/* Avoid IPIing CPUs early in the grace period. */
#define RCU_TASK_IPI_DELAY (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) ? HZ / 2 : 0)
static int rcu_task_ipi_delay __read_mostly = RCU_TASK_IPI_DELAY;
@@ -830,6 +836,13 @@ static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop)
/* Processing between scanning taskslist and draining the holdout list. */
static void rcu_tasks_postscan(struct list_head *hop)
{
+ int rtsi = READ_ONCE(rcu_task_stall_info);
+
+ if (!IS_ENABLED(CONFIG_TINY_RCU)) {
+ tasks_rcu_exit_srcu_stall_timer.expires = jiffies + rtsi;
+ add_timer(&tasks_rcu_exit_srcu_stall_timer);
+ }
+
/*
* Exiting tasks may escape the tasklist scan. Those are vulnerable
* until their final schedule() with TASK_DEAD state. To cope with
@@ -848,6 +861,9 @@ static void rcu_tasks_postscan(struct list_head *hop)
* call to synchronize_rcu().
*/
synchronize_srcu(&tasks_rcu_exit_srcu);
+
+ if (!IS_ENABLED(CONFIG_TINY_RCU))
+ del_timer_sync(&tasks_rcu_exit_srcu_stall_timer);
}

/* See if tasks are still holding out, complain if so. */
@@ -923,6 +939,21 @@ static void rcu_tasks_postgp(struct rcu_tasks *rtp)
void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func);
DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks");

+static void tasks_rcu_exit_srcu_stall(struct timer_list *unused)
+{
+#ifndef CONFIG_TINY_RCU
+ int rtsi;
+
+ rtsi = READ_ONCE(rcu_task_stall_info);
+ pr_info("%s: %s grace period number %lu (since boot) gp_state: %s is %lu jiffies old.\n",
+ __func__, rcu_tasks.kname, rcu_tasks.tasks_gp_seq,
+ tasks_gp_state_getname(&rcu_tasks), jiffies - rcu_tasks.gp_jiffies);
+ pr_info("Please check any exiting tasks stuck between calls to exit_tasks_rcu_start() and exit_tasks_rcu_finish()\n");
+ tasks_rcu_exit_srcu_stall_timer.expires = jiffies + rtsi;
+ add_timer(&tasks_rcu_exit_srcu_stall_timer);
+#endif // #ifndef CONFIG_TINY_RCU
+}
+
/**
* call_rcu_tasks() - Queue an RCU for invocation task-based grace period
* @rhp: structure to be used for queueing the RCU updates.
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:36:00

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 08/13] rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels

From: Zqiang <[email protected]>

The lazy_rcu_shrink_count() shrinker function is registered even in
kernels built with CONFIG_RCU_LAZY=n, in which case this function
uselessly consumes cycles learning that no CPU has any lazy callbacks
queued.

This commit therefore registers this shrinker function only in the kernels
built with CONFIG_RCU_LAZY=y, where it might actually do something useful.

Signed-off-by: Zqiang <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree_nocb.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 9e1c8caec5ce..f2280616f9d5 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1312,6 +1312,7 @@ int rcu_nocb_cpu_offload(int cpu)
}
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);

+#ifdef CONFIG_RCU_LAZY
static unsigned long
lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
{
@@ -1360,6 +1361,7 @@ static struct shrinker lazy_rcu_shrinker = {
.batch = 0,
.seeks = DEFAULT_SEEKS,
};
+#endif // #ifdef CONFIG_RCU_LAZY

void __init rcu_init_nohz(void)
{
@@ -1391,8 +1393,10 @@ void __init rcu_init_nohz(void)
if (!rcu_state.nocb_is_setup)
return;

+#ifdef CONFIG_RCU_LAZY
if (register_shrinker(&lazy_rcu_shrinker, "rcu-lazy"))
pr_err("Failed to register lazy_rcu shrinker!\n");
+#endif // #ifdef CONFIG_RCU_LAZY

if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
--
2.40.0.348.gf938b09366-goog

2023-03-25 17:36:02

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v2 09/13] rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()

From: Zqiang <[email protected]>

The rcu_accelerate_cbs() function is invoked by rcu_report_qs_rdp()
only if there is a grace period in progress that is still blocked
by at least one CPU on this rcu_node structure. This means that
rcu_accelerate_cbs() should never return the value true, and thus that
this function should never set the needwake variable and in turn never
invoke rcu_gp_kthread_wake().

This commit therefore removes the needwake variable and the invocation
of rcu_gp_kthread_wake() in favor of a WARN_ON_ONCE() on the call to
rcu_accelerate_cbs(). The purpose of this new WARN_ON_ONCE() is to
detect situations where the system's opinion differs from ours.

Signed-off-by: Zqiang <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/tree.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8e880c09ab59..e80e8f128c57 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1955,7 +1955,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
{
unsigned long flags;
unsigned long mask;
- bool needwake = false;
bool needacc = false;
struct rcu_node *rnp;

@@ -1987,7 +1986,12 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
* NOCB kthreads have their own way to deal with that...
*/
if (!rcu_rdp_is_offloaded(rdp)) {
- needwake = rcu_accelerate_cbs(rnp, rdp);
+ /*
+ * The current GP has not yet ended, so it
+ * should not be possible for rcu_accelerate_cbs()
+ * to return true. So complain, but don't awaken.
+ */
+ WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp));
} else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
/*
* ...but NOCB kthreads may miss or delay callbacks acceleration
@@ -1999,8 +2003,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
rcu_disable_urgency_upon_qs(rdp);
rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
/* ^^^ Released rnp->lock */
- if (needwake)
- rcu_gp_kthread_wake();

if (needacc) {
rcu_nocb_lock_irqsave(rdp, flags);
--
2.40.0.348.gf938b09366-goog

2023-03-26 19:20:29

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH v2 01/13] MAINTAINERS: Change Joel Fernandes from R: to M:

Le Sat, Mar 25, 2023 at 05:33:04PM +0000, Joel Fernandes (Google) a ?crit :
> I have spent years learning / contributing to RCU with several features,
> talks and presentations, with my most recent work being on Lazy-RCU.
>
> Please consider me for M, so I can tell my wife why I spend a lot of my
> weekends and evenings on this complicated and mysterious thing -- which is
> mostly in the hopes of preventing the world from burning down because
> everything runs on this one way or another. ;-)
>
> Acked-by: Paul E. McKenney <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Neeraj Upadhyay <[email protected]>
> Cc: Boqun Feng <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Acked-by: Frederic Weisbecker <[email protected]>

2023-03-26 19:35:02

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH v2 02/13] MAINTAINERS: Add Boqun to RCU entry

Le Sat, Mar 25, 2023 at 05:33:05PM +0000, Joel Fernandes (Google) a ?crit :
> From: Boqun Feng <[email protected]>
>
> Just to be clear, the "M:" tag before my name is short of "Minions" ;-)
>
> Acked-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Boqun Feng <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Acked-by: Frederic Weisbecker <[email protected]>

2023-03-26 19:52:06

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH v2 04/13] tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

Le Sat, Mar 25, 2023 at 05:33:07PM +0000, Joel Fernandes (Google) a ?crit :
> For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
> However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
> torture tests that do offlining to end up trying to offline this CPU causing
> test failures. Such failure happens on all architectures.

It might be worth noting that hotplug failure is fine on hotplug testing.
The issue here is the repetitive error message in the logs.

Other than that:

Acked-by: Frederic Weisbecker <[email protected]>

2023-03-26 20:06:10

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH v2 11/13] rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()

Le Sat, Mar 25, 2023 at 05:33:14PM +0000, Joel Fernandes (Google) a ?crit :
> From: Neeraj Upadhyay <[email protected]>
>
> The call to synchronize_srcu() from rcu_tasks_postscan() can be stalled
> by a task getting stuck in do_exit() between that function's calls to
> exit_tasks_rcu_start() and exit_tasks_rcu_finish(). To ease diagnosis
> of this situation, print a stall warning message every rcu_task_stall_info
> period when rcu_tasks_postscan() is stalled.
>
> [ paulmck: Adjust to handle CONFIG_SMP=n. ]
>
> Reported-by: Mark Brown <[email protected]>
> Link: https://lore.kernel.org/rcu/20230111212736.GA1062057@paulmck-ThinkPad-P17-Gen-1/
> Signed-off-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Acked-by: Frederic Weisbecker <[email protected]>

2023-03-30 15:50:54

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 04/13] tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

On Sun, Mar 26, 2023 at 09:34:35PM +0200, Frederic Weisbecker wrote:
> Le Sat, Mar 25, 2023 at 05:33:07PM +0000, Joel Fernandes (Google) a ?crit :
> > For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
> > However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
> > torture tests that do offlining to end up trying to offline this CPU causing
> > test failures. Such failure happens on all architectures.
>
> It might be worth noting that hotplug failure is fine on hotplug testing.
> The issue here is the repetitive error message in the logs.
>
> Other than that:
>
> Acked-by: Frederic Weisbecker <[email protected]>

Thank you, below is the reworded update. Let me know if any other comment.

-------8<-------

From: "Joel Fernandes (Google)" <[email protected]>
Subject: [PATCH] tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz
subsystem

For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.

Fix the repeated error messages thrown as a result (even if the hotplug
errors are harmless), by asking the opinion of the nohz subsystem on whether
the CPU can be hotplugged.

[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]

For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Zhouyi Zhou <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: rcu <[email protected]>
Cc: [email protected]
Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
drivers/base/cpu.c | 3 ++-
include/linux/tick.h | 2 ++
kernel/time/tick-sched.c | 11 ++++++++---
3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 182c6122f815..c1815b9dae68 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -487,7 +487,8 @@ static const struct attribute_group *cpu_root_attr_groups[] = {
bool cpu_is_hotpluggable(unsigned int cpu)
{
struct device *dev = get_cpu_device(cpu);
- return dev && container_of(dev, struct cpu, dev)->hotpluggable;
+ return dev && container_of(dev, struct cpu, dev)->hotpluggable
+ && tick_nohz_cpu_hotpluggable(cpu);
}
EXPORT_SYMBOL_GPL(cpu_is_hotpluggable);

diff --git a/include/linux/tick.h b/include/linux/tick.h
index bfd571f18cfd..9459fef5b857 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -216,6 +216,7 @@ extern void tick_nohz_dep_set_signal(struct task_struct *tsk,
enum tick_dep_bits bit);
extern void tick_nohz_dep_clear_signal(struct signal_struct *signal,
enum tick_dep_bits bit);
+extern bool tick_nohz_cpu_hotpluggable(unsigned int cpu);

/*
* The below are tick_nohz_[set,clear]_dep() wrappers that optimize off-cases
@@ -280,6 +281,7 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }

static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }
+static inline bool tick_nohz_cpu_hotpluggable(unsigned int cpu) { return true; }

static inline void tick_dep_set(enum tick_dep_bits bit) { }
static inline void tick_dep_clear(enum tick_dep_bits bit) { }
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ba2ac1469d47..a46506f7ec6d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -532,7 +532,7 @@ void __init tick_nohz_full_setup(cpumask_var_t cpumask)
tick_nohz_full_running = true;
}

-static int tick_nohz_cpu_down(unsigned int cpu)
+bool tick_nohz_cpu_hotpluggable(unsigned int cpu)
{
/*
* The tick_do_timer_cpu CPU handles housekeeping duty (unbound
@@ -540,8 +540,13 @@ static int tick_nohz_cpu_down(unsigned int cpu)
* CPUs. It must remain online when nohz full is enabled.
*/
if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
- return -EBUSY;
- return 0;
+ return false;
+ return true;
+}
+
+static int tick_nohz_cpu_down(unsigned int cpu)
+{
+ return tick_nohz_cpu_hotpluggable(cpu) ? 0 : -EBUSY;
}

void __init tick_nohz_init(void)
--
2.40.0.rc1.284.g88254d51c5-goog