As part of testing the 5.16 rc series we noticed a new BUG message originating from check_preemption_disabled().
We submitted a patch to move a call to smp_processor_id() into an rcu critical section within the same function.
See https://lore.kernel.org/linux-rdma/[email protected]/T/#u.
Much to my surprise, additional testing still sees the BUG!
Additional testing has shown that an explicit preempt_disable()/preempt_enable() silences the warning when placed around the RCU critical section.
The RCU config is:
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
# end of RCU Subsystem
It looks like there is a difference between the checking in check_preemption_disabled() and the implicit preemption disabling in __rcu_read_lock().
The implicit disable looks like:
static void rcu_preempt_read_enter(void)
{
WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
}
The checking code uses the x86 define preempt_count():
static __always_inline void __preempt_count_add(int val)
{
raw_cpu_add_4(__preempt_count, val);
}
An explicit disable uses this x86 code:
static __always_inline void __preempt_count_add(int val)
{
raw_cpu_add_4(__preempt_count, val);
}
The difference seems to be the use of __preempt_count vs. rcu_read_lock_nesting.
This can't be good...
Mike