2020-11-18 16:13:56

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH] rcu/segcblist: Add debug check for whether seglen is 0 for empty list

After rcu_do_batch(), add a check for whether the seglen counts went to
zero if the list was indeed empty.

Signed-off-by: Joel Fernandes (Google) <[email protected]>

---
kernel/rcu/rcu_segcblist.c | 12 ++++++++++++
kernel/rcu/rcu_segcblist.h | 3 +++
kernel/rcu/tree.c | 1 +
3 files changed, 16 insertions(+)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 5059b6102afe..6e98bb3804f0 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -94,6 +94,18 @@ static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
return READ_ONCE(rsclp->seglen[seg]);
}

+/* Return number of callbacks in segmented callback list by totalling seglen. */
+long rcu_segcblist_n_segment_cbs(struct rcu_segcblist *rsclp)
+{
+ long len = 0;
+ int i;
+
+ for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
+ len += rcu_segcblist_get_seglen(rsclp, i);
+
+ return len;
+}
+
/* Set the length of a segment of the rcu_segcblist structure. */
static void rcu_segcblist_set_seglen(struct rcu_segcblist *rsclp, int seg, long v)
{
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index cd35c9faaf51..46a42d77f7e1 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -15,6 +15,9 @@ static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
return READ_ONCE(rclp->len);
}

+/* Return number of callbacks in segmented callback list by totalling seglen. */
+long rcu_segcblist_n_segment_cbs(struct rcu_segcblist *rsclp);
+
void rcu_cblist_init(struct rcu_cblist *rclp);
void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp);
void rcu_cblist_flush_enqueue(struct rcu_cblist *drclp,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f5b61e10f1de..928bd10c9c3b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2553,6 +2553,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
WARN_ON_ONCE(count == 0 && !rcu_segcblist_empty(&rdp->cblist));
WARN_ON_ONCE(!IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
count != 0 && rcu_segcblist_empty(&rdp->cblist));
+ WARN_ON_ONCE(count == 0 && !rcu_segcblist_n_segment_cbs(&rdp->cblist));

rcu_nocb_unlock_irqrestore(rdp, flags);

--
2.29.2.299.gdc1121823c-goog


2020-11-19 07:02:39

by kernel test robot

[permalink] [raw]
Subject: [rcu/segcblist] a15faaf2a0: WARNING:at_kernel/rcu/tree.c:#rcu_do_batch


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: a15faaf2a0d190b88cc793b75735fa65ba218f00 ("[PATCH] rcu/segcblist: Add debug check for whether seglen is 0 for empty list")
url: https://github.com/0day-ci/linux/commits/Joel-Fernandes-Google/rcu-segcblist-Add-debug-check-for-whether-seglen-is-0-for-empty-list/20201119-001442
base: https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+----------------------------------------------------------------------------+------------+------------+
| | 5fa06308db | a15faaf2a0 |
+----------------------------------------------------------------------------+------------+------------+
| boot_successes | 27 | 0 |
| WARNING:at_kernel/rcu/tree.c:#rcu_do_batch | 0 | 34 |
| EIP:rcu_do_batch | 0 | 34 |
+----------------------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 2.400207] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:2556 rcu_do_batch+0x56b/0xcb5
[ 2.400975] Modules linked in:
[ 2.401298] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc1-00165-ga15faaf2a0d1 #7
[ 2.402138] EIP: rcu_do_batch+0x56b/0xcb5
[ 2.402587] Code: 06 00 8b 45 c8 e8 3a 5c 00 00 85 c0 0f 85 ca 03 00 00 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 e0 2b 17 c2 e8 ba b3 06 00 <0f> 0b bb 01 00 00 00 e9 72 01 00 00 b8 b2 60 ce c1 e8 a0 13 85 00
[ 2.404353] EAX: c2172be0 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 2.405007] ESI: 00000000 EDI: ee4a3000 EBP: c2c25f78 ESP: c2c25f24
[ 2.405651] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
[ 2.406359] CR0: 80050033 CR2: 00000000 CR3: 023d9000 CR4: 000406b0
[ 2.407014] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 2.407640] DR6: fffe0ff0 DR7: 00000400
[ 2.408036] Call Trace:
[ 2.408316] <SOFTIRQ>
[ 2.408607] ? rcu_do_batch+0x256/0xcb5
[ 2.409013] rcu_core+0x2c2/0x546
[ 2.409357] ? __do_softirq+0x8d/0x513
[ 2.409676] rcu_core_si+0x8/0xa
[ 2.409676] __do_softirq+0xb9/0x513
[ 2.409676] ? __entry_text_end+0x6/0x6
[ 2.409676] call_on_stack+0x47/0x53
[ 2.409676] </SOFTIRQ>
[ 2.409676] ? irq_exit_rcu+0xd1/0xe4
[ 2.409676] ? sysvec_apic_timer_interrupt+0x22/0x31
[ 2.409676] ? handle_exception+0x119/0x119
[ 2.409676] ? acpi_os_release_object+0x8/0xc
[ 2.409676] ? sysvec_call_function_single+0x32/0x32
[ 2.409676] ? kmem_cache_free+0x95/0x6a4
[ 2.409676] ? sysvec_call_function_single+0x32/0x32
[ 2.409676] ? kmem_cache_free+0x95/0x6a4
[ 2.409676] ? acpi_ps_complete_final_op+0xe9/0x101
[ 2.409676] ? acpi_os_release_object+0x8/0xc
[ 2.409676] ? acpi_ut_delete_generic_state+0x13/0x16
[ 2.409676] ? acpi_ds_scope_stack_clear+0x1d/0x22
[ 2.409676] ? acpi_ps_parse_aml+0x15d/0x270
[ 2.409676] ? acpi_ps_execute_method+0x135/0x165
[ 2.409676] ? acpi_ns_evaluate+0x18a/0x211
[ 2.409676] ? acpi_evaluate_object+0x103/0x1e0
[ 2.409676] ? acpi_evaluate_integer+0x27/0x5b
[ 2.409676] ? acpi_match_device_ids+0x14/0x1d
[ 2.409676] ? acpi_device_is_battery+0x24/0x34
[ 2.409676] ? acpi_bus_get_status+0x4b/0x7f
[ 2.409676] ? acpi_add_single_object+0x364/0x5e6
[ 2.409676] ? up+0x4d/0x64
[ 2.409676] ? acpi_os_signal_semaphore+0x23/0x36
[ 2.409676] ? acpi_bus_check_add+0xcb/0x3d6
[ 2.409676] ? preempt_count_sub+0x80/0x172
[ 2.409676] ? up+0x4d/0x64
[ 2.409676] ? acpi_add_single_object+0x5e6/0x5e6
[ 2.409676] ? acpi_ns_walk_namespace+0xc6/0x171
[ 2.409676] ? acpi_walk_namespace+0x6f/0x99
[ 2.409676] ? acpi_add_single_object+0x5e6/0x5e6
[ 2.409676] ? acpi_sleep_proc_init+0x20/0x20
[ 2.409676] ? acpi_bus_scan+0x5c/0x65
[ 2.409676] ? acpi_add_single_object+0x5e6/0x5e6
[ 2.409676] ? acpi_scan_init+0xf3/0x200
[ 2.409676] ? acpi_init+0x25c/0x29f
[ 2.409676] ? do_one_initcall+0x6c/0x357
[ 2.409676] ? rcu_read_unlock_sched_notrace+0x27/0x38
[ 2.409676] ? trace_initcall_level+0xbb/0xc2
[ 2.409676] ? do_initcalls+0x88/0xd0
[ 2.409676] ? do_initcalls+0xad/0xd0
[ 2.409676] ? kernel_init_freeable+0x94/0xbe
[ 2.409676] ? rest_init+0x11d/0x11d
[ 2.409676] ? kernel_init+0x8/0xdf
[ 2.409676] ? ret_from_fork+0x1c/0x28
[ 2.409676] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc1-00165-ga15faaf2a0d1 #7
[ 2.409676] Call Trace:
[ 2.409676] <SOFTIRQ>
[ 2.409676] dump_stack+0x6d/0x8b
[ 2.409676] __warn.cold+0x24/0x4d
[ 2.409676] ? rcu_do_batch+0x56b/0xcb5
[ 2.409676] report_bug+0x9d/0xcb
[ 2.409676] ? exc_overflow+0x4c/0x4c
[ 2.409676] handle_bug+0x28/0x44
[ 2.409676] exc_invalid_op+0x1b/0x6c
[ 2.409676] ? check_preemption_disabled+0x43/0xe9
[ 2.409676] handle_exception+0x119/0x119
[ 2.409676] EIP: rcu_do_batch+0x56b/0xcb5
[ 2.409676] Code: 06 00 8b 45 c8 e8 3a 5c 00 00 85 c0 0f 85 ca 03 00 00 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 e0 2b 17 c2 e8 ba b3 06 00 <0f> 0b bb 01 00 00 00 e9 72 01 00 00 b8 b2 60 ce c1 e8 a0 13 85 00
[ 2.409676] EAX: c2172be0 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 2.409676] ESI: 00000000 EDI: ee4a3000 EBP: c2c25f78 ESP: c2c25f24
[ 2.409676] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010006
[ 2.409676] ? bit_xfer_atomic.cold+0x22/0x2d
[ 2.409676] ? exc_overflow+0x4c/0x4c
[ 2.409676] ? rcu_do_batch+0x56b/0xcb5
[ 2.409676] ? rcu_do_batch+0x256/0xcb5
[ 2.409676] rcu_core+0x2c2/0x546
[ 2.409676] ? __do_softirq+0x8d/0x513
[ 2.409676] rcu_core_si+0x8/0xa
[ 2.409676] __do_softirq+0xb9/0x513
[ 2.409676] ? __entry_text_end+0x6/0x6
[ 2.409676] call_on_stack+0x47/0x53
[ 2.409676] </SOFTIRQ>
[ 2.409676] ? irq_exit_rcu+0xd1/0xe4
[ 2.409676] ? sysvec_apic_timer_interrupt+0x22/0x31
[ 2.409676] ? handle_exception+0x119/0x119
[ 2.409676] ? acpi_os_release_object+0x8/0xc
[ 2.409676] ? sysvec_call_function_single+0x32/0x32
[ 2.409676] ? kmem_cache_free+0x95/0x6a4
[ 2.409676] ? sysvec_call_function_single+0x32/0x32
[ 2.409676] ? kmem_cache_free+0x95/0x6a4


To reproduce:

# build kernel
cd linux
cp config-5.10.0-rc1-00165-ga15faaf2a0d1 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Oliver Sang


Attachments:
(No filename) (6.99 kB)
config-5.10.0-rc1-00165-ga15faaf2a0d1 (126.60 kB)
job-script (4.85 kB)
dmesg.xz (13.16 kB)
Download all attachments