The below calltrace was reported, the cause is that timer is delayed
bigger than 3 seconds
Hardware name: New H3C Technologies Co.,Ltd. UniServer R4950 G3/RS41R4950, BIOS 2.00.06 V700R003
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0xee/0x100
...
Call Trace:
process_one_work+0x18c/0x3a0
worker_thread+0x30/0x380
? process_one_work+0x3a0/0x3a0
kthread+0x113/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x40
---[ end trace 41bd884127493e39 ]---
then write a program to test timer latency, it can reproduce this issue
static void sched_l_tick(struct timer_list *t)
{
unsigned long delta = jiffies - set_time;
if (delta > 3*HZ)
printk("abnormal %ld %d\n", delta, raw_smp_processor_id());
set_time = jiffies+HZ;
mod_timer(t, jiffies + HZ);
}
further investigation shows jiffies maybe change when advence this base clk,
twice read of jiffies maybe lead to that base clk is bigger than truely next
event, and fire timer is skipped, so read jiffies once,
Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: Liang ZhiCheng <[email protected]>
---
kernel/time/timer.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 343c7ba33b1c..e2dbd0223635 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1593,24 +1593,27 @@ void timer_clear_idle(void)
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
+ unsigned long jnow;
+
+ jnow = READ_ONCE(jiffies);
/*
* NOHZ optimization. After a long idle sleep we need to forward the
* base to current jiffies. Avoid a loop by searching the bitfield for
* the next expiring timer.
*/
- if ((long)(jiffies - base->clk) > 2) {
+ if ((long)(jnow - base->clk) > 2) {
unsigned long next = __next_timer_interrupt(base);
/*
* If the next timer is ahead of time forward to current
* jiffies, otherwise forward to the next expiry time:
*/
- if (time_after(next, jiffies)) {
+ if (time_after(next, jnow)) {
/*
* The call site will increment base->clk and then
* terminate the expiry loop immediately.
*/
- base->clk = jiffies;
+ base->clk = jnow;
return 0;
}
base->clk = next;
--
2.16.2
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: e430d802d6a3aaf61bd3ed03d9404888a29b9bf9
Gitweb: https://git.kernel.org/tip/e430d802d6a3aaf61bd3ed03d9404888a29b9bf9
Author: Li RongQing <[email protected]>
AuthorDate: Thu, 19 Sep 2019 20:04:47 +08:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 19 Sep 2019 17:50:11 +02:00
timer: Read jiffies once when forwarding base clk
The timer delayed for more than 3 seconds warning was triggered during
testing.
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0xee/0x100
...
Call Trace:
process_one_work+0x18c/0x3a0
worker_thread+0x30/0x380
kthread+0x113/0x130
ret_from_fork+0x22/0x40
The reason is that the code in collect_expired_timers() uses jiffies
unprotected:
if (next_event > jiffies)
base->clk = jiffies;
As the compiler is allowed to reload the value base->clk can advance
between the check and the store and in the worst case advance farther than
next event. That causes the timer expiry to be delayed until the wheel
pointer wraps around.
Convert the code to use READ_ONCE()
Fixes: 236968383cf5 ("timers: Optimize collect_expired_timers() for NOHZ")
Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: Liang ZhiCheng <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/time/timer.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 0e315a2..4820823 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1678,24 +1678,26 @@ void timer_clear_idle(void)
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
+ unsigned long now = READ_ONCE(jiffies);
+
/*
* NOHZ optimization. After a long idle sleep we need to forward the
* base to current jiffies. Avoid a loop by searching the bitfield for
* the next expiring timer.
*/
- if ((long)(jiffies - base->clk) > 2) {
+ if ((long)(now - base->clk) > 2) {
unsigned long next = __next_timer_interrupt(base);
/*
* If the next timer is ahead of time forward to current
* jiffies, otherwise forward to the next expiry time:
*/
- if (time_after(next, jiffies)) {
+ if (time_after(next, now)) {
/*
* The call site will increment base->clk and then
* terminate the expiry loop immediately.
*/
- base->clk = jiffies;
+ base->clk = now;
return 0;
}
base->clk = next;