I saw following error when testing the latest nohz code on Power:
[ 85.295384] BUG: using smp_processor_id() in preemptible [00000000] code: rsyslogd/3493
[ 85.295396] caller is .tick_nohz_task_switch+0x1c/0xb8
[ 85.295402] Call Trace:
[ 85.295408] [c0000001fababab0] [c000000000012dc4] .show_stack+0x110/0x25c (unreliable)
[ 85.295420] [c0000001fababba0] [c0000000007c4b54] .dump_stack+0x20/0x30
[ 85.295430] [c0000001fababc10] [c00000000044eb74] .debug_smp_processor_id+0xf4/0x124
[ 85.295438] [c0000001fababca0] [c0000000000d7594] .tick_nohz_task_switch+0x1c/0xb8
[ 85.295447] [c0000001fababd20] [c0000000000b9748] .finish_task_switch+0x13c/0x160
[ 85.295455] [c0000001fababdb0] [c0000000000bbe50] .schedule_tail+0x50/0x124
[ 85.295463] [c0000001fababe30] [c000000000009dc8] .ret_from_fork+0x4/0x54
It seems to me that we could just use raw_smp_processor_id() here. Even
if the tick_nohz_full_cpu() check is done on a !nohz_full cpu, then the
task is moved to another nohz_full cpu, it seems the context switching
because of the task moving would call tick_nohz_task_switch() again to
evaluate the need for tick.
I don't know whether I missed something here.
Signed-off-by: Li Zhong <[email protected]>
---
kernel/time/tick-sched.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index da53c8f..0aa575b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -251,7 +251,7 @@ void tick_nohz_task_switch(struct task_struct *tsk)
{
unsigned long flags;
- if (!tick_nohz_full_cpu(smp_processor_id()))
+ if (!tick_nohz_full_cpu(raw_smp_processor_id()))
return;
local_irq_save(flags);
--
1.7.1
2013/4/27 Li Zhong <[email protected]>:
> I saw following error when testing the latest nohz code on Power:
>
> [ 85.295384] BUG: using smp_processor_id() in preemptible [00000000] code: rsyslogd/3493
> [ 85.295396] caller is .tick_nohz_task_switch+0x1c/0xb8
> [ 85.295402] Call Trace:
> [ 85.295408] [c0000001fababab0] [c000000000012dc4] .show_stack+0x110/0x25c (unreliable)
> [ 85.295420] [c0000001fababba0] [c0000000007c4b54] .dump_stack+0x20/0x30
> [ 85.295430] [c0000001fababc10] [c00000000044eb74] .debug_smp_processor_id+0xf4/0x124
> [ 85.295438] [c0000001fababca0] [c0000000000d7594] .tick_nohz_task_switch+0x1c/0xb8
> [ 85.295447] [c0000001fababd20] [c0000000000b9748] .finish_task_switch+0x13c/0x160
> [ 85.295455] [c0000001fababdb0] [c0000000000bbe50] .schedule_tail+0x50/0x124
> [ 85.295463] [c0000001fababe30] [c000000000009dc8] .ret_from_fork+0x4/0x54
>
> It seems to me that we could just use raw_smp_processor_id() here. Even
> if the tick_nohz_full_cpu() check is done on a !nohz_full cpu, then the
> task is moved to another nohz_full cpu, it seems the context switching
> because of the task moving would call tick_nohz_task_switch() again to
> evaluate the need for tick.
>
> I don't know whether I missed something here.
You're right it looks safe to do so. But I suggest we rather move the
test inside local_irq_save()/restore section to avoid confusion on
reviewers minds.
Thanks!
On Sat, 2013-04-27 at 15:40 +0200, Frederic Weisbecker wrote:
> 2013/4/27 Li Zhong <[email protected]>:
> > I saw following error when testing the latest nohz code on Power:
> >
> > [ 85.295384] BUG: using smp_processor_id() in preemptible [00000000] code: rsyslogd/3493
> > [ 85.295396] caller is .tick_nohz_task_switch+0x1c/0xb8
> > [ 85.295402] Call Trace:
> > [ 85.295408] [c0000001fababab0] [c000000000012dc4] .show_stack+0x110/0x25c (unreliable)
> > [ 85.295420] [c0000001fababba0] [c0000000007c4b54] .dump_stack+0x20/0x30
> > [ 85.295430] [c0000001fababc10] [c00000000044eb74] .debug_smp_processor_id+0xf4/0x124
> > [ 85.295438] [c0000001fababca0] [c0000000000d7594] .tick_nohz_task_switch+0x1c/0xb8
> > [ 85.295447] [c0000001fababd20] [c0000000000b9748] .finish_task_switch+0x13c/0x160
> > [ 85.295455] [c0000001fababdb0] [c0000000000bbe50] .schedule_tail+0x50/0x124
> > [ 85.295463] [c0000001fababe30] [c000000000009dc8] .ret_from_fork+0x4/0x54
> >
> > It seems to me that we could just use raw_smp_processor_id() here. Even
> > if the tick_nohz_full_cpu() check is done on a !nohz_full cpu, then the
> > task is moved to another nohz_full cpu, it seems the context switching
> > because of the task moving would call tick_nohz_task_switch() again to
> > evaluate the need for tick.
> >
> > I don't know whether I missed something here.
>
> You're right it looks safe to do so. But I suggest we rather move the
> test inside local_irq_save()/restore section to avoid confusion on
> reviewers minds.
OK, I'll send an updated version, using local_irq_save() to protect it.
I tried using raw_* because seems it could avoid some unnecessary irq
disabling...
Thanks, Zhong
> Thanks!
>
I saw following error when testing the latest nohz code on Power:
[ 85.295384] BUG: using smp_processor_id() in preemptible [00000000] code: rsyslogd/3493
[ 85.295396] caller is .tick_nohz_task_switch+0x1c/0xb8
[ 85.295402] Call Trace:
[ 85.295408] [c0000001fababab0] [c000000000012dc4] .show_stack+0x110/0x25c (unreliable)
[ 85.295420] [c0000001fababba0] [c0000000007c4b54] .dump_stack+0x20/0x30
[ 85.295430] [c0000001fababc10] [c00000000044eb74] .debug_smp_processor_id+0xf4/0x124
[ 85.295438] [c0000001fababca0] [c0000000000d7594] .tick_nohz_task_switch+0x1c/0xb8
[ 85.295447] [c0000001fababd20] [c0000000000b9748] .finish_task_switch+0x13c/0x160
[ 85.295455] [c0000001fababdb0] [c0000000000bbe50] .schedule_tail+0x50/0x124
[ 85.295463] [c0000001fababe30] [c000000000009dc8] .ret_from_fork+0x4/0x54
The code below moves the test into local_irq_save/restore section to
avoid the above complaint.
Signed-off-by: Li Zhong <[email protected]>
---
kernel/time/tick-sched.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index da53c8f..1c9f53b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -251,14 +251,15 @@ void tick_nohz_task_switch(struct task_struct *tsk)
{
unsigned long flags;
- if (!tick_nohz_full_cpu(smp_processor_id()))
- return;
-
local_irq_save(flags);
+ if (!tick_nohz_full_cpu(smp_processor_id()))
+ goto out;
+
if (tick_nohz_tick_stopped() && !can_stop_full_tick())
tick_nohz_full_kick();
+out:
local_irq_restore(flags);
}
--
1.7.1
Commit-ID: 6296ace467c8640317414ba589b124323806f7ce
Gitweb: http://git.kernel.org/tip/6296ace467c8640317414ba589b124323806f7ce
Author: Li Zhong <[email protected]>
AuthorDate: Sun, 28 Apr 2013 11:25:58 +0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 29 Apr 2013 13:17:33 +0200
nohz: Protect smp_processor_id() in tick_nohz_task_switch()
I saw following error when testing the latest nohz code on
Power:
[ 85.295384] BUG: using smp_processor_id() in preemptible [00000000] code: rsyslogd/3493
[ 85.295396] caller is .tick_nohz_task_switch+0x1c/0xb8
[ 85.295402] Call Trace:
[ 85.295408] [c0000001fababab0] [c000000000012dc4] .show_stack+0x110/0x25c (unreliable)
[ 85.295420] [c0000001fababba0] [c0000000007c4b54] .dump_stack+0x20/0x30
[ 85.295430] [c0000001fababc10] [c00000000044eb74] .debug_smp_processor_id+0xf4/0x124
[ 85.295438] [c0000001fababca0] [c0000000000d7594] .tick_nohz_task_switch+0x1c/0xb8
[ 85.295447] [c0000001fababd20] [c0000000000b9748] .finish_task_switch+0x13c/0x160
[ 85.295455] [c0000001fababdb0] [c0000000000bbe50] .schedule_tail+0x50/0x124
[ 85.295463] [c0000001fababe30] [c000000000009dc8] .ret_from_fork+0x4/0x54
The code below moves the test into local_irq_save/restore
section to avoid the above complaint.
Signed-off-by: Li Zhong <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul McKenney <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/time/tick-sched.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index da53c8f..1c9f53b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -251,14 +251,15 @@ void tick_nohz_task_switch(struct task_struct *tsk)
{
unsigned long flags;
- if (!tick_nohz_full_cpu(smp_processor_id()))
- return;
-
local_irq_save(flags);
+ if (!tick_nohz_full_cpu(smp_processor_id()))
+ goto out;
+
if (tick_nohz_tick_stopped() && !can_stop_full_tick())
tick_nohz_full_kick();
+out:
local_irq_restore(flags);
}