An experimental hack to tease out whether we are continuing to
run the softirq handler past the point of needing scheduling.
It allows only one trip through __do_softirq() as long as need_resched()
is set which hopefully creates the back pressure needed to get ksoftirqd
scheduled.
Targeted to address reports like the following that are produced
with i/o tests to a sas domain with a large number of disks (48+), and
lots of debugging enabled (slub_deubg, lockdep) that makes the
block+scsi softirq path more cpu-expensive than normal.
With this patch applied the softlockup detector seems appeased, but it
seems odd to need changes to kernel/softirq.c so maybe I have overlooked
something that needs changing at the block/scsi level?
BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
irq event stamp: 26260303
hardirqs last enabled at (26260302): [<ffffffff814becf4>] restore_args+0x0/0x30
hardirqs last disabled at (26260303): [<ffffffff814c60ee>] apic_timer_interrupt+0x6e/0x80
softirqs last enabled at (26220386): [<ffffffff81033edd>] __do_softirq+0x1ae/0x1bd
softirqs last disabled at (26220665): [<ffffffff814c696c>] call_softirq+0x1c/0x26
CPU 3
Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
Pid: 78, comm: kworker/3:1 Not tainted 3.3.0-rc3-7ada1dd-isci-3.0.183+ #1 Intel Corporation ROSECITY/ROSECITY
RIP: 0010:[<ffffffff814be8b6>] [<ffffffff814be8b6>] _raw_spin_unlock_irq+0x34/0x4b
RSP: 0000:ffff8800bb8c3c50 EFLAGS: 00000202
RAX: ffff8800375f3ec0 RBX: ffffffff814becf4 RCX: ffff8800bb8c3c00
RDX: 0000000000000001 RSI: ffff880035bbc348 RDI: ffff8800375f4588
RBP: ffff8800bb8c3c60 R08: 0000000000000000 R09: ffff880035aed150
R10: 0000000000018f3b R11: ffff8800bb8c39e0 R12: ffff8800bb8c3bc8
R13: ffffffff814c60f3 R14: ffff8800bb8c3c60 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000f2e028 CR3: 00000000b11b3000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/3:1 (pid: 78, threadinfo ffff8800377d2000, task ffff8800375f3ec0)
Stack:
ffff88003555f800 ffff88003555f800 ffff8800bb8c3cc0 ffffffffa00512c4
ffffffff814be8b2 ffff880035dfc000 ffff880035dfe000 000000000553a265
ffff8800bb8c3cb0 ffff88003555f800 ffff8800b20af200 ffff880035dfe000
Call Trace:
<IRQ>
[<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
[<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
[<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
[<ffffffff813317a8>] ? spin_lock+0x9/0xb
[<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
[<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
[<ffffffff8124b869>] blk_run_queue+0x26/0x3a
[<ffffffff813319a5>] scsi_run_queue+0x1fb/0x20a
[<ffffffff81332136>] scsi_next_command+0x3b/0x4c
[<ffffffff81332b66>] scsi_io_completion+0x205/0x44f
[<ffffffff813316b8>] ? spin_unlock_irqrestore+0x9/0xb
[<ffffffff8132b3ab>] scsi_finish_command+0xeb/0xf4
[<ffffffff81333a04>] scsi_softirq_done+0x112/0x11b
[<ffffffff812540ac>] blk_done_softirq+0x7e/0x96
[<ffffffff81033e0c>] __do_softirq+0xdd/0x1bd
[<ffffffff814c696c>] call_softirq+0x1c/0x26
[<ffffffff81003ce6>] do_softirq+0x4b/0xa5
[<ffffffff81034916>] irq_exit+0x55/0xc2
[<ffffffff814c6a9c>] smp_apic_timer_interrupt+0x7c/0x8a
[<ffffffff814c60f3>] apic_timer_interrupt+0x73/0x80
<EOI>
[<ffffffff814be8b6>] ? _raw_spin_unlock_irq+0x34/0x4b
[<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
[<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
[<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
[<ffffffff813317a8>] ? spin_lock+0x9/0xb
[<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
[<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
[<ffffffff8125f3bb>] cfq_kick_queue+0x2f/0x41
[<ffffffff8104462e>] process_one_work+0x1c8/0x336
[<ffffffff81044599>] ? process_one_work+0x133/0x336
[<ffffffff81044306>] ? spin_lock_irq+0x9/0xb
[<ffffffff8125f38c>] ? cfq_init_queue+0x2a3/0x2a3
[<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
[<ffffffff81046085>] worker_thread+0xac/0x151
[<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
[<ffffffff8104a618>] kthread+0x8a/0x92
[<ffffffff8107654e>] ? trace_hardirqs_on_caller+0x16/0x16d
[<ffffffff814c6874>] kernel_thread_helper+0x4/0x10
[<ffffffff814becf4>] ? retint_restore_args+0x13/0x13
[<ffffffff8104a58e>] ? kthread_create_on_node+0x14d/0x14d
[<ffffffff814c6870>] ? gs_change+0x13/0x13
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: James Bottomley <[email protected]>
Reported-by: Lukasz Dorau <[email protected]>
Reported-by: Andrzej Jakowski <[email protected]>
Not-yet-signed-off-by: Dan Williams <[email protected]>
---
kernel/softirq.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 4eb3a0f..82a3f43 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -255,7 +255,7 @@ restart:
local_irq_disable();
pending = local_softirq_pending();
- if (pending && --max_restart)
+ if (pending && --max_restart && !need_resched())
goto restart;
if (pending)
On Mon, Feb 27, 2012 at 12:38:47PM -0800, Dan Williams wrote:
> An experimental hack to tease out whether we are continuing to
> run the softirq handler past the point of needing scheduling.
>
> It allows only one trip through __do_softirq() as long as need_resched()
> is set which hopefully creates the back pressure needed to get ksoftirqd
> scheduled.
>
> Targeted to address reports like the following that are produced
> with i/o tests to a sas domain with a large number of disks (48+), and
> lots of debugging enabled (slub_deubg, lockdep) that makes the
> block+scsi softirq path more cpu-expensive than normal.
>
> With this patch applied the softlockup detector seems appeased, but it
> seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> something that needs changing at the block/scsi level?
But stucking in softirq for 22s still seems odd.
I guess the reason why your patch works is that softirq returns before
handling BLOCK_SOFTIRQ, but who knows, just guess.
Does kernel command line 'threadirqs' solve your issue?
Thanks,
Yong
>
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
> Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
> irq event stamp: 26260303
> hardirqs last enabled at (26260302): [<ffffffff814becf4>] restore_args+0x0/0x30
> hardirqs last disabled at (26260303): [<ffffffff814c60ee>] apic_timer_interrupt+0x6e/0x80
> softirqs last enabled at (26220386): [<ffffffff81033edd>] __do_softirq+0x1ae/0x1bd
> softirqs last disabled at (26220665): [<ffffffff814c696c>] call_softirq+0x1c/0x26
> CPU 3
> Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
>
> Pid: 78, comm: kworker/3:1 Not tainted 3.3.0-rc3-7ada1dd-isci-3.0.183+ #1 Intel Corporation ROSECITY/ROSECITY
> RIP: 0010:[<ffffffff814be8b6>] [<ffffffff814be8b6>] _raw_spin_unlock_irq+0x34/0x4b
> RSP: 0000:ffff8800bb8c3c50 EFLAGS: 00000202
> RAX: ffff8800375f3ec0 RBX: ffffffff814becf4 RCX: ffff8800bb8c3c00
> RDX: 0000000000000001 RSI: ffff880035bbc348 RDI: ffff8800375f4588
> RBP: ffff8800bb8c3c60 R08: 0000000000000000 R09: ffff880035aed150
> R10: 0000000000018f3b R11: ffff8800bb8c39e0 R12: ffff8800bb8c3bc8
> R13: ffffffff814c60f3 R14: ffff8800bb8c3c60 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000f2e028 CR3: 00000000b11b3000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kworker/3:1 (pid: 78, threadinfo ffff8800377d2000, task ffff8800375f3ec0)
> Stack:
> ffff88003555f800 ffff88003555f800 ffff8800bb8c3cc0 ffffffffa00512c4
> ffffffff814be8b2 ffff880035dfc000 ffff880035dfe000 000000000553a265
> ffff8800bb8c3cb0 ffff88003555f800 ffff8800b20af200 ffff880035dfe000
> Call Trace:
> <IRQ>
> [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
> [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
> [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
> [<ffffffff813317a8>] ? spin_lock+0x9/0xb
> [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
> [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
> [<ffffffff8124b869>] blk_run_queue+0x26/0x3a
> [<ffffffff813319a5>] scsi_run_queue+0x1fb/0x20a
> [<ffffffff81332136>] scsi_next_command+0x3b/0x4c
> [<ffffffff81332b66>] scsi_io_completion+0x205/0x44f
> [<ffffffff813316b8>] ? spin_unlock_irqrestore+0x9/0xb
> [<ffffffff8132b3ab>] scsi_finish_command+0xeb/0xf4
> [<ffffffff81333a04>] scsi_softirq_done+0x112/0x11b
> [<ffffffff812540ac>] blk_done_softirq+0x7e/0x96
> [<ffffffff81033e0c>] __do_softirq+0xdd/0x1bd
> [<ffffffff814c696c>] call_softirq+0x1c/0x26
> [<ffffffff81003ce6>] do_softirq+0x4b/0xa5
> [<ffffffff81034916>] irq_exit+0x55/0xc2
> [<ffffffff814c6a9c>] smp_apic_timer_interrupt+0x7c/0x8a
> [<ffffffff814c60f3>] apic_timer_interrupt+0x73/0x80
> <EOI>
> [<ffffffff814be8b6>] ? _raw_spin_unlock_irq+0x34/0x4b
> [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
> [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
> [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
> [<ffffffff813317a8>] ? spin_lock+0x9/0xb
> [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
> [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
> [<ffffffff8125f3bb>] cfq_kick_queue+0x2f/0x41
> [<ffffffff8104462e>] process_one_work+0x1c8/0x336
> [<ffffffff81044599>] ? process_one_work+0x133/0x336
> [<ffffffff81044306>] ? spin_lock_irq+0x9/0xb
> [<ffffffff8125f38c>] ? cfq_init_queue+0x2a3/0x2a3
> [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
> [<ffffffff81046085>] worker_thread+0xac/0x151
> [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
> [<ffffffff8104a618>] kthread+0x8a/0x92
> [<ffffffff8107654e>] ? trace_hardirqs_on_caller+0x16/0x16d
> [<ffffffff814c6874>] kernel_thread_helper+0x4/0x10
> [<ffffffff814becf4>] ? retint_restore_args+0x13/0x13
> [<ffffffff8104a58e>] ? kthread_create_on_node+0x14d/0x14d
> [<ffffffff814c6870>] ? gs_change+0x13/0x13
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: James Bottomley <[email protected]>
> Reported-by: Lukasz Dorau <[email protected]>
> Reported-by: Andrzej Jakowski <[email protected]>
> Not-yet-signed-off-by: Dan Williams <[email protected]>
> ---
> kernel/softirq.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 4eb3a0f..82a3f43 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -255,7 +255,7 @@ restart:
> local_irq_disable();
>
> pending = local_softirq_pending();
> - if (pending && --max_restart)
> + if (pending && --max_restart && !need_resched())
> goto restart;
>
> if (pending)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Only stand for myself
On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
> An experimental hack to tease out whether we are continuing to
> run the softirq handler past the point of needing scheduling.
>
> It allows only one trip through __do_softirq() as long as need_resched()
> is set which hopefully creates the back pressure needed to get ksoftirqd
> scheduled.
>
> Targeted to address reports like the following that are produced
> with i/o tests to a sas domain with a large number of disks (48+), and
> lots of debugging enabled (slub_deubg, lockdep) that makes the
> block+scsi softirq path more cpu-expensive than normal.
>
> With this patch applied the softlockup detector seems appeased, but it
> seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> something that needs changing at the block/scsi level?
>
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
So you're stuck in softirq for 22s+, max_restart is 10, this gives that
on average you spend 2.2s+ per softirq invocation, this is completely
absolutely bonkers. Softirq handlers should never consume significant
amount of cpu-time.
Thomas, think its about time we put something like the below in?
---
kernel/softirq.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index ff066a4..6137ee1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -210,6 +210,7 @@ asmlinkage void __do_softirq(void)
__u32 pending;
int max_restart = MAX_SOFTIRQ_RESTART;
int cpu;
+ u64 start, callback, now;
pending = local_softirq_pending();
account_system_vtime(current);
@@ -223,6 +224,8 @@ asmlinkage void __do_softirq(void)
/* Reset the pending bitmask before enabling irqs */
set_softirq_pending(0);
+ start = callback = cpu_clock(cpu);
+
local_irq_enable();
h = softirq_vec;
@@ -246,6 +249,15 @@ asmlinkage void __do_softirq(void)
preempt_count() = prev_count;
}
+ now = cpu_clock(cpu);
+ if (now - callback > TICK_NSEC / 4) {
+ printk(KERN_ERR "softirq took longer than 1/4 tick: "
+ "%u %s %p\n", vec_nr,
+ softirq_to_name[vec_nr],
+ h->action);
+ }
+ callback = now;
+
rcu_bh_qs(cpu);
}
h++;
@@ -254,6 +266,10 @@ asmlinkage void __do_softirq(void)
local_irq_disable();
+ now = cpu_clock(cpu);
+ if (now - start > TICK_NSEC / 2)
+ printk(KERN_ERR "softirq loop took longer than 1/2 tick\n");
+
pending = local_softirq_pending();
if (pending && --max_restart)
goto restart;
On Tue, Feb 28, 2012 at 1:48 AM, Peter Zijlstra <[email protected]> wrote:
>> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>
> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
> on average you spend 2.2s+ per softirq invocation, this is completely
> absolutely bonkers. Softirq handlers should never consume significant
> amount of cpu-time.
I'm not sure whether this was just one __do_softirq() invocation, or
multiple. I'll see if I can instrument this, but the suspicion is
that max_restart can't help when softirq is constantly re-triggered.
I'll also add the below patch to see if any single invocation is
violating these expectations.
--
Dan
On Tue, 28 Feb 2012, Peter Zijlstra wrote:
> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
> > An experimental hack to tease out whether we are continuing to
> > run the softirq handler past the point of needing scheduling.
> >
> > It allows only one trip through __do_softirq() as long as need_resched()
> > is set which hopefully creates the back pressure needed to get ksoftirqd
> > scheduled.
> >
> > Targeted to address reports like the following that are produced
> > with i/o tests to a sas domain with a large number of disks (48+), and
> > lots of debugging enabled (slub_deubg, lockdep) that makes the
> > block+scsi softirq path more cpu-expensive than normal.
> >
> > With this patch applied the softlockup detector seems appeased, but it
> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> > something that needs changing at the block/scsi level?
> >
> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>
> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
> on average you spend 2.2s+ per softirq invocation, this is completely
> absolutely bonkers. Softirq handlers should never consume significant
> amount of cpu-time.
>
> Thomas, think its about time we put something like the below in?
Absolutely. Anything which consumes more than a few microseconds in
the softirq handler needs to be sorted out, no matter what.
On Tue, Feb 28, 2012 at 1:41 PM, Thomas Gleixner <[email protected]> wrote:
> On Tue, 28 Feb 2012, Peter Zijlstra wrote:
>
>> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
>> > An experimental hack to tease out whether we are continuing to
>> > run the softirq handler past the point of needing scheduling.
>> >
>> > It allows only one trip through __do_softirq() as long as need_resched()
>> > is set which hopefully creates the back pressure needed to get ksoftirqd
>> > scheduled.
>> >
>> > Targeted to address reports like the following that are produced
>> > with i/o tests to a sas domain with a large number of disks (48+), and
>> > lots of debugging enabled (slub_deubg, lockdep) that makes the
>> > block+scsi softirq path more cpu-expensive than normal.
>> >
>> > With this patch applied the softlockup detector seems appeased, but it
>> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
>> > something that needs changing at the block/scsi level?
>> >
>> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>>
>> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
>> on average you spend 2.2s+ per softirq invocation, this is completely
>> absolutely bonkers. Softirq handlers should never consume significant
>> amount of cpu-time.
>>
>> Thomas, think its about time we put something like the below in?
>
> Absolutely. Anything which consumes more than a few microseconds in
> the softirq handler needs to be sorted out, no matter what.
Looks like everyone is guilty:
[ 422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
...
[ 423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
[ 423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
[ 423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
[ 424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
[ 424.008691] softirq loop took longer than 1/2 tick need_resched:
yes max_restart: 10
As expected whenever that 1/2 tick message gets emitted the softirq
handler is almost running in a need_resched() context.
$ grep need_resched.*no log | wc -l
295
$ grep need_resched.*yes log | wc -l
3201
One of these warning messages gets printed out at a rate of 1 every
40ms. (468 second log w/ 11,725 of these messages).
So is it a good idea to get more aggressive about scheduling ksoftrrqd?
--
Dan
On Tue, Feb 28, 2012 at 2:16 PM, Dan Williams <[email protected]> wrote:
> On Tue, Feb 28, 2012 at 1:41 PM, Thomas Gleixner <[email protected]> wrote:
>> On Tue, 28 Feb 2012, Peter Zijlstra wrote:
>>
>>> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
>>> > An experimental hack to tease out whether we are continuing to
>>> > run the softirq handler past the point of needing scheduling.
>>> >
>>> > It allows only one trip through __do_softirq() as long as need_resched()
>>> > is set which hopefully creates the back pressure needed to get ksoftirqd
>>> > scheduled.
>>> >
>>> > Targeted to address reports like the following that are produced
>>> > with i/o tests to a sas domain with a large number of disks (48+), and
>>> > lots of debugging enabled (slub_deubg, lockdep) that makes the
>>> > block+scsi softirq path more cpu-expensive than normal.
>>> >
>>> > With this patch applied the softlockup detector seems appeased, but it
>>> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
>>> > something that needs changing at the block/scsi level?
>>> >
>>> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>>>
>>> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
>>> on average you spend 2.2s+ per softirq invocation, this is completely
>>> absolutely bonkers. Softirq handlers should never consume significant
>>> amount of cpu-time.
>>>
>>> Thomas, think its about time we put something like the below in?
>>
>> Absolutely. Anything which consumes more than a few microseconds in
>> the softirq handler needs to be sorted out, no matter what.
>
> Looks like everyone is guilty:
>
> [ ?422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> ...
> [ ?423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> [ ?423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> [ ?423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> [ ?424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> [ ?424.008691] softirq loop took longer than 1/2 tick need_resched:
> yes max_restart: 10
>
> As expected whenever that 1/2 tick message gets emitted the softirq
> handler is almost running in a need_resched() context.
>
> $ grep need_resched.*no log | wc -l
> 295
> $ grep need_resched.*yes log | wc -l
> 3201
>
> One of these warning messages gets printed out at a rate of 1 every
> 40ms. (468 second log w/ 11,725 of these messages).
>
> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>
Keep in mind this is with slub_debug and lockdep turned on, but we are
basically looking to do extended i/o runs on large topologies and want
the soft lockup detector to not generate false positives.
On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
> Looks like everyone is guilty:
>
> [ 422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> ...
> [ 423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> [ 423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> [ 423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> [ 424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> [ 424.008691] softirq loop took longer than 1/2 tick need_resched:
/me kicks himself for not printing the actual duration.. :-)
> As expected whenever that 1/2 tick message gets emitted the softirq
> handler is almost running in a need_resched() context.
Yeah.. that's quite expected.
> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
Nah, moving away from softirq more like. I'll put moving the
load-balancer into a kthread on the todo list. And it looks like
everybody else should move to kthreads too.
On Wed, Feb 29, 2012 at 1:17 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
>> Looks like everyone is guilty:
>>
>> [ ?422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
>> ...
>> [ ?423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
>> [ ?423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
>> [ ?423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
>> [ ?424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
>> [ ?424.008691] softirq loop took longer than 1/2 tick need_resched:
>
> /me kicks himself for not printing the actual duration.. :-)
>
>> As expected whenever that 1/2 tick message gets emitted the softirq
>> handler is almost running in a need_resched() context.
>
> Yeah.. that's quite expected.
>
>> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>
> Nah, moving away from softirq more like. I'll put moving the
> load-balancer into a kthread on the todo list. And it looks like
> everybody else should move to kthreads too.
I seem to recall this running into resistance (but maybe things have
changed in the last few years)?
https://lkml.org/lkml/2007/6/29/155