2006-08-10 11:05:11

by Jens Axboe

[permalink] [raw]
Subject: softirq considered harmful

Hi,

Ok maybe that is a little too strong, but I am indeed seeing some very
sucky behaviour with softirqs here. The block layer uses it for doing
request completions, and several cases in the past months have made me
suspect it was doing to well. So recently I added a bit of timing debug
info, noting how long it would take from when a request goes on the
per-CPU completion list and raise_softirq_irqoff() is called, to when
the softirq handler was actually called. I logged the 10 slowest times.

The results were very alarming. Doing a simple kernel untar (6 of them)
or tiotest run, this is what I got (time in microseconds):

centera:/sys/block/sdc/queue # cat complete
7790, 10119, 9711, 7830, 8678, 8171, 9364, 7629, 9033, 8383

so within a few minutes, we've seen 8-10 msec delays! From the IO point
of view, this is time wasted where we could have notified the IO
producer that their work was done and put the disk to other use.

Then I thought perhaps this was due to punting to ksoftirqd (which runs
at nice 19), so I tried to renice ksoftirqdX. Initially results seemed
to look better, but it ended up looking like this:

centera:/sys/block/sdc/queue # cat complete
10854, 6091, 4514, 8638, 9346, 9518, 8528, 5629, 4016, 3583

So probably the same results, not sure how much they differ yet. Not
sure quite what is going on here yet, thought I'd post note here and see
if anyone has clues or has seem something similar. One last final clue
is that run_timer_softirq() sometimes takes way too long. This was noted
with some debug code checking how long a handler took, and dumping the
offender if it was larger than 2 msecs:

run_timer_softirq+0x0/0x18e: took 3750
run_timer_softirq+0x0/0x18e: took 2595
run_timer_softirq+0x0/0x18e: took 6265
run_timer_softirq+0x0/0x18e: took 2608

So from 2.6 to 6.2msecs just that handler, auch. During normal running,
the 2.6 msec variant triggers quite often.

--
Jens Axboe


2006-08-10 11:41:26

by David Miller

[permalink] [raw]
Subject: Re: softirq considered harmful

From: Jens Axboe <[email protected]>
Date: Thu, 10 Aug 2006 13:06:27 +0200

> run_timer_softirq+0x0/0x18e: took 3750
> run_timer_softirq+0x0/0x18e: took 2595
> run_timer_softirq+0x0/0x18e: took 6265
> run_timer_softirq+0x0/0x18e: took 2608
>
> So from 2.6 to 6.2msecs just that handler, auch. During normal running,
> the 2.6 msec variant triggers quite often.

It would be interesting to know what timers ran when
the overhead got this high.

You can probably track this with a per-cpu array
of pointers, have run_timer_softirq record the
t->func pointers into the array as it runs the
current slew of timers, then if the "took" is
very large dump the array.

2006-08-10 12:24:55

by Jens Axboe

[permalink] [raw]
Subject: Re: softirq considered harmful

On Thu, Aug 10 2006, David Miller wrote:
> From: Jens Axboe <[email protected]>
> Date: Thu, 10 Aug 2006 13:06:27 +0200
>
> > run_timer_softirq+0x0/0x18e: took 3750
> > run_timer_softirq+0x0/0x18e: took 2595
> > run_timer_softirq+0x0/0x18e: took 6265
> > run_timer_softirq+0x0/0x18e: took 2608
> >
> > So from 2.6 to 6.2msecs just that handler, auch. During normal running,
> > the 2.6 msec variant triggers quite often.
>
> It would be interesting to know what timers ran when
> the overhead got this high.
>
> You can probably track this with a per-cpu array
> of pointers, have run_timer_softirq record the
> t->func pointers into the array as it runs the
> current slew of timers, then if the "took" is
> very large dump the array.

I did that, and it got me nowhere, no timers are to blame. Trying to
reproduce the slowdowns reported, these tests were with full preemtion
enabled. Running the exact same test that showed:

centera:/sys/block/sdc/queue # cat complete
8596, 8148, 6833, 7768, 11332, 7781, 8652, 6956, 6716, 13710

before (almost 14 msecs!!), now shows:

centera:/sys/block/sdc/queue # cat complete
172, 138, 125, 152, 122, 179, 157, 155, 142, 155

which is a hell of a lot more reasonable. Preempting the io completion
is, well, shall we say pretty suboptimal. So much for improving latency
:-)

--
Jens Axboe

2006-08-12 23:29:03

by Andrew Morton

[permalink] [raw]
Subject: Re: softirq considered harmful

On Thu, 10 Aug 2006 13:06:27 +0200
Jens Axboe <[email protected]> wrote:

> Ok maybe that is a little too strong, but I am indeed seeing some very
> sucky behaviour with softirqs here. The block layer uses it for doing
> request completions

I wasn't even aware that this change had been made. I don't recall (and I
cannot locate) any mailing list discussion of it.

Maybe I missed the discussion. But if not, this is yet another case of
significant changes getting into mainline via a git merge and sneaking
under everyone's radar.


It seems like a bad idea to me. Any additional latency at all in disk
completion adds directly onto process waiting time - we do a _lot_ of
synchronous disk IO.

There is no mention in the changelog of any observed problems which this
patch solves. Can we revert it please?

2006-08-13 00:43:08

by David Miller

[permalink] [raw]
Subject: Re: softirq considered harmful

From: Andrew Morton <[email protected]>
Date: Sat, 12 Aug 2006 16:28:57 -0700

> Maybe I missed the discussion. But if not, this is yet another case of
> significant changes getting into mainline via a git merge and sneaking
> under everyone's radar.

Scsi has been doing command completions via a per-cpu softirq handler
for as long as we've had an SMP more advanced than lock_kernel() :-)

2006-08-13 00:45:58

by Andrew Morton

[permalink] [raw]
Subject: Re: softirq considered harmful

On Sat, 12 Aug 2006 17:43:24 -0700 (PDT)
David Miller <[email protected]> wrote:

> From: Andrew Morton <[email protected]>
> Date: Sat, 12 Aug 2006 16:28:57 -0700
>
> > Maybe I missed the discussion. But if not, this is yet another case of
> > significant changes getting into mainline via a git merge and sneaking
> > under everyone's radar.
>
> Scsi has been doing command completions via a per-cpu softirq handler
> for as long as we've had an SMP more advanced than lock_kernel() :-)

Is that also adding 150 usecs to each IO operation?

2006-08-13 01:09:26

by David Miller

[permalink] [raw]
Subject: Re: softirq considered harmful

From: Andrew Morton <[email protected]>
Date: Sat, 12 Aug 2006 17:45:49 -0700

> Is that also adding 150 usecs to each IO operation?

I have no idea, Jens hasn't done enough to narrow down the true cause
of the latencies he is seeing. So pinpointing it on anything specific
is highly premature at this stage.

My point was merely to encourage you to find out the facts before
tossing accusations around. :-) I/O completions via softirqs aren't
some new thing, in general, as the SCSI example shows, and GIT didn't
even exist when the scsi command completion via softirq changes went
in.

Heck I think even Eric Youngdale was still an active developer of our
SCSI stack back then, that's a long time ago! :-)

2006-08-13 01:22:42

by Andrew Morton

[permalink] [raw]
Subject: Re: softirq considered harmful

On Sat, 12 Aug 2006 18:09:44 -0700 (PDT)
David Miller <[email protected]> wrote:

> From: Andrew Morton <[email protected]>
> Date: Sat, 12 Aug 2006 17:45:49 -0700
>
> > Is that also adding 150 usecs to each IO operation?
>
> I have no idea, Jens hasn't done enough to narrow down the true cause
> of the latencies he is seeing. So pinpointing it on anything specific
> is highly premature at this stage.

Determining whether pre-conversion scsi was impacted in the same manner
would be part of that pinpointing process.

Deferring to softirq _has_ to add latency and any latency addition in
synchronous disk IO is very bad. That being said, 150 usecs per request is
so bad that I'd be suspecting that it's not affecting most people, else
we'd have heard.

> My point was merely to encourage you to find out the facts before
> tossing accusations around. :-)

No, your point was that slotting this change into mainline without telling
anyone was OK because SCSI has been doing something similar.

2006-08-13 14:31:37

by Jens Axboe

[permalink] [raw]
Subject: Re: softirq considered harmful

On Sat, Aug 12 2006, Andrew Morton wrote:
> On Thu, 10 Aug 2006 13:06:27 +0200
> Jens Axboe <[email protected]> wrote:
>
> > Ok maybe that is a little too strong, but I am indeed seeing some very
> > sucky behaviour with softirqs here. The block layer uses it for doing
> > request completions
>
> I wasn't even aware that this change had been made. I don't recall (and I
> cannot locate) any mailing list discussion of it.
>
> Maybe I missed the discussion. But if not, this is yet another case of
> significant changes getting into mainline via a git merge and sneaking
> under everyone's radar.

It's not a significant change, it pretty much falls into a code
relocation issue. The softirq completion stuff was made generic, and
SCSI the primary user of it. The completion path didn't change for SCSI.

> It seems like a bad idea to me. Any additional latency at all in disk
> completion adds directly onto process waiting time - we do a _lot_ of
> synchronous disk IO.

Doesn't seem like a good idea to me either, hence I'm investigating the
current possible problems with it...

> There is no mention in the changelog of any observed problems which this
> patch solves. Can we revert it please?

As you should see now, it wont change anything. The problem would be
the same.

--
Jens Axboe

2006-08-14 07:33:11

by Jens Axboe

[permalink] [raw]
Subject: Re: softirq considered harmful

On Sat, Aug 12 2006, Andrew Morton wrote:
> On Sat, 12 Aug 2006 17:43:24 -0700 (PDT)
> David Miller <[email protected]> wrote:
>
> > From: Andrew Morton <[email protected]>
> > Date: Sat, 12 Aug 2006 16:28:57 -0700
> >
> > > Maybe I missed the discussion. But if not, this is yet another case of
> > > significant changes getting into mainline via a git merge and sneaking
> > > under everyone's radar.
> >
> > Scsi has been doing command completions via a per-cpu softirq handler
> > for as long as we've had an SMP more advanced than lock_kernel() :-)
>
> Is that also adding 150 usecs to each IO operation?

It is, it's the identical mechanism. SCSI used to do completions via
tasklets, it was converted to softirqs a long time ago but I don't think
anyone ever did timings on it to my knowledge... From the few timings I
showed, 150 usec is a _best_ case time on my hardware. 10 msecs was seen
as well, which is just bad beyond describing.

My suggestion (I'll code this up) is that we scrap the softirq
completion and just do it from the irq event. The typical completion
doesn't even need to grab any locks.

--
Jens Axboe

2006-08-14 07:35:47

by Jens Axboe

[permalink] [raw]
Subject: Re: softirq considered harmful

On Sat, Aug 12 2006, Andrew Morton wrote:
> On Sat, 12 Aug 2006 18:09:44 -0700 (PDT)
> David Miller <[email protected]> wrote:
>
> > From: Andrew Morton <[email protected]>
> > Date: Sat, 12 Aug 2006 17:45:49 -0700
> >
> > > Is that also adding 150 usecs to each IO operation?
> >
> > I have no idea, Jens hasn't done enough to narrow down the true cause
> > of the latencies he is seeing. So pinpointing it on anything specific
> > is highly premature at this stage.
>
> Determining whether pre-conversion scsi was impacted in the same manner
> would be part of that pinpointing process.
>
> Deferring to softirq _has_ to add latency and any latency addition in
> synchronous disk IO is very bad. That being said, 150 usecs per request is
> so bad that I'd be suspecting that it's not affecting most people, else
> we'd have heard.

Hopefully you often end up doing > 1 request for a busy IO sub system,
otherwise the softirq stuff is pointless. But it's still pretty bad for
single requests.

> > My point was merely to encourage you to find out the facts before
> > tossing accusations around. :-)
>
> No, your point was that slotting this change into mainline without telling
> anyone was OK because SCSI has been doing something similar.

Not similar, identical. Andrew, there was _no_ real change there!

--
Jens Axboe

2006-08-14 08:44:51

by David Miller

[permalink] [raw]
Subject: Re: softirq considered harmful

From: Jens Axboe <[email protected]>
Date: Mon, 14 Aug 2006 09:37:25 +0200

> Hopefully you often end up doing > 1 request for a busy IO sub system,
> otherwise the softirq stuff is pointless. But it's still pretty bad for
> single requests.

Note that the per-cpu softirq completion of I/O events means
that the queue can be processed lockless.

I'm not saying this justifies softirq I/O completion, I'm just
mentioning it as one benefit of the scheme. This is also why
networking uses softirqs for the core irq events instead of
tasklets.

It appears the scsi code uses hw IRQ locking for all of it's
locking so that should be fine.

However there might be some scsi exception handling dependencies
on the completions being run in software irq context, but this is
just a guess.

iSCSI would need to be checked out too.

2006-08-14 08:50:10

by Jens Axboe

[permalink] [raw]
Subject: Re: softirq considered harmful

On Mon, Aug 14 2006, David Miller wrote:
> From: Jens Axboe <[email protected]>
> Date: Mon, 14 Aug 2006 09:37:25 +0200
>
> > Hopefully you often end up doing > 1 request for a busy IO sub system,
> > otherwise the softirq stuff is pointless. But it's still pretty bad for
> > single requests.
>
> Note that the per-cpu softirq completion of I/O events means
> that the queue can be processed lockless.
>
> I'm not saying this justifies softirq I/O completion, I'm just
> mentioning it as one benefit of the scheme. This is also why
> networking uses softirqs for the core irq events instead of
> tasklets.

As I wrote, the actual processing of the completion can often be done
lockless in the first place, so there's little benefit to offloading
those to softirq completion. We definitely have room for improvement.

> It appears the scsi code uses hw IRQ locking for all of it's
> locking so that should be fine.

Yep

> However there might be some scsi exception handling dependencies
> on the completions being run in software irq context, but this is
> just a guess.
>
> iSCSI would need to be checked out too.

It certainly isn't an easy job, lots of stuff needs to be eye balled.

--
Jens Axboe