2010-06-10 17:26:15

by Linus Torvalds

[permalink] [raw]
Subject: Slow pty's (was Re: libdivecomputer interfaces?)

Greg, Alan, Hirofumi-san,

I thought we long since (ie back last fall) fixed the latency
problems with pty's, but there does seem to be something very fishy
going on there still.

On Thu, Jun 10, 2010 at 8:01 AM, Linus Torvalds
<[email protected]> wrote:
>On Sat, May 29, 2010 at 12:53 PM, Jef Driesen <[email protected]> wrote
>> BTW, now that I have your attention, could you maybe help me with a linux
>> kernel problem I'm experiencing in this area? I reported the problem on LKML
>> but got no response:
>>
>> http://www.divesoftware.org/libdc/simulator.html
>> http://groups.google.com/group/linux.kernel/browse_thread/thread/5a2b00e35b0864a7
>
> [ Hmm.. Testing.. ]
>
> Yeah, it's slow. Your test thing takes one and a quarter minutes for
> me. That's ridiculous.
>
> And no, we shouldn't need the low-latency flag, we're supposed to do
> this all automatically correctly. I'll talk to the tty people.

This is clearly not a regression (it's been going on forever, I
suspect), but taking over a minute to transfer just over half a MB of
data over a pty seems crazy.

Maybe it's not a kernel problem, and it's something done wrong by
rx/sx/socat, I haven't looked at what they do. But since setting
low_latency apparently helps (I didn't test that part, but I did test
"ridiculously slow"), it sounds very much like something is still
wrong in the kernel unless there is some really subtle timing issue in
user space.

>From Jef's original lkml report linked to above:

> You can reproduce the problem by running these commands in three
> different terminals:
>
> # Terminal 1: Setup the pty's.
> socat PTY,link=/tmp/ttyS0 PTY,link=/tmp/ttyS1
> # Terminal 2: Send some data.
> dd if=/dev/urandom of=input.bin bs=538368 count=1
> sx input.bin >>/tmp/ttyS0 </tmp/ttyS0
> # Terminal 2: Receive the data data.
> time rx output.bin >/tmp/ttyS1 </tmp/ttyS1

and yeah, it's pretty clear to see. A "perf report" on that receiving
side just shows queue_delayed_work_on(), but that doesn't mean much.
It's clearly just sleeping all the time...

Any ideas?

Linus


2010-06-10 18:07:16

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

Linus Torvalds <[email protected]> writes:

>>From Jef's original lkml report linked to above:
>
>> You can reproduce the problem by running these commands in three
>> different terminals:
>>
>> # Terminal 1: Setup the pty's.
>> socat PTY,link=/tmp/ttyS0 PTY,link=/tmp/ttyS1
>> # Terminal 2: Send some data.
>> dd if=/dev/urandom of=input.bin bs=538368 count=1
>> sx input.bin >>/tmp/ttyS0 </tmp/ttyS0
>> # Terminal 2: Receive the data data.
>> time rx output.bin >/tmp/ttyS1 </tmp/ttyS1
>
> and yeah, it's pretty clear to see. A "perf report" on that receiving
> side just shows queue_delayed_work_on(), but that doesn't mean much.
> It's clearly just sleeping all the time...

I'm not reading all of this thread yet, so sorry if I'm missing the
point of this thread.

FWIW, I remember there was the unnecessary waiting related to background
flusher. It was in input_available_p(). I'm not sure at all whether
this is related to the problem, and totally untested patch (this patch
calls tty_flush_to_ldisc() only when data is unavailable). Someone can
test the following or something (is there any change)?

Well, anyway, I'll read this thread at this weekend, and will see if
someone didn't already fix it.


I.e. the following or something,

static inline int input_available_p(struct tty_struct *tty, int amt)
{
int try = 0;

retry:
if (tty->icanon) {
if (tty->canon_data)
return 1;
} else if (tty->read_cnt >= (amt ? amt : 1))
return 1;

if (!checked) {
tty_flush_to_ldisc(tty);
try = 1;
goto retry;
}

return 0;
}

--
OGAWA Hirofumi <[email protected]>

2010-06-10 18:20:32

by Chris Wedgwood

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

(sorry if this reponse isn't on target, i was just pointed to this
thread a few minutes ago)


On Thu, Jun 10, 2010 at 10:25:36AM -0700, Linus Torvalds wrote:

> I thought we long since (ie back last fall) fixed the latency
> problems with pty's, but there does seem to be something very fishy
> going on there still.

this might not be related, but i have slow serial ports with NOHZ that
goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.


commit 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad
Author: Mike Galbraith <[email protected]>
Date: Thu Mar 11 17:17:13 2010 +0100

sched: Rate-limit nohz

Entering nohz code on every micro-idle is costing ~10% throughput for netperf
TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.

The higher the context switch rate, the more nohz entry costs. With this patch
and some cycle recovery patches in my tree, max cross cpu context switch rate is
improved by ~16%, a large portion of which of which is this ratelimiting.

and looking at the only two interesting hunks it's not clear why:

+int nohz_ratelimit(int cpu)
+{
+ struct rq *rq = cpu_rq(cpu);
+ u64 diff = rq->clock - rq->nohz_stamp;
+
+ rq->nohz_stamp = rq->clock;
+
+ return diff < (NSEC_PER_SEC / HZ) >> 1;
+}

+ if (nohz_ratelimit(cpu))
+ goto end;
+

network latnecy is fine, and if i create lots of wakeups (network IO
is fine) then the serial port latency is noticable

2010-06-10 22:25:43

by Brian Bloniarz

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On 06/10/2010 02:10 PM, Chris Wedgwood wrote:
> (sorry if this reponse isn't on target, i was just pointed to this
> thread a few minutes ago)
>
>
> On Thu, Jun 10, 2010 at 10:25:36AM -0700, Linus Torvalds wrote:
>
>> I thought we long since (ie back last fall) fixed the latency
>> problems with pty's, but there does seem to be something very fishy
>> going on there still.
>
> this might not be related, but i have slow serial ports with NOHZ that
> goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.

Unrelated or not, I think Chris is right about this. Somewhere before
-rc1, the emulated serial console on my KVM instance became slow
to echo input. I just tested with the commit reverted and it's
back to normal.

> commit 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad
> Author: Mike Galbraith <[email protected]>
> Date: Thu Mar 11 17:17:13 2010 +0100
>
> sched: Rate-limit nohz
>
> Entering nohz code on every micro-idle is costing ~10% throughput for netperf
> TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
> ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.
>
> The higher the context switch rate, the more nohz entry costs. With this patch
> and some cycle recovery patches in my tree, max cross cpu context switch rate is
> improved by ~16%, a large portion of which of which is this ratelimiting.
>
> and looking at the only two interesting hunks it's not clear why:
>
> +int nohz_ratelimit(int cpu)
> +{
> + struct rq *rq = cpu_rq(cpu);
> + u64 diff = rq->clock - rq->nohz_stamp;
> +
> + rq->nohz_stamp = rq->clock;
> +
> + return diff < (NSEC_PER_SEC / HZ) >> 1;
> +}
>
> + if (nohz_ratelimit(cpu))
> + goto end;
> +
>
> network latnecy is fine, and if i create lots of wakeups (network IO
> is fine) then the serial port latency is noticable
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2010-06-10 22:30:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, Jun 10, 2010 at 3:25 PM, Brian Bloniarz <[email protected]> wrote:
> On 06/10/2010 02:10 PM, Chris Wedgwood wrote:
>>
>> this might not be related, but i have slow serial ports with NOHZ that
>> goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.
>
> Unrelated or not, I think Chris is right about this. Somewhere before
> -rc1, the emulated serial console on my KVM instance became slow
> to echo input. I just tested with the commit reverted and it's
> back to normal.

I suspect it's related. The tty subsystem clearly ends up depending on
the timer tick to move things forward even if there are pending
readers. So I suspect the slowness is very much related to the same
issue.

Linus

2010-06-16 15:03:28

by Jiri Kosina

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 10 Jun 2010, Brian Bloniarz wrote:

> On 06/10/2010 02:10 PM, Chris Wedgwood wrote:
> > (sorry if this reponse isn't on target, i was just pointed to this
> > thread a few minutes ago)
> >
> >
> > On Thu, Jun 10, 2010 at 10:25:36AM -0700, Linus Torvalds wrote:
> >
> >> I thought we long since (ie back last fall) fixed the latency
> >> problems with pty's, but there does seem to be something very fishy
> >> going on there still.
> >
> > this might not be related, but i have slow serial ports with NOHZ that
> > goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.
>
> Unrelated or not, I think Chris is right about this. Somewhere before
> -rc1, the emulated serial console on my KVM instance became slow
> to echo input. I just tested with the commit reverted and it's
> back to normal.

So let's CC Mike then.

>
> > commit 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad
> > Author: Mike Galbraith <[email protected]>
> > Date: Thu Mar 11 17:17:13 2010 +0100
> >
> > sched: Rate-limit nohz
> >
> > Entering nohz code on every micro-idle is costing ~10% throughput for netperf
> > TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
> > ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.
> >
> > The higher the context switch rate, the more nohz entry costs. With this patch
> > and some cycle recovery patches in my tree, max cross cpu context switch rate is
> > improved by ~16%, a large portion of which of which is this ratelimiting.
> >
> > and looking at the only two interesting hunks it's not clear why:
> >
> > +int nohz_ratelimit(int cpu)
> > +{
> > + struct rq *rq = cpu_rq(cpu);
> > + u64 diff = rq->clock - rq->nohz_stamp;
> > +
> > + rq->nohz_stamp = rq->clock;
> > +
> > + return diff < (NSEC_PER_SEC / HZ) >> 1;
> > +}
> >
> > + if (nohz_ratelimit(cpu))
> > + goto end;
> > +
> >
> > network latnecy is fine, and if i create lots of wakeups (network IO
> > is fine) then the serial port latency is noticable

--
Jiri Kosina
SUSE Labs, Novell Inc.

2010-06-16 15:16:58

by Mike Galbraith

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Wed, 2010-06-16 at 17:03 +0200, Jiri Kosina wrote:
> On Thu, 10 Jun 2010, Brian Bloniarz wrote:
>
> > On 06/10/2010 02:10 PM, Chris Wedgwood wrote:
> > > (sorry if this reponse isn't on target, i was just pointed to this
> > > thread a few minutes ago)
> > >
> > >
> > > On Thu, Jun 10, 2010 at 10:25:36AM -0700, Linus Torvalds wrote:
> > >
> > >> I thought we long since (ie back last fall) fixed the latency
> > >> problems with pty's, but there does seem to be something very fishy
> > >> going on there still.
> > >
> > > this might not be related, but i have slow serial ports with NOHZ that
> > > goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.
> >
> > Unrelated or not, I think Chris is right about this. Somewhere before
> > -rc1, the emulated serial console on my KVM instance became slow
> > to echo input. I just tested with the commit reverted and it's
> > back to normal.
>
> So let's CC Mike then.

Chris already gave me a heads up, it's on my todo. The old P4 box I use
for a serial console box is exploding on boot, or I would have already
had a look.

> > > and looking at the only two interesting hunks it's not clear why:

Complete mystery to me.

> > > +int nohz_ratelimit(int cpu)
> > > +{
> > > + struct rq *rq = cpu_rq(cpu);
> > > + u64 diff = rq->clock - rq->nohz_stamp;
> > > +
> > > + rq->nohz_stamp = rq->clock;
> > > +
> > > + return diff < (NSEC_PER_SEC / HZ) >> 1;
> > > +}
> > >
> > > + if (nohz_ratelimit(cpu))
> > > + goto end;
> > > +
> > >
> > > network latnecy is fine, and if i create lots of wakeups (network IO
> > > is fine) then the serial port latency is noticable

2010-06-17 06:39:36

by Mike Galbraith

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Wed, 2010-06-16 at 17:16 +0200, Mike Galbraith wrote:
> On Wed, 2010-06-16 at 17:03 +0200, Jiri Kosina wrote:
> > On Thu, 10 Jun 2010, Brian Bloniarz wrote:
> >
> > > On 06/10/2010 02:10 PM, Chris Wedgwood wrote:
> > > > (sorry if this reponse isn't on target, i was just pointed to this
> > > > thread a few minutes ago)
> > > >
> > > >
> > > > On Thu, Jun 10, 2010 at 10:25:36AM -0700, Linus Torvalds wrote:
> > > >
> > > >> I thought we long since (ie back last fall) fixed the latency
> > > >> problems with pty's, but there does seem to be something very fishy
> > > >> going on there still.
> > > >
> > > > this might not be related, but i have slow serial ports with NOHZ that
> > > > goes away when i revert 39c0cbe2150cbd848a25ba6cdb271d1ad46818ad.
> > >
> > > Unrelated or not, I think Chris is right about this. Somewhere before
> > > -rc1, the emulated serial console on my KVM instance became slow
> > > to echo input. I just tested with the commit reverted and it's
> > > back to normal.
> >
> > So let's CC Mike then.
>
> Chris already gave me a heads up, it's on my todo. The old P4 box I use
> for a serial console box is exploding on boot, or I would have already
> had a look.

(Removing filth and re-seating ram seems to have revived poor old P4)

I'm not seeing any problem with serial console here, seems to work just
fine P4->Q6600, both running NOHZ kernels with nohz_ratelimit(), 33.5 on
the P4, and tip.today on the Q6600.

Eyeballing it, perhaps we need to proceed downward if any needs_cpu
condition is true, despite having just been here a wee bit ago.

Does this help anyone's woes?

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5f171f0..ec72fad 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -262,7 +262,7 @@ void tick_nohz_stop_sched_tick(int inidle)
ktime_t last_update, expires, now;
struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;
u64 time_delta;
- int cpu;
+ int cpu, cpu_needed;

local_irq_save(flags);

@@ -315,7 +315,9 @@ void tick_nohz_stop_sched_tick(int inidle)
goto end;
}

- if (nohz_ratelimit(cpu))
+ cpu_needed = rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) || arch_needs_cpu(cpu);
+
+ if (!cpu_needed && nohz_ratelimit(cpu))
goto end;

ts->idle_calls++;
@@ -327,8 +329,7 @@ void tick_nohz_stop_sched_tick(int inidle)
time_delta = timekeeping_max_deferment();
} while (read_seqretry(&xtime_lock, seq));

- if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
- arch_needs_cpu(cpu)) {
+ if (cpu_needed) {
next_jiffies = last_jiffies + 1;
delta_jiffies = 1;
} else {

2010-06-17 07:00:43

by Mike Galbraith

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 2010-06-17 at 08:39 +0200, Mike Galbraith wrote:

> I'm not seeing any problem with serial console here, seems to work just
> fine P4->Q6600, both running NOHZ kernels with nohz_ratelimit(), 33.5 on
> the P4, and tip.today on the Q6600.

Of course, as soon as I say that, the problem appeared. Hopefully,
It'll stick around a while.

-Mike

2010-06-17 10:50:50

by Mike Galbraith

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 2010-06-17 at 09:00 +0200, Mike Galbraith wrote:
> On Thu, 2010-06-17 at 08:39 +0200, Mike Galbraith wrote:
>
> > I'm not seeing any problem with serial console here, seems to work just
> > fine P4->Q6600, both running NOHZ kernels with nohz_ratelimit(), 33.5 on
> > the P4, and tip.today on the Q6600.
>
> Of course, as soon as I say that, the problem appeared. Hopefully,
> It'll stick around a while.

Actually, it's fully reproducible, I just have _way_ too many (49)
kernels to choose from.

I had to go back to virgin 34, apply 39c0cbe and fixlet to fully test,
as git/tip network isn't working quite right for me atm. At any rate,
the below fixed it up for me, and cross-cpu throughput gain is intact.

sched: do not ratelimit NOHZ when the tick is stopped.

Chris Wedgwood reports that 39c0cbe sched: Rate-limit nohz causes a serial
console regression, unresponsiveness, and indeed it does. The below fixes
it by not skipping out when the tick has been stopped.

Tested that the throughput benefit of ratelimiting is still intact. It is.

Signed-off-by: Mike Galbraith <[email protected]>
Reported-by: Chris Wedgwood <[email protected]>
LKML-Reference: <new-submission>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5f171f0..83c5129 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -315,7 +315,7 @@ void tick_nohz_stop_sched_tick(int inidle)
goto end;
}

- if (nohz_ratelimit(cpu))
+ if (!ts->tick_stopped && nohz_ratelimit(cpu))
goto end;

ts->idle_calls++;

2010-06-17 13:24:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 2010-06-17 at 12:50 +0200, Mike Galbraith wrote:

> sched: do not ratelimit NOHZ when the tick is stopped.
>
> Chris Wedgwood reports that 39c0cbe sched: Rate-limit nohz causes a serial
> console regression, unresponsiveness, and indeed it does. The below fixes
> it by not skipping out when the tick has been stopped.
>
> Tested that the throughput benefit of ratelimiting is still intact. It is.
>
> Signed-off-by: Mike Galbraith <[email protected]>
> Reported-by: Chris Wedgwood <[email protected]>

> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 5f171f0..83c5129 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -315,7 +315,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> goto end;
> }
>
> - if (nohz_ratelimit(cpu))
> + if (!ts->tick_stopped && nohz_ratelimit(cpu))
> goto end;
>
> ts->idle_calls++;
>


Humm,. the code around there suggests something like the below, but I
must admit its been a while since I really read all that nohz stuff,
Thomas, any preferences?

---
kernel/time/tick-sched.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5f171f0..e0707ea 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
goto end;
}

- if (nohz_ratelimit(cpu))
- goto end;
-
ts->idle_calls++;
/* Read jiffies and the time when jiffies were updated last */
do {
@@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
} while (read_seqretry(&xtime_lock, seq));

if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
- arch_needs_cpu(cpu)) {
+ arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
next_jiffies = last_jiffies + 1;
delta_jiffies = 1;
} else {

2010-06-17 14:12:50

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 17 Jun 2010, Peter Zijlstra wrote:

> On Thu, 2010-06-17 at 12:50 +0200, Mike Galbraith wrote:
>
> > sched: do not ratelimit NOHZ when the tick is stopped.
> >
> > Chris Wedgwood reports that 39c0cbe sched: Rate-limit nohz causes a serial
> > console regression, unresponsiveness, and indeed it does. The below fixes
> > it by not skipping out when the tick has been stopped.
> >
> > Tested that the throughput benefit of ratelimiting is still intact. It is.
> >
> > Signed-off-by: Mike Galbraith <[email protected]>
> > Reported-by: Chris Wedgwood <[email protected]>
>
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 5f171f0..83c5129 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -315,7 +315,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> > goto end;
> > }
> >
> > - if (nohz_ratelimit(cpu))
> > + if (!ts->tick_stopped && nohz_ratelimit(cpu))
> > goto end;
> >
> > ts->idle_calls++;
> >
>
>
> Humm,. the code around there suggests something like the below, but I
> must admit its been a while since I really read all that nohz stuff,
> Thomas, any preferences?

The version below is better as it solves the problem and follows the
nohz_ratelimit() advise even in the case where it changes after the
tick has been stopped.

Thanks,

tglx
> ---
> kernel/time/tick-sched.c | 5 +----
> 1 files changed, 1 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 5f171f0..e0707ea 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
> goto end;
> }
>
> - if (nohz_ratelimit(cpu))
> - goto end;
> -
> ts->idle_calls++;
> /* Read jiffies and the time when jiffies were updated last */
> do {
> @@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> } while (read_seqretry(&xtime_lock, seq));
>
> if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
> - arch_needs_cpu(cpu)) {
> + arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
> next_jiffies = last_jiffies + 1;
> delta_jiffies = 1;
> } else {
>
>

2010-06-17 14:14:54

by Mike Galbraith

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On Thu, 2010-06-17 at 15:24 +0200, Peter Zijlstra wrote:

> Humm,. the code around there suggests something like the below, but I
> must admit its been a while since I really read all that nohz stuff,
> Thomas, any preferences?

Mine works. Yours fits and works. -+ vs ----+, you win :)


> ---
> kernel/time/tick-sched.c | 5 +----
> 1 files changed, 1 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 5f171f0..e0707ea 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
> goto end;
> }
>
> - if (nohz_ratelimit(cpu))
> - goto end;
> -
> ts->idle_calls++;
> /* Read jiffies and the time when jiffies were updated last */
> do {
> @@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> } while (read_seqretry(&xtime_lock, seq));
>
> if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
> - arch_needs_cpu(cpu)) {
> + arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
> next_jiffies = last_jiffies + 1;
> delta_jiffies = 1;
> } else {

2010-06-17 14:56:20

by Brian Bloniarz

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On 06/17/2010 09:24 AM, Peter Zijlstra wrote:
> On Thu, 2010-06-17 at 12:50 +0200, Mike Galbraith wrote:
>
>> sched: do not ratelimit NOHZ when the tick is stopped.
>>
>> Chris Wedgwood reports that 39c0cbe sched: Rate-limit nohz causes a serial
>> console regression, unresponsiveness, and indeed it does. The below fixes
>> it by not skipping out when the tick has been stopped.
>>
>> Tested that the throughput benefit of ratelimiting is still intact. It is.
>>
>> Signed-off-by: Mike Galbraith <[email protected]>
>> Reported-by: Chris Wedgwood <[email protected]>
>
>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> index 5f171f0..83c5129 100644
>> --- a/kernel/time/tick-sched.c
>> +++ b/kernel/time/tick-sched.c
>> @@ -315,7 +315,7 @@ void tick_nohz_stop_sched_tick(int inidle)
>> goto end;
>> }
>>
>> - if (nohz_ratelimit(cpu))
>> + if (!ts->tick_stopped && nohz_ratelimit(cpu))
>> goto end;
>>
>> ts->idle_calls++;
>>
>
>
> Humm,. the code around there suggests something like the below, but I
> must admit its been a while since I really read all that nohz stuff,
> Thomas, any preferences?

I tested Peter's variant, it eliminates the kvm console echo latency
that I was seeing. (I haven't tried Mike's earlier variant).

Tested-by: Brian Bloniarz <[email protected]>

> ---
> kernel/time/tick-sched.c | 5 +----
> 1 files changed, 1 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 5f171f0..e0707ea 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
> goto end;
> }
>
> - if (nohz_ratelimit(cpu))
> - goto end;
> -
> ts->idle_calls++;
> /* Read jiffies and the time when jiffies were updated last */
> do {
> @@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> } while (read_seqretry(&xtime_lock, seq));
>
> if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
> - arch_needs_cpu(cpu)) {
> + arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
> next_jiffies = last_jiffies + 1;
> delta_jiffies = 1;
> } else {
>

2010-06-17 16:03:28

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH] nohz: Fix nohz ratelimit

Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
below fixes it by not skipping out when the tick has been stopped.

Reported-by: Chris Wedgwood <[email protected]>
Tested-by: Brian Bloniarz <[email protected]>
Signed-off-by: Mike Galbraith <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/time/tick-sched.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5f171f0..e0707ea 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
goto end;
}

- if (nohz_ratelimit(cpu))
- goto end;
-
ts->idle_calls++;
/* Read jiffies and the time when jiffies were updated last */
do {
@@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
} while (read_seqretry(&xtime_lock, seq));

if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
- arch_needs_cpu(cpu)) {
+ arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
next_jiffies = last_jiffies + 1;
delta_jiffies = 1;
} else {

2010-06-17 17:40:33

by Peter Zijlstra

[permalink] [raw]
Subject: [tip:timers/urgent] nohz: Fix nohz ratelimit

Commit-ID: 3310d4d38fbc514e7b18bd3b1eea8effdd63b5aa
Gitweb: http://git.kernel.org/tip/3310d4d38fbc514e7b18bd3b1eea8effdd63b5aa
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu, 17 Jun 2010 18:02:37 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 17 Jun 2010 19:37:29 +0200

nohz: Fix nohz ratelimit

Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
reason is that the nohz code is skipped even when the tick was already
stopped before the nohz_ratelimit(cpu) condition changed.

Move the nohz_ratelimit() check to the other conditions which prevent
long idle sleeps.

Reported-by: Chris Wedgwood <[email protected]>
Tested-by: Brian Bloniarz <[email protected]>
Signed-off-by: Mike Galbraith <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: OGAWA Hirofumi <[email protected]>
Cc: Jef Driesen <[email protected]>
LKML-Reference: <1276790557.27822.516.camel@twins>
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/time/tick-sched.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 1d7b9bc..783fbad 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -315,9 +315,6 @@ void tick_nohz_stop_sched_tick(int inidle)
goto end;
}

- if (nohz_ratelimit(cpu))
- goto end;
-
ts->idle_calls++;
/* Read jiffies and the time when jiffies were updated last */
do {
@@ -328,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
} while (read_seqretry(&xtime_lock, seq));

if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
- arch_needs_cpu(cpu)) {
+ arch_needs_cpu(cpu) || nohz_ratelimit(cpu)) {
next_jiffies = last_jiffies + 1;
delta_jiffies = 1;
} else {

2010-11-12 18:50:43

by Jef Driesen

[permalink] [raw]
Subject: Re: Slow pty's (was Re: libdivecomputer interfaces?)

On 10/06/10 19:25, Linus Torvalds wrote:
> Greg, Alan, Hirofumi-san,
>
> I thought we long since (ie back last fall) fixed the latency
> problems with pty's, but there does seem to be something very fishy
> going on there still.
>
> On Thu, Jun 10, 2010 at 8:01 AM, Linus Torvalds
> <[email protected]> wrote:
>> On Sat, May 29, 2010 at 12:53 PM, Jef Driesen<[email protected]> wrote
>>> BTW, now that I have your attention, could you maybe help me with a linux
>>> kernel problem I'm experiencing in this area? I reported the problem on LKML
>>> but got no response:
>>>
>>> http://www.divesoftware.org/libdc/simulator.html
>>> http://groups.google.com/group/linux.kernel/browse_thread/thread/5a2b00e35b0864a7
>>
>> [ Hmm.. Testing.. ]
>>
>> Yeah, it's slow. Your test thing takes one and a quarter minutes for
>> me. That's ridiculous.
>>
>> And no, we shouldn't need the low-latency flag, we're supposed to do
>> this all automatically correctly. I'll talk to the tty people.
>
> This is clearly not a regression (it's been going on forever, I
> suspect), but taking over a minute to transfer just over half a MB of
> data over a pty seems crazy.
>
> Maybe it's not a kernel problem, and it's something done wrong by
> rx/sx/socat, I haven't looked at what they do. But since setting
> low_latency apparently helps (I didn't test that part, but I did test
> "ridiculously slow"), it sounds very much like something is still
> wrong in the kernel unless there is some really subtle timing issue in
> user space.
>
>> From Jef's original lkml report linked to above:
>
>> You can reproduce the problem by running these commands in three
>> different terminals:
>>
>> # Terminal 1: Setup the pty's.
>> socat PTY,link=/tmp/ttyS0 PTY,link=/tmp/ttyS1
>> # Terminal 2: Send some data.
>> dd if=/dev/urandom of=input.bin bs=538368 count=1
>> sx input.bin>>/tmp/ttyS0</tmp/ttyS0
>> # Terminal 2: Receive the data data.
>> time rx output.bin>/tmp/ttyS1</tmp/ttyS1
>
> and yeah, it's pretty clear to see. A "perf report" on that receiving
> side just shows queue_delayed_work_on(), but that doesn't mean much.
> It's clearly just sleeping all the time...
>
> Any ideas?
>
> Linus

Just out of curiosity, is there any progress on this issue? There was
some discussion on NOHZ related changes in the remainder of this thread,
but they don't appear to have fixed the problem I reported above. I
still need to patch my kernel to set the low-latency flag to get decent
performance.

I wish I could look into this myself, but unfortunately my kernel
experience is still too limited. But if there is anything that I could
do to help, just let me know.

Thanks for your time.

Jef