2011-04-12 02:37:51

by Adam McLaurin

[permalink] [raw]
Subject: Loopback and Nagle's algorithm

I understand that disabling Nagle's algorithm via TCP_NODELAY will
generally degrade throughput. However, in my scenario (150 byte
messages, sending as fast as possible), the actual throughput penalty
over the network is marginal (maybe 10% at most).

However, when I disable Nagle's algorithm when connecting over loopback,
the performance hit is *huge* - 10x reduction in throughput.

The question is, why is disabling Nagle's algorithm on loopback so much
worse w.r.t. throughput? Is there anything I can do to reduce the
incurred throughput penalty?

Thanks,
Adam


Subject: Re: Loopback and Nagle's algorithm

El Mon, 11 Apr 2011 22:37:49 -0400
"Adam McLaurin" <[email protected]> escribió:

Just CCing netdev

> I understand that disabling Nagle's algorithm via TCP_NODELAY will
> generally degrade throughput. However, in my scenario (150 byte
> messages, sending as fast as possible), the actual throughput penalty
> over the network is marginal (maybe 10% at most).
>
> However, when I disable Nagle's algorithm when connecting over loopback,
> the performance hit is *huge* - 10x reduction in throughput.
>
> The question is, why is disabling Nagle's algorithm on loopback so much
> worse w.r.t. throughput? Is there anything I can do to reduce the
> incurred throughput penalty?
>
> Thanks,
> Adam
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2011-04-12 09:45:22

by Will Newton

[permalink] [raw]
Subject: Re: Loopback and Nagle's algorithm

On Tue, Apr 12, 2011 at 3:37 AM, Adam McLaurin <[email protected]> wrote:
> I understand that disabling Nagle's algorithm via TCP_NODELAY will
> generally degrade throughput. However, in my scenario (150 byte
> messages, sending as fast as possible), the actual throughput penalty
> over the network is marginal (maybe 10% at most).
>
> However, when I disable Nagle's algorithm when connecting over loopback,
> the performance hit is *huge* - 10x reduction in throughput.
>
> The question is, why is disabling Nagle's algorithm on loopback so much
> worse w.r.t. throughput? Is there anything I can do to reduce the
> incurred throughput penalty?

It may be caused by an increase in context switch rate, as both sender
and receiver are on the same machine.

2011-04-12 11:04:08

by Adam McLaurin

[permalink] [raw]
Subject: Re: Loopback and Nagle's algorithm

On Tue, 12 Apr 2011 10:45 +0100, "Will Newton" <[email protected]>
wrote:
> It may be caused by an increase in context switch rate, as both sender
> and receiver are on the same machine.

I'm not sure that's what's happening, since the box where I'm running
this test has 8 physical CPU's and 32 cores in total.

Thanks,
Adam

2011-04-12 11:54:33

by Jiri Kosina

[permalink] [raw]
Subject: Re: Loopback and Nagle's algorithm

On Tue, 12 Apr 2011, Adam McLaurin wrote:

> > It may be caused by an increase in context switch rate, as both sender
> > and receiver are on the same machine.
>
> I'm not sure that's what's happening, since the box where I'm running
> this test has 8 physical CPU's and 32 cores in total.

Have you tried firing up the testcase under perf, to see what it reveals
as the bottleneck?

--
Jiri Kosina
SUSE Labs, Novell Inc.

2011-04-12 13:08:29

by Eric Dumazet

[permalink] [raw]
Subject: Re: Loopback and Nagle's algorithm

Le mardi 12 avril 2011 à 13:54 +0200, Jiri Kosina a écrit :
> On Tue, 12 Apr 2011, Adam McLaurin wrote:
>
> > > It may be caused by an increase in context switch rate, as both sender
> > > and receiver are on the same machine.
> >
> > I'm not sure that's what's happening, since the box where I'm running
> > this test has 8 physical CPU's and 32 cores in total.
>
> Have you tried firing up the testcase under perf, to see what it reveals
> as the bottleneck?
>
CC netdev

This rings a bell here.

I suspect we hit mod_timer() / lock_timer_base() because of delack
timer constantly changing.

I remember raising this point last year :

http://kerneltrap.org/mailarchive/linux-netdev/2010/5/20/6277741

David answer :
http://kerneltrap.org/mailarchive/linux-netdev/2010/6/2/6278430

I am afraid no change was done...