I understand that disabling Nagle's algorithm via TCP_NODELAY will
generally degrade throughput. However, in my scenario (150 byte
messages, sending as fast as possible), the actual throughput penalty
over the network is marginal (maybe 10% at most).
However, when I disable Nagle's algorithm when connecting over loopback,
the performance hit is *huge* - 10x reduction in throughput.
The question is, why is disabling Nagle's algorithm on loopback so much
worse w.r.t. throughput? Is there anything I can do to reduce the
incurred throughput penalty?
Thanks,
Adam
El Mon, 11 Apr 2011 22:37:49 -0400
"Adam McLaurin" <[email protected]> escribió:
Just CCing netdev
> I understand that disabling Nagle's algorithm via TCP_NODELAY will
> generally degrade throughput. However, in my scenario (150 byte
> messages, sending as fast as possible), the actual throughput penalty
> over the network is marginal (maybe 10% at most).
>
> However, when I disable Nagle's algorithm when connecting over loopback,
> the performance hit is *huge* - 10x reduction in throughput.
>
> The question is, why is disabling Nagle's algorithm on loopback so much
> worse w.r.t. throughput? Is there anything I can do to reduce the
> incurred throughput penalty?
>
> Thanks,
> Adam
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Tue, Apr 12, 2011 at 3:37 AM, Adam McLaurin <[email protected]> wrote:
> I understand that disabling Nagle's algorithm via TCP_NODELAY will
> generally degrade throughput. However, in my scenario (150 byte
> messages, sending as fast as possible), the actual throughput penalty
> over the network is marginal (maybe 10% at most).
>
> However, when I disable Nagle's algorithm when connecting over loopback,
> the performance hit is *huge* - 10x reduction in throughput.
>
> The question is, why is disabling Nagle's algorithm on loopback so much
> worse w.r.t. throughput? Is there anything I can do to reduce the
> incurred throughput penalty?
It may be caused by an increase in context switch rate, as both sender
and receiver are on the same machine.
On Tue, 12 Apr 2011 10:45 +0100, "Will Newton" <[email protected]>
wrote:
> It may be caused by an increase in context switch rate, as both sender
> and receiver are on the same machine.
I'm not sure that's what's happening, since the box where I'm running
this test has 8 physical CPU's and 32 cores in total.
Thanks,
Adam
On Tue, 12 Apr 2011, Adam McLaurin wrote:
> > It may be caused by an increase in context switch rate, as both sender
> > and receiver are on the same machine.
>
> I'm not sure that's what's happening, since the box where I'm running
> this test has 8 physical CPU's and 32 cores in total.
Have you tried firing up the testcase under perf, to see what it reveals
as the bottleneck?
--
Jiri Kosina
SUSE Labs, Novell Inc.
Le mardi 12 avril 2011 à 13:54 +0200, Jiri Kosina a écrit :
> On Tue, 12 Apr 2011, Adam McLaurin wrote:
>
> > > It may be caused by an increase in context switch rate, as both sender
> > > and receiver are on the same machine.
> >
> > I'm not sure that's what's happening, since the box where I'm running
> > this test has 8 physical CPU's and 32 cores in total.
>
> Have you tried firing up the testcase under perf, to see what it reveals
> as the bottleneck?
>
CC netdev
This rings a bell here.
I suspect we hit mod_timer() / lock_timer_base() because of delack
timer constantly changing.
I remember raising this point last year :
http://kerneltrap.org/mailarchive/linux-netdev/2010/5/20/6277741
David answer :
http://kerneltrap.org/mailarchive/linux-netdev/2010/6/2/6278430
I am afraid no change was done...