2001-03-15 15:26:54

by Sampsa Ranta

[permalink] [raw]
Subject: Performance is weird (fwd)

--
Subject: Performance is weird
The following message was first posted to linux-atm mailing list, it
is followed with one of the replies I got, thanks Werner Almesberger
<[email protected]>.

Actually, with 2.4.3pre4 kernel I got something like 66Mbit/s which were
better than the 2.4.2 results.
--

Hello,

I am running a set of ForeRunner LE 155 cards on two Athlon 900
machines. The cards are currently back to back connected. I am having
problems with performance and this problem seems a bit curious to me.

The boxes are running kernel versions 2.4.2 with the builtin ATM
functionality.

First when the machine is idle and i run ttcp_atm, the record is:

[root@akvagw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 3.805066 real seconds = 4305.838585 KB/sec
(35.273430
Mb/sec)

I can get the same result when I run it as many times as I want when the
machine is idle, however, the performance of the increases a lot when I
give the processor something to do, for example compile the kernel, when
gcc is compiling the kernel, I get better results:

[root@akvagw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 0.997561 real seconds = 16424.058278 KB/sec (134.545885 Mb/sec)

For the record, the remote machine does not affect the tests, because the
machine just sends data even when none listens.

Can someone explain, and maybe do something, please? Or am I supposed to
compile kernel all the time on my production ATM routers.

Same seems to apply when I stream UDP via my 3C905C card to one of my
routers, first I get 60Mbytes / s, then 94Mbytes/s when I start to compile
the kernel.

Thanks,
Sampsa Ranta
[email protected]


"
Don't know where those "negative CPU cycles" come from. It's probably
a driver problem. Could be that either you're triggering scheduling of a
softirq or such, where there normally wouldn't be one (but should be),
or that there's a race condition leading to the loss of an event
(softirq, tasklet, wait queue, etc.), and background activity makes this
happen in the correct order.
"






2001-03-15 18:16:31

by Manfred Spraul

[permalink] [raw]
Subject: Re: Performance is weird (fwd)

One difference between idle and a running user space app is that the
kernel->user space return path checks for pending softirqs, but the ide
thread doesn't.

Perhaps cpu_idle() should also check for pending softirq's before
hlt'ing?

idle thread is running.
* hw interrupt
* * hw interrupt handler
* * * packet arrives
* * * softirq marked
* * hw interrupt handler returns
* do_softirq
* * net_rx called
* * * an hw interrupt interrupts net_rx
* * * * a second packet arrives, softirq marked again.
* * * hw interrupt returns
* * net_rx returns
* do_softirq notices that net_rx is queued again, but doesn't process
it immediately (otherwise it would cause an endless loop)
* hw interrupt returns
idle thread sleeps again.
!! one packet is waiting unprocessed

What about adding if(softirq_active...) do_softirq() into default_idle?
--
Manfred


2001-03-15 19:59:04

by Manfred Spraul

[permalink] [raw]
Subject: Re: Performance is weird (fwd)

--- 2.4/arch/i386/kernel/process.c Thu Feb 22 22:28:52 2001
+++ build-2.4/arch/i386/kernel/process.c Thu Mar 15 20:35:12 2001
@@ -81,6 +81,11 @@
{
if (current_cpu_data.hlt_works_ok && !hlt_counter) {
__cli();
+ if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) {
+ __sti();
+ do_softirq();
+ return;
+ }
if (!current->need_resched)
safe_halt();
else


Attachments:
patch-proc (399.00 B)

2001-03-16 02:42:46

by Sampsa Ranta

[permalink] [raw]
Subject: Re: Performance is weird (fwd) -> results

On Thu, 15 Mar 2001, Manfred Spraul wrote:

> I've attached a patch.
> I tried to trigger the problem with my 10 MBit ne2k-pci connection, but
> without success.
>
> Could you try it?
> I've tested it with -ac17, and it applies to 2.4.2 cleanly.

On 2.4.2:

Before:
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 3.829257 real seconds = 4278.636822 KB/sec
(35.050593 Mb/sec)

After either of your patches, the result was the same, sorry.

I tried to apply the patch to 2.4.3 and still got the better result with
it, altought compiling kernel still improved the performance.

First:

[root@ropogw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.994121 real seconds = 8216.151377 KB/sec
(67.306712 Mb/sec)
[root@ropogw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.995773 real seconds = 8209.350462 KB/sec
(67.250999 Mb/sec)
[root@ropogw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.989680 real seconds = 8234.489968 KB/sec
(67.456942 Mb/sec)

(start to compile kernel on other console)

[root@ropogw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.072744 real seconds = 15272.982184 KB/sec
(125.116270 Mb/sec)
[root@ropogw test]# ./ttcp_atm -t -a -s 0.90
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.140261 real seconds = 14368.640162 KB/sec
(117.70790

I also applied it the test to the 3com card:

Before kernel compiling, patch applied or not:

ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 udp ->
not.for.your.eyes
ttcp-t: socket
ttcp-t: 16777216 bytes in 2.218013 real seconds = 7386.791691 KB/sec
(60.512598 Mb/sec)

ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 udp ->
not.for.your.eyes
ttcp-t: socket
ttcp-t: 16777216 bytes in 1.428264 real seconds = 11471.268617 KB/sec
(93.972633 Mb/sec)

Thanks,

Sampsa Ranta
[email protected]

2001-03-16 15:24:21

by Manfred Spraul

[permalink] [raw]
Subject: Re: Performance is weird (fwd) -> results

Sampsa Ranta wrote:
>
> After either of your patches, the result was the same, sorry.
>
Is apm or acpi running?

--
Manfred

2001-03-17 13:56:41

by Sampsa Ranta

[permalink] [raw]
Subject: Re: Performance is weird (fwd) -> results

On Fri, 16 Mar 2001, Manfred Spraul wrote:

> Sampsa Ranta wrote:
> >
> > After either of your patches, the result was the same, sorry.
> >
> Is apm or acpi running?

No, I tried both SMP and non-SMP version of kernel, the machine is however
single processor Athlon 900. CONFIG_ACPI is not set, CONFIG_APM is not
set. The 2.4.3pre4 still performs 66M/s without "the load" and 124M/s+
with load. However there is much different between 2.4.2 and 2.4.3pre
about 33M/s to 66M/s.

- Sampsa Ranta
[email protected]

2001-03-24 12:43:54

by Sampsa Ranta

[permalink] [raw]
Subject: Re: Performance is weird (fwd)


Hi,

I got my ATM driver working properly, both LE155 and PCA200E did
good throughput when I found out the problem. I had some fancy option in
BIOS setup described like "Enhance chip performance", after turning this
on everything started to rock. So, my best guess is that there was
something terribly wrong in my PCI or chipset setup and this option fixed
the configuration.

My benchmark results on can be found from
http://www.netsonic.fi/~sampsa/netpipe/ if anyone else is interested on
the results. Based on my benchmark results I would say ForeRunner LE155
and PCA-200E and their Linux drivers seemed to benchmark equally well on
throughput, I have no idea on CPU usage, tho.

Thanks,
Sampsa


> --
> Subject: Performance is weird
> The following message was first posted to linux-atm mailing list, it
> is followed with one of the replies I got, thanks Werner Almesberger
> <[email protected]>.
>
> Actually, with 2.4.3pre4 kernel I got something like 66Mbit/s which were
> better than the 2.4.2 results.
> --
>
> Hello,
>
> I am running a set of ForeRunner LE 155 cards on two Athlon 900
> machines. The cards are currently back to back connected. I am having
> problems with performance and this problem seems a bit curious to me.
>
> The boxes are running kernel versions 2.4.2 with the builtin ATM
> functionality.
>
> First when the machine is idle and i run ttcp_atm, the record is:
>
> [root@akvagw test]# ./ttcp_atm -t -a -s 0.90
> ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
> ttcp-t: socket
> ttcp-t: 16777216 bytes in 3.805066 real seconds = 4305.838585 KB/sec
> (35.273430
> Mb/sec)
>
> I can get the same result when I run it as many times as I want when the
> machine is idle, however, the performance of the increases a lot when I
> give the processor something to do, for example compile the kernel, when
> gcc is compiling the kernel, I get better results:
>
> [root@akvagw test]# ./ttcp_atm -t -a -s 0.90
> ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5013 atm -> 0.90
> ttcp-t: socket
> ttcp-t: 16777216 bytes in 0.997561 real seconds = 16424.058278 KB/sec (134.545885 Mb/sec)
>
> For the record, the remote machine does not affect the tests, because the
> machine just sends data even when none listens.
>
> Can someone explain, and maybe do something, please? Or am I supposed to
> compile kernel all the time on my production ATM routers.
>
> Same seems to apply when I stream UDP via my 3C905C card to one of my
> routers, first I get 60Mbytes / s, then 94Mbytes/s when I start to compile
> the kernel.
>
> Thanks,
> Sampsa Ranta
> [email protected]
>
>
> "
> Don't know where those "negative CPU cycles" come from. It's probably
> a driver problem. Could be that either you're triggering scheduling of a
> softirq or such, where there normally wouldn't be one (but should be),
> or that there's a race condition leading to the loss of an event
> (softirq, tasklet, wait queue, etc.), and background activity makes this
> happen in the correct order.
> "
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-net" in
> the body of a message to [email protected]
>