2003-03-20 14:47:12

by Srihari Vijayaraghavan

[permalink] [raw]
Subject: Bottleneck on /dev/null

Linux-2.4.latest
PACKET_MMAP
PCAP_FRAMES=max for tcpdump-3.8/libpcap-0.8 (from http://public.lanl.gov/cpw/)
e1000 driver

2 * Xeon 2800 MHz, 512 KB L2
1 GB RAM
70 GB HW RAID-0 on SmartArray 5i
2 * 2 port Intel GigE cards (only using 1 per card for the testing purposes)

Capturing all packets and writting to /dev/null causes more packet drops than
writting to hard drives (approx 40,000 packets/sec of 70 bytes for couple of
minutes). I will have a comparision between those figures in a day or two,
but /dev/null was well over SCSI hard drives. I thought writting (even
multiple of them simultaneously) to /dev/null should be faster than fastest
SCSI drives out there :) Interesting.

(And yes I see plenty of "errors", "dropped", and "overruns" in ifconfig stats
on those interfaces. %system is over 80%, and tcpdump goes to "D" state many
times. Simon Kirby suggested to use irq-smp_affinity to see if that helps for
reducing %system time. A well optimised e1000 would definitely help as tg3
does it very well.)

I mean to test this /dev/null behavior on 2 tg3 driver configuration perhaps
in couple of days time. (But the 2 tg3 cards with out-of-the-box NAPI support
on 2.4.latest is able to not to loose a single packet even while writting to
hard drives, then I didn't care to test it on /dev/null)

BTW I found 2.5.51 backport of e1000 NAPI support at
http://havoc.gtf.org/lunz/linux/net/
Anyone knows of a recent backport or improved one for 2.4.latest (including
2.4.21-pre5 or -pre6). Patches for testing or URL is welcome.

Thanks
--
Hari
[email protected]



2003-03-20 15:42:02

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Bottleneck on /dev/null

On Fri, 21 Mar 2003, Srihari Vijayaraghavan wrote:

> Linux-2.4.latest
> PACKET_MMAP
> PCAP_FRAMES=max for tcpdump-3.8/libpcap-0.8 (from http://public.lanl.gov/cpw/)
> e1000 driver
>
> 2 * Xeon 2800 MHz, 512 KB L2
> 1 GB RAM
> 70 GB HW RAID-0 on SmartArray 5i
> 2 * 2 port Intel GigE cards (only using 1 per card for the testing purposes)
>
> Capturing all packets and writting to /dev/null causes more packet drops than
> writting to hard drives (approx 40,000 packets/sec of 70 bytes for couple of
> minutes). I will have a comparision between those figures in a day or two,
> but /dev/null was well over SCSI hard drives. I thought writting (even
> multiple of them simultaneously) to /dev/null should be faster than fastest
> SCSI drives out there :) Interesting.
>
> (And yes I see plenty of "errors", "dropped", and "overruns" in ifconfig stats
> on those interfaces. %system is over 80%, and tcpdump goes to "D" state many
> times. Simon Kirby suggested to use irq-smp_affinity to see if that helps for
> reducing %system time. A well optimised e1000 would definitely help as tg3
> does it very well.)
>
> I mean to test this /dev/null behavior on 2 tg3 driver configuration perhaps
> in couple of days time. (But the 2 tg3 cards with out-of-the-box NAPI support
> on 2.4.latest is able to not to loose a single packet even while writting to
> hard drives, then I didn't care to test it on /dev/null)
>
> BTW I found 2.5.51 backport of e1000 NAPI support at
> http://havoc.gtf.org/lunz/linux/net/
> Anyone knows of a recent backport or improved one for 2.4.latest (including
> 2.4.21-pre5 or -pre6). Patches for testing or URL is welcome.
>
> Thanks
> --
> Hari
> [email protected]

You are correct and here's a little program to show the problem
and demonstrate when it gets corrected.




#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <time.h>
#define BUF_LEN 0x10000
unsigned long amount = 0L;
void timer(int unused) {
fprintf(stdout, "Kilobytes / sec = %lu\n", amount >> 10);
fflush(stdout);
amount = 0L;
alarm(1);
}
main() {
int fd, len;
char *buf;
if((fd = open("/dev/null", O_RDWR)) < 0)
exit(EXIT_FAILURE);
if((buf = malloc(BUF_LEN)) == NULL)
exit(EXIT_FAILURE);
(void)signal(SIGALRM, timer);
alarm(1);
while((len = write(fd, buf, BUF_LEN)) > 0)
amount += (unsigned long) len;
free(buf);
return 0;
}

On my SMP system, using kernel version 2.4.20, I get:

Kilobytes / sec = 3987136
Kilobytes / sec = 4101760
Kilobytes / sec = 1984
Kilobytes / sec = 4138304
Kilobytes / sec = 33664
Kilobytes / sec = 4189888
Kilobytes / sec = 432
Kilobytes / sec = 4122880

There is an awful big variation and I'm the only one on
this system!! If I disconnect the network line so I
truly get the entire attention of both CPUs, I get:

Kilobytes / sec = 3717402
Kilobytes / sec = 320250
Kilobytes / sec = 239501
Kilobytes / sec = 1893527
Kilobytes / sec = 23
Kilobytes / sec = 6783920
Kilobytes / sec = 1296789
Kilobytes / sec = 5109001


How, what the :F: makes the thing stumble down to 23 kilobytes
per second?a `taint right.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-03-20 16:10:21

by Tim Schmielau

[permalink] [raw]
Subject: Re: Bottleneck on /dev/null

On Thu, 20 Mar 2003, Richard B. Johnson wrote:

> unsigned long amount = 0L;

try 'volatile' to get the deviation down...

Tim

2003-03-20 16:23:01

by Bernd Petrovitsch

[permalink] [raw]
Subject: Re: Bottleneck on /dev/null

Tim Schmielau <[email protected]> wrote:
>On Thu, 20 Mar 2003, Richard B. Johnson wrote:
>
>> unsigned long amount = 0L;
>
>try 'volatile' to get the deviation down...

.. and try "long long" to avoid an overrun.

Bernd
--
Bernd Petrovitsch Email : [email protected]
g.a.m.s gmbh Fax : +43 1 205255-900
Prinz-Eugen-Stra?e 8 A-1040 Vienna/Austria/Europe
LUGA : http://www.luga.at


2003-03-20 16:48:15

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Bottleneck on /dev/null

On Thu, 20 Mar 2003, Bernd Petrovitsch wrote:

> Tim Schmielau <[email protected]> wrote:
> >On Thu, 20 Mar 2003, Richard B. Johnson wrote:
> >
> >> unsigned long amount = 0L;
> >
> >try 'volatile' to get the deviation down...
>
> .. and try "long long" to avoid an overrun.
>
> Bernd
> --

Yes. That's better. It may have been a diagnostic error
in the code of the first person reporting this --also.

The data-rate is so high that I might have wrapped several
times! I didn't think it would be that high, only 2 to 3
gigibyte/second, not over 4 Gb/s (with 130MHz RAM no less)


#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <time.h>

#define BUF_LEN 0x10000
volatile unsigned long long amount = 0L;

void timer(int unused) {
fprintf(stdout, "Kilobytes / sec = %llu\n", amount >> 10);
fflush(stdout);
amount = 0LL;
alarm(1);
}

int main() {
int fd, len;
char *buf;
if((fd = open("/dev/null", O_RDWR)) < 0)
exit(EXIT_FAILURE);
if((buf = malloc(BUF_LEN)) == NULL)
exit(EXIT_FAILURE);
(void)signal(SIGALRM, timer);
alarm(1);
while((len = write(fd, buf, BUF_LEN)) > 0)
amount += (unsigned long long) len;
free(buf);
return 0;
}


With network:

Kilobytes / sec = 46170080
Kilobytes / sec = 46171576
Kilobytes / sec = 46172944
Kilobytes / sec = 46172192
Kilobytes / sec = 46171840
Kilobytes / sec = 46171576

Without network:

Kilobytes / sec = 46128168
Kilobytes / sec = 46128200
Kilobytes / sec = 46128152
Kilobytes / sec = 46128142
Kilobytes / sec = 46128208
Kilobytes / sec = 46128198
Kilobytes / sec = 46128202


Its interesting that the data-rate is higher with the network
plugged in and getting all those M$ broadcast messages. But, as
expected, its more stable without.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-03-20 16:56:31

by Matti Aarnio

[permalink] [raw]
Subject: Re: Bottleneck on /dev/null

On Thu, Mar 20, 2003 at 12:01:06PM -0500, Richard B. Johnson wrote:
...
> Yes. That's better. It may have been a diagnostic error
> in the code of the first person reporting this --also.
>
> The data-rate is so high that I might have wrapped several
> times! I didn't think it would be that high, only 2 to 3
> gigibyte/second, not over 4 Gb/s (with 130MHz RAM no less)

Furthermore, in Linux you are really measuring syscall overhead
to "/dev/null" write, which does never do memory transfers of
any kind from user-space to kernel.

...
> Its interesting that the data-rate is higher with the network
> plugged in and getting all those M$ broadcast messages. But, as
> expected, its more stable without.

Quite so.

> Cheers,
> Dick Johnson

/Matti Aarnio