2001-07-22 03:33:22

by Jimmie Mayfield

[permalink] [raw]
Subject: Interesting disk throughput performance problem

Hi. I'm running into some disk throughput issues that I can't explain.
Hopefully someone reading this can offer an explanation.

One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
controller (AMI CMD649 as found on the IWill KT266-R). Both drives contain
reiserfs 3.6x filesystems.

By all local benchmarks, the 7200 rpm drive is the faster drive. But this
doesn't seem to be the case for large files originating from remote clients.
Witness:

My crude test involves scp'ing a 100MB file from another machine on my home
network over 100bT ethernet.

1) scp to the 5400rpm drive: roughly 10MB/sec.
2) scp to the 7200rpm drive: roughly 2MB/sec.

I've tried 'tail' and 'notail' mount options with no change (as expected since
this is a single large file). I suspect that the machine would become CPU-bound
somewhere in the 20MB/sec range (see below for my reasoning).

I see the same sort of behavior using Samba though not nearly as
pronounced (the 5400rpm drive is merely 2x as fast as the 7200rpm drive).

Okay. Since the test involved 2 separate drives with different geometries,
I figured this might be due to physical block location. Perhaps the file
is getting allocated to the fastest cylinders on the 5400 rpm drive and
the slowest cylinders on the 7200 rpm drive. Or it could be a fragmentation
issue.

So I tried the test locally: with the file stored on the 5400rpm drive,
scp it to localhost and write it to the 7200rpm drive. Results were a little
below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
the fly).

Any ideas why the 7200rpm drive performs so poorly for remote clients but
performs wonderfully well when those same operations are performed locally?

Jimmie


2001-07-22 09:24:26

by Hans Reiser

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

I'm just guessing here, but is write caching active on one but not the other?

Hans


Jimmie Mayfield wrote:
>
> Hi. I'm running into some disk throughput issues that I can't explain.
> Hopefully someone reading this can offer an explanation.
>
> One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> controller (AMI CMD649 as found on the IWill KT266-R). Both drives contain
> reiserfs 3.6x filesystems.
>
> By all local benchmarks, the 7200 rpm drive is the faster drive. But this
> doesn't seem to be the case for large files originating from remote clients.
> Witness:
>
> My crude test involves scp'ing a 100MB file from another machine on my home
> network over 100bT ethernet.
>
> 1) scp to the 5400rpm drive: roughly 10MB/sec.
> 2) scp to the 7200rpm drive: roughly 2MB/sec.
>
> I've tried 'tail' and 'notail' mount options with no change (as expected since
> this is a single large file). I suspect that the machine would become CPU-bound
> somewhere in the 20MB/sec range (see below for my reasoning).
>
> I see the same sort of behavior using Samba though not nearly as
> pronounced (the 5400rpm drive is merely 2x as fast as the 7200rpm drive).
>
> Okay. Since the test involved 2 separate drives with different geometries,
> I figured this might be due to physical block location. Perhaps the file
> is getting allocated to the fastest cylinders on the 5400 rpm drive and
> the slowest cylinders on the 7200 rpm drive. Or it could be a fragmentation
> issue.
>
> So I tried the test locally: with the file stored on the 5400rpm drive,
> scp it to localhost and write it to the 7200rpm drive. Results were a little
> below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
> the fly).
>
> Any ideas why the 7200rpm drive performs so poorly for remote clients but
> performs wonderfully well when those same operations are performed locally?
>
> Jimmie
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-22 10:30:41

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

On Sat, Jul 21, 2001 at 11:33:13PM -0400, Jimmie Mayfield wrote:
> Hi. I'm running into some disk throughput issues that I can't explain.
> Hopefully someone reading this can offer an explanation.
>
> One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> controller (AMI CMD649 as found on the IWill KT266-R). Both drives contain
> reiserfs 3.6x filesystems.
>
> By all local benchmarks, the 7200 rpm drive is the faster drive. But this
> doesn't seem to be the case for large files originating from remote clients.
> Witness:
....
> So I tried the test locally: with the file stored on the 5400rpm drive,
> scp it to localhost and write it to the 7200rpm drive. Results were a little
> below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
> the fly).
>
> Any ideas why the 7200rpm drive performs so poorly for remote clients but
> performs wonderfully well when those same operations are performed locally?

This is a wild guess:

Try cat /proc/interrupts

Would the 7200 rpm drive controller happen to share an IRQ with your NIC ?

If so, something is horribly wrong since that shouldn't give that kind of
performance penalty. But it's my best guess :)

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2001-07-22 10:44:56

by Mike Black

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

Not enough info (plenty for guessing though :-)

First off show "hdparm -i /dev/hd_" and "hdparm /dev/hd_" -- this will
ensure both drives have things like DMA, etc.
Next -- you didn't say what benchmarks you're using locally.
And as the previous poster said provide "cat /proc/interrupts".

----- Original Message -----
From: "Jimmie Mayfield" <[email protected]>
To: <[email protected]>
Sent: Saturday, July 21, 2001 11:33 PM
Subject: Interesting disk throughput performance problem


> Hi. I'm running into some disk throughput issues that I can't explain.
> Hopefully someone reading this can offer an explanation.
>
> One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> controller (AMI CMD649 as found on the IWill KT266-R). Both drives
contain
> reiserfs 3.6x filesystems.
>
> By all local benchmarks, the 7200 rpm drive is the faster drive. But this
> doesn't seem to be the case for large files originating from remote
clients.
> Witness:
>
> My crude test involves scp'ing a 100MB file from another machine on my
home
> network over 100bT ethernet.
>
> 1) scp to the 5400rpm drive: roughly 10MB/sec.
> 2) scp to the 7200rpm drive: roughly 2MB/sec.
>
> I've tried 'tail' and 'notail' mount options with no change (as expected
since
> this is a single large file). I suspect that the machine would become
CPU-bound
> somewhere in the 20MB/sec range (see below for my reasoning).
>
> I see the same sort of behavior using Samba though not nearly as
> pronounced (the 5400rpm drive is merely 2x as fast as the 7200rpm drive).
>
> Okay. Since the test involved 2 separate drives with different
geometries,
> I figured this might be due to physical block location. Perhaps the file
> is getting allocated to the fastest cylinders on the 5400 rpm drive and
> the slowest cylinders on the 7200 rpm drive. Or it could be a
fragmentation
> issue.
>
> So I tried the test locally: with the file stored on the 5400rpm drive,
> scp it to localhost and write it to the 7200rpm drive. Results were a
little
> below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
> the fly).
>
> Any ideas why the 7200rpm drive performs so poorly for remote clients but
> performs wonderfully well when those same operations are performed
locally?
>
> Jimmie
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-22 14:22:27

by toon

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

On Sun, Jul 22, 2001 at 01:20:24PM +0400, Hans Reiser wrote:
>
> I'm just guessing here, but is write caching active on one but not the other?

What do you mean?
That he should activate the write caching in all his drives?
I thought that was plain stupid and wrong, because the filesystem
expects journal data to hit the disk immedialtely. The journal
is written synchronously, isn't it?
So I would expect you to advise everybody to deactivate any
caching in drives and controllers. An the we are back with
Jimmie's question: the throughput performance of his drives
is bad, and the (theoretically) fastest drive is worst.

Regards,
Toon (running a newsserver with reiserfs-3.5.32 on top of LVM-0.9.1-beta7 on
top of DAC960 hardware RAID5 with write-caching turned off, using linux-2.2.19)

> Jimmie Mayfield wrote:
> >
> > Hi. I'm running into some disk throughput issues that I can't explain.
> > Hopefully someone reading this can offer an explanation.
> >
> > One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> > ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> > controller (AMI CMD649 as found on the IWill KT266-R). Both drives contain
> > reiserfs 3.6x filesystems.
> >
> > By all local benchmarks, the 7200 rpm drive is the faster drive. But this
> > doesn't seem to be the case for large files originating from remote clients.
> > Witness:
> >
> > My crude test involves scp'ing a 100MB file from another machine on my home
> > network over 100bT ethernet.
> >
> > 1) scp to the 5400rpm drive: roughly 10MB/sec.
> > 2) scp to the 7200rpm drive: roughly 2MB/sec.
> >
> > I've tried 'tail' and 'notail' mount options with no change (as expected since
> > this is a single large file). I suspect that the machine would become CPU-bound
> > somewhere in the 20MB/sec range (see below for my reasoning).
> >
> > I see the same sort of behavior using Samba though not nearly as
> > pronounced (the 5400rpm drive is merely 2x as fast as the 7200rpm drive).
> >
> > Okay. Since the test involved 2 separate drives with different geometries,
> > I figured this might be due to physical block location. Perhaps the file
> > is getting allocated to the fastest cylinders on the 5400 rpm drive and
> > the slowest cylinders on the 7200 rpm drive. Or it could be a fragmentation
> > issue.
> >
> > So I tried the test locally: with the file stored on the 5400rpm drive,
> > scp it to localhost and write it to the 7200rpm drive. Results were a little
> > below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
> > the fly).
> >
> > Any ideas why the 7200rpm drive performs so poorly for remote clients but
> > performs wonderfully well when those same operations are performed locally?
> >
> > Jimmie
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-22 14:45:22

by Hans Reiser

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

[email protected] wrote:
>
> On Sun, Jul 22, 2001 at 01:20:24PM +0400, Hans Reiser wrote:
> >
> > I'm just guessing here, but is write caching active on one but not the other?
>
> What do you mean?
> That he should activate the write caching in all his drives?

Didn't say that, I simply asked a question to collect data before formulating any opinion.

> I thought that was plain stupid and wrong, because the filesystem
> expects journal data to hit the disk immedialtely. The journal
> is written synchronously, isn't it?
> So I would expect you to advise everybody to deactivate any
> caching in drives and controllers. An the we are back with
> Jimmie's question: the throughput performance of his drives
> is bad, and the (theoretically) fastest drive is worst.

A difference that large in throughput is not what I would first guess.

>
> Regards,
> Toon (running a newsserver with reiserfs-3.5.32 on top of LVM-0.9.1-beta7 on
> top of DAC960 hardware RAID5 with write-caching turned off, using linux-2.2.19)
>
> > Jimmie Mayfield wrote:
> > >
> > > Hi. I'm running into some disk throughput issues that I can't explain.
> > > Hopefully someone reading this can offer an explanation.
> > >
> > > One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> > > ATA100 Maxtor and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> > > controller (AMI CMD649 as found on the IWill KT266-R). Both drives contain
> > > reiserfs 3.6x filesystems.
> > >
> > > By all local benchmarks, the 7200 rpm drive is the faster drive. But this
> > > doesn't seem to be the case for large files originating from remote clients.
> > > Witness:
> > >
> > > My crude test involves scp'ing a 100MB file from another machine on my home
> > > network over 100bT ethernet.
> > >
> > > 1) scp to the 5400rpm drive: roughly 10MB/sec.
> > > 2) scp to the 7200rpm drive: roughly 2MB/sec.
> > >
> > > I've tried 'tail' and 'notail' mount options with no change (as expected since
> > > this is a single large file). I suspect that the machine would become CPU-bound
> > > somewhere in the 20MB/sec range (see below for my reasoning).
> > >
> > > I see the same sort of behavior using Samba though not nearly as
> > > pronounced (the 5400rpm drive is merely 2x as fast as the 7200rpm drive).
> > >
> > > Okay. Since the test involved 2 separate drives with different geometries,
> > > I figured this might be due to physical block location. Perhaps the file
> > > is getting allocated to the fastest cylinders on the 5400 rpm drive and
> > > the slowest cylinders on the 7200 rpm drive. Or it could be a fragmentation
> > > issue.
> > >
> > > So I tried the test locally: with the file stored on the 5400rpm drive,
> > > scp it to localhost and write it to the 7200rpm drive. Results were a little
> > > below 10MB/sec (CPU near 100% presumably due to encrypting/decrypting on
> > > the fly).
> > >
> > > Any ideas why the 7200rpm drive performs so poorly for remote clients but
> > > performs wonderfully well when those same operations are performed locally?
> > >
> > > Jimmie
> > >
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/

2001-07-24 15:06:47

by Jimmie Mayfield

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

On Sun, Jul 22, 2001 at 06:44:38AM -0400, Mike Black wrote:
> Not enough info (plenty for guessing though :-)
>
> First off show "hdparm -i /dev/hd_" and "hdparm /dev/hd_" -- this will
> ensure both drives have things like DMA, etc.
> Next -- you didn't say what benchmarks you're using locally.
> And as the previous poster said provide "cat /proc/interrupts".

/proc/interrupts looks like this:
CPU0
0: 17319250 XT-PIC timer
1: 85980 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 1084144 XT-PIC ide3, usb-uhci, usb-uhci
6: 811 XT-PIC floppy
9: 1284463 XT-PIC mga@PCI:1:0:0
10: 142416 XT-PIC eth0
11: 8296678 XT-PIC eth1, C-Media PCI CM8738
12: 385915 XT-PIC PS/2 Mouse
15: 7 XT-PIC ide1
NMI: 0
LOC: 17319087
ERR: 3022
MIS: 0


I would like to make a correction to my original post. In that post I said
that the drives are "masters" on their own controller. This was false.
They share a controller (CMD 649) with the Maxtor drive being "master" and
the IBM drive being "slave". To test if this was the problem, I reran the
tests (see URL below) with the IBM drive completely disconnected. I didn't
notice any difference.

I collected my benchmarks and tests into a simple webpage to avoid cluttering
this list. Hopefully someone will see something obvious that I've
misconfigured.

http://sackheads.org/~mayfield/dp.html


Jimmie

--
Jimmie Mayfield
http://www.sackheads.org/mayfield email: [email protected]
My mail provider does not welcome UCE -- http://www.sackheads.org/uce

2001-07-24 17:11:42

by Tim Schmielau

[permalink] [raw]
Subject: Re: Interesting disk throughput performance problem

Did you monitor network traffic for lost packets?

Just another wild guess: Considering the weakness of the VIA chipset,
maybe the NIC cannot do enough DMA during UDMA5 transfers.

Tim

> Hi. I'm running into some disk throughput issues that I can't explain.
> Hopefully someone reading this can offer an explanation.
>
> One of my machines is running 2.4.5 and has 2 hard drives: a 7200 rpm
> ATA100 Maxtor
> and a 5400 rpm ATA33 IBM. Each drive is a master on its own
> controller (AMI CMD649 as found on the IWill KT266-R). Both drives
> contain reiserfs 3.6x filesystems.
>
> By all local benchmarks, the 7200 rpm drive is the faster drive. But
> this doesn't seem to be the case for large files originating from remote
> clients.