2002-06-21 14:03:09

by T.Raykoff

[permalink] [raw]
Subject: 2.4.18 IDE channels block each other under load?

Can someone tell me what is going on here?

dd if=/dev/zero of=/dev/hda bs=1024 count=1000000

then in another vt:
fdisk /dev/hdc, then immediately press "q".

fdisk "hangs" for a long, long time.
ps -aux says state of dd and fdisk are both "D"
strace says fdisk is hanging on the close()
/proc/interrupts tell me that ide1 (/dev/hdc) is getting no
int activity for a long, long time. ide0 is very busy.

It is not just dd/fdisk. Any intensive writes on one IDE
channel (direct to the hd? device) seem to block any IO on
the other device.

Intel SAI2 MB, ServerWorks IDE chipset, 2.4.18, two IDE
hard drives /dev/hda and /dev/hdc, 1024MB RAM, RH73 kernel
build.

Also seen on Promise PDCx IDE controllers hanging off the PCI.

hdparm settings appear to have no influence on this behavior.

Thanks,
TR.




2002-06-21 14:17:14

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

On Friday 21 June 2002 16:03, Taavo Raykoff wrote:
> Can someone tell me what is going on here?
> <snip>
> hdparm settings appear to have no influence on this behavior.

Are you running DMA on these controllers? It looks like they're running PIO

Can you check the output of 'hdparm /dev/hd_' and "cat
/proc/ide/hd_/settings"?
--
Roy Sigurd Karlsbakk, Datavaktmester

Computers are like air conditioners.
They stop working when you open Windows.

2002-06-21 15:14:34

by T.Raykoff

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

Yes. Baffling. hdparm and prod/ide/hdx both show dma_mode is on.

Normally, I have these running with -X69 (UltraDMA 100)

TR.


On Fri, 21 Jun 2002, Roy Sigurd Karlsbakk wrote:

> On Friday 21 June 2002 16:03, Taavo Raykoff wrote:
> > Can someone tell me what is going on here?
> > <snip>
> > hdparm settings appear to have no influence on this behavior.
>
> Are you running DMA on these controllers? It looks like they're running PIO
>
> Can you check the output of 'hdparm /dev/hd_' and "cat
> /proc/ide/hd_/settings"?
> --
> Roy Sigurd Karlsbakk, Datavaktmester
>
> Computers are like air conditioners.
> They stop working when you open Windows.
>
>

2002-07-22 20:49:53

by Aniket Malatpure

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

Hi

In the IDE Protocol, only 1 DMA transfer can take place on the bus at
one time.
So if the IDE bus is busy doing the DMA transfer for 1 drive, the other
drive cannot receive any commands.

I guess what does happen is that the IDE driver in the kernel doesnt
distribute commands in a "fair" manner between drive 0 & drive 1. If it
receives commands for drive 0, it keeps sending those commands to the
drive 0, one after the other, without checking if there is a command for
drive 1.

A driver distributing commands in a "fair" manner would send 1 command
to drive 0, check if there is a command for drive 1, if there is, it
would send a command to drive 1, then send the next command to drive 0
and so on...

The above is a guess...would anyone care to verify?

Thanks
Aniket







Taavo Raykoff wrote:
>
> Can someone tell me what is going on here?
>
> dd if=/dev/zero of=/dev/hda bs=1024 count=1000000
>
> then in another vt:
> fdisk /dev/hdc, then immediately press "q".
>
> fdisk "hangs" for a long, long time.
> ps -aux says state of dd and fdisk are both "D"
> strace says fdisk is hanging on the close()
> /proc/interrupts tell me that ide1 (/dev/hdc) is getting no
> int activity for a long, long time. ide0 is very busy.
>
> It is not just dd/fdisk. Any intensive writes on one IDE
> channel (direct to the hd? device) seem to block any IO on
> the other device.
>
> Intel SAI2 MB, ServerWorks IDE chipset, 2.4.18, two IDE
> hard drives /dev/hda and /dev/hdc, 1024MB RAM, RH73 kernel
> build.
>
> Also seen on Promise PDCx IDE controllers hanging off the PCI.
>
> hdparm settings appear to have no influence on this behavior.
>
> Thanks,
> TR.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-07-22 21:11:13

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?


One if the issues causing problems is the driver does not pull its queue
off the channel. This is the ideal model as one is permitted to thread
the channel wisely. This was scheduled for 2.5 but now I have to run a
shorter development cycle to get it correct for the remainder of 2.4.

Taskfile is designed to deal with a threaded model, but that is only the
base core. Also with the advent of SATA, the desire to make a PATA core
threaded is not so pressing.

Cheers,

On Mon, 22 Jul 2002, Aniket Malatpure wrote:

> Hi
>
> In the IDE Protocol, only 1 DMA transfer can take place on the bus at
> one time.
> So if the IDE bus is busy doing the DMA transfer for 1 drive, the other
> drive cannot receive any commands.
>
> I guess what does happen is that the IDE driver in the kernel doesnt
> distribute commands in a "fair" manner between drive 0 & drive 1. If it
> receives commands for drive 0, it keeps sending those commands to the
> drive 0, one after the other, without checking if there is a command for
> drive 1.
>
> A driver distributing commands in a "fair" manner would send 1 command
> to drive 0, check if there is a command for drive 1, if there is, it
> would send a command to drive 1, then send the next command to drive 0
> and so on...
>
> The above is a guess...would anyone care to verify?
>
> Thanks
> Aniket
>
>
>
>
>
>
>
> Taavo Raykoff wrote:
> >
> > Can someone tell me what is going on here?
> >
> > dd if=/dev/zero of=/dev/hda bs=1024 count=1000000
> >
> > then in another vt:
> > fdisk /dev/hdc, then immediately press "q".
> >
> > fdisk "hangs" for a long, long time.
> > ps -aux says state of dd and fdisk are both "D"
> > strace says fdisk is hanging on the close()
> > /proc/interrupts tell me that ide1 (/dev/hdc) is getting no
> > int activity for a long, long time. ide0 is very busy.
> >
> > It is not just dd/fdisk. Any intensive writes on one IDE
> > channel (direct to the hd? device) seem to block any IO on
> > the other device.
> >
> > Intel SAI2 MB, ServerWorks IDE chipset, 2.4.18, two IDE
> > hard drives /dev/hda and /dev/hdc, 1024MB RAM, RH73 kernel
> > build.
> >
> > Also seen on Promise PDCx IDE controllers hanging off the PCI.
> >
> > hdparm settings appear to have no influence on this behavior.
> >
> > Thanks,
> > TR.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2002-07-22 23:01:08

by T.Raykoff

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

On Mon, 22 Jul 2002, Mark Hahn wrote:

> > I guess what does happen is that the IDE driver in the kernel doesnt
> > distribute commands in a "fair" manner between drive 0 & drive 1. If it
>
> but this is between hda and hdc, which are on different IDE channels.
>
> I'm guessing this is straighforward memory thrashing - the dd is
> abusing the system, and getting anything else done is hard.
> for instance, what happens if bs is reasonable instead?
> what happens if the load is read, rather than write?

Not sure about the bs dependency, but I seem to remember that it had
little effect.

This lockup only happens under write load. Heavy reads don't cause the
prob. Hmmmm.

Not sure that it really is memory thrashing. The box is unloaded and
really has about 1GB free, to use for buffer as it sees fit. No I/O to
the swap file going on, cause there is no mounted swap.

Check this out:

dd if=/dev/zero of=/dev/hda bs=1024

then:

fdisk /dev/hdc

"q"

fdisk blocks in the close() call.... for well over 15 minutes!

As soon as dd ends cause /dev/hda is at EOF, fisk::close() returns in a
moment.

Doesn't sounds like simple system abuse to me.

Taavo.



>
>
> > Taavo Raykoff wrote:
> > >
> > > Can someone tell me what is going on here?
> > >
> > > dd if=/dev/zero of=/dev/hda bs=1024 count=1000000
> > >
> > > then in another vt:
> > > fdisk /dev/hdc, then immediately press "q".
> > >
> > > fdisk "hangs" for a long, long time.
> > > ps -aux says state of dd and fdisk are both "D"
> > > strace says fdisk is hanging on the close()
> > > /proc/interrupts tell me that ide1 (/dev/hdc) is getting no
> > > int activity for a long, long time. ide0 is very busy.
> > >
> > > It is not just dd/fdisk. Any intensive writes on one IDE
> > > channel (direct to the hd? device) seem to block any IO on
> > > the other device.
> > >
> > > Intel SAI2 MB, ServerWorks IDE chipset, 2.4.18, two IDE
> > > hard drives /dev/hda and /dev/hdc, 1024MB RAM, RH73 kernel
> > > build.
> > >
> > > Also seen on Promise PDCx IDE controllers hanging off the PCI.
> > >
> > > hdparm settings appear to have no influence on this behavior.
> > >
> > > Thanks,
> > > TR.
>

2002-07-30 16:21:00

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

On Mon, 22 Jul 2002, T.Raykoff wrote:

> This lockup only happens under write load. Heavy reads don't cause the
> prob. Hmmmm.
>
> Not sure that it really is memory thrashing. The box is unloaded and
> really has about 1GB free, to use for buffer as it sees fit. No I/O to
> the swap file going on, cause there is no mounted swap.

The aa kernels can be tuned to reduce this to a great extent. It seems to
happen on machines with large memory, so if you don't have that this is
not part of your problem. When doing heavy writes, a lot of data gets
buffered, then bdflush checks and sees that there is a shitload of data to
write and queues it. I used to see it with mkisofs CD images, where I
would get one or even two images in memory before the write started.

Andrea gave me some tips on tuning bdflush, although they work best on his
kernels they help a lot on other kernels as well. The trick seems to be to
check more often and be more aggressive about keeping the size of the
buffered data down by writing a a low threshold. No IDE system is going to
like trying to write 500MB all at once, and the best you can do until low
level changes go in is to start sooner getting the data out.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-07-30 17:23:53

by Roman Kagan

[permalink] [raw]
Subject: Re: 2.4.18 IDE channels block each other under load?

Hi,

I'm by no means an expert in this, just a guess:

On Tue, Jul 30, 2002 at 05:09:22PM +0000, T.Raykoff wrote:
> This lockup only happens under write load. Heavy reads don't cause the
> prob. Hmmmm.
>
> Not sure that it really is memory thrashing. The box is unloaded and
> really has about 1GB free, to use for buffer as it sees fit. No I/O to
> the swap file going on, cause there is no mounted swap.
>
> Check this out:
>
> dd if=/dev/zero of=/dev/hda bs=1024
>
> then:
>
> fdisk /dev/hdc
>
> "q"
>
> fdisk blocks in the close() call.... for well over 15 minutes!
>
> As soon as dd ends cause /dev/hda is at EOF, fisk::close() returns in a
> moment.

When you quit from fdisk it does a sync() right after the close(). I
suspect that fdisk gets stuck in that sync() rather than close(). (You
said strace reported close() as the last syscall - it's the last one
completed.) The write on one of the channels doesn't let sync() return.

To make sure I'd try to check (e.g. with /proc/<pid>/fd) if fdisk still
has /dev/hdc open during the dd.

Cheers,
Roman.