2002-01-04 14:37:05

by Matti Aarnio

[permalink] [raw]
Subject: 2.4.17 RAID-1 EXT3 reliable to hang....

For past few weeks I have wondered of why my web-server machine
is hanging semi-regularly.

I have:
- Two 30+ GB SCSI Ultra2-Wide disks
- onboard AIC7XXX controller
- Disks with identical partition maps
- RAID-1 bound pairwise on those partitions
(RAIDTAB entries md3/md4/md5 - the md0/md1/md2 were on
other older disk, which was removed latter..)
- EXT3 filesystem at all partitions (except at 2 G swap..)
Mounted with default options
- machine with dual-P-III 750 MHz, and 786 MB memory (3*256MB)

When the machine is up all the way, and MD disks have finished
syncing, I execute command:

dd if=/dev/zero bs=1024k of=test.file count=8000

which will lead to hard system hangup where the keyboard won't
react, SCSI led shines constantly, but nothig happens.
Right at the moment when the keyboard becomes unresponsibe,
the disk led will continue to flicker for a few seconds, but
then the flicker will stop, and the led stays constantly on.

Earlier guestimates of using "noapic", have no effect on
system hangups. Same command causes it quite soon. Even
"noapic nosmp" does hang.

Large amount of RAM may contribute, but this 3*256MB
does not need e.g. PAE mode extensions.

/Matti Aarnio


2002-01-07 08:01:03

by Matti Aarnio

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Fri, Jan 04, 2002 at 04:36:35PM +0200, Matti Aarnio wrote:
> For past few weeks I have wondered of why my web-server machine
> is hanging semi-regularly.

Over the weekend I tried something new.

> I have:
> - Two 30+ GB SCSI Ultra2-Wide disks
> - onboard AIC7XXX controller
> - Disks with identical partition maps
> - RAID-1 bound pairwise on those partitions
> (RAIDTAB entries md3/md4/md5 - the md0/md1/md2 were on
> other older disk, which was removed latter..)
> - EXT3 filesystem at all partitions (except at 2 G swap..)
> Mounted with default options
> - machine with dual-P-III 750 MHz, and 786 MB memory (3*256MB)

.. running 2.4.17 code compiled with "SMP" disabled, but
having APICs enabled. (e.g. Local and IO-APIC.)
It appears to me as:
- SMP code, SMP mode: hangup in use
- SMP code, "nosmp" boot option: hangup in use
- UP code: works

Tool chain:

$ gcc -v
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-97.1)
$ ld -V
GNU ld version 2.11.92.0.12 20011121


I have partial evidence that EXT3 may be part of the problem,
as another machine with RAID-1 disks with EXT2 filesystems
is not hanging up when running RedHat 2.4.16-0.9custom kernel.
That another machine has, however, IDE disks.

Earlier experiements with the hanging box used that same RH
kernel, and hangups were observed there too..

> When the machine is up all the way, and MD disks have finished
> syncing, I execute command:
>
> dd if=/dev/zero bs=1024k of=test.file count=8000
>
> which will lead to hard system hangup where the keyboard won't
> react, SCSI led shines constantly, but nothig happens.
> Right at the moment when the keyboard becomes unresponsibe,
> the disk led will continue to flicker for a few seconds, but
> then the flicker will stop, and the led stays constantly on.

Of this flicker I am not entirely sure anymore.
Maybe it happens, maybe not.

I tried also SGI's kdb at 2.4.17, but when the system
hangs, "pause/break" won't react at all.

> Earlier guestimates of using "noapic", have no effect on
> system hangups. Same command causes it quite soon. Even
> "noapic nosmp" does hang.
>
> Large amount of RAM may contribute, but this 3*256MB
> does not need e.g. PAE mode extensions.

/Matti Aarnio

2002-01-07 08:25:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

Matti Aarnio wrote:
>
> I have partial evidence that EXT3 may be part of the problem,
> as another machine with RAID-1 disks with EXT2 filesystems
> is not hanging up when running RedHat 2.4.16-0.9custom kernel.
> That another machine has, however, IDE disks.

I'd be surprised if an ext3 bug could cause a freeze as solid
as this one. ext3's write submission patterns are somewhat different
from other filesystems, and we've exposed a few problem in underlying
layers in the past because of this. But who knows...

Have you enabled the NMI watchdog? nmi_watchdog=1 on the LILO
commandline?

Also, I'd be inclined to enable all the kernel debug options,
including SLAB debug.

-

2002-01-07 08:39:22

by Matti Aarnio

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Mon, Jan 07, 2002 at 12:19:56AM -0800, Andrew Morton wrote:
> Matti Aarnio wrote:
> > I have partial evidence that EXT3 may be part of the problem,
> > as another machine with RAID-1 disks with EXT2 filesystems
> > is not hanging up when running RedHat 2.4.16-0.9custom kernel.
> > That another machine has, however, IDE disks.
>
> I'd be surprised if an ext3 bug could cause a freeze as solid
> as this one. ext3's write submission patterns are somewhat different
> from other filesystems, and we've exposed a few problem in underlying
> layers in the past because of this. But who knows...
>
> Have you enabled the NMI watchdog? nmi_watchdog=1 on the LILO
> commandline?

# cat /proc/cmdline
auto BOOT_IMAGE=up ro root=905 BOOT_FILE=/boot/vmlinuz-2.4.17up nmi_watchdog=1

This is the apparently stable UP mode kernel, but this option
has been present at all variants, although I don't recall of
what the NMI count was previously -- same or lower than that
of LOC count at /proc/interrupts.

> Also, I'd be inclined to enable all the kernel debug options,
> including SLAB debug.

CONFIG_SCSI_DEBUG=y
CONFIG_JBD_DEBUG=y
# CONFIG_DEVFS_DEBUG is not set
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_IOVIRT=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_BUGVERBOSE=y

/Matti Aarnio

2002-01-07 08:40:12

by Oliver Paukstadt

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Mon, 7 Jan 2002, Andrew Morton wrote:

> Matti Aarnio wrote:
> >
> > I have partial evidence that EXT3 may be part of the problem,
> > as another machine with RAID-1 disks with EXT2 filesystems
> > is not hanging up when running RedHat 2.4.16-0.9custom kernel.
> > That another machine has, however, IDE disks.
>
> I'd be surprised if an ext3 bug could cause a freeze as solid
> as this one. ext3's write submission patterns are somewhat different
> from other filesystems, and we've exposed a few problem in underlying
> layers in the past because of this. But who knows...
>
> Have you enabled the NMI watchdog? nmi_watchdog=1 on the LILO
> commandline?
>
> Also, I'd be inclined to enable all the kernel debug options,
> including SLAB debug.
Heavy traffic on ext3 seems to cause short system freezes.

Seems only to happen on 2 or more processor boxes.

I'm not deep into kernel nor ext3, but how is the journal flushed if
full?

Greetings Oli

+++the jungle near manaos - the amazonas full of piranhas - the birds of++
+++paradies - disapear into the green desert - for years an years we're+++
+++hungry and desperate - for the only thing worth living - the E><CESS+++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


2002-01-07 08:55:13

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

Oliver Paukstadt wrote:
>
> Heavy traffic on ext3 seems to cause short system freezes.

This could be due to disk request elevator latency and VM imbalance.
Your application has a page dropped due to the competing write activity,
and it takes ages to be restored, due to the write activity.

> Seems only to happen on 2 or more processor boxes.

In which case the above theory is wrong.

> I'm not deep into kernel nor ext3, but how is the journal flushed if
> full?

Nothing special, really - we just pump a stream of data out to disk.
While this is happening, other processes can still attach data to the
journal without getting blocked. Up to a point. Our handling of this
is a bit sudden at present. Some people have reported benefit from
radically decreasing the buffer flushtimes. See Daniel Robbins' article
at http://www-106.ibm.com/developerworks/linux/library/l-fs8/ for this.
Yes, improvements are needed in this area. Not only in ext3.

You haven't really defined "freeze", but it's certainly different
from Matti's freeze.

-

2002-01-07 09:51:36

by Richard Guenther

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Mon, 7 Jan 2002, Oliver Paukstadt wrote:

> Heavy traffic on ext3 seems to cause short system freezes.

I see dropped frames while watching TV (bttv chip, xawtv in overlay mode,
XFree 4.1.0)
since I use ext3 (2.4.16&17). Always during disk activity (IDE, umask irq
and dma enabled). From what I know the bttv driver does it seems to loose
interrupts!? This doesnt happen with ext2.

> Seems only to happen on 2 or more processor boxes.

Nope, UP Athlon.

> I'm not deep into kernel nor ext3, but how is the journal flushed if
> full?

By any chance, is some global lock held during any IO intensive part of
ext3?

Richard.

--
Richard Guenther <[email protected]>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
The GLAME Project: http://www.glame.de/

2002-01-07 10:15:16

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

Richard Guenther wrote:
>
> On Mon, 7 Jan 2002, Oliver Paukstadt wrote:
>
> > Heavy traffic on ext3 seems to cause short system freezes.
>
> I see dropped frames while watching TV (bttv chip, xawtv in overlay mode,
> XFree 4.1.0)
> since I use ext3 (2.4.16&17). Always during disk activity (IDE, umask irq
> and dma enabled). From what I know the bttv driver does it seems to loose
> interrupts!? This doesnt happen with ext2.

ext3 never blocks interrupts. It _may_ cause increased interrupt
latency than ext2 by the very large linear writes which it does. These
may cause other parts of the kernel to block interrupts for longer.

However, more likely that it's a scheduling latency problem. Sigh.
I spent *ages* on the ext3 buffer writeout code and it's still not
ideal. Can you test with this patch applied?

http://www.zipworld.com.au/~akpm/linux/2.4/2.4.18-pre1/mini-ll.patch

It should go into 2.4.17 OK.

> ...
> By any chance, is some global lock held during any IO intensive part of
> ext3?

Yes, a couple. But on uniprocessor it's more a matter of the kernel
failing to context switch promptly.

-

2002-01-07 10:33:35

by Richard Guenther

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Mon, 7 Jan 2002, Andrew Morton wrote:

> Richard Guenther wrote:
> >
> > On Mon, 7 Jan 2002, Oliver Paukstadt wrote:
> >
> > > Heavy traffic on ext3 seems to cause short system freezes.
> >
> > I see dropped frames while watching TV (bttv chip, xawtv in overlay mode,
> > XFree 4.1.0)
> > since I use ext3 (2.4.16&17). Always during disk activity (IDE, umask irq
> > and dma enabled). From what I know the bttv driver does it seems to loose
> > interrupts!? This doesnt happen with ext2.
>
> ext3 never blocks interrupts. It _may_ cause increased interrupt
> latency than ext2 by the very large linear writes which it does. These
> may cause other parts of the kernel to block interrupts for longer.
>
> However, more likely that it's a scheduling latency problem. Sigh.
> I spent *ages* on the ext3 buffer writeout code and it's still not
> ideal. Can you test with this patch applied?
>
> http://www.zipworld.com.au/~akpm/linux/2.4/2.4.18-pre1/mini-ll.patch

I'll check that later tonight.

> > ...
> > By any chance, is some global lock held during any IO intensive part of
> > ext3?
>
> Yes, a couple. But on uniprocessor it's more a matter of the kernel
> failing to context switch promptly.

Umm, I doubt that - the bttv driver does overlay grabbing completely
within an interrupt handler. I'll look if the bttv card shares with
the IDE interrupt or maybe the drivers way of operation changed. Oh well.

Richard.

--
Richard Guenther <[email protected]>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
The GLAME Project: http://www.glame.de/

2002-01-07 13:36:07

by Alan

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

> I see dropped frames while watching TV (bttv chip, xawtv in overlay mode,
> XFree 4.1.0)
> since I use ext3 (2.4.16&17). Always during disk activity (IDE, umask irq
> and dma enabled). From what I know the bttv driver does it seems to loose
> interrupts!? This doesnt happen with ext2.

The really important bit there is that you see dropped frames in overlay
mode. Overlay mode the hardware is copying directly. The only way you should
lose frames in overlay mode is if the chip couldnt sync to that frame or
the PCI bus was fully loaded by other traffic and the transfer failed. There
are some other corner cases too (certainly video cards can run out of
bandwidth during accelerated operations like bitblt)

Alan

2002-01-20 17:32:13

by Matti Aarnio

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Mon, Jan 07, 2002 at 02:09:29AM -0800, Andrew Morton wrote:
...
> I spent *ages* on the ext3 buffer writeout code and it's still not
> ideal. Can you test with this patch applied?
>
> http://www.zipworld.com.au/~akpm/linux/2.4/2.4.18-pre1/mini-ll.patch
>
> It should go into 2.4.17 OK.

I just tried this into 2.4.18-pre4 and it still hard-hangs
the RAID-1 + EXT3 on SMP.

/Matti Aarnio

2002-01-20 19:25:09

by John Jasen

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

Really?

Software raid?

I have a quad ppro that is doing ext3/md just fine, and its running
2.4.17.

On Sun, 20 Jan 2002, Matti Aarnio wrote:

> Date: Sun, 20 Jan 2002 19:31:41 +0200
> From: Matti Aarnio <[email protected]>
> To: Andrew Morton <[email protected]>
> Cc: Richard Guenther <[email protected]>,
> Oliver Paukstadt <[email protected]>,
> Linux-Kernel <[email protected]>, [email protected]
> Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....
>
> On Mon, Jan 07, 2002 at 02:09:29AM -0800, Andrew Morton wrote:
> ...
> > I spent *ages* on the ext3 buffer writeout code and it's still not
> > ideal. Can you test with this patch applied?
> >
> > http://www.zipworld.com.au/~akpm/linux/2.4/2.4.18-pre1/mini-ll.patch
> >
> > It should go into 2.4.17 OK.
>
> I just tried this into 2.4.18-pre4 and it still hard-hangs
> the RAID-1 + EXT3 on SMP.
>
> /Matti Aarnio
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
-- John E. Jasen ([email protected])
-- In theory, theory and practise are the same. In practise, they aren't.

2002-01-20 19:58:05

by Matti Aarnio

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Sun, Jan 20, 2002 at 02:24:40PM -0500, John Jasen wrote:
> Really?
> Software raid?
> I have a quad ppro that is doing ext3/md just fine, and its running
> 2.4.17.

Indeed. I didn't have problems with dual PPro200, but after I did
upgrade it to dual P-III-750, it does hang up. Machines have
128 MB, and 750 MB memory, respectively. (Disks and host controller
were moved over.)

I recall having ran same earlier PPro optimized 2.4 kernel on PPro200,
and on P-III (2.4.6-ac1, or 2.4.10). It hung up too, which prompted
research on kernel versions.

This all does point to some sort of deadlock window somewhere.
It appears to be practically untriggerable with PPro200, but
trivial to hit with P-III-750.

Now to have a reliable way to find where the CPUs are spinning
when the thing does not work... (I have tested kdb: keyboard
dies at hangup -> kdb becomes non-functional...)

> --
> -- John E. Jasen ([email protected])
> -- In theory, theory and practise are the same. In practise, they aren't.

/Matti Aarnio

2002-01-21 00:24:40

by Oliver Xymoron

[permalink] [raw]
Subject: Re: 2.4.17 RAID-1 EXT3 reliable to hang....

On Sun, 20 Jan 2002, Matti Aarnio wrote:

> Now to have a reliable way to find where the CPUs are spinning
> when the thing does not work... (I have tested kdb: keyboard
> dies at hangup -> kdb becomes non-functional...)

KDB can still be triggered with an NMI if your mobo has an NMI button
(often a small pinhole button on the back).

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."