2001-11-29 02:22:02

by Randy Hron

[permalink] [raw]
Subject: fsync02 test hangs 2.5.1-pre3 + patch

2.5.1-pre3 with patch http://marc.theaimsgroup.com/?l=linux-kernel&m=100699614529904&w=4

LTP Test: fsync02 - Create a sparse file, fsync it, and time the fsync

Test usually takes about 3 seconds.

Symptoms after the test started:
w, login, ps do not return.
exiting mp3blaster, bash, bitchx don't return.
Sysrq Sync Unmount do not return
Sysrq showPc shows "swapper".
Sysrq tErm kIll killalL terminate some processes (ppp - wvdial).
Sysrq reBoot does not reboot.
ncftp completed downloaded of patch-2.4.17-pre1.bz2, but the file was corrupt.
(size is okay, but checksum is bad).

Linux version 2.5.1-pre3 (gcc version 2.95.3 20010315 (release)) #8 Wed Nov 28 20:18:11 EST 2001

Gnu C 2.95.3
Gnu make 3.79.1
binutils 2.11.2
util-linux 2.11m
mount 2.11m
modutils 2.4.12
e2fsprogs 1.25
reiserfsprogs 3.x.0j
PPP 2.4.1
Linux C Library 2.2.4
Dynamic linker (ldd) 2.2.4
Procps 2.0.7
Net-tools 1.60
Kbd 1.06
Sh-utils 2.0
Modules Loaded

00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10)
00:0f.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04)

Hope this helps.
--
Randy Hron


2001-11-29 02:33:16

by Alexander Viro

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch



On Wed, 28 Nov 2001 [email protected] wrote:

> 2.5.1-pre3 with patch http://marc.theaimsgroup.com/?l=linux-kernel&m=100699614529904&w=4

Umm... With which patch? Sorry for being dense, but I see no patches
- neither in the posting you'd refered to nor in your own posting...

2001-11-29 03:01:08

by Randy Hron

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch

On Wed, Nov 28, 2001 at 09:32:43PM -0500, Alexander Viro wrote:
> Umm... With which patch? Sorry for being dense, but I see no patches

Oops.

[PATCH] fix for drivers/char/pc_keyb.c in 2.5.1-pre3

2.5.1-pre3 would not compile without Alan's patch for pc_keyb.c, etc.

I also noticed the logfile LTP "runalltests" was writing to has binary
data and snippets of kernel code in it. (Normally it would all be text).

This was on a reiserfs system, btw.

--
Randy Hron

2001-11-29 04:13:00

by Alexander Viro

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch



On Wed, 28 Nov 2001 [email protected] wrote:

> On Wed, Nov 28, 2001 at 09:32:43PM -0500, Alexander Viro wrote:
> > Umm... With which patch? Sorry for being dense, but I see no patches
>
> Oops.
>
> [PATCH] fix for drivers/char/pc_keyb.c in 2.5.1-pre3
>
> 2.5.1-pre3 would not compile without Alan's patch for pc_keyb.c, etc.
>
> I also noticed the logfile LTP "runalltests" was writing to has binary
> data and snippets of kernel code in it. (Normally it would all be text).
>
> This was on a reiserfs system, btw.

Interesting... I see two candidates in -pre3 - either fs/super.c cleanups
are fscked and something leaves superblock or superblock list locked
(either would have such effect, but that would have to happen at mount
or umount time and thing would lock up much earlier) or bio.c+ll_rw_blk.c+...
changes are acting up.

Obvious tests:
a) 2.5.1-pre2
b) 2.5.1-pre2 + fs/super.c from 2.5.1-pre3
c) 2.5.1-pre3 + fs/super.c from 2.5.1-pre2
(fs/super.c changes are independent from everything else).

Another possibility is silent fs corruption from 2.5.0/2.4.15 - if you
ran these kernels you really ought to do fsck -f (or whatever is used
to force recovery of reiserfs).

2001-11-29 06:50:30

by Randy Hron

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch

On Wed, Nov 28, 2001 at 11:12:22PM -0500, Alexander Viro wrote:
> > This was on a reiserfs system, btw.
>
> Obvious tests:
> a) 2.5.1-pre2

2.5.1-pre2 was fine.

> b) 2.5.1-pre2 + fs/super.c from 2.5.1-pre3
> c) 2.5.1-pre3 + fs/super.c from 2.5.1-pre2
> (fs/super.c changes are independent from everything else).

I'll try option c and let you know what happens.

> Another possibility is silent fs corruption from 2.5.0/2.4.15 - if you
> ran these kernels you really ought to do fsck -f (or whatever is used
> to force recovery of reiserfs).

I did have some nasty errors while running dbench around those kernel versions.
To be safe, I did a cpio backup and re-mkreiserfs'd three filesystems with 2.4.16.
The filesystem that fsync02 hung on was one of the recently rebuilt filesystems.

--
Randy Hron

2001-11-29 06:50:30

by Randy Hron

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch

On Thu, Nov 29, 2001 at 12:18:59AM -0500, [email protected] wrote:
> > c) 2.5.1-pre3 + fs/super.c from 2.5.1-pre2
> > (fs/super.c changes are independent from everything else).
>
> I'll try option c and let you know what happens.

I ran the fsync02 test by itself, and that went fine. When I started
the "runalltests", the system locked up when "tail -f" showed fsync02.
It's possible that one of the next tests, fsync03 or ftruncate01 are
the actual triggers for the "can't write" lockup.

--
Randy Hron

2001-11-29 06:50:30

by Alexander Viro

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch



On Thu, 29 Nov 2001 [email protected] wrote:

> On Thu, Nov 29, 2001 at 12:18:59AM -0500, [email protected] wrote:
> > > c) 2.5.1-pre3 + fs/super.c from 2.5.1-pre2
> > > (fs/super.c changes are independent from everything else).
> >
> > I'll try option c and let you know what happens.
>
> I ran the fsync02 test by itself, and that went fine. When I started
> the "runalltests", the system locked up when "tail -f" showed fsync02.
> It's possible that one of the next tests, fsync03 or ftruncate01 are
> the actual triggers for the "can't write" lockup.

OK, so what we have is breakage in bio parts merged in -pre3 _or_ breakage
going back to -pre2. If you could rerun these tests for vanilla 2.5.1-pre2
and see if they break...

2001-11-29 07:06:13

by Jens Axboe

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch

On Thu, Nov 29 2001, Alexander Viro wrote:
>
>
> On Thu, 29 Nov 2001 [email protected] wrote:
>
> > On Thu, Nov 29, 2001 at 12:18:59AM -0500, [email protected] wrote:
> > > > c) 2.5.1-pre3 + fs/super.c from 2.5.1-pre2
> > > > (fs/super.c changes are independent from everything else).
> > >
> > > I'll try option c and let you know what happens.
> >
> > I ran the fsync02 test by itself, and that went fine. When I started
> > the "runalltests", the system locked up when "tail -f" showed fsync02.
> > It's possible that one of the next tests, fsync03 or ftruncate01 are
> > the actual triggers for the "can't write" lockup.
>
> OK, so what we have is breakage in bio parts merged in -pre3 _or_ breakage
> going back to -pre2. If you could rerun these tests for vanilla 2.5.1-pre2
> and see if they break...

-pre3 breakage is very possible, let me drink a cup of coffee and take a
closer look at this.

--
Jens Axboe

2001-11-29 07:18:27

by Alexander Viro

[permalink] [raw]
Subject: Re: fsync02 test hangs 2.5.1-pre3 + patch



On Thu, 29 Nov 2001, Alexander Viro wrote:

> OK, so what we have is breakage in bio parts merged in -pre3 _or_ breakage
> going back to -pre2. If you could rerun these tests for vanilla 2.5.1-pre2
> and see if they break...

Latency is a bitch... Sorry - I got your previous posting (saying that
2.5.1-pre2 was OK) only now.

So it looks like it had been introduced in -pre3 minus fs/super.c changes.
Which leaves only bio stuff - AFAICS nothing aside of these two had any
chance to give a deadlock of that sort.