2002-02-05 20:23:28

by Pavel Machek

[permalink] [raw]
Subject: Warning, 2.5.3 eats filesystems

Hi!

2.5.3 managed to damage my ext2 filesystem (few lost directories);
beware.
Pavel

PCI devices found:
Bus 0, device 0, function 0:
Host bridge: PCI device 10b9:1647 (Acer Laboratories Inc. [ALi]) (rev 4).
Prefetchable 32 bit memory at 0xf0000000 [0xf3ffffff].
Bus 0, device 1, function 0:
PCI bridge: Acer Laboratories Inc. [ALi] M5247 (rev 0).
Master Capable. No bursts. Min Gnt=12.
Bus 0, device 2, function 0:
USB Controller: Acer Laboratories Inc. [ALi] M5237 USB (rev 3).
IRQ 9.
Master Capable. Latency=16. Max Lat=80.
Non-prefetchable 32 bit memory at 0xfff70000 [0xfff70fff].
Bus 0, device 4, function 0:
CardBus bridge: Texas Instruments PCI1420 (rev 0).
IRQ 11.
Master Capable. Latency=168. Min Gnt=192.Max Lat=5.
Non-prefetchable 32 bit memory at 0x10000000 [0x10000fff].
Bus 0, device 4, function 1:
CardBus bridge: Texas Instruments PCI1420 (#2) (rev 0).
IRQ 11.
Master Capable. Latency=168. Min Gnt=192.Max Lat=5.
Non-prefetchable 32 bit memory at 0x10001000 [0x10001fff].
Bus 0, device 6, function 0:
Bridge: Acer Laboratories Inc. [ALi] M7101 PMU (rev 0).
Bus 0, device 7, function 0:
ISA bridge: Acer Laboratories Inc. [ALi] M1533 PCI to ISA Bridge [Aladdin IV] (rev 0).
Bus 0, device 8, function 0:
Multimedia audio controller: ESS Technology ES1988 Allegro-1 (rev 18).
IRQ 5.
Master Capable. Latency=64. Min Gnt=2.Max Lat=24.
I/O at 0x1400 [0x14ff].
Bus 0, device 8, function 1:
Communication controller: ESS Technology ESS Modem (rev 18).
IRQ 5.
I/O at 0x1800 [0x18ff].
Bus 0, device 15, function 0:
IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev 195).
Master Capable. Latency=64. Min Gnt=2.Max Lat=4.
I/O at 0x1000 [0x100f].
Bus 0, device 16, function 0:
Ethernet controller: Accton Technology Corporation EN-1216 Ethernet Adapter (rev 17).
IRQ 11.
Master Capable. Latency=64. Min Gnt=255.Max Lat=255.
I/O at 0x1c00 [0x1cff].
Non-prefetchable 32 bit memory at 0xea001000 [0xea0013ff].
Bus 1, device 0, function 0:
VGA compatible controller: Trident Microsystems CyberBlade/XP (rev 99).
IRQ 11.
Master Capable. Latency=64.
Non-prefetchable 32 bit memory at 0xee000000 [0xefffffff].
Non-prefetchable 32 bit memory at 0xea400000 [0xea7fffff].
Non-prefetchable 32 bit memory at 0xec000000 [0xedffffff].
Non-prefetchable 32 bit memory at 0xea100000 [0xea107fff].



--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa


2002-02-05 23:15:23

by Daniel Pittman

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Tue, 5 Feb 2002, Pavel Machek wrote:
> Hi!
>
> 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> beware.

I can confirm that there are filesystem corruption issues with 2.5.3;
after this message I rebooted and did a forced fsck which turned up
around a half dozen inodes where the block count in the inode itself was
too high.

The box has been rock solid before this. Configuration:

00:07.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01)

ext3 filesystem, data=journal mode. P-II 400, 288MB.

Daniel

--
C makes it easy to shoot yourself in the foot.
C++ makes it harder, but when you do, it blows away your whole leg.
-- Bjarne Stroustrup

2002-02-06 01:02:54

by Skip Ford

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Pittman wrote:
> On Tue, 5 Feb 2002, Pavel Machek wrote:
> > Hi!
> >
> > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > beware.
>
> I can confirm that there are filesystem corruption issues with 2.5.3;
> after this message I rebooted and did a forced fsck which turned up
> around a half dozen inodes where the block count in the inode itself was
> too high.

I can confirm inode errors also. However, I can't be sure it's 2.5.3 that
did it.

All of the errors I've had all seemed to be files included in the
pre-patch that broke Configure.help into pieces. I don't know the code
well enough, but if the errors could only have happened at file creation
then that would rule out 2.5.3.

- --
Skip ID: 0x7EDDDB0A
-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAjxgf/EACgkQBMKxVH7d2wppeACg3g9HDKuibe/u0rpFlU4u+xrb
pzAAoKfOKnvXJTzGYfUzs+/YeVDDVyMP
=UhON
-----END PGP SIGNATURE-----

2002-02-06 01:39:55

by Dave Jones

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Tue, Feb 05, 2002 at 07:59:57PM -0500, Skip Ford wrote:
> I can confirm inode errors also. However, I can't be sure it's 2.5.3 that
> did it.

Recall that pre3/pre4/pre5 had the missing ext2_inode_info initialisation bug.
If you booted any of those, and have only just done a fsck, it could
be a leftover artifact of a now-fixed bug.

> All of the errors I've had all seemed to be files included in the
> pre-patch that broke Configure.help into pieces. I don't know the code
> well enough, but if the errors could only have happened at file creation
> then that would rule out 2.5.3.

Indeed, that change was in pre5, which was the last pre to feature
aforementioned buglet.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-02-06 04:59:42

by Daniel Pittman

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Wed, 6 Feb 2002, Dave Jones wrote:
> On Tue, Feb 05, 2002 at 07:59:57PM -0500, Skip Ford wrote:
> > I can confirm inode errors also. However, I can't be sure it's
> > 2.5.3 that did it.
>
> Recall that pre3/pre4/pre5 had the missing ext2_inode_info
> initialisation bug. If you booted any of those, and have only just
> done a fsck, it could be a leftover artifact of a now-fixed bug.

I jumped directly from 2.4.18pre1 to 2.5.3, and I didn't see any issues
with the 2.4 kernel in the time I ran it.

This doesn't rule out a pre-existing corruption, of course, but it seems
to me unlikely in the extreme. Er, in my case, of course. I can't speak
for others.

Daniel

--
Yes, I hate that. You spend all this time trying to explain to people that
they don't *have* to hammer nails into their own heads all the time, only to
discover that they *like* it because it's all they've ever known.
Tim Bradshaw, _comp.lang.lisp_

2002-02-06 17:50:04

by Olaf Zaplinski

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

Pavel Machek wrote:
> Hi!
>
> 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> beware.
> Pavel

/me too. 2.5.3 ate my bookmarks.html (and perhaps other files too, I did not
check yet). I just ran it up to the 1st freeze, then switched back to
2.4.17. Reiserfs then did a rather long log replay...

Olaf

2002-02-06 22:46:43

by Pavel Machek

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

Hi!

> I can confirm inode errors also. However, I can't be sure it's 2.5.3 that
> did it.

I *know* it is 2.5.3.
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

2002-02-06 23:25:48

by Heinz Diehl

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Wed Feb 06 2002, Daniel Pittman wrote:

> > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > beware.

> I can confirm that there are filesystem corruption issues with 2.5.3;
> after this message I rebooted and did a forced fsck which turned up
> around a half dozen inodes where the block count in the inode itself was
> too high.

Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
without ever using any pre patch at this machine.

--
# Heinz Diehl, 68259 Mannheim, Germany

2002-02-06 23:39:38

by Alexander Viro

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems



On Thu, 7 Feb 2002, Heinz Diehl wrote:

> On Wed Feb 06 2002, Daniel Pittman wrote:
>
> > > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > > beware.
>
> > I can confirm that there are filesystem corruption issues with 2.5.3;
> > after this message I rebooted and did a forced fsck which turned up
> > around a half dozen inodes where the block count in the inode itself was
> > too high.
>
> Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
> a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
> without ever using any pre patch at this machine.

Very interesting. Which filesystems are mounted (other than ext2) and
are you been able to reproduce it on 2.5.3-pre6?

2002-02-07 20:30:14

by Heinz Diehl

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Wed Feb 06 2002, Alexander Viro wrote:

> Very interesting. Which filesystems are mounted (other than ext2) and
> are you been able to reproduce it on 2.5.3-pre6?

There are only ext2 filesystems available and one cd-rom.

I installed 2.5.3-pre6 and the machine runs for about 6 hours now
(heavy load) and no error occured yet.

--
# Heinz Diehl, 68259 Mannheim, Germany

2002-02-08 21:12:55

by Pavel Machek

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

Hi!

> > > > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > > > beware.
> >
> > > I can confirm that there are filesystem corruption issues with 2.5.3;
> > > after this message I rebooted and did a forced fsck which turned up
> > > around a half dozen inodes where the block count in the inode itself was
> > > too high.
> >
> > Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
> > a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
> > without ever using any pre patch at this machine.
>
> Very interesting. Which filesystems are mounted (other than ext2) and
> are you been able to reproduce it on 2.5.3-pre6?

For me, mounted filesystems look like this:

none on /proc type proc (rw)
none on /proc type proc (rw)
none on /proc type proc (rw)
/dev/hda3 on /suse type ext2 (rw)
none on /proc type proc (rw)
none on /proc/bus/usb type usbdevfs (rw)
/dev/cfs0 on /overlay type coda (rw)

Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-08 21:30:57

by Pavel Machek

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

Hi!

> > > > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > > > beware.
> >
> > > I can confirm that there are filesystem corruption issues with 2.5.3;
> > > after this message I rebooted and did a forced fsck which turned up
> > > around a half dozen inodes where the block count in the inode itself was
> > > too high.
> >
> > Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
> > a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
> > without ever using any pre patch at this machine.
>
> Very interesting. Which filesystems are mounted (other than ext2) and
> are you been able to reproduce it on 2.5.3-pre6?

Mounted filesystems:

/dev/hda2 on / type ext2 (rw)
none on /proc type proc (rw)
...
none on /proc type proc (rw)
/dev/hda3 on /suse type ext2 (rw)
none on /proc type proc (rw)
none on /proc/bus/usb type usbdevfs (rw)
/dev/cfs0 on /overlay type coda (rw)

(I wander what is responsible for mounting /proc hundred times?)

But... you should know that I'm strongly suspecting ide subsystem:

Feb 8 12:08:02 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:08:02 amd kernel: hda: drive not ready for command
Feb 8 12:08:02 amd kernel: ide0: reset: success
Feb 8 12:09:26 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:09:26 amd kernel: hda: drive not ready for command
Feb 8 12:09:26 amd kernel: ide0: reset: success
Feb 8 12:12:27 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:12:27 amd kernel: hda: drive not ready for command
Feb 8 12:12:27 amd kernel: ide0: reset: success

I'm trying to test it with md5sum, but so far it behaves ok. [I wonder
what directory I'll loose this time ... :-(]
Pavel

--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-08 21:31:16

by Pavel Machek

[permalink] [raw]
Subject: WARNING: 2.5.3 -- IDE damages data! [was Re: Warning, 2.5.3 eats filesystems]

Hi!

> > > > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > > > beware.
> >
> > > I can confirm that there are filesystem corruption issues with 2.5.3;
> > > after this message I rebooted and did a forced fsck which turned up
> > > around a half dozen inodes where the block count in the inode itself was
> > > too high.
> >
> > Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
> > a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
> > without ever using any pre patch at this machine.
>
> Very interesting. Which filesystems are mounted (other than ext2) and
> are you been able to reproduce it on 2.5.3-pre6?

[This is ext2 machine, IDE is

Bus 0, device 15, function 0:
IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev 195).
Master Capable. Latency=64. Min Gnt=2.Max Lat=4.
I/O at 0x1000 [0x100f].

]


I did more testing, and yes, 2.5.3 does data corruption. This appeared
in syslog:

Feb 8 12:08:02 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:08:02 amd kernel: hda: drive not ready for command
Feb 8 12:08:02 amd kernel: ide0: reset: success
Feb 8 12:09:26 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:09:26 amd kernel: hda: drive not ready for command
Feb 8 12:09:26 amd kernel: ide0: reset: success
Feb 8 12:12:27 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:12:27 amd kernel: hda: drive not ready for command
Feb 8 12:12:27 amd kernel: ide0: reset: success
Feb 8 12:13:05 amd log1n[103]: ROOT LOGIN on `tty6'
Feb 8 12:15:00 amd sendmail[150]: alias database /etc/aliases.db out
of date
Feb 8 12:15:00 amd sendmail[150]: MAA00150: from=pavel, size=2519,
class=0, pri=62519, nrcpts=2, msgid=<[email protected]>,
relay=pavel@localhost
Feb 8 12:15:04 amd sendmail[152]: MAA00150:
[email protected],[email protected], ctladdr=pavel
(8/100), delay=00:00:04, xdelay=00:00:04, mailer=relay,
relay=[10.0.0.1] [10.0.0.1], stat=Sent (MAA10440 Message accepted for
delivery)
Feb 8 12:18:17 amd kernel: hda: status timeout: status=0xd0 { Busy }
Feb 8 12:18:17 amd kernel: hda: drive not ready for command
Feb 8 12:18:18 amd kernel: ide0: reset: master: error (0x00?)

And this happened on console:

croot@amd:~# cat /dev/urandom > /tmp/delme

root@amd:~# ls -al /tmp/delme
-rw-r--r-- 1 root root 246845440 Feb 8 12:11 /tmp/delme
root@amd:~# cp /tmp/delme /tmp/delme2
root@amd:~# cp /tmp/delme /tmp/delme3
root@amd:~# cp /tmp/delme /tmp/delme4
root@amd:~# md5sum /tmp/delme*
2da1568c45e298938353672d3a642714 /tmp/delme
bc99762f3cf9a104e58e3f5708eeba99 /tmp/delme2
2da1568c45e298938353672d3a642714 /tmp/delme3
2da1568c45e298938353672d3a642714 /tmp/delme4
root@amd:~# cd /tmp
root@amd:/tmp# ls -al delme delme2 delme3
-rw-r--r-- 1 root root 246845440 Feb 8 12:11 delme
-rw-r--r-- 1 root root 246845440 Feb 8 12:14 delme2
-rw-r--r-- 1 root root 246845440 Feb 8 12:17 delme3
root@amd:/tmp#

As you can see, delme2 was damaged. I tried to do some heavy reads,
but those seem to be okay.

root@amd:/tmp# hdparm /dev/hda

/dev/hda:
multcount = 0 (off)
I/O support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 2432/255/63, sectors = 39070080, start = 0
busstate = 1 (on)
root@amd:/tmp#

Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-02-08 21:43:16

by Heinz Diehl

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Fri Feb 08 2002, Pavel Machek wrote:

> For me, mounted filesystems look like this:
[....]

Yep. I sent the "mount" output already to Alexander Viro, unfortunately I
did not Cc: the mail to lkml.

/dev/hda1 on / type ext2 (rw)
proc on /proc type proc (rw)
/dev/hda6 on /usr type ext2 (rw)
/dev/hda5 on /home type ext2 (rw)
/dev/hdb5 on /var/spool/news type ext2 (rw)
tmpfs on /dev/shm type shm (rw)
tmpfs on /tmp type tmpfs (rw)
tmpfs on /var/tmp type tmpfs (rw)

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev06)
(prog-if 8a [Master SecP PriP])
Flags: bus master, medium devsel, latency 64
I/O ports at e000 [size=16]

[....]
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c586b (rev 41) IDE UDMA33 controller on pci00:07.1
ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:DMA
hda: IBM-DHEA-36481, ATA DISK drive
hdb: Conner Peripherals 1275MB - CFS1275A, ATA DISK drive
hdc: CD-540E, ATAPI CD/DVD-ROM drive
hdd: CD-W54E, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 12692736 sectors (6499 MB) w/472KiB Cache, CHS=790/255/63, UDMA(33)
hdb: 2496876 sectors (1278 MB) w/64KiB Cache, CHS=2477/16/63, DMA
[....]

chiara:~ # hdparm /dev/hda /dev/hdb
/dev/hda:
multcount = 16 (on)
I/O support = 3 (32-bit w/sync)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 790/255/63, sectors = 12692736, start = 0

/dev/hdb:
multcount = 8 (on)
I/O support = 3 (32-bit w/sync)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 619/64/63, sectors = 2496876, start = 0

--
# Heinz Diehl, 68259 Mannheim, Germany

2002-02-08 22:00:58

by Andre Hedrick

[permalink] [raw]
Subject: Re: Warning, 2.5.3 eats filesystems

On Fri, 8 Feb 2002, Pavel Machek wrote:

> Hi!
>
> > > > > 2.5.3 managed to damage my ext2 filesystem (few lost directories);
> > > > > beware.
> > >
> > > > I can confirm that there are filesystem corruption issues with 2.5.3;
> > > > after this message I rebooted and did a forced fsck which turned up
> > > > around a half dozen inodes where the block count in the inode itself was
> > > > too high.
> > >
> > > Exactly the same thing here, and I bet it _is_ 2.5.3 and not a relict from
> > > a 2.5.3-pre patch because I switched directly from 2.4.17 to 2.5.3
> > > without ever using any pre patch at this machine.
> >
> > Very interesting. Which filesystems are mounted (other than ext2) and
> > are you been able to reproduce it on 2.5.3-pre6?
>
> Mounted filesystems:
>
> /dev/hda2 on / type ext2 (rw)
> none on /proc type proc (rw)
> ...
> none on /proc type proc (rw)
> /dev/hda3 on /suse type ext2 (rw)
> none on /proc type proc (rw)
> none on /proc/bus/usb type usbdevfs (rw)
> /dev/cfs0 on /overlay type coda (rw)
>
> (I wander what is responsible for mounting /proc hundred times?)
>
> But... you should know that I'm strongly suspecting ide subsystem:
>
> Feb 8 12:08:02 amd kernel: hda: status timeout: status=0xd0 { Busy }
> Feb 8 12:08:02 amd kernel: hda: drive not ready for command
> Feb 8 12:08:02 amd kernel: ide0: reset: success
> Feb 8 12:09:26 amd kernel: hda: status timeout: status=0xd0 { Busy }
> Feb 8 12:09:26 amd kernel: hda: drive not ready for command
> Feb 8 12:09:26 amd kernel: ide0: reset: success
> Feb 8 12:12:27 amd kernel: hda: status timeout: status=0xd0 { Busy }
> Feb 8 12:12:27 amd kernel: hda: drive not ready for command
> Feb 8 12:12:27 amd kernel: ide0: reset: success
>
> I'm trying to test it with md5sum, but so far it behaves ok. [I wonder
> what directory I'll loose this time ... :-(]
> Pavel
>
> --
> (about SSSCA) "I don't say this lightly. However, I really think that the U.S.
> no longer is classifiable as a democracy, but rather as a plutocracy." --hpa
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Yep I warned about multmode pio.
I think I finally have a fix which does not use a copy of the request.


Andre Hedrick
Linux Disk Certification Project Linux ATA Development