2006-10-10 07:09:35

by Jan-Benedict Glaw

[permalink] [raw]
Subject: Re: 2.6.18-mm2: ext3 BUG?

On Sun, 2006-10-08 08:33:30 +0200, Jan-Benedict Glaw <[email protected]> wrote:
> On Thu, 2006-10-05 17:14:28 -0700, Andrew Morton <[email protected]> wrote:

> In one case, there was a test case mentioned. I'll run that on my
> affected box in a non-productive LV, like this:
>
> dd bs=1M count=200 if=/dev/zero of=test0
> while :; do
> echo "cp 0-1"; cp test0 test1 || break
> echo "cp 1-2"; cp test1 test2 || break
> echo "cp 2-3"; cp test2 test3 || break
> echo "cp 3-4"; cp test3 test4 || break
> echo "od 0" ; od test0 || break
> echo "rm 1"; rm test1 || break
> echo "rm 2"; rm test2 || break
> echo "rm 3"; rm test3 || break
> echo "rm 4"; rm test4 || break
> done

While I could reproduce it with a 200MB file, it seems I can't break
it with a 10MB file.

MfG, JBG

--
Jan-Benedict Glaw [email protected] +49-172-7608481
Signature of: Lauf nicht vor Deinem Glück davon:
the second : Es könnte hinter Dir stehen!


Attachments:
(No filename) (0.99 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-10-11 10:42:17

by Jan Kara

[permalink] [raw]
Subject: Re: 2.6.18-mm2: ext3 BUG?

> On Sun, 2006-10-08 08:33:30 +0200, Jan-Benedict Glaw <[email protected]> wrote:
> > On Thu, 2006-10-05 17:14:28 -0700, Andrew Morton <[email protected]> wrote:
>
> > In one case, there was a test case mentioned. I'll run that on my
> > affected box in a non-productive LV, like this:
> >
> > dd bs=1M count=200 if=/dev/zero of=test0
> > while :; do
> > echo "cp 0-1"; cp test0 test1 || break
> > echo "cp 1-2"; cp test1 test2 || break
> > echo "cp 2-3"; cp test2 test3 || break
> > echo "cp 3-4"; cp test3 test4 || break
> > echo "od 0" ; od test0 || break
> > echo "rm 1"; rm test1 || break
> > echo "rm 2"; rm test2 || break
> > echo "rm 3"; rm test3 || break
> > echo "rm 4"; rm test4 || break
> > done
>
> While I could reproduce it with a 200MB file, it seems I can't break
> it with a 10MB file.
Hmm, I was running the test for several ours without any problem...
The kernel is 2.6.17.6, ext3 in ordered data mode, standard SATA disk. I'm
now running it again and trying my luck ;). What is your testing environment?

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2006-10-23 08:13:56

by Jan-Benedict Glaw

[permalink] [raw]
Subject: Re: 2.6.18-mm2: ext3 BUG?

On Wed, 2006-10-11 12:42:02 +0200, Jan Kara <[email protected]> wrote:
> > On Sun, 2006-10-08 08:33:30 +0200, Jan-Benedict Glaw <[email protected]> wrote:
> > While I could reproduce it with a 200MB file, it seems I can't break
> > it with a 10MB file.
> Hmm, I was running the test for several ours without any problem...
> The kernel is 2.6.17.6, ext3 in ordered data mode, standard SATA disk. I'm
> now running it again and trying my luck ;). What is your testing environment?

kolbe34-backup:/mnt# uname -a
Linux kolbe34-backup 2.6.17-2-686 #1 SMP Wed Sep 13 16:34:10 UTC 2006 i686 GNU/Linux
kolbe34-backup:/mnt# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 448.674
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up
bogomips : 898.38
kolbe34-backup:/mnt# grep -i preem /boot/config-2.6.17-2-686
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
kolbe34-backup:/mnt# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:0e.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet32 LANCE] (rev 16)
01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP 1X/2X (rev 5c)
kolbe34-backup:/mnt# lspci -n
00:00.0 0600: 8086:7190 (rev 03)
00:01.0 0604: 8086:7191 (rev 03)
00:07.0 0601: 8086:7110 (rev 02)
00:07.1 0101: 8086:7111 (rev 01)
00:07.2 0c03: 8086:7112 (rev 01)
00:07.3 0680: 8086:7113 (rev 02)
00:0e.0 0200: 1022:2000 (rev 16)
01:00.0 0300: 1002:4742 (rev 5c)
kolbe34-backup:~# hdparm -i /dev/hdb

/dev/hdb:

Model=ST3300822A, FwRev=3.AAE, SerialNo=5NF24YCN
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 *udma2 udma3 udma4 udma5
AdvancedPM=no WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

* signifies the current active mode


Still running Debian's 2.6.17-2-686, I'm now tracking down the file
size when I start to see this type of corruption. Right now, it seems
I never get it with a 16384 KB (16 MB) large file, but I get it with a
21504 KB (21 MB) file.

Is there something important that changes handling of file contents in
the 16..21 MB range?

dumpe2fs output at http://lug-owl.de/~jbglaw/ext3-dumpe2fs.txt for
that filesystem. I'll now run with a 18.5 MB file...

MfG, JBG

--
Jan-Benedict Glaw [email protected] +49-172-7608481
Signature of: Wenn ich wach bin, träume ich.
the second :


Attachments:
(No filename) (3.55 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-10-23 10:40:54

by Jan Kara

[permalink] [raw]
Subject: Re: 2.6.18-mm2: ext3 BUG?

Hello,

> On Wed, 2006-10-11 12:42:02 +0200, Jan Kara <[email protected]> wrote:
> > > On Sun, 2006-10-08 08:33:30 +0200, Jan-Benedict Glaw <[email protected]> wrote:
> > > While I could reproduce it with a 200MB file, it seems I can't break
> > > it with a 10MB file.
> > Hmm, I was running the test for several ours without any problem...
> > The kernel is 2.6.17.6, ext3 in ordered data mode, standard SATA disk. I'm
> > now running it again and trying my luck ;). What is your testing environment?
>
> kolbe34-backup:/mnt# uname -a
Thanks for info. This looks pretty similar to what I have (only that
I have Athlon).

> Still running Debian's 2.6.17-2-686, I'm now tracking down the file
> size when I start to see this type of corruption. Right now, it seems
> I never get it with a 16384 KB (16 MB) large file, but I get it with a
> 21504 KB (21 MB) file.
>
> Is there something important that changes handling of file contents in
> the 16..21 MB range?
Umm, I've checked and found nothing obvious. We already have to use
double-indirect block at 16MB, maybe reservation code does some
distiction. Could you mount the filesystem with -o 'noreservation' and
see whether you can still reproduce the problem? Also it may be useful
to find out, whether you see the failure also with some older kernels...

Honza
>
> dumpe2fs output at http://lug-owl.de/~jbglaw/ext3-dumpe2fs.txt for
> that filesystem. I'll now run with a 18.5 MB file...
>
> MfG, JBG
>
> --
> Jan-Benedict Glaw [email protected] +49-172-7608481
> Signature of: Wenn ich wach bin, tr?ume ich.
> the second :


--
Jan Kara <[email protected]>
SuSE CR Labs