I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus A7A266,
Ali chipset). I put some partitions on it like so:
hda1: 100MB - /boot
hda2: 8192MB - /
hda3: 1024MB - swap
hda4: the rest (about 190GB I guess) - /home
I find that when I mkfs on /home, I get massive filesystem corruption on /
When I fsck / (and restore the deleted files) I get massive filesystem corruption on /home. Luckily all my real data is still on my old disk...
I reduced the size of /home to 40GB and everything was fine.
I see the same behaviour with both 2.6.0test3 and 2.4.22.
My guess is that writes to very high numbered blocks are wrapping round
to lower numbered blocks in some way.
so...anyone else seen this? Is it a known driver problem?
Or is it a hardware issue?
Anyone care to suggest stuff to try? The contents of the disk are toast
(pretty much) so I can do destructive tests if it'll help...
Output from lspci looks like this:
00:00.0 Host bridge: ALi Corporation M1647 Northbridge [MAGiK 1 / MobileMAGiK 1] (rev 04)
00:01.0 PCI bridge: ALi Corporation PCI to AGP Controller
00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
00:04.0 IDE interface: ALi Corporation M5229 IDE (rev c4)
00:05.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
00:06.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV]
00:0a.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c)
00:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c)
00:0d.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 02)
00:11.0 Bridge: ALi Corporation M7101 PMU
01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF/PRO AGP 4x TMDS
Thanks in advance,
Steve Bennett
On Tue, Sep 02, 2003 at 02:28:16PM +0100, [email protected] wrote:
> I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus A7A266,
> Ali chipset). I put some partitions on it like so:
> hda1: 100MB - /boot
> hda2: 8192MB - /
> hda3: 1024MB - swap
> hda4: the rest (about 190GB I guess) - /home
>
> I find that when I mkfs on /home, I get massive filesystem corruption on /
> When I fsck / (and restore the deleted files) I get massive filesystem corruption on /home.
>
> so...anyone else seen this? Is it a known driver problem?
No doubt wraparound at 137 GB. (2^28 sectors of 2^9 bytes gives a 2^37 byte,
that is 128 GiB limit; to get past this you need support for lba48)
Recently we discussed a case where Linux decided that the hardware
could not handle lba48 but forgot to adapt the total capacity.
That was a Linux bug.
In fact, if I am not mistaken, the idea that that hardware could not
handle lba48 was due to a misunderstanding. That was another Linux bug.
Maybe these have now been fixed in some kernel versions.
So, you must check (i) what Linux thinks your hardware can do, and
(ii) what your hardware can do in reality.
Maybe the former can be seen in /proc/ide/hdX/settings under "address"
or so.
Corruption is fixed in 2.6.0-test4.
Unfortunately it seems your IDE chipset doesnt support LBA48,
so you wont be able to access full capacity (137GB limit).
If you are ready to take a risk (again ;-) ) you can remove
"hwif->no_lba48 = ..." line from a drivers/ide/pci/alim15x3.c,
recompile and retest without using DMA (add "ide=nodma"
boot option). Maybe LBA48 will work in PIO mode.
--bartlomiej
On Tuesday 02 of September 2003 15:28, [email protected] wrote:
> I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus
> A7A266, Ali chipset). I put some partitions on it like so:
> hda1: 100MB - /boot
> hda2: 8192MB - /
> hda3: 1024MB - swap
> hda4: the rest (about 190GB I guess) - /home
>
> I find that when I mkfs on /home, I get massive filesystem corruption on /
> When I fsck / (and restore the deleted files) I get massive filesystem
> corruption on /home. Luckily all my real data is still on my old disk...
>
> I reduced the size of /home to 40GB and everything was fine.
> I see the same behaviour with both 2.6.0test3 and 2.4.22.
> My guess is that writes to very high numbered blocks are wrapping round
> to lower numbered blocks in some way.
>
> so...anyone else seen this? Is it a known driver problem?
> Or is it a hardware issue?
> Anyone care to suggest stuff to try? The contents of the disk are toast
> (pretty much) so I can do destructive tests if it'll help...
>
> Output from lspci looks like this:
> 00:00.0 Host bridge: ALi Corporation M1647 Northbridge [MAGiK 1 /
> MobileMAGiK 1] (rev 04) 00:01.0 PCI bridge: ALi Corporation PCI to AGP
> Controller
> 00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
> 00:04.0 IDE interface: ALi Corporation M5229 IDE (rev c4)
> 00:05.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev
> 10) 00:06.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
> 00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV]
> 00:0a.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev
> 0c) 00:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 0c) 00:0d.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev
> 02) 00:11.0 Bridge: ALi Corporation M7101 PMU
> 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF/PRO
> AGP 4x TMDS
>
> Thanks in advance,
>
> Steve Bennett
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus
> A7A266, Ali chipset). I put some partitions on it like so:
> hda1: 100MB - /boot
> hda2: 8192MB - /
> hda3: 1024MB - swap
> hda4: the rest (about 190GB I guess) - /home
>
> I find that when I mkfs on /home, I get massive filesystem corruption on /
> When I fsck / (and restore the deleted files) I get massive filesystem
> corruption on /home. Luckily all my real data is still on my old disk...
>
> I reduced the size of /home to 40GB and everything was fine.
> I see the same behaviour with both 2.6.0test3 and 2.4.22.
> My guess is that writes to very high numbered blocks are wrapping round
> to lower numbered blocks in some way.
>
> so...anyone else seen this? Is it a known driver problem?
> Or is it a hardware issue?
> Anyone care to suggest stuff to try? The contents of the disk are toast
> (pretty much) so I can do destructive tests if it'll help...
>
> Output from lspci looks like this:
> 00:00.0 Host bridge: ALi Corporation M1647 Northbridge [MAGiK 1 /
> MobileMAGiK 1] (rev 04) 00:01.0 PCI bridge: ALi Corporation PCI to AGP
> Controller
> 00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
> 00:04.0 IDE interface: ALi Corporation M5229 IDE (rev c4)
> 00:05.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev
> 10) 00:06.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
> 00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV]
> 00:0a.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev
> 0c) 00:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
> (rev 0c) 00:0d.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev
> 02) 00:11.0 Bridge: ALi Corporation M7101 PMU
> 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF/PRO
> AGP 4x TMDS
>
> Thanks in advance,
>
> Steve Bennett
On Tue Sep 02, 2003 at 02:28:16PM +0100, [email protected] wrote:
>
> I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus A7A266,
> Ali chipset). I put some partitions on it like so:
> hda1: 100MB - /boot
> hda2: 8192MB - /
> hda3: 1024MB - swap
> hda4: the rest (about 190GB I guess) - /home
>
> I find that when I mkfs on /home, I get massive filesystem corruption on /
> When I fsck / (and restore the deleted files) I get massive filesystem corruption on /home. Luckily all my real data is still on my old disk...
>
> I reduced the size of /home to 40GB and everything was fine.
> I see the same behaviour with both 2.6.0test3 and 2.4.22.
Known problem. For some reason Marcelo has not yet applied
the fix for this problem to the 2.4.x kernels...
-Erik
--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--
On Mer, 2003-09-03 at 01:55, Bartlomiej Zolnierkiewicz wrote:
> If you are ready to take a risk (again ;-) ) you can remove
> "hwif->no_lba48 = ..." line from a drivers/ide/pci/alim15x3.c,
> recompile and retest without using DMA (add "ide=nodma"
> boot option). Maybe LBA48 will work in PIO mode.
ALi does support LBA48 in PIO mode. Right now the choice is
DMA and 137Gb or no DMA and 200Gb, ideally it should be DMA
and fall back to PIO for the top 70Gb, but not yet a while.
I've actually not yet found a controller in my testing that cannot
manage LBA48 PIO, including nailing a 160Gb drive to a Cyrix box with
a VIA VP2.
On Mer, 2003-09-03 at 19:07, Marcelo Tosatti wrote:
> > Known problem. For some reason Marcelo has not yet applied
> > the fix for this problem to the 2.4.x kernels...
>
> Alan (which has a clue about IDE unlike me) had complaints about your
> approach, right?
Bart pointed out the case in question can only occur when you move a
disk between interfaces physically. So the last IDE changes I sent you
included a minimal version of Erik's change
On Tue, 2 Sep 2003, Erik Andersen wrote:
> On Tue Sep 02, 2003 at 02:28:16PM +0100, [email protected] wrote:
> >
> > I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus A7A266,
> > Ali chipset). I put some partitions on it like so:
> > hda1: 100MB - /boot
> > hda2: 8192MB - /
> > hda3: 1024MB - swap
> > hda4: the rest (about 190GB I guess) - /home
> >
> > I find that when I mkfs on /home, I get massive filesystem corruption on /
> > When I fsck / (and restore the deleted files) I get massive filesystem corruption on /home. Luckily all my real data is still on my old disk...
> >
> > I reduced the size of /home to 40GB and everything was fine.
> > I see the same behaviour with both 2.6.0test3 and 2.4.22.
>
> Known problem. For some reason Marcelo has not yet applied
> the fix for this problem to the 2.4.x kernels...
Alan (which has a clue about IDE unlike me) had complaints about your
approach, right?
On Tue, 2 Sep 2003, Erik Andersen wrote:
> On Tue Sep 02, 2003 at 02:28:16PM +0100, [email protected] wrote:
> >
> > I just got a new 200GB disk (WDC WD2000JB) for my home machine (Asus A7A266,
> > Ali chipset). I put some partitions on it like so:
> > hda1: 100MB - /boot
> > hda2: 8192MB - /
> > hda3: 1024MB - swap
> > hda4: the rest (about 190GB I guess) - /home
> >
> > I find that when I mkfs on /home, I get massive filesystem corruption on /
> > When I fsck / (and restore the deleted files) I get massive filesystem corruption on /home. Luckily all my real data is still on my old disk...
> >
> > I reduced the size of /home to 40GB and everything was fine.
> > I see the same behaviour with both 2.6.0test3 and 2.4.22.
>
> Known problem. For some reason Marcelo has not yet applied
> the fix for this problem to the 2.4.x kernels...
So it seems the fix is already in 2.4.23-pre2 (came in through Alan IDE
changes).
Steve, it seems 2.4.23-pre2 fixes your problem.
On Wed Sep 03, 2003 at 04:54:28PM -0300, Marcelo Tosatti wrote:
> > > I reduced the size of /home to 40GB and everything was fine.
> > > I see the same behaviour with both 2.6.0test3 and 2.4.22.
> >
> > Known problem. For some reason Marcelo has not yet applied
> > the fix for this problem to the 2.4.x kernels...
>
> So it seems the fix is already in 2.4.23-pre2 (came in through Alan IDE
> changes).
>
> Steve, it seems 2.4.23-pre2 fixes your problem.
Marcelo, I think you are mistaken... You have indeed applied
some IDE fixes from Alan. But I just read all the IDE changes
again, and unless I have gone blind, this problem is not yet
fixed.
-Erik
--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--
> ALi does support LBA48 in PIO mode. Right now the choice is
> DMA and 137Gb or no DMA and 200Gb, ideally it should be DMA
> and fall back to PIO for the top 70Gb, but not yet a while.
OK, having actually read what dmesg says (instead of making assumptions),
I see:
hda: max request size: 128KiB
hda: cannot use LBA48 - full capacity 390721968 sectors (200049 MB)
hda: 268435456 sectors (137438 MB) w/8192KiB Cache, CHS=16709/255/63, UDMA(100)
hda: hda1 hda2 hda3 hda4
and fdisk reports:
# /sbin/fdisk -l
Disk /dev/hda: 137.4 GB, 137438953472 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 13 104391 83 Linux
/dev/hda2 14 1057 8385930 83 Linux
/dev/hda3 1058 1188 1052257+ 82 Linux swap
/dev/hda4 1189 6169 40009882+ 83 Linux
So the disk is being correctly downgraded to a non-lba48-compatible size.
In which case, why is the disk getting trashed?
Maybe there's a fault on the disk itself? I'll find a system that does lba48
and try it there...
Steve.
Steve Bennett wrote:
> So the disk is being correctly downgraded to a non-lba48-compatible size.
> In which case, why is the disk getting trashed
>
> Maybe there's a fault on the disk itself? I'll find a system that does lba48
> and try it there...
This is a western digital 200GB disc? Stop. Back off. Upgrade the
firmware to the latest on their site. Try again.
Dave