2010-08-11 14:44:13

by Cyril Hrubis

[permalink] [raw]
Subject: zaurus pata_pcmcia corrupted filesystem

Hi!
Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
interface, sources are here:

http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlog;h=refs/heads/zaurus-haxing

Everythig seems to be fine at the first look but sometimes the filesystem gets
unconsistent. Typical example is extracting tar archive with big number of
files and directories (I used ltp-full archive from
http://www.sf.net/projects/ltp/). After doing that, some files or directories
are present in the filesystem, eg. you can see them in directory listing but
open or fstat returns ENOENT.

Extracting to SD card seems to work without any problems.

Anybody has idea what went wrong?

--
Cyril Hrubis
[email protected]


2010-08-14 04:16:37

by Marek Vasut

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Dne St 11. srpna 2010 16:44:11 Cyril Hrubis napsal(a):
> Hi!
> Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
> interface, sources are here:
>
> http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlo
> g;h=refs/heads/zaurus-haxing
>
> Everythig seems to be fine at the first look but sometimes the filesystem
> gets unconsistent. Typical example is extracting tar archive with big
> number of files and directories (I used ltp-full archive from
> http://www.sf.net/projects/ltp/). After doing that, some files or
> directories are present in the filesystem, eg. you can see them in
> directory listing but open or fstat returns ENOENT.
>
> Extracting to SD card seems to work without any problems.
>
> Anybody has idea what went wrong?

I use the pcmcia on a few devices and was unable to replicate it (and I use git
and do some heavy compiling on them too). I'll do some further testing.

Cyril, do you have the PCMCIA timing fix for PXA applied?

2010-08-14 09:41:34

by Kristoffer Ericson

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

On Wed, Aug 11, 2010 at 04:44:11PM +0200, Cyril Hrubis wrote:
> Hi!
> Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
> interface, sources are here:
>
> http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlog;h=refs/heads/zaurus-haxing
>
> Everythig seems to be fine at the first look but sometimes the filesystem gets
> unconsistent. Typical example is extracting tar archive with big number of
> files and directories (I used ltp-full archive from
> http://www.sf.net/projects/ltp/). After doing that, some files or directories
> are present in the filesystem, eg. you can see them in directory listing but
> open or fstat returns ENOENT.
>
> Extracting to SD card seems to work without any problems.
>
> Anybody has idea what went wrong?

What kind of number of files (in tar archive) are we talking about in round
figures? Does the files it affect differ in size (large fail, small doesnt)..?
Its hard to say without having more data (as always).

>
> --
> Cyril Hrubis
> [email protected]

2010-08-14 10:43:10

by Stanislav Brabec

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Kristoffer Ericson wrote:
> On Wed, Aug 11, 2010 at 04:44:11PM +0200, Cyril Hrubis wrote:
> > Hi!
> > Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
> > interface, sources are here:
> >
> > http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlog;h=refs/heads/zaurus-haxing
> >
> > Everythig seems to be fine at the first look but sometimes the filesystem gets
> > unconsistent. Typical example is extracting tar archive with big number of
> > files and directories (I used ltp-full archive from
> > http://www.sf.net/projects/ltp/). After doing that, some files or directories
> > are present in the filesystem, eg. you can see them in directory listing but
> > open or fstat returns ENOENT.
> >
> > Extracting to SD card seems to work without any problems.
> >
> > Anybody has idea what went wrong?
>
> What kind of number of files (in tar archive) are we talking about in round
> figures? Does the files it affect differ in size (large fail, small doesnt)..?
> Its hard to say without having more data (as always).

These mysterious failures were discussed several times before in
zaurus-devel[1][2][3][4][5][6]. We suspected PCMCIA code, RAM errors or
bad voltage.

Typical reproducers:

git status on kernel tree (false result)
copy files from CF to SD or USB (errors in files)
gcc while using CF WLAN or CF Bluetooth (segfault)

It happened in all kernels at least since 2.6.26, but probably in all
2.6 kernels.

[1] http://lists.linuxtogo.org/pipermail/zaurus-devel/2010-June/000306.html
[2] http://lists.linuxtogo.org/pipermail/zaurus-devel/2010-June/000305.html
[3] http://lists.linuxtogo.org/pipermail/zaurus-devel/2010-April/000242.html
[4] http://lists.linuxtogo.org/pipermail/zaurus-devel/2010-March/000217.html
[5] http://lists.linuxtogo.org/pipermail/zaurus-devel/2010-March/000227.html

--
________________________________________________________________________
Stanislav Brabec
http://www.penguin.cz/~utx/zaurus

2010-08-14 11:39:40

by Cyril Hrubis

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Hi!
> > > Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
> > > interface, sources are here:
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlog;h=refs/heads/zaurus-haxing
> > >
> > > Everythig seems to be fine at the first look but sometimes the filesystem gets
> > > unconsistent. Typical example is extracting tar archive with big number of
> > > files and directories (I used ltp-full archive from
> > > http://www.sf.net/projects/ltp/). After doing that, some files or directories
> > > are present in the filesystem, eg. you can see them in directory listing but
> > > open or fstat returns ENOENT.
> > >
> > > Extracting to SD card seems to work without any problems.
> > >
> > > Anybody has idea what went wrong?
> >
> > What kind of number of files (in tar archive) are we talking about in round
> > figures? Does the files it affect differ in size (large fail, small doesnt)..?
> > Its hard to say without having more data (as always).
>
> These mysterious failures were discussed several times before in
> zaurus-devel[1][2][3][4][5][6]. We suspected PCMCIA code, RAM errors or
> bad voltage.
>
> Typical reproducers:
>
> git status on kernel tree (false result)
> copy files from CF to SD or USB (errors in files)
> gcc while using CF WLAN or CF Bluetooth (segfault)
>
> It happened in all kernels at least since 2.6.26, but probably in all
> 2.6 kernels.
>

These seems to be different, at least it newer occured so often to me. Now
installing debian packages often fails because it cannot remove temporary
files, the filesystem worked reasonably good before.

The usecase is to unpack archive with a lot rather small files somewhere deeper
to the filesystem tree. Eg. installing gtk2-devel failed miserably when trying
to move all the header files from *.h.dpkg-new to *.h, this works rather good
with old driver.

--
Cyril Hrubis

2010-08-14 14:15:58

by Stanislav Brabec

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Cyril Hrubis wrote:

> These seems to be different, at least it newer occured so often to me. Now
> installing debian packages often fails because it cannot remove temporary
> files, the filesystem worked reasonably good before.
>
> The usecase is to unpack archive with a lot rather small files somewhere deeper
> to the filesystem tree. Eg. installing gtk2-devel failed miserably when trying
> to move all the header files from *.h.dpkg-new to *.h, this works rather good
> with old driver.

Well, I remember opkg crashes while installing emacs (many small files)
and file system corruption as well. Both with kernel 2.6.26.

--

________________________________________________________________________
Stanislav Brabec
http://www.penguin.cz/~utx/zaurus

2010-08-14 16:48:33

by Cyril Hrubis

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Hi!
> > Marek Vasut ported deprecated zaurus pcmcia disk driver to the pata_pcmcia
> > interface, sources are here:
> >
> > http://git.kernel.org/?p=linux/kernel/git/marex/pxa-linux-2.6.git;a=shortlo
> > g;h=refs/heads/zaurus-haxing
> >
> > Everythig seems to be fine at the first look but sometimes the filesystem
> > gets unconsistent. Typical example is extracting tar archive with big
> > number of files and directories (I used ltp-full archive from
> > http://www.sf.net/projects/ltp/). After doing that, some files or
> > directories are present in the filesystem, eg. you can see them in
> > directory listing but open or fstat returns ENOENT.
> >
> > Extracting to SD card seems to work without any problems.
> >
> > Anybody has idea what went wrong?
>
> I use the pcmcia on a few devices and was unable to replicate it (and I use git
> and do some heavy compiling on them too). I'll do some further testing.
>
> Cyril, do you have the PCMCIA timing fix for PXA applied?

Well git log says so:

...
pxa2xx/cpufreq: Fix PCMCIA frequency scaling

The MCxx values must be based off memory clock, not CPU core clock.

This also fixes the bug where on some machines the LCD went crazy while using
PCMCIA.

...


At least it's your tree that I've been testing ;).

--
Cyril Hrubis

2010-08-21 09:51:44

by Cyril Hrubis

[permalink] [raw]
Subject: Re: zaurus pata_pcmcia corrupted filesystem

Hi!
> > These seems to be different, at least it newer occured so often to me. Now
> > installing debian packages often fails because it cannot remove temporary
> > files, the filesystem worked reasonably good before.
> >
> > The usecase is to unpack archive with a lot rather small files somewhere deeper
> > to the filesystem tree. Eg. installing gtk2-devel failed miserably when trying
> > to move all the header files from *.h.dpkg-new to *.h, this works rather good
> > with old driver.
>
> Well, I remember opkg crashes while installing emacs (many small files)
> and file system corruption as well. Both with kernel 2.6.26.
>

The main difference is that now it's 100% reproducible and I've never seen
corruption bad enough so I could not delete files without doing fsck first.

Here is pseudo output from fsck on ext3 filesystem after these problems
appeared:

Pass 2:

Bunch of these:

Invalid HTREE directory inode $NUMBER
$PATH/$FILE
Clear HTree index<y>?

and these:

Problem in HTree directory inode $NUMBER: block #$NR has bad max hash

Pass 3:

Bunch of:

Entry '$FILE in $PATH ($NUMBER) has a non-unique filename.
Rename to $FILE~0<y>?

Happily the files/directories affected was only the recently written from the
kernel from marex tree. I haven't seen file content to be corrupted (that
doesn't mean there weren't any corruptions) but it seems they are not so likely
as directory tree corruption.

--
metan