IA64, gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-95). Various
kernels from 2.4.9 (with ext3) through 2.4.13-pre1 (with XFS).
The filesystem was created as ext3 but is currently being accessed as
ext2 while I work on XFS and kdb for IA64. After multiple power rests,
several directory entries are corrupt. Attempts to access the files
get I/O error with nothing in the log. Running e2fsck does not correct
the broken directory entry, neither does booting a kernel that supports
ext3.
I am surprised that neither ext3 recovery nor e2fsck detected the
broken directory entries. Before I clri the directory entry, does
anybody want more details?
debugfs: ls -l /var/run
686849 40755 0 0 4096 13-Oct-2001 21:41 .
179873 40755 0 0 4096 25-Sep-2001 11:39 ..
507431 40775 0 0 4096 27-Jul-2001 07:32 netreport
686888 100664 0 22 4800 13-Oct-2001 21:45 utmp
376417 40755 75 75 4096 9-Jul-2001 21:40 radvd
654934 40700 0 0 4096 24-Jul-2001 02:52 sudo
686995 100644 0 0 11 13-Oct-2001 21:41 runlevel.dir
687119 100600 0 0 4 13-Oct-2001 21:41 syslogd.pid
687120 100600 0 0 4 13-Oct-2001 21:41 klogd.pid
688413 100644 0 0 4 13-Oct-2001 21:41 sshd.pid
688414 100644 0 0 4 13-Oct-2001 21:41 xinetd.pid
2133571369 --- error --- sendmail.pid
2133571369 --- error --- crond.pid
2133571369 --- error --- xfs.pid
2133571369 --- error --- atd.pid
Dump of the directory.
0000 01 7B 0A 00 0C 00 01 02 2E 00 00 00 A1 BE 02 00 * .{..........?... *
0010 0C 00 02 02 2E 2E 00 00 27 BE 07 00 14 00 09 02 * ........'....... *
0020 6E 65 74 72 65 70 6F 72 74 00 00 00 28 7B 0A 00 * netreport...({.. *
0030 0C 00 04 01 75 74 6D 70 61 BE 05 00 10 00 05 02 * ....utmpa....... *
0040 72 61 64 76 64 00 00 00 56 FE 09 00 0C 00 04 02 * radvd...V....... *
0050 73 75 64 6F 93 7B 0A 00 14 00 0C 01 72 75 6E 6C * sudo.{......runl *
0060 65 76 65 6C 2E 64 69 72 0F 7C 0A 00 14 00 0B 01 * evel.dir.|...... *
0070 73 79 73 6C 6F 67 64 2E 70 69 64 00 10 7C 0A 00 * syslogd.pid..|.. *
0080 14 00 09 01 6B 6C 6F 67 64 2E 70 69 64 00 00 00 * ....klogd.pid... *
0090 1D 81 0A 00 10 00 08 01 73 73 68 64 2E 70 69 64 * ........sshd.pid *
00a0 1E 81 0A 00 14 00 0A 01 78 69 6E 65 74 64 2E 70 * ........xinetd.p *
00b0 69 64 00 00 00 00 00 00 14 00 0C 01 73 65 6E 64 * id..........send *
00c0 6D 61 69 6C 2E 70 69 64 00 00 00 00 14 00 09 01 * mail.pid........ *
00d0 63 72 6F 6E 64 2E 70 69 64 00 00 00 00 00 00 00 * crond.pid....... *
00e0 10 00 07 01 78 66 73 2E 70 69 64 00 00 00 00 00 * ....xfs.pid..... *
00f0 14 0F 07 01 61 74 64 2E 70 69 64 00 00 00 00 00 * ....atd.pid..... *
0100 04 0F 0C 01 73 68 75 74 64 6F 77 6E 2E 70 69 64 * ....shutdown.pid *
0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * ................ *
On Sat, 13 Oct 2001 22:44:34 +1000,
Keith Owens <[email protected]> wrote:
>The filesystem was created as ext3 but is currently being accessed as
>ext2 while I work on XFS and kdb for IA64. After multiple power rests,
>several directory entries are corrupt. Attempts to access the files
>get I/O error with nothing in the log. Running e2fsck does not correct
>the broken directory entry, neither does booting a kernel that supports
>ext3.
I forgot to mention that both fsck.ext2 and fsck.ext3 report
1: Entry 'sendmail.pid' in /var/run (686849) has deleted/unused inode 688415. CLEARED.
/1: Entry 'crond.pid' in /var/run (686849) has deleted/unused inode 688416. CLEARED.
/1: Entry 'xfs.pid' in /var/run (686849) has deleted/unused inode 688417. CLEARED.
/1: Entry 'atd.pid' in /var/run (686849) has deleted/unused inode 688418. CLEARED.
but the entries are still corrupt.
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
> I forgot to mention that both fsck.ext2 and fsck.ext3 report
>
> 1: Entry 'sendmail.pid' in /var/run (686849) has
> deleted/unused inode 688415. CLEARED.
> /1: Entry 'crond.pid' in /var/run (686849) has
> deleted/unused inode 688416. CLEARED.
> /1: Entry 'xfs.pid' in /var/run (686849) has
> deleted/unused inode 688417. CLEARED.
> /1: Entry 'atd.pid' in /var/run (686849) has
> deleted/unused inode 688418. CLEARED.
>
All inodes are in the same sector.
Could you try out if that sector is destroyed?
One of my broken harddisks showed similar behaviour:
* write operations succeeded.
* read operations immediately after the write (write 16 MB including the
damaged sector, then read all 16 MB) sometimes succeeded.
* read operations after 5 minutes always failed.
--
Manfred
On Sat, 13 Oct 2001 16:06:35 +0200,
Manfred Spraul <[email protected]> wrote:
>> I forgot to mention that both fsck.ext2 and fsck.ext3 report
>>
>> 1: Entry 'sendmail.pid' in /var/run (686849) has
>> deleted/unused inode 688415. CLEARED.
>> /1: Entry 'crond.pid' in /var/run (686849) has
>> deleted/unused inode 688416. CLEARED.
>> /1: Entry 'xfs.pid' in /var/run (686849) has
>> deleted/unused inode 688417. CLEARED.
>> /1: Entry 'atd.pid' in /var/run (686849) has
>> deleted/unused inode 688418. CLEARED.
>>
>All inodes are in the same sector.
>Could you try out if that sector is destroyed?
It should not matter which sector the inode is in, the directory entry
should have been cleared, independent of the inode. But I checked
anyway, dd of the entire partition to /dev/null succeeded, no disk
error messages anywhere in the logs at any time.
On Sun, 14 Oct 2001, Keith Owens wrote:
> On Sat, 13 Oct 2001 16:06:35 +0200,
> Manfred Spraul <[email protected]> wrote:
> >> I forgot to mention that both fsck.ext2 and fsck.ext3 report
> >>
> >> 1: Entry 'sendmail.pid' in /var/run (686849) has
> >> deleted/unused inode 688415. CLEARED.
> >> /1: Entry 'crond.pid' in /var/run (686849) has
> >> deleted/unused inode 688416. CLEARED.
> >> /1: Entry 'xfs.pid' in /var/run (686849) has
> >> deleted/unused inode 688417. CLEARED.
> >> /1: Entry 'atd.pid' in /var/run (686849) has
> >> deleted/unused inode 688418. CLEARED.
> >>
> >All inodes are in the same sector.
> >Could you try out if that sector is destroyed?
>
> It should not matter which sector the inode is in, the directory entry
> should have been cleared, independent of the inode. But I checked
> anyway, dd of the entire partition to /dev/null succeeded, no disk
> error messages anywhere in the logs at any time.
Is this your root partition perhaps? Fsck of a mounted device might act a
little differently with the new blockdev-in-pagecache approach.
--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."
In article <[email protected]> you wrote:
> I am surprised that neither ext3 recovery nor e2fsck detected the
> broken directory entries. Before I clri the directory entry, does
> anybody want more details?
I had problems (the first for years) with 2.4.11-xfs, too. I had illegal
chars in file names in my ext2 /home partition. But e2fsck was able to clear
them. It was due to a kernel oops caused by openafs module.
Not sure if it is related.
Greetings
Bernd
On Sat, 13 Oct 2001 14:09:32 -0500 (CDT),
Oliver Xymoron <[email protected]> wrote:
>Is this your root partition perhaps? Fsck of a mounted device might act a
>little differently with the new blockdev-in-pagecache approach.
It is root. The problem exists on 2.4.9 kernels as well as
2.4.13-pre1. The 2.4.9 kernels are RH 7.2 IA64 beta so they might
contain code from the -ac tree.