2003-07-29 11:12:06

by Robert L. Harris

[permalink] [raw]
Subject: NFS Server running 2.6.0-test2



Just converted my nfs server to 2.6.0-test2 last night. This morning I
found this on my console:

{0}:/>
Message from syslogd@camel at Tue Jul 29 00:02:30 2003 ...
camel kernel: journal commit I/O error


{0}:/>mount
.
.
/dev/md/0 on /mnt/data1 type ext3 (rw)

{0}:/>find /mnt/data1/backups/www/tarballs -name www-\*tgz -mtime +7 -exec rm {} \;
rm: cannot remove `/mnt/data1/backups/www/tarballs/www-20030721.tgz':
Read-only file system

{0}:/>mount -o remount,rw /dev/md0
mount: block device /dev/md/0 is write-protected, mounting read-only


I have NFS Version 3 enabled but TCP disabled (was very laggy).


Robert

:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Diagnosis: witzelsucht

IPv6 = [email protected] http://ipv6.rdlg.net
IPv4 = [email protected] http://www.rdlg.net


Attachments:
(No filename) (1.00 kB)
(No filename) (189.00 B)
Download all attachments

2003-07-29 12:05:56

by NeilBrown

[permalink] [raw]
Subject: Re: NFS Server running 2.6.0-test2

On Tuesday July 29, [email protected] wrote:
>
>
> Just converted my nfs server to 2.6.0-test2 last night. This morning I
> found this on my console:
>
> {0}:/>
> Message from syslogd@camel at Tue Jul 29 00:02:30 2003 ...
> camel kernel: journal commit I/O error
>

I'm guessing that the filesystem got an I/O error when writing to the
device.
Anything in the kernel log that might confirm or deny this?

NeilBrown

2003-07-29 13:09:33

by Robert L. Harris

[permalink] [raw]
Subject: Re: NFS Server running 2.6.0-test2



The messages file is completely empty of any error messages related to
anything disk or filesystem related from about 6 hours prior to the
error up until the time I rebooted. In addition the actual device
(RAID5 filesystem) is intact.

Robert



Thus spake Neil Brown ([email protected]):

> On Tuesday July 29, [email protected] wrote:
> >
> >
> > Just converted my nfs server to 2.6.0-test2 last night. This morning I
> > found this on my console:
> >
> > {0}:/>
> > Message from syslogd@camel at Tue Jul 29 00:02:30 2003 ...
> > camel kernel: journal commit I/O error
> >
>
> I'm guessing that the filesystem got an I/O error when writing to the
> device.
> Anything in the kernel log that might confirm or deny this?
>
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Diagnosis: witzelsucht

IPv6 = [email protected] http://ipv6.rdlg.net
IPv4 = [email protected] http://www.rdlg.net


Attachments:
(No filename) (1.38 kB)
(No filename) (189.00 B)
Download all attachments

2003-07-30 04:29:34

by NeilBrown

[permalink] [raw]
Subject: Re: NFS Server running 2.6.0-test2

On Tuesday July 29, [email protected] wrote:
>
>
> The messages file is completely empty of any error messages related to
> anything disk or filesystem related from about 6 hours prior to the
> error up until the time I rebooted. In addition the actual device
> (RAID5 filesystem) is intact.
>

Well, it looks quite unplesant then.
That error message can only get printed if the JFS_ABORT flag is set
for the ext3 journal, and whenever JFS_ABORT is set, the message:
Aborting journal on device ...
comes first. If you don't have that message, then the impossible has
happened.

The impossible is usually caused either by bad memory (and a
single-bit-error in memory could have caused this probem), or by some
stray pointer corrupting something.

So, I would suggest memtest86 if that is convenient, followed by
reporting the problem to the ext3 developers (see the MAINAINERS
file).

This problem is unlikely to be related to the machine being an NFS
server, though until we know the cause, nothing should be ruled out.

NeilBrown