2003-03-28 17:28:55

by Bernd Schubert

[permalink] [raw]
Subject: nfsd-fh: found a name that I didn't expect

Hi,

due to hardware problems I just started our fall back server and got these
messages for 2 files:

nfsd-fh: found a name that I didn't expect: bin/uptime
nfsd-fh: found a name that I didn't expect: bin/uptime
nfsd: last server has exited
nfsd: unexporting all filesystems
nfsd-fh: found a name that I didn't expect: lib/libident.so.0
nfsd-fh: found a name that I didn't expect: lib/libident.so.0

Well, just a small explanation how our fall back-solution works:
The server exports '/' (hda5) via nfs to all clients and via nbd to one of
its clients (the fall back server) . The exporting via nbd is used for
mirroring the device via a cron job (by "dd'ing" the device).
The cron-job script also executes a 'reiserfsck --fix-fixable' and afterwards
a 'reiserfsck --check', so except doing the partionchecks, both devices
should be identical.
However, when I started the fall-back server, it showed the messages for these
2 files and the clients got I/O errors for these files.

The solution was to stop the nfs-server, copy the files, delete the old ones
and move the copies back to the old names (just as I'm used to be to do when
this happens to directories served by ClusterNFS).

Any ideas how we can prevent this in the future ?


Thanks,
Bernd


2003-03-29 11:01:45

by Oleg Drokin

[permalink] [raw]
Subject: Re: nfsd-fh: found a name that I didn't expect

Hello!

On Fri, Mar 28, 2003 at 06:28:55PM +0100, Bernd Schubert wrote:
> due to hardware problems I just started our fall back server and got these
> messages for 2 files:
> nfsd-fh: found a name that I didn't expect: bin/uptime
> nfsd-fh: found a name that I didn't expect: bin/uptime
> nfsd: last server has exited
> nfsd: unexporting all filesystems
> nfsd-fh: found a name that I didn't expect: lib/libident.so.0
> nfsd-fh: found a name that I didn't expect: lib/libident.so.0

Hm. Does the message is visible each time you access those files?
Does reiserfsck have found anything?

> Any ideas how we can prevent this in the future ?

We do not yet understand on how to reproduce that locally, because this message
indicates pretty strange conditions.

Bye,
Oleg

2003-03-29 11:54:02

by Bernd Schubert

[permalink] [raw]
Subject: Re: nfsd-fh: found a name that I didn't expect

On Saturday 29 March 2003 12:01, you wrote:
> Hello!
>
> On Fri, Mar 28, 2003 at 06:28:55PM +0100, Bernd Schubert wrote:
> > due to hardware problems I just started our fall back server and got
> > these messages for 2 files:
> > nfsd-fh: found a name that I didn't expect: bin/uptime
> > nfsd-fh: found a name that I didn't expect: bin/uptime
> > nfsd: last server has exited
> > nfsd: unexporting all filesystems
> > nfsd-fh: found a name that I didn't expect: lib/libident.so.0
> > nfsd-fh: found a name that I didn't expect: lib/libident.so.0
>
> Hm. Does the message is visible each time you access those files?

Yes, probably since uptime and libident.so.0 were called in and endless loop
from the clients, the nfs-servers log was filled with those messages and I
had to stop the nfsd and do the cp- and mv-procedure for those files.

> Does reiserfsck have found anything?

As I said, the backup script runs a reiserfsck itself and found no problems, I
hope there is no corruption after a simple boot, but I will check this later
on this day.

>
> > Any ideas how we can prevent this in the future ?
>
> We do not yet understand on how to reproduce that locally, because this
> message indicates pretty strange conditions.

Is there anything I can do to debug this, when I observe this phenomena the
next time (e.g. enabling nfs-debugging via proc, etc) ?

Thanks,
Bernd

2003-03-29 12:04:58

by Bernd Schubert

[permalink] [raw]
Subject: Re: nfsd-fh: found a name that I didn't expect

>
> We do not yet understand on how to reproduce that locally, because this
> message indicates pretty strange conditions.

Oh, I just logged in from home and checked the logfiles and saw that
/usr/sbin/logrotate is affected as well. However, this time the clients don't
get I/O-errors for this file, its only the server that reports the problems.

Well, if you like, I could provide you a dd-image of this partition, though
its rather large (19GB).


Bernd