2003-02-01 02:04:24

by David Ford

[permalink] [raw]
Subject: NFS problems, 2.5.5x

Synopsis: nfsserver:/home/david mount, get dir. entries loops forever,
2.5.59 for client and server.

Example: ls -l /home/david

An strace will show the same directory entries flying by over and over
until memory is exhausted or ^c comes along. It worked at first for
about 30 minutes while I finished the new gentoo install on my desktop,
but then things got weird. the nfs server spat out a big long callback
trace (oops) and died hard. Had to reset the power. The looping
started just minutes before that. I've rebooted, tried 2.5.53 on the
client but no go.

Doing a stat on a single file works fine. Doing a glob, i.e. ls -l on
the mount directory fails. Doing ls -l on any sub directory of the
mount works fine.

ls -la /home/david/.xinitrc (file, works)
ls -la /home/david/.e (directory, works)
ls -la /home/david (loops forever on all directory entries until memory
is exhausted and ls aborts)

Some things I've noted:

- not all directory entries are repeated, some only appear once and
never again.
- ls only does this on the mount point of the nfs mounted directory, all
other directories are fine

GLIBC 2.3.1, underlying filesystems are reiserfs.

Client:
nfsserver:/home/david on /home/david type nfs
(rw,bg,hard,intr,timeo=7,rsize=16384,wsize=16384,addr=10.0.0.5)

Server (/var/lib/nfs/etab):
/raid/home/david
hb.blue-labs.org(rw,async,wdelay,hide,secure,no_root_squash,no_all_squash,subtree_check,secure_locks,mapping=identity,anonuid=-2,anongid=-2)


David



2003-02-01 02:21:32

by Andrew Morton

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

David Ford <[email protected]> wrote:
>
> Synopsis: nfsserver:/home/david mount, get dir. entries loops forever,
> 2.5.59 for client and server.

If the server is ext3+htree then you've hit the htree dir cookie bug.

Use `dumpe2fs -h /dev/hda1 | grep index' to se if you're using htree.

Use `tune2fs -O ^dir_index /dev/hda1' to disable it.


2003-02-01 02:38:28

by David Ford

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

Underlying filesystems on both client and server are reiserfs.

Got any more quickie suggestions? :)

Thanks,
David

Andrew Morton wrote:

>David Ford <[email protected]> wrote:
>
>
>>Synopsis: nfsserver:/home/david mount, get dir. entries loops forever,
>>2.5.59 for client and server.
>>
>>
>
>If the server is ext3+htree then you've hit the htree dir cookie bug.
>
>Use `dumpe2fs -h /dev/hda1 | grep index' to se if you're using htree.
>
>Use `tune2fs -O ^dir_index /dev/hda1' to disable it.
>
>

2003-02-01 08:49:07

by Trond Myklebust

[permalink] [raw]
Subject: NFS problems, 2.5.5x

>>>>> " " == David Ford <[email protected]> writes:

> Synopsis: nfsserver:/home/david mount, get dir. entries loops
> forever,
> 2.5.59 for client and server.

> Example: ls -l /home/david

> An strace will show the same directory entries flying by over
> and over until memory is exhausted or ^c comes along. It
> worked at first for about 30 minutes while I finished the new
> gentoo install on my desktop, but then things got weird. the
> nfs server spat out a big long callback trace (oops) and died
> hard. Had to reset the power. The looping started just
> minutes before that. I've rebooted, tried 2.5.53 on the client
> but no go.

AFAICR, there have been no changes to the NFS client readdir code since
2.5.30.

Cheers,
Trond

2003-02-01 15:55:09

by David Ford

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

The last time NFS was working, I had 2.4.19 and 2.5.53 clients on a
2.5.59 server, that was yesterday. I had experienced a slight problem
with it last week when my 2.5.53 client was booted for first time on
2.5.5x, it was previously a 2.4 kernel. The server OOPSed repeatedly
shortly after bootup in NFS stuff then it never happened again and was
rock solid until today.

David

Trond Myklebust wrote:

>>>>>>" " == David Ford <[email protected]> writes:
>>>>>>
>>>>>>
>
> > Synopsis: nfsserver:/home/david mount, get dir. entries loops
> > forever,
> > 2.5.59 for client and server.
>
> > Example: ls -l /home/david
>
> > An strace will show the same directory entries flying by over
> > and over until memory is exhausted or ^c comes along. It
> > worked at first for about 30 minutes while I finished the new
> > gentoo install on my desktop, but then things got weird. the
> > nfs server spat out a big long callback trace (oops) and died
> > hard. Had to reset the power. The looping started just
> > minutes before that. I've rebooted, tried 2.5.53 on the client
> > but no go.
>
>AFAICR, there have been no changes to the NFS client readdir code since
>2.5.30.
>
>Cheers,
> Trond
>
>

2003-02-01 16:14:22

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

>>>>> " " == David Ford <[email protected]> writes:

> The last time NFS was working, I had 2.4.19 and 2.5.53 clients
> on a
> 2.5.59 server, that was yesterday. I had experienced a slight
> problem
> with it last week when my 2.5.53 client was booted for first
> time on 2.5.5x, it was previously a 2.4 kernel. The server
> OOPSed repeatedly shortly after bootup in NFS stuff then it
> never happened again and was rock solid until today.

So have you tried out the 2.5.53 client since you noticed this
problem?

Cheers,
Trond

2003-02-01 21:40:07

by David Ford

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

Yes. Today I haven't experienced the loop problem. On the other hand,
when I reboot back and forth between 2.5.53 and 2.5.59, I have to
restart the server nfs programs or I get permission denied on the client
and "rpc.mountd: getfh failed: Operation not permitted" on the server.

I have also had to restart 2.4 clients because NFS silently hangs. I
believe there's a few patches on the list that I need to apply regarding
this.

David

Trond Myklebust wrote:

>>>>>>" " == David Ford <[email protected]> writes:
>>>>>>
>>>>>>
>
> > The last time NFS was working, I had 2.4.19 and 2.5.53 clients
> > on a
> > 2.5.59 server, that was yesterday. I had experienced a slight
> > problem
> > with it last week when my 2.5.53 client was booted for first
> > time on 2.5.5x, it was previously a 2.4 kernel. The server
> > OOPSed repeatedly shortly after bootup in NFS stuff then it
> > never happened again and was rock solid until today.
>
>So have you tried out the 2.5.53 client since you noticed this
>problem?
>
>Cheers,
> Trond
>
>

--
I may have the information you need and I may choose only HTML. It's up
to you. Disclaimer: I am not responsible for any email that you send me
nor am I bound to any obligation to deal with any received email in any
given fashion. If you send me spam or a virus, I may in whole or part
send you 50,000 return copies of it. I may also publically announce any
and all emails and post them to message boards, news sites, and even
parody sites. I may also mark them up, cut and paste, print, and staple
them to telephone poles for the enjoyment of people without internet
access. This is not a confidential medium and your assumption that your
email can or will be handled confidentially is akin to baring your
backside, burying your head in the ground, and thinking nobody can see
you butt nekkid and in plain view for miles away. Don't be a cluebert,
buy one from K-mart today.

When it absolutely, positively, has to be destroyed overnight.
AIR FORCE



2003-02-03 11:03:10

by Oleg Drokin

[permalink] [raw]
Subject: Re: NFS problems, 2.5.5x

Hello!

On Sat, Feb 01, 2003 at 05:23:42PM +0100, Trond Myklebust wrote:
> > The last time NFS was working, I had 2.4.19 and 2.5.53 clients
> > on a
> > 2.5.59 server, that was yesterday. I had experienced a slight
> > problem
> > with it last week when my 2.5.53 client was booted for first
> > time on 2.5.5x, it was previously a 2.4 kernel. The server
> > OOPSed repeatedly shortly after bootup in NFS stuff then it
> > never happened again and was rock solid until today.
> So have you tried out the 2.5.53 client since you noticed this
> problem?

While trying to reproduce mounting of reiserfs FS over NFS from 2.5.59 server,
trying to see if there is something to do with reiserfs itself (and it worked
perfectly with 2.4.19 client), I decided to try to mount it from this same
workstation as I did not had 2.5.59 client at hand.
This way I learned that mount localhost:/exportedfs /mnt -t nfs
(localhost can be replaced by any local IP) results in mount
hanging in D state:
100 0 775 770 15 0 1532 620 rpc_ex D pts/1 0:00 mount 212.16.7.78:/home /mnt -t nfs

At the same time I still can mount it from external hosts (but cannot kill this hung mount, obviously).

This is on today's 2.5.529 bk snapshot.

Bye,
Oleg