2008-02-07 23:20:18

by Anirban Sinha

[permalink] [raw]
Subject: nfs new_cache mechanism and older kernel

Hi:

I am wondering if there is a known issue with using the newer cache
mechanism in NFS (by mounting nfsd filesystem on /proc/fs/nfsd) on an
older kernel like 2.6.17 built for 64 bit archs. I am observing a
peculiar problem. The moment nfs exportfs writes the time value from
time() system call to <cache>/flush, the exports entries vanishes.
However, when I pass smaller arbitrary values (of the order of 100s or
10000s), the tables do not get cleared. This could be (though I am not
completely sure) due to the difference between what sys_time() reports
and what get_seconds() reports. Get_seconds() reports the seconds from
the xtime structure whereas sys_time uses do_gettimeofday() in 2.6.17
kenrel. But I could be wrong. Any thoughts on this?

Ani



2008-02-08 00:45:48

by NeilBrown

[permalink] [raw]
Subject: Re: nfs new_cache mechanism and older kernel

On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
> Hi:
>
> I am wondering if there is a known issue with using the newer cache
> mechanism in NFS (by mounting nfsd filesystem on /proc/fs/nfsd) on an
> older kernel like 2.6.17 built for 64 bit archs. I am observing a
> peculiar problem. The moment nfs exportfs writes the time value from
> time() system call to <cache>/flush, the exports entries vanishes.
> However, when I pass smaller arbitrary values (of the order of 100s or
> 10000s), the tables do not get cleared. This could be (though I am not
> completely sure) due to the difference between what sys_time() reports
> and what get_seconds() reports. Get_seconds() reports the seconds from
> the xtime structure whereas sys_time uses do_gettimeofday() in 2.6.17
> kenrel. But I could be wrong. Any thoughts on this?

2.6.17 isn't that old. The newer cache mechanism should work for any
2.6 kernel.

I think the different between get_seconds and sys_time should be very
small and not really significant (I hope?). I think there is only a
difference when adjtime is causing time to run a little fast or a
little slow.... I wonder if we should worry about that.

It is normal for exportfs to completely flush the in-kernel cache.
Subsequent NFS requests will then cause an upcall to mountd which will
add the required information to the cache.

NeilBrown

2008-02-08 01:01:59

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfs new_cache mechanism and older kernel

Hi Neil:

> -----Original Message-----
> From: Neil Brown [mailto:[email protected]]
> Sent: Thursday, February 07, 2008 4:46 PM
> To: Anirban Sinha
> Cc: [email protected]
> Subject: Re: nfs new_cache mechanism and older kernel
>
> On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
> > Hi:
> >
> > I am wondering if there is a known issue with using the newer cache
> > mechanism in NFS (by mounting nfsd filesystem on /proc/fs/nfsd) on
an
> > older kernel like 2.6.17 built for 64 bit archs. I am observing a
> > peculiar problem. The moment nfs exportfs writes the time value from
> > time() system call to <cache>/flush, the exports entries vanishes.
> > However, when I pass smaller arbitrary values (of the order of 100s
> or
> > 10000s), the tables do not get cleared. This could be (though I am
> not
> > completely sure) due to the difference between what sys_time()
> reports
> > and what get_seconds() reports. Get_seconds() reports the seconds
> from
> > the xtime structure whereas sys_time uses do_gettimeofday() in
2.6.17
> > kenrel. But I could be wrong. Any thoughts on this?
>
> 2.6.17 isn't that old. The newer cache mechanism should work for any
> 2.6 kernel.
>
> I think the different between get_seconds and sys_time should be very
> small and not really significant (I hope?). I think there is only a
> difference when adjtime is causing time to run a little fast or a
> little slow.... I wonder if we should worry about that.

Yeah, not sure if it would make any difference. I think one of them is
the wall clock time (do_gettimeofday) and the xtime is the monotonic
time. One can be obtained from other by adding/subtracting an offset
value (wall_to_monotonic or something like that) if I recall correctly.
Why not use the same time value for both cases (if you think it could be
if some remote significance).

>
> It is normal for exportfs to completely flush the in-kernel cache.
> Subsequent NFS requests will then cause an upcall to mountd which will
> add the required information to the cache.

I am wondering ... if it completely flushes the in-kernel cache as its
happening in my case, shouldn't the clients who have already mounted
their nfs filesystem, lose their mounts? In our case, we are mounting
the root fs through nfs and the moment it flushes the cache, the clients
become dead. If they happen to resend their mount request, will it not,
in that case, involve mountd? Please enlighten me.

Ani




>
> NeilBrown

2008-02-08 01:10:48

by NeilBrown

[permalink] [raw]
Subject: RE: nfs new_cache mechanism and older kernel

On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
>
> Yeah, not sure if it would make any difference. I think one of them is
> the wall clock time (do_gettimeofday) and the xtime is the monotonic
> time. One can be obtained from other by adding/subtracting an offset
> value (wall_to_monotonic or something like that) if I recall correctly.
> Why not use the same time value for both cases (if you think it could be
> if some remote significance).

They probably should use the same value - I wasn't previously aware
there was a difference.... something for Bruce's todo list :-) (but
not for Bruce to do...)

>
> >
> > It is normal for exportfs to completely flush the in-kernel cache.
> > Subsequent NFS requests will then cause an upcall to mountd which will
> > add the required information to the cache.
>
> I am wondering ... if it completely flushes the in-kernel cache as its
> happening in my case, shouldn't the clients who have already mounted
> their nfs filesystem, lose their mounts? In our case, we are mounting
> the root fs through nfs and the moment it flushes the cache, the clients
> become dead. If they happen to resend their mount request, will it not,
> in that case, involve mountd? Please enlighten me.

When an NFS request arrives and the cache doesn't contain enough
information to accept or reject it, a message is sent to mountd (via
/proc/net/rpc/XXX/channel) to ask that the information in the cache be
filled in. mountd does the appropriate checks and replies with the
required information.

If this isn't happening for you, then something is wrong with
mountd...

I assume you have checked that mountd is still running, so my best
guess is that mountd was started *before* the 'nfsd' filesystem was
mounted. It needs to be started afterwards.

Just kill mountd, restart it, and see what happens.

NeilBrown

2008-02-08 01:33:27

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfs new_cache mechanism and older kernel

Perfect! That was indeed the problem. Thank you so much. Btw, so when
mountd starts, it checks whether or not the new cache mechanism is being
used and acts accordingly, right? (I am being lazy by not going through
the codebase to find that out myself).

Ani


> -----Original Message-----
> From: Neil Brown [mailto:[email protected]]
> Sent: Thursday, February 07, 2008 5:11 PM
> To: Anirban Sinha
> Cc: [email protected]
> Subject: RE: nfs new_cache mechanism and older kernel
>
> On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
> >
> > Yeah, not sure if it would make any difference. I think one of them
> is
> > the wall clock time (do_gettimeofday) and the xtime is the monotonic
> > time. One can be obtained from other by adding/subtracting an offset
> > value (wall_to_monotonic or something like that) if I recall
> correctly.
> > Why not use the same time value for both cases (if you think it
could
> be
> > if some remote significance).
>
> They probably should use the same value - I wasn't previously aware
> there was a difference.... something for Bruce's todo list :-) (but
> not for Bruce to do...)
>
> >
> > >
> > > It is normal for exportfs to completely flush the in-kernel cache.
> > > Subsequent NFS requests will then cause an upcall to mountd which
> will
> > > add the required information to the cache.
> >
> > I am wondering ... if it completely flushes the in-kernel cache as
> its
> > happening in my case, shouldn't the clients who have already mounted
> > their nfs filesystem, lose their mounts? In our case, we are
mounting
> > the root fs through nfs and the moment it flushes the cache, the
> clients
> > become dead. If they happen to resend their mount request, will it
> not,
> > in that case, involve mountd? Please enlighten me.
>
> When an NFS request arrives and the cache doesn't contain enough
> information to accept or reject it, a message is sent to mountd (via
> /proc/net/rpc/XXX/channel) to ask that the information in the cache be
> filled in. mountd does the appropriate checks and replies with the
> required information.
>
> If this isn't happening for you, then something is wrong with
> mountd...
>
> I assume you have checked that mountd is still running, so my best
> guess is that mountd was started *before* the 'nfsd' filesystem was
> mounted. It needs to be started afterwards.
>
> Just kill mountd, restart it, and see what happens.
>
> NeilBrown

2008-02-08 01:39:06

by NeilBrown

[permalink] [raw]
Subject: RE: nfs new_cache mechanism and older kernel

On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
> Perfect! That was indeed the problem. Thank you so much. Btw, so when
> mountd starts, it checks whether or not the new cache mechanism is being
> used and acts accordingly, right? (I am being lazy by not going through
> the codebase to find that out myself).

Correct on both counts :-)

NeilBrown

2008-02-08 03:42:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs new_cache mechanism and older kernel

On Fri, Feb 08, 2008 at 12:10:46PM +1100, Neil Brown wrote:
> On Thursday February 7, ASinha-z4qIPS1Syiu/3pe1ocb+swC/[email protected] wrote:
> >
> > Yeah, not sure if it would make any difference. I think one of them is
> > the wall clock time (do_gettimeofday) and the xtime is the monotonic
> > time. One can be obtained from other by adding/subtracting an offset
> > value (wall_to_monotonic or something like that) if I recall correctly.
> > Why not use the same time value for both cases (if you think it could be
> > if some remote significance).
>
> They probably should use the same value - I wasn't previously aware
> there was a difference.... something for Bruce's todo list :-) (but
> not for Bruce to do...)

I know nothing about time. I suppose for proc/../flush to be a
reasonable user interface the time source used should be something that
makes sense to userland?

Looks like the number nfs-utils uses in
support/nfs/cacheio.c:cache_flush() is either returned from time() or is
the mtime of /var/lib/nfs/etab.

--b.

>
> >
> > >
> > > It is normal for exportfs to completely flush the in-kernel cache.
> > > Subsequent NFS requests will then cause an upcall to mountd which will
> > > add the required information to the cache.
> >
> > I am wondering ... if it completely flushes the in-kernel cache as its
> > happening in my case, shouldn't the clients who have already mounted
> > their nfs filesystem, lose their mounts? In our case, we are mounting
> > the root fs through nfs and the moment it flushes the cache, the clients
> > become dead. If they happen to resend their mount request, will it not,
> > in that case, involve mountd? Please enlighten me.
>
> When an NFS request arrives and the cache doesn't contain enough
> information to accept or reject it, a message is sent to mountd (via
> /proc/net/rpc/XXX/channel) to ask that the information in the cache be
> filled in. mountd does the appropriate checks and replies with the
> required information.
>
> If this isn't happening for you, then something is wrong with
> mountd...
>
> I assume you have checked that mountd is still running, so my best
> guess is that mountd was started *before* the 'nfsd' filesystem was
> mounted. It needs to be started afterwards.
>
> Just kill mountd, restart it, and see what happens.
>
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-02-08 03:46:51

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfs new_cache mechanism and older kernel

>
> I know nothing about time. I suppose for proc/../flush to be a
> reasonable user interface the time source used should be something
that
> makes sense to userland?

We should not also forget that this timestamp is also used by the kernel
(and compared with what is returned from get_seconds()) in deciding
which entries are stale.

Not sure if this is of any big significance.

Ani



>
> Looks like the number nfs-utils uses in
> support/nfs/cacheio.c:cache_flush() is either returned from time() or
> is
> the mtime of /var/lib/nfs/etab.
>
> --b.
>
> >
> > >
> > > >
> > > > It is normal for exportfs to completely flush the in-kernel
> cache.
> > > > Subsequent NFS requests will then cause an upcall to mountd
which
> will
> > > > add the required information to the cache.
> > >
> > > I am wondering ... if it completely flushes the in-kernel cache as
> its
> > > happening in my case, shouldn't the clients who have already
> mounted
> > > their nfs filesystem, lose their mounts? In our case, we are
> mounting
> > > the root fs through nfs and the moment it flushes the cache, the
> clients
> > > become dead. If they happen to resend their mount request, will it
> not,
> > > in that case, involve mountd? Please enlighten me.
> >
> > When an NFS request arrives and the cache doesn't contain enough
> > information to accept or reject it, a message is sent to mountd (via
> > /proc/net/rpc/XXX/channel) to ask that the information in the cache
> be
> > filled in. mountd does the appropriate checks and replies with the
> > required information.
> >
> > If this isn't happening for you, then something is wrong with
> > mountd...
> >
> > I assume you have checked that mountd is still running, so my best
> > guess is that mountd was started *before* the 'nfsd' filesystem was
> > mounted. It needs to be started afterwards.
> >
> > Just kill mountd, restart it, and see what happens.
> >
> > NeilBrown
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html