LinuxLists.cc - What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

2007-10-03 23:41:26

Subject: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

Aside from the usual updates from Chuck for NFS-over-IPv6 (still
incomplete) and a number of bugfixes for the text-based mount code, the
main news in the NFS tree is the merging of support for the NFS/RDMA
client code from Tom Talpey and the NetApp New England (NANE) team.

We also have the 64-bit inode support from RedHat/Peter Staubach.

There is also the addition of a nfs_vm_page_mkwrite() method in order to
clean up the mmap() write code.
Finally, I've been working on a number of updates for the attribute
revalidation, having pulled apart most of the dentry and attribute
revalidation into separate variables. A number of fixes that address
existing bugs fell out of that review, which should hopefully result in
more efficient dcache behaviour...

The NFS client git tree can be found at

git://git.linux-nfs.org/pub/linux/nfs-2.6.git

or on gitweb at

http://linux-nfs.org/cgi-bin/gitweb.cgi?p=nfs-2.6.git;a=summary

Finally, a full set of patches may be found on

http://client.linux-nfs.org/Linux-2.6.x/2.6.23-rc9/

Cheers
Trond

-------------------

Adrian Bunk (1):
[2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static

Christoph Hellwig (1):
[NFS] [PATCH] nfs: tiny makefile cleanup

Chuck Lever (41):
SUNRPC: Fix a signed v. unsigned comparison in rpcbind's XDR routines
SUNRPC: Fix a signed v. unsigned comparison in net/sunrpc/xprtsock.c
SUNRPC: Use standard macros for printing IP addresses
SUNRPC: Free address buffers in a loop
SUNRPC: Add hex-formatted address support to rpc_peeraddr2str()
SUNRPC: Rename xs_format_peer_addresses
SUNRPC: add a function to format IPv6 addresses
SUNRPC: add support for IPv6 to the kernel's rpcbind client
SUNRPC: Introduce support for setting the port number in IPv6 addresses
SUNRPC: Rename xs_bind() to prepare for IPv6-specific bind method
SUNRPC: create an IPv6-savvy mechanism for binding to a reserved port
SUNRPC: Refactor a part of socket connect logic into a helper function
SUNRPC: Rename IPv4 connect workers
SUNRPC: create connect workers for IPv6
SUNRPC: Add IPv6 address support to net/sunrpc/xprtsock.c
SUNRPC: Add a helper for extracting the address using the correct type
SUNRPC: Split xs_reclassify_socket into an IPv4 and IPv6 version
SUNRPC: Add support for formatted universal addresses
SUNRPC: Fix generation of universal addresses for
SUNRPC: Only one dprintk is needed during client creation
SUNRPC: fix a signed v. unsigned comparison nit in rpc_bind_new_program
SUNRPC: Use correct argument type in memcpy()
SUNRPC: Make sure server name is reasonable before trying to print it
SUNRPC: Clean up in rpc_show_tasks
SUNRPC: Make rpcb_decode_getaddr more picky about universal addresses
SUNRPC: Retry bad rpcbind replies
SUNRPC: Add a new error code for retry waiting for another binder
SUNRPC: Split another new rpcbind retry error code from EACCES
SUNRPC: RPC bind failures should be permanent for NULL requests
NFS: Kernel mount client should use async bind
NFS: Add new 'mountaddr=' mount option
NFS: Convert printk's to dprintk's in fs/nfs/nfs?xdr.c
LOCKD: Convert printk's to dprintk's in lockd XDR routines
NFSD: Convert printk's to dprintk's in NFSD's nfs4xdr
NFS: Verify server address before invoking in-kernel mount client
NFS: Show "nointr" mount option
SUNRPC: Fix bytes-per-op accounting for RPC over UDP
NFS: Don't call nfs_renew_times() in nfs_dentry_iput()
NFS: Eliminate nfs_renew_times()
NFS: Eliminate nfs_refresh_verifier()
SUNRPC: Use correct type in buffer length calculations

Fabio Olive Leite (1):
Re: [NFS] [PATCH] Attribute timeout handling and wrapping u32 jiffies

J. Bruce Fields (2):
nfs: add server port to rpc_pipe info file
SUNRPC: Fix default hostname created in rpc_create()

James Lentini (1):
[NFS] [PATCH] NFS: initialize default port in kernel mount client

Jeff Layton (1):
[NFS] [PATCH] NFS: show addr=ipaddr in /proc/mounts rather than

Jesper Juhl (1):
[23/37] Clean up duplicate includes in

Peter Staubach (1):
64 bit ino support for NFS client

Trond Myklebust (56):
NFS: Add the helper nfs_vm_page_mkwrite
NFS: Clean up write code...
NFS: Clean up nfs_writepages()
VFS: Remove writeback_control->fs_private
NFS: Clean up NFS writeback flush code
NFS: Writeback optimisation
NFS: Fall back to synchronous writes when a background write errors...
SUNRPC: Convert rpc_pipefs to use the generic filesystem notification hooks
NFSv4: Fix a bug in nfs4_validate_mount_data()
NFS: Add a helper to extract the nfs_open_context from a struct file
NFS: Replace file->private_data with calls to nfs_file_open_context()
NFSv4: Simplify _nfs4_do_access()
NFSv4: Make NFSv4 ACCESS calls return attributes too...
NFS: Fix over-conservative attribute invalidation in nfs_update_inode()
NFS: nfs_post_op_update_inode() should call nfs_refresh_inode()
NFS: fix nfs_verify_change_attribute
NFS: Fix dcache revalidation bugs
NFS: nfs_wcc_update_inode: directory caches are always invalidated
NFS: Don't force a dcache revalidation if nfs_wcc_update_inode succeeds
NFSv4: Don't use ctime/mtime for determining when to invalidate the caches
NFS: Don't use readdirplus data if the page cache is invalid
NFS: Fix atime revalidation in readdir()
NFS: Fix atime revalidation in read()
NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode()
NFS: Remove bogus check of cache_change_attribute in nfs_update_inode
NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
NFS: Fix the sign of the return value of nfs_save_change_attribute()
NFS: Fix nfs_verify_change_attribute()
NFS: Ensure nfs_instantiate() invalidates the parent dir on error
NFS: nfs_instantiate() should set the dentry verifier
NFS: Don't hash the negative dentry when optimising for an O_EXCL open
NFS: Fix a bug in nfs_open_revalidate()
NFS: Don't set cache_change_attribute in nfs_revalidate_mapping
NFS: Don't revalidate dentries on directory size or ctime changes
NFS: nfs_post_op_update_inode don't update cache_change_attribute
NFS: nfs_mark_for_revalidate don't update cache_change_attribute
NFS: don't cache the verifer across ->lookup() calls
NFS: Remove bogus nfs_mark_for_revalidate() in nfs_lookup
NFS: NFS_CACHEINV() should not test for nfs_caches_unstable()
NFS: Remove NFS_I(inode)->data_updates
NFS: Remove nfs_begin_data_update/nfs_end_data_update
NFS: Reset nfsi->last_updated only if the attribute changed
NFS: Optimise nfs_lookup_revalidate()
NFSv4: Don't revalidate the directory in nfs_atomic_lookup()
NFSv4: Use NFSv2/v3 rules for negative dentries in nfs_open_revalidate
NFSv4: Fix nfs_atomic_open() to set the verifier on negative dentries too
NFSv3: Always use directory post-op attributes in nfs3_proc_lookup
NFS: Remove the redundant nfs_reval_fsid()
NFS: Don't zap the readdir caches upon error
NFS: Be strict about dentry revalidation when doing exclusive create
NFS: Ensure that nfs_link() returns a hashed dentry
NFS: Simplify filehandle revalidation
NFS: Get rid of some obsolete macros
SUNRPC: Fix buggy UDP transmission
SUNRPC: Don't call xprt_release() if call_allocate fails
SUNRPC: Don't call xprt_release in call refresh

\"Talpey, Thomas\ (20):
SUNRPC: move per-transport rpcbind netid's
SUNRPC: export per-transport rpcbind netid's
NFS: move nfs_parsed_mount_data structure definition
NFS: use in-kernel mount argument structure for nfsv[23] mounts
NFS: use in-kernel mount argument structure for nfsv4 mounts
SUNRPC: mark bulk read/write data in xdrbuf
SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions
SUNRPC: Provide a new API for registering transport implementations
SUNRPC: Finish API to load RPC transport implementations dynamically
SUNRPC: rename the rpc_xprtsock_create structure
SUNRPC: rearrange RPC sockets definitions
NFS/SUNRPC: support transport protocol naming
NFS/SUNRPC: use transport protocol naming
NFS - print accurate transport protocol
RPCRDMA: Kconfig and header file with rpcrdma protocol definitions
NFS: support RDMA mounts
RPCRDMA: rpc rdma transport switch
RPCRDMA: rpc rdma protocol implementation
RPCRDMA: rpc rdma verbs interface implementation
SUNRPC: Add RDMA dependency to SUNRPC_XPRT_RDMA

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-04 16:43:04

by Pierre Ossman

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 04 Oct 2007 10:00:50 -0400
Trond Myklebust <[email protected]> wrote:

> On Thu, 2007-10-04 at 08:52 +0200, Pierre Ossman wrote:
> > On Wed, 03 Oct 2007 19:41:16 -0400
> > Trond Myklebust <[email protected]> wrote:
> >
> > >
> > > We also have the 64-bit inode support from RedHat/Peter Staubach.
> > >
> >
> > As has been pointed[1] out[2], this will cause regressions for
> > non-LFS applications (of which there are still lots and lots). This
> > change should be in feature-removal (the "feature" being removed is
> > legacy support for non-LFS applications using NFS servers that make
> > full use of the protocol) and preferably accompanied with
> > appropriate user space changes (e.g. compatibility option in glibc).
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
> > [2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2
> >
> > Rgds
>
> How about a boot/module parameter to turn it on or off?
>

That would be perfect. It can even be in non-legacy mode by default,
just as long as you can go back to the old behaviour when/if you run
into a non-LFS application.

> I don't see any point in having a sysctl for something like this:
> either you have legacy applications or you don't. It is not something
> that you switch off as you go off to lunch.
> A compile parameter, OTOH, would be too restrictive since it would
> force distros to choose just one behaviour (which would mean they
> would have to choose the most conservative).
>

Agreed.

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

2007-10-04 18:43:36

by Andrew Morton

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 4 Oct 2007 18:43:04 +0200
Pierre Ossman <[email protected]> wrote:

> On Thu, 04 Oct 2007 10:00:50 -0400
> Trond Myklebust <[email protected]> wrote:
>
> > On Thu, 2007-10-04 at 08:52 +0200, Pierre Ossman wrote:
> > > On Wed, 03 Oct 2007 19:41:16 -0400
> > > Trond Myklebust <[email protected]> wrote:
> > >
> > > >
> > > > We also have the 64-bit inode support from RedHat/Peter Staubach.
> > > >
> > >
> > > As has been pointed[1] out[2], this will cause regressions for
> > > non-LFS applications (of which there are still lots and lots). This
> > > change should be in feature-removal (the "feature" being removed is
> > > legacy support for non-LFS applications using NFS servers that make
> > > full use of the protocol) and preferably accompanied with
> > > appropriate user space changes (e.g. compatibility option in glibc).
> > >
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
> > > [2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2
> > >
> > > Rgds
> >
> > How about a boot/module parameter to turn it on or off?
> >
>
> That would be perfect. It can even be in non-legacy mode by default,
> just as long as you can go back to the old behaviour when/if you run
> into a non-LFS application.
>

Wouldn't a mount option be better?

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-04 19:16:03

by Trond Myklebust

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 2007-10-04 at 11:42 -0700, Andrew Morton wrote:
> On Thu, 4 Oct 2007 18:43:04 +0200
> Pierre Ossman <[email protected]> wrote:
>
> > On Thu, 04 Oct 2007 10:00:50 -0400
> > Trond Myklebust <[email protected]> wrote:
> >
> > > On Thu, 2007-10-04 at 08:52 +0200, Pierre Ossman wrote:
> > > > On Wed, 03 Oct 2007 19:41:16 -0400
> > > > Trond Myklebust <[email protected]> wrote:
> > > >
> > > > >
> > > > > We also have the 64-bit inode support from RedHat/Peter Staubach.
> > > > >
> > > >
> > > > As has been pointed[1] out[2], this will cause regressions for
> > > > non-LFS applications (of which there are still lots and lots). This
> > > > change should be in feature-removal (the "feature" being removed is
> > > > legacy support for non-LFS applications using NFS servers that make
> > > > full use of the protocol) and preferably accompanied with
> > > > appropriate user space changes (e.g. compatibility option in glibc).
> > > >
> > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
> > > > [2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2
> > > >
> > > > Rgds
> > >
> > > How about a boot/module parameter to turn it on or off?
> > >
> >
> > That would be perfect. It can even be in non-legacy mode by default,
> > just as long as you can go back to the old behaviour when/if you run
> > into a non-LFS application.
> >
>
> Wouldn't a mount option be better?

I suppose that might be OK if you know that the 32-bit legacy
applications will only touch one or two servers, but that sounds like a
niche thing.

On the downside, forcing all those people who have portable 64-bit aware
applications to upgrade their version of mount just in order to have
stat64() work correctly seems unnecessarily complicated. I'd prefer not
to have to do that unless someone comes up with a good reason why we
must.

Cheers
Trond

2007-10-04 19:42:34

by Peter Staubach

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

Trond Myklebust wrote:
> On Thu, 2007-10-04 at 11:42 -0700, Andrew Morton wrote:
>
>> On Thu, 4 Oct 2007 18:43:04 +0200
>> Pierre Ossman <[email protected]> wrote:
>>
>>
>>> On Thu, 04 Oct 2007 10:00:50 -0400
>>> Trond Myklebust <[email protected]> wrote:
>>>
>>>
>>>> On Thu, 2007-10-04 at 08:52 +0200, Pierre Ossman wrote:
>>>>
>>>>> On Wed, 03 Oct 2007 19:41:16 -0400
>>>>> Trond Myklebust <[email protected]> wrote:
>>>>>
>>>>>
>>>>>> We also have the 64-bit inode support from RedHat/Peter Staubach.
>>>>>>
>>>>>>
>>>>> As has been pointed[1] out[2], this will cause regressions for
>>>>> non-LFS applications (of which there are still lots and lots). This
>>>>> change should be in feature-removal (the "feature" being removed is
>>>>> legacy support for non-LFS applications using NFS servers that make
>>>>> full use of the protocol) and preferably accompanied with
>>>>> appropriate user space changes (e.g. compatibility option in glibc).
>>>>>
>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
>>>>> [2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2
>>>>>
>>>>> Rgds
>>>>>
>>>> How about a boot/module parameter to turn it on or off?
>>>>
>>>>
>>> That would be perfect. It can even be in non-legacy mode by default,
>>> just as long as you can go back to the old behaviour when/if you run
>>> into a non-LFS application.
>>>
>>>
>> Wouldn't a mount option be better?
>>
>
> I suppose that might be OK if you know that the 32-bit legacy
> applications will only touch one or two servers, but that sounds like a
> niche thing.
>
> On the downside, forcing all those people who have portable 64-bit aware
> applications to upgrade their version of mount just in order to have
> stat64() work correctly seems unnecessarily complicated. I'd prefer not
> to have to do that unless someone comes up with a good reason why we
> must.

I would agree. The 64 bit fileids will only become visible when
the server is exporting file systems which contain fileids which
are bigger than 32 bits and then only when the application
encounters these files.

Also, these 32-bit legacy applications are going to have a
problem if they are ever run on a system which contains local
file systems which expose the large fileids.

It would be better to identify these applications and get them
fixed. The world is evolving and it is time for them to do so.

ps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-04 19:59:16

by Andrew Morton

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 04 Oct 2007 15:16:03 -0400
Trond Myklebust <[email protected]> wrote:

> > >
> > > That would be perfect. It can even be in non-legacy mode by default,
> > > just as long as you can go back to the old behaviour when/if you run
> > > into a non-LFS application.
> > >
> >
> > Wouldn't a mount option be better?
>
> I suppose that might be OK if you know that the 32-bit legacy
> applications will only touch one or two servers, but that sounds like a
> niche thing.
>
> On the downside, forcing all those people who have portable 64-bit aware
> applications to upgrade their version of mount just in order to have
> stat64() work correctly seems unnecessarily complicated. I'd prefer not
> to have to do that unless someone comes up with a good reason why we
> must.

Confused. You don't need to modify mount(8) when adding a new mount option?

2007-10-04 20:18:34

by Chuck Lever III

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

[ Trimming the cc: list... ]

On Oct 3, 2007, at 7:41 PM, Trond Myklebust wrote:
> Aside from the usual updates from Chuck for NFS-over-IPv6 (still
> incomplete) and a number of bugfixes for the text-based mount code,
> the
> main news in the NFS tree is the merging of support for the NFS/RDMA
> client code from Tom Talpey and the NetApp New England (NANE) team.
>
> We also have the 64-bit inode support from RedHat/Peter Staubach.
>
> There is also the addition of a nfs_vm_page_mkwrite() method in
> order to
> clean up the mmap() write code.
> Finally, I've been working on a number of updates for the attribute
> revalidation, having pulled apart most of the dentry and attribute
> revalidation into separate variables. A number of fixes that address
> existing bugs fell out of that review, which should hopefully
> result in
> more efficient dcache behaviour...
>
> The NFS client git tree can be found at
>
> git://git.linux-nfs.org/pub/linux/nfs-2.6.git
>
> or on gitweb at
>
> http://linux-nfs.org/cgi-bin/gitweb.cgi?p=nfs-2.6.git;a=summary
>
> Finally, a full set of patches may be found on
>
> http://client.linux-nfs.org/Linux-2.6.x/2.6.23-rc9/

This is a massive effort... boo-yah!

I downloaded this latest set of patches and browsed through them
today in order to get a sense of how my BKL and IPv6 patch sets will
need to be adapted for 2.6.24. I've already reviewed Mr. Talpey's
patches, so I focused mainly on patches I hadn't seen already.

There are some patches I don't include here, either because I didn't
find any issues with them, or because I'm not smart enough to
understand exactly what they are changing. Here are my comments in
"git log" order (so you should start from the bottom of this email at
the oldest change).

commit 37e582fe2194862ba1924e1c31b081f8c866df29
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 12:06:48 2007 -0400

SUNRPC: Don't call xprt_release in call refresh

Call it from call_verify() instead...

Signed-off-by: Trond Myklebust <[email protected]>

Can you explain what you are fixing? Why is it better not to call
xprt_release() in call_refresh() ?

commit 76d0ee01ff975eaa696d682138927dac40e90856
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 12:06:44 2007 -0400

SUNRPC: Don't call xprt_release() if call_allocate fails

It completely fouls up the RPC call statistics, and serves no
useful
purpose.

Signed-off-by: Trond Myklebust <[email protected]>

The xprt_release() is needed because we don't want to tie up a slot
in the
transport's slot table during the rpc_delay() after the RPC buffer
allocation failed.

Would it make sense to add some conditional processing in
rpc_count_iostats() that skipped some accounting if tk_status < 0 ?
Or by
adding a boolean argument to xprt_release() that indicates if the RPC
should
be counted or not? Or by adding a new xprt_release_no_count function
that
doesn't count the RPC request?

commit 373352e2c1de96e2db27201e191ce21fb459bbd7
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 11:43:37 2007 -0400

SUNRPC: Fix buggy UDP transmission

xs_sendpages() may return a negative result. We sure as hell
don't want to
add that to the 'tk_bytes_sent' tally...

Signed-off-by: Trond Myklebust <[email protected]>

Heh. Oops.

commit 56ad5672426879452eb2a70e9986e59f4d642243
Author: Trond Myklebust <[email protected]>
Date: Tue Oct 2 23:13:32 2007 -0400

NFS: Simplify filehandle revalidation

Signed-off-by: Trond Myklebust <[email protected]>

I think a little more explanation of what's going on here is needed.
Maybe a sentence or two in the description about why this is not
going to break CTO? Are you just fixing a minor oversight?

commit 36fbae45411bebd6876310009336de58e8ef4a9a
Author: Trond Myklebust <[email protected]>
Date: Tue Oct 2 21:58:05 2007 -0400

NFS: Ensure that nfs_link() returns a hashed dentry

Signed-off-by: Trond Myklebust <[email protected]>

Here, the d_drop() and d_add() are explicitly serialized inside
nfs_link()
via the BKL. I'm focusing on that serialization because it will have
some
effect on my effort to eliminate the BKL from the NFS client.

For those of us who aren't fluent users of the dentry cache, can you
elaborate on why the d_drop() is needed in these cases?

commit bf1dc509161086950c9563c98c62879260eda056
Author: Trond Myklebust <[email protected]>
Date: Tue Oct 2 19:13:04 2007 -0400

NFS: Be strict about dentry revalidation when doing exclusive
create

Signed-off-by: Trond Myklebust <[email protected]>

Can you explain why the extra checking is needed for exclusive create?

commit 92f82103b37c9a6e8a4246de85ff909d734b77c6
Author: Trond Myklebust <[email protected]>
Date: Tue Oct 2 19:02:07 2007 -0400

NFS: Don't zap the readdir caches upon error

If necessary, the caches will get zapped under normal revalidation.

Signed-off-by: Trond Myklebust <[email protected]>

Again, to be clear, you're using the NFS attribute cache flags to detect
out-of-date page cache data, rather than using the generic VFS support,
for directories. Seems reasonable.

commit 389b68610d592e103936801cc449ef3052008fd6
Author: Trond Myklebust <[email protected]>
Date: Tue Oct 2 17:11:54 2007 -0400

NFS: Remove the redundant nfs_reval_fsid()

Signed-off-by: Trond Myklebust <[email protected]>

I'd like to understand why nfs_reval_fsid() is no longer necessary. Is
it because this check is already done by _nfs4_proc_lookup() ?

commit b322d877bb6a58af6f24930bb5ff6122860d61e1
Author: Trond Myklebust <[email protected]>
Date: Sat Sep 29 17:48:19 2007 -0400

NFS: Remove nfs_begin_data_update/nfs_end_data_update

The lower level routines in fs/nfs/proc.c, fs/nfs/nfs3proc.c and
fs/nfs/nfs4proc.c should already be dealing with the
revalidation issues.

Signed-off-by: Trond Myklebust <[email protected]>

commit defee1455a19a43c4b857b55d46085e107337c17
Author: Trond Myklebust <[email protected]>
Date: Sat Sep 29 17:34:46 2007 -0400

NFS: Remove NFS_I(inode)->data_updates

We have no more users...

Signed-off-by: Trond Myklebust <[email protected]>

You have removed nfs_begin_data_update() in this patch... but the
next patch
in the series (commit b322d877bb6a58af6f24930bb5ff6122860d61e1)
removes a
bunch of calls to it. Are these two patches in reverse order?

commit 40acc380df81ba07d4cd610107b5479b1edb6055
Author: Trond Myklebust <[email protected]>
Date: Sat Sep 29 17:25:43 2007 -0400

NFS: NFS_CACHEINV() should not test for nfs_caches_unstable()

The fact that we're in the process of modifying the inode does
not mean
that we should not invalidate the attribute and data caches. The
defensive
thing is to always invalidate when we're confronted with inode
mtime/ctime or change_attribute updates that we do not immediately
recognise.

Signed-off-by: Trond Myklebust <[email protected]>

Ugh, a triple negative. I'm having trouble understanding the
description
and figuring out exactly what's going on this patch.

commit f507133ab0f2369194037ae7842dc0895b979627
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 13:54:51 2007 -0400

NFS: Remove bogus nfs_mark_for_revalidate() in nfs_lookup

The parent of the newly materialised dentry has just been
revalidated...

Signed-off-by: Trond Myklebust <[email protected]>

Is there a way to document the revalidation of the parent? Where
exactly
does that occur in the nfs_lookup() path?

commit 176c84c19cb7cd7ea1a83a105649b41d18999812
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 10:00:23 2007 -0400

NFS: nfs_mark_for_revalidate don't update cache_change_attribute

Just let the subsequent inode revalidation do the update...

Signed-off-by: Trond Myklebust <[email protected]>

The short description is ungrammatical.

commit d2c7da6c1cb184ea79b765eae848b1a2a8910dbf
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 09:59:15 2007 -0400

NFS: nfs_post_op_update_inode don't update cache_change_attribute

If nfs_post_op_update_inode fails because the server didn't
return any
attributes, then we let the subsequent inode revalidation update
cache_change_attribute.

Signed-off-by: Trond Myklebust <[email protected]>

The short description is ungrammatical.

commit fb26bdc63b426cfed53e7f1fc89e5113ceb17a5b
Author: Trond Myklebust <[email protected]>
Date: Mon Oct 1 09:56:59 2007 -0400

NFS: Don't revalidate dentries on directory size or ctime changes

We only need to look at the mtime changes...

Signed-off-by: Trond Myklebust <[email protected]>

It would be friendly to spell out what's going on here in the patch
description.

commit 9dd4607202e6c177cc53035338473db378b98c63
Author: Trond Myklebust <[email protected]>
Date: Sat Sep 29 17:41:33 2007 -0400

NFS: Ensure nfs_instantiate() invalidates the parent dir on error

Also ensure that it drops the dentry in this case.

Signed-off-by: Trond Myklebust <[email protected]>

Perhaps a silly question, but since you've added a d_drop(dentry)
right at
the top of nfs_instatiate(), does nfs_instantiate() now rely on some
kind of
external serialization (say, the BKL)?

commit c7a8d8a69d807dc026dc016df11f82380d841118
Author: Trond Myklebust <[email protected]>
Date: Sat Sep 29 17:14:03 2007 -0400

NFS: Fix the sign of the return value of
nfs_save_change_attribute()

Also fix up the comments.

Signed-off-by: Trond Myklebust <[email protected]>

I would mention that the type of the return value of
nfs_save_change_attribute() has to match the second parameter of
nfs_set_verifier(). Otherwise it's not clear why you are changing this.

commit 9cd8d8475b9ecdca1181eb8dc41a00b8cf84b7bd
Author: Trond Myklebust <[email protected]>
Date: Sun Sep 30 15:21:24 2007 -0400

NFS: Fake up 'wcc' attributes to prevent cache invalidation
after write

NFSv2 and v4 don't offer weak cache consistency attributes on
WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want
to prevent
the client from invalidating our cached data whenever -
>write_done()
attempts to update the inode attributes.

Signed-off-by: Trond Myklebust <[email protected]>

See comment on commit 74994c59a5a1def233c2245c3f7ef23cb01c64ce.

commit 71d96564928ac2017c97a4555f818397c49c461c
Author: Trond Myklebust <[email protected]>
Date: Fri Sep 28 19:11:33 2007 -0400

NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode()

For one thing, the test NFS_ATTRTIMEO() == 0 makes no sense: we're
testing whether or not the cache timeout length is zero, which
is totally
unrelated to the issue of whether or not we trust the file
staleness.

Secondly, we do not want to retry the GETATTR once a file has
been declared
stale by the server: we rather want to discard that inode as
soon as
possible, since there are broken servers still in use out there
that reuse
filehandles on new files.

Signed-off-by: Trond Myklebust <[email protected]>

The NFS_ATTRTIMEO check was made to see if the "noac" or "actimeo=0"
mount
options are in effect. Basically you are saying that cache revalidation
should not be different in those cases.

Do you have test cases for all the crazy things one might do with
rsync or
file restoration that might result in a stale file handle?

commit a03b09537dd935a4efaedb9bdb56f180e71db6bb
Author: Trond Myklebust <[email protected]>
Date: Fri Sep 28 17:20:07 2007 -0400

NFS: Fix atime revalidation in read()

NFSv3 will correctly update atime on a read() call, so there is
no need to
set the NFS_INO_INVALID_ATIME flag unless the call to
nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <[email protected]>

You say "unless the call to nfs_refresh_inode() fails" but it looks
to me
like you're always setting NFS_INO_INVALID_ATIME for all three protocol
versions.

There's also an additional undocumented change in nfs3_read_done:
apparently
the server returns post-op attributes whether or not the READ request
failed.
In a later patch against the NFS lookup path, this change is broken
out and
documented separately.

commit ac67b2f0c9bb90389ec925ca5ecc7fd930eaab3b
Author: Trond Myklebust <[email protected]>
Date: Fri Sep 28 17:11:45 2007 -0400

NFS: Fix atime revalidation in readdir()

NFSv3 will correctly update atime on a readdir call, so there is
no need to
set the NFS_INO_INVALID_ATIME flag unless the call to
nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <[email protected]>

You say "unless the call to nfs_refresh_inode() fails" but it looks
to me
like you're always setting NFS_INO_INVALID_ATIME for all three protocol
versions.

commit adbf11d6638fe8e1a8b3075216efa73b8294f17d
Author: Trond Myklebust <[email protected]>
Date: Sun Sep 30 18:01:13 2007 -0400

NFS: Don't use readdirplus data if the page cache is invalid

Signed-off-by: Trond Myklebust <[email protected]>

Ulp. Nasty.

Minor quibble: the page cache isn't being checked here; rather it's
the validity of the NFS data cache for the directory in question. In
other words, you're checking an NFS attribute cache flag here, not a
generic VFS data structure.

Why is the spin lock needed for checking NFS_INO_INVALID_DATA? What do
you anticipate is the race condition here?

commit a7096362f511f24a47788b332636a992a3518da8
Author: Trond Myklebust <[email protected]>
Date: Thu Sep 27 15:57:24 2007 -0400

NFSv4: Don't use ctime/mtime for determining when to invalidate
the caches

In NFSv4 we should only be looking at the change attribute.

Signed-off-by: Trond Myklebust <[email protected]>

I know you don't prefer the idea, but I think fs/nfs/inode.c would be
made
much cleaner in the face of these kind of version-related differences if
there were version-specific copies of nfs_update_inode() and friends.

commit a6d2ddd5bf668adb1a700104f846ab664b64ea35
Author: Trond Myklebust <[email protected]>
Date: Thu Sep 27 10:07:31 2007 -0400

NFS: Don't force a dcache revalidation if nfs_wcc_update_inode
succeeds

The reason is that if the weak cache consistency update was
successful,
then we know that our client must be the only one that changed the
directory, and we've already updated the dcache to reflect the
change.

Signed-off-by: Trond Myklebust <[email protected]>

It would help me if there was, say, an nfs_force_dcache_reval() function
that just plugged the current jiffies value into nfsi-
>cache_change_attribute.
That would clearly document in which cases dcache revalidation was
needed,
instead of just the raw code that shows us saving jiffies in the NFS
inode
for some unknown reason.

commit a289c003c1ec4d6bad69a9d0d0ad4f138f6ba096
Author: Trond Myklebust <[email protected]>
Date: Sun Sep 30 17:03:25 2007 -0400

NFS: nfs_wcc_update_inode: directory caches are always invalidated

We must ensure that the readdir data is always invalidated
whether or not
the weak cache consistency data update succeeds.

Signed-off-by: Trond Myklebust <[email protected]>

What happens if the directory's data cache is not invalidated? Can you
explain what bug is being addresses here? Are there performance
implications to this change?

commit 93953f229fdf4e8f26c73c596bcca4b386a697f3
Author: Trond Myklebust <[email protected]>
Date: Fri Sep 28 14:20:12 2007 -0400

NFS: fix nfs_verify_change_attribute

We always want to check that the verifier and directory
cache_change_attribute match. This also allows us to remove the
'wraparound
hack' for the cache_change_attribute. If we're only checking for
equality,
then we don't care about wraparound issues.

Signed-off-by: Trond Myklebust <[email protected]>

Nice clean-up, but can you elucidate why "we always want to check
that the
verifier and directory cache_change_attribute match?"

commit 74994c59a5a1def233c2245c3f7ef23cb01c64ce
Author: Trond Myklebust <[email protected]>
Date: Wed Aug 15 12:59:12 2007 -0400

NFS: nfs_post_op_update_inode() should call nfs_refresh_inode()

Ensure that we don't clobber the results from a more recent
getattr call...

Signed-off-by: Trond Myklebust <[email protected]>

I notice that block comments in front of nfs_refresh_inode,
nfs_post_op_update_inode, and nfs_post_op_update_inode_force_wcc all
claim
that these functions "try to update the inode attribute cache". IMO the
block comments do not help the reader understand how each of these is
different. In other words they don't explain why we need three of
these.

commit f0b280597f7967c68356c396673ea645228eb2c6
Author: Trond Myklebust <[email protected]>
Date: Wed Aug 15 12:49:17 2007 -0400

NFS: Fix over-conservative attribute invalidation in
nfs_update_inode()

We should always be declaring the attribute cache as valid after
having
updated it.

Signed-off-by: Trond Myklebust <[email protected]>

Can you explain in your description why it is now considered safe to
ignore outstanding updates while updating an inode's cached attributes?
Does this change prevent an extra GETATTR request after a series of
overlapping async writes?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 00:58:47

by Trond Myklebust

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 2007-10-04 at 12:59 -0700, Andrew Morton wrote:
> On Thu, 04 Oct 2007 15:16:03 -0400
> Trond Myklebust <[email protected]> wrote:
>
> > > >
> > > > That would be perfect. It can even be in non-legacy mode by default,
> > > > just as long as you can go back to the old behaviour when/if you run
> > > > into a non-LFS application.
> > > >
> > >
> > > Wouldn't a mount option be better?
> >
> > I suppose that might be OK if you know that the 32-bit legacy
> > applications will only touch one or two servers, but that sounds like a
> > niche thing.
> >
> > On the downside, forcing all those people who have portable 64-bit aware
> > applications to upgrade their version of mount just in order to have
> > stat64() work correctly seems unnecessarily complicated. I'd prefer not
> > to have to do that unless someone comes up with a good reason why we
> > must.
>
> Confused. You don't need to modify mount(8) when adding a new mount option?

Prior to 2.6.22, the 'mount' program used a binary blob for passing the
NFS mount options to the kernel.
It is only very recently that we have started doing in-kernel parsing of
text strings, and in order to make use of that, people will need to
upgrade to the latest version of nfs-utils.

Trond

2007-10-05 06:25:16

by Pierre Ossman

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 04 Oct 2007 15:41:57 -0400
Peter Staubach <[email protected]> wrote:

>
> I would agree. The 64 bit fileids will only become visible when
> the server is exporting file systems which contain fileids which
> are bigger than 32 bits and then only when the application
> encounters these files.
>

Or, as has been pointed out, when the server is not the Linux in-kernel
NFS server.

> Also, these 32-bit legacy applications are going to have a
> problem if they are ever run on a system which contains local
> file systems which expose the large fileids.
>

Agreed. And I'd probably like a way around that as well. But local
files have never worked, NFS has. So removing it from NFS (where it is
more likely to occur IMO) would be a regression.

> It would be better to identify these applications and get them
> fixed. The world is evolving and it is time for them to do so.
>

Print a warning or something so that they can be found. Don't go
breaking systems left and right. People have better things to do than
to fix the build systems for ever program they use.

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 12:24:47

by Jeff Layton

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 5 Oct 2007 08:25:13 +0200
Pierre Ossman <[email protected]> wrote:
>
> > It would be better to identify these applications and get them
> > fixed. The world is evolving and it is time for them to do so.
> >
>
> Print a warning or something so that they can be found. Don't go
> breaking systems left and right. People have better things to do than
> to fix the build systems for ever program they use.
>

Unfortunately, the kernel doesn't have any way to know that the app is
not built with LFS defines. glibc uses the same syscalls regardless of
how it's built. If you want to print a warning you'll have to modify
glibc to do so, and then you have the problem of where to send this
output...

--
Jeff Layton <[email protected]>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 17:31:27

by Valdis Klētnieks

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 04 Oct 2007 10:00:50 EDT, Trond Myklebust said:

> How about a boot/module parameter to turn it on or off?
>
> I don't see any point in having a sysctl for something like this: either
> you have legacy applications or you don't. It is not something that you
> switch off as you go off to lunch.

How does Joe Sysadmin tell if he has an affected legacy app or not?

(The obvious "try it and see what breaks" is a non-starter for many places,
because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
turn it off, fix them, lather rinse repease". Been there, done that, got
the tshirt - a project I got dragged into involves a large storage array that
appears to insist on exporting 64-bit stuff, and a large farm of clients that
are very 64-bit unclean....)

Attachments:

(No filename) (226.00 B)
(No filename) (314.00 B)
(No filename) (140.00 B)
Download all attachments

2007-10-05 17:32:13

by Trond Myklebust

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 2007-10-04 at 16:16 -0400, Chuck Lever wrote:

> commit 37e582fe2194862ba1924e1c31b081f8c866df29
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 12:06:48 2007 -0400
>
> SUNRPC: Don't call xprt_release in call refresh
>
> Call it from call_verify() instead...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Can you explain what you are fixing? Why is it better not to call
> xprt_release() in call_refresh() ?

call_refresh can be called before you ever hit call_transmit. It is
pointless to allocate an XID, then release it _before_ you have used it.

> commit 76d0ee01ff975eaa696d682138927dac40e90856
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 12:06:44 2007 -0400
>
> SUNRPC: Don't call xprt_release() if call_allocate fails
>
> It completely fouls up the RPC call statistics, and serves no
> useful
> purpose.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> The xprt_release() is needed because we don't want to tie up a slot
> in the
> transport's slot table during the rpc_delay() after the RPC buffer
> allocation failed.

Which causes:
* XID leakage
* Reordering of the RPC calls
* breakage of the RPC scheduling priority code

...in addition to adding extra latencies to the particular RPC call that
failed. Most of the calls that require the allocation of large buffers
tend to be _synchronous_ calls, such as LOOKUP, OPEN, LINK, SYMLINK.
Whereas nobody really worries too much about the latency of SYMLINK
calls, an extra latency on LOOKUP can add up to a lot of wasted time.

> Would it make sense to add some conditional processing in
> rpc_count_iostats() that skipped some accounting if tk_status < 0 ?
> Or by
> adding a boolean argument to xprt_release() that indicates if the RPC
> should
> be counted or not? Or by adding a new xprt_release_no_count function
> that
> doesn't count the RPC request?

Why not instead call xprt_release() only when it is necessary?

> commit 56ad5672426879452eb2a70e9986e59f4d642243
> Author: Trond Myklebust <[email protected]>
> Date: Tue Oct 2 23:13:32 2007 -0400
>
> NFS: Simplify filehandle revalidation
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I think a little more explanation of what's going on here is needed.
> Maybe a sentence or two in the description about why this is not
> going to break CTO? Are you just fixing a minor oversight?

The patch is clearly _not_ going to break close-to-open cache
consistency. It changes one line immediately after a test that
establishes that we're not doing a lookup for the file that is being
open()ed.

It basically gets rid of a revalidation that was supposed to check if
the filehandle is stale or not, but we've already determined that the
parent directory hasn't changed, so the file hasn't been deleted since
we last revalidated it.

> commit 36fbae45411bebd6876310009336de58e8ef4a9a
> Author: Trond Myklebust <[email protected]>
> Date: Tue Oct 2 21:58:05 2007 -0400
>
> NFS: Ensure that nfs_link() returns a hashed dentry
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Here, the d_drop() and d_add() are explicitly serialized inside
> nfs_link()
> via the BKL. I'm focusing on that serialization because it will have
> some
> effect on my effort to eliminate the BKL from the NFS client.
>
> For those of us who aren't fluent users of the dentry cache, can you
> elaborate on why the d_drop() is needed in these cases?

We don't know a priori if the dentry is hashed or not. nfs_lookup() may
have optimised away the actual lookup (the VFS will set the
LOOKUP_CREATE and O_EXCL flags), in which case an unhashed negative
dentry is created, and so we need a call to d_add() instead of
d_instantiate().
Alternatively, the user may have actually tried to open or stat the
dentry before creating a link, in which case we would have a hashed
negative dentry on our hands, and so we unhash it before calling
d_add().

> commit bf1dc509161086950c9563c98c62879260eda056
> Author: Trond Myklebust <[email protected]>
> Date: Tue Oct 2 19:13:04 2007 -0400
>
> NFS: Be strict about dentry revalidation when doing exclusive
> create
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Can you explain why the extra checking is needed for exclusive create?

We shouldn't trust cached values in the case of exclusive create.
Someone may have unlinked the file on the server, in which case we want
the exclusive create to succeed.

> commit 389b68610d592e103936801cc449ef3052008fd6
> Author: Trond Myklebust <[email protected]>
> Date: Tue Oct 2 17:11:54 2007 -0400
>
> NFS: Remove the redundant nfs_reval_fsid()
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I'd like to understand why nfs_reval_fsid() is no longer necessary. Is
> it because this check is already done by _nfs4_proc_lookup() ?

In NFSv3, the directory fsid has already been revalidated by the lookup
call: the post-op attributes return it.
If we need that kind of revalidation in NFSv4 too (I don't see why the
NFSv4 servers would change their fsid willy-nilly given that the NFSv4
spec says that means you are crossing a mountpoint), then we should add
directory post-op attributes to lookup there too.

> commit defee1455a19a43c4b857b55d46085e107337c17
> Author: Trond Myklebust <[email protected]>
> Date: Sat Sep 29 17:34:46 2007 -0400
>
> NFS: Remove NFS_I(inode)->data_updates
>
> We have no more users...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> You have removed nfs_begin_data_update() in this patch... but the
> next patch
> in the series (commit b322d877bb6a58af6f24930bb5ff6122860d61e1)
> removes a
> bunch of calls to it. Are these two patches in reverse order?

Look again. This patch converts nfs_begin_data_update() into an empty
inlined function. The next patch removes it altogether.

> commit 40acc380df81ba07d4cd610107b5479b1edb6055
> Author: Trond Myklebust <[email protected]>
> Date: Sat Sep 29 17:25:43 2007 -0400
>
> NFS: NFS_CACHEINV() should not test for nfs_caches_unstable()
>
> The fact that we're in the process of modifying the inode does
> not mean
> that we should not invalidate the attribute and data caches. The
> defensive
> thing is to always invalidate when we're confronted with inode
> mtime/ctime or change_attribute updates that we do not immediately
> recognise.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Ugh, a triple negative. I'm having trouble understanding the
> description
> and figuring out exactly what's going on this patch.

It is getting rid of an invalid assumption. The last sentence above says
it all: we always invalidate the attribute and data caches whenever we
see an mtime/ctime or change_attribute update that we don't recognise as
being ours (i.e. no pre-op attribute matches).

> commit f507133ab0f2369194037ae7842dc0895b979627
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 13:54:51 2007 -0400
>
> NFS: Remove bogus nfs_mark_for_revalidate() in nfs_lookup
>
> The parent of the newly materialised dentry has just been
> revalidated...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Is there a way to document the revalidation of the parent? Where
> exactly
> does that occur in the nfs_lookup() path?

We just did a LOOKUP from that same parent.

> commit 176c84c19cb7cd7ea1a83a105649b41d18999812
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 10:00:23 2007 -0400
>
> NFS: nfs_mark_for_revalidate don't update cache_change_attribute
>
> Just let the subsequent inode revalidation do the update...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> The short description is ungrammatical.

Don't care as long as it fits on one line and is descriptive.

> commit d2c7da6c1cb184ea79b765eae848b1a2a8910dbf
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 09:59:15 2007 -0400
>
> NFS: nfs_post_op_update_inode don't update cache_change_attribute
>
> If nfs_post_op_update_inode fails because the server didn't
> return any
> attributes, then we let the subsequent inode revalidation update
> cache_change_attribute.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> The short description is ungrammatical.

Don't care as long as it fits on one line and is descriptive.

> commit fb26bdc63b426cfed53e7f1fc89e5113ceb17a5b
> Author: Trond Myklebust <[email protected]>
> Date: Mon Oct 1 09:56:59 2007 -0400
>
> NFS: Don't revalidate dentries on directory size or ctime changes
>
> We only need to look at the mtime changes...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> It would be friendly to spell out what's going on here in the patch
> description.

What is missing?

> commit 9dd4607202e6c177cc53035338473db378b98c63
> Author: Trond Myklebust <[email protected]>
> Date: Sat Sep 29 17:41:33 2007 -0400
>
> NFS: Ensure nfs_instantiate() invalidates the parent dir on error
>
> Also ensure that it drops the dentry in this case.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Perhaps a silly question, but since you've added a d_drop(dentry)
> right at
> the top of nfs_instatiate(), does nfs_instantiate() now rely on some
> kind of
> external serialization (say, the BKL)?

Nothing has changed. The only serialisation we rely on is the usual one
for operations that modify a directory: the callers _must_ be holding
the directory's i_mutex.

> commit c7a8d8a69d807dc026dc016df11f82380d841118
> Author: Trond Myklebust <[email protected]>
> Date: Sat Sep 29 17:14:03 2007 -0400
>
> NFS: Fix the sign of the return value of
> nfs_save_change_attribute()
>
> Also fix up the comments.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I would mention that the type of the return value of
> nfs_save_change_attribute() has to match the second parameter of
> nfs_set_verifier(). Otherwise it's not clear why you are changing this.

The return type of nfs_save_change_attribute() should match the type of
the cache_change_attribute that we're reading.

> commit 9cd8d8475b9ecdca1181eb8dc41a00b8cf84b7bd
> Author: Trond Myklebust <[email protected]>
> Date: Sun Sep 30 15:21:24 2007 -0400
>
> NFS: Fake up 'wcc' attributes to prevent cache invalidation
> after write
>
> NFSv2 and v4 don't offer weak cache consistency attributes on
> WRITE calls.
> In NFSv3, returning wcc data is optional. In all cases, we want
> to prevent
> the client from invalidating our cached data whenever -
> >write_done()
> attempts to update the inode attributes.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> See comment on commit 74994c59a5a1def233c2245c3f7ef23cb01c64ce.
>
> commit 71d96564928ac2017c97a4555f818397c49c461c
> Author: Trond Myklebust <[email protected]>
> Date: Fri Sep 28 19:11:33 2007 -0400
>
> NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode()
>
> For one thing, the test NFS_ATTRTIMEO() == 0 makes no sense: we're
> testing whether or not the cache timeout length is zero, which
> is totally
> unrelated to the issue of whether or not we trust the file
> staleness.
>
> Secondly, we do not want to retry the GETATTR once a file has
> been declared
> stale by the server: we rather want to discard that inode as
> soon as
> possible, since there are broken servers still in use out there
> that reuse
> filehandles on new files.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> The NFS_ATTRTIMEO check was made to see if the "noac" or "actimeo=0"
> mount
> options are in effect. Basically you are saying that cache revalidation
> should not be different in those cases.

Damned right!

> Do you have test cases for all the crazy things one might do with
> rsync or
> file restoration that might result in a stale file handle?

Please go back, and read the code and the above comments. Once a _file_
is declared stale, there is no going back! It doesn't matter if you set
'noac' or if you brush your teeth for 2 minutes every morning and night,
we're not going to let you read or write to that inode again.

> commit a03b09537dd935a4efaedb9bdb56f180e71db6bb
> Author: Trond Myklebust <[email protected]>
> Date: Fri Sep 28 17:20:07 2007 -0400
>
> NFS: Fix atime revalidation in read()
>
> NFSv3 will correctly update atime on a read() call, so there is
> no need to
> set the NFS_INO_INVALID_ATIME flag unless the call to
> nfs_refresh_inode()
> fails.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> You say "unless the call to nfs_refresh_inode() fails" but it looks
> to me
> like you're always setting NFS_INO_INVALID_ATIME for all three protocol
> versions.
>
> There's also an additional undocumented change in nfs3_read_done:
> apparently
> the server returns post-op attributes whether or not the READ request
> failed.
> In a later patch against the NFS lookup path, this change is broken
> out and
> documented separately.

...and the NFS_INO_INVALID_ATIME flag will therefore be cleared if the
call to nfs_refresh_inode results in an inode update.

> commit ac67b2f0c9bb90389ec925ca5ecc7fd930eaab3b
> Author: Trond Myklebust <[email protected]>
> Date: Fri Sep 28 17:11:45 2007 -0400
>
> NFS: Fix atime revalidation in readdir()
>
> NFSv3 will correctly update atime on a readdir call, so there is
> no need to
> set the NFS_INO_INVALID_ATIME flag unless the call to
> nfs_refresh_inode()
> fails.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> You say "unless the call to nfs_refresh_inode() fails" but it looks
> to me
> like you're always setting NFS_INO_INVALID_ATIME for all three protocol
> versions.

...and the NFS_INO_INVALID_ATIME flag will therefore be cleared if the
call to nfs_refresh_inode results in an inode update.

> commit adbf11d6638fe8e1a8b3075216efa73b8294f17d
> Author: Trond Myklebust <[email protected]>
> Date: Sun Sep 30 18:01:13 2007 -0400
>
> NFS: Don't use readdirplus data if the page cache is invalid
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Ulp. Nasty.
>
> Minor quibble: the page cache isn't being checked here; rather it's
> the validity of the NFS data cache for the directory in question. In
> other words, you're checking an NFS attribute cache flag here, not a
> generic VFS data structure.
>
> Why is the spin lock needed for checking NFS_INO_INVALID_DATA? What do
> you anticipate is the race condition here?

It is there to ensure that there is a memory barrier between the read of
the verifier (in nfs_save_change_attribute()) and the check for whether
or not the directory's cache is valid.
That again should ensure that even if we later find out that the cache
was invalid, the dentry verifier is correctly labelled with the
directory verifier that was _prior_ to the cache being labelled as
invalid.

> commit a7096362f511f24a47788b332636a992a3518da8
> Author: Trond Myklebust <[email protected]>
> Date: Thu Sep 27 15:57:24 2007 -0400
>
> NFSv4: Don't use ctime/mtime for determining when to invalidate
> the caches
>
> In NFSv4 we should only be looking at the change attribute.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I know you don't prefer the idea, but I think fs/nfs/inode.c would be
> made
> much cleaner in the face of these kind of version-related differences if
> there were version-specific copies of nfs_update_inode() and friends.

The behaviour of the code right now depends on the contents of the fattr
struct, rather than any assumptions about what kind of attributes each
NFS version returns. The NFSv4 protocol allows for a lot of choices for
what the fattr may contain.

> commit a6d2ddd5bf668adb1a700104f846ab664b64ea35
> Author: Trond Myklebust <[email protected]>
> Date: Thu Sep 27 10:07:31 2007 -0400
>
> NFS: Don't force a dcache revalidation if nfs_wcc_update_inode
> succeeds
>
> The reason is that if the weak cache consistency update was
> successful,
> then we know that our client must be the only one that changed the
> directory, and we've already updated the dcache to reflect the
> change.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> It would help me if there was, say, an nfs_force_dcache_reval() function
> that just plugged the current jiffies value into nfsi-
> >cache_change_attribute.
> That would clearly document in which cases dcache revalidation was
> needed,
> instead of just the raw code that shows us saving jiffies in the NFS
> inode
> for some unknown reason.

That is a cleanup that might make sense.

> commit a289c003c1ec4d6bad69a9d0d0ad4f138f6ba096
> Author: Trond Myklebust <[email protected]>
> Date: Sun Sep 30 17:03:25 2007 -0400
>
> NFS: nfs_wcc_update_inode: directory caches are always invalidated
>
> We must ensure that the readdir data is always invalidated
> whether or not
> the weak cache consistency data update succeeds.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> What happens if the directory's data cache is not invalidated? Can you
> explain what bug is being addresses here? Are there performance
> implications to this change?

Any change in the directory means that our READDIR data needs to get
tossed out. We can't fake up directory entries ourselves.

> commit 93953f229fdf4e8f26c73c596bcca4b386a697f3
> Author: Trond Myklebust <[email protected]>
> Date: Fri Sep 28 14:20:12 2007 -0400
>
> NFS: fix nfs_verify_change_attribute
>
> We always want to check that the verifier and directory
> cache_change_attribute match. This also allows us to remove the
> 'wraparound
> hack' for the cache_change_attribute. If we're only checking for
> equality,
> then we don't care about wraparound issues.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Nice clean-up, but can you elucidate why "we always want to check
> that the
> verifier and directory cache_change_attribute match?"

It is a requirement that we impose. By saving the parent directory's
cache_change_attribute in the dentry->d_time, we're labelling the dentry
as having been revalidated or checked. If ever we see the parent
directory changed in some way that we don't recognise as being due to
our activity, then we change its dir->cache_change_attribute. Every time
a dentry is presented to nfs_lookup_revalidate(), we can then compare
the dentry->d_time to the dir->cache_change_attribute for equality in
order to test whether or not it was revalidated since the last change.

> commit 74994c59a5a1def233c2245c3f7ef23cb01c64ce
> Author: Trond Myklebust <[email protected]>
> Date: Wed Aug 15 12:59:12 2007 -0400
>
> NFS: nfs_post_op_update_inode() should call nfs_refresh_inode()
>
> Ensure that we don't clobber the results from a more recent
> getattr call...
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I notice that block comments in front of nfs_refresh_inode,
> nfs_post_op_update_inode, and nfs_post_op_update_inode_force_wcc all
> claim
> that these functions "try to update the inode attribute cache". IMO the
> block comments do not help the reader understand how each of these is
> different. In other words they don't explain why we need three of
> these.

They differ in the circumstances of use. nfs_refresh_inode simply says
that I should check and/or update the inode attributes assuming that my
fattr is not empty.

nfs_post_op_update_inode says that I should call nfs_refresh_inode, or
invalidate my inode's attribute cache if the fattr is empty. It will
normally be called if I know for certain that an RPC call I just made
has changed the inode.

nfs_post_op_update_inode_force_wcc is a variant on
nfs_post_op_update_inode that was specifically made for the case of
writes because they often involve multiple simultaneous updates of the
inode that break the weak cache consistency model because the replies
from the server may be handled in a different order to the update order
on the server.

> commit f0b280597f7967c68356c396673ea645228eb2c6
> Author: Trond Myklebust <[email protected]>
> Date: Wed Aug 15 12:49:17 2007 -0400
>
> NFS: Fix over-conservative attribute invalidation in
> nfs_update_inode()
>
> We should always be declaring the attribute cache as valid after
> having
> updated it.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> Can you explain in your description why it is now considered safe to
> ignore outstanding updates while updating an inode's cached attributes?
> Does this change prevent an extra GETATTR request after a series of
> overlapping async writes?

It is not safe to 'ignore outstanding updates'. All this says is that if
I just updated the inode, then I'm reasonably sure that the attributes
are valid.

Trond

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 17:36:19

by Trond Myklebust

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 2007-10-05 at 08:25 +0200, Pierre Ossman wrote:

> Print a warning or something so that they can be found. Don't go
> breaking systems left and right. People have better things to do than
> to fix the build systems for ever program they use.

The kernel knows bugger all about what glibc function your program is
calling. The problem here is precisely that newer versions of glibc will
transform legacy 32-bit stat() calls into 64-bit stat64() calls, then
will complain when the result overflows.

If you want to figure out which apps are broken, then you will have to
either do so in glibc or use a preloaded shared library to intercept the
32-bit stat() calls and print out a warning.

Trond

2007-10-05 17:52:30

by Trond Myklebust

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 2007-10-05 at 13:30 -0400, [email protected] wrote:
> On Thu, 04 Oct 2007 10:00:50 EDT, Trond Myklebust said:
>
> > How about a boot/module parameter to turn it on or off?
> >
> > I don't see any point in having a sysctl for something like this: either
> > you have legacy applications or you don't. It is not something that you
> > switch off as you go off to lunch.
>
> How does Joe Sysadmin tell if he has an affected legacy app or not?
>
> (The obvious "try it and see what breaks" is a non-starter for many places,
> because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
> turn it off, fix them, lather rinse repease". Been there, done that, got
> the tshirt - a project I got dragged into involves a large storage array that
> appears to insist on exporting 64-bit stuff, and a large farm of clients that
> are very 64-bit unclean....)

If you're unsure, then set the bloody boot parameter. That's what it is
for...

Trond

2007-10-05 17:54:51

by Pierre Ossman

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 05 Oct 2007 13:36:19 -0400
Trond Myklebust <[email protected]> wrote:

> On Fri, 2007-10-05 at 08:25 +0200, Pierre Ossman wrote:
>
> > Print a warning or something so that they can be found. Don't go
> > breaking systems left and right. People have better things to do
> > than to fix the build systems for ever program they use.
>
> The kernel knows bugger all about what glibc function your program is
> calling. The problem here is precisely that newer versions of glibc
> will transform legacy 32-bit stat() calls into 64-bit stat64() calls,
> then will complain when the result overflows.
>

Right, I didn't suggest that this had to be done in the kernel. My
point was that first you mark something as deprecated, make a lot of
noise when someone uses it so that problems can be identified, and some
time later you remove it. You don't just remove it and let production
systems deal with the fallout.

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 18:01:27

by Jeff Layton

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 05 Oct 2007 13:30:10 -0400
[email protected] wrote:
>
> How does Joe Sysadmin tell if he has an affected legacy app or not?
>
> (The obvious "try it and see what breaks" is a non-starter for many places,
> because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
> turn it off, fix them, lather rinse repease". Been there, done that, got
> the tshirt - a project I got dragged into involves a large storage array that
> appears to insist on exporting 64-bit stuff, and a large farm of clients that
> are very 64-bit unclean....)
>

In addition to Trond's suggestion, you might be able to use "nm" or
something like it and see if there are references to non-LFS (f)stat
calls in your binaries. For instance, if you see references to stat()
(and not stat64()), then the app is probably not built with 64-bit file
offsets.

This is probably not as reliable as Trond's method, but it might be
less invasive and reasonable for a first pass when looking for these
sorts of apps...

--
Jeff Layton <[email protected]>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-05 18:12:22

by Jeff Layton

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, 05 Oct 2007 13:30:10 -0400
[email protected] wrote:

> On Thu, 04 Oct 2007 10:00:50 EDT, Trond Myklebust said:
>
> > How about a boot/module parameter to turn it on or off?
> >
> > I don't see any point in having a sysctl for something like this: either
> > you have legacy applications or you don't. It is not something that you
> > switch off as you go off to lunch.
>
> How does Joe Sysadmin tell if he has an affected legacy app or not?
>
> (The obvious "try it and see what breaks" is a non-starter for many places,
> because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
> turn it off, fix them, lather rinse repease". Been there, done that, got
> the tshirt - a project I got dragged into involves a large storage array that
> appears to insist on exporting 64-bit stuff, and a large farm of clients that
> are very 64-bit unclean....)
>

Note that "try it and see what breaks" isn't reliable either. If glibc
gets back a 64 bit inode number that just happens to fit in the 32-bit
field, then everything will work. You don't actually get an EOVERFLOW
until st_ino overflows the field, and that may not happen often enough
for testing this way to detect it...

--
Jeff Layton <[email protected]>

2007-10-07 22:56:09

by David Chinner

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, Oct 05, 2007 at 02:12:22PM -0400, Jeff Layton wrote:
> On Fri, 05 Oct 2007 13:30:10 -0400
> [email protected] wrote:
>
> > On Thu, 04 Oct 2007 10:00:50 EDT, Trond Myklebust said:
> >
> > > How about a boot/module parameter to turn it on or off?
> > >
> > > I don't see any point in having a sysctl for something like this: either
> > > you have legacy applications or you don't. It is not something that you
> > > switch off as you go off to lunch.
> >
> > How does Joe Sysadmin tell if he has an affected legacy app or not?
> >
> > (The obvious "try it and see what breaks" is a non-starter for many places,
> > because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
> > turn it off, fix them, lather rinse repease". Been there, done that, got
> > the tshirt - a project I got dragged into involves a large storage array that
> > appears to insist on exporting 64-bit stuff, and a large farm of clients that
> > are very 64-bit unclean....)
> >
>
> Note that "try it and see what breaks" isn't reliable either. If glibc
> gets back a 64 bit inode number that just happens to fit in the 32-bit
> field, then everything will work. You don't actually get an EOVERFLOW
> until st_ino overflows the field, and that may not happen often enough
> for testing this way to detect it...

There's a damn easy way of testing this.

Use XFS on a 64 bit Linux NFS server, mount is '-o inode64,ino64'
and then export it to you client that is going to have problems.
the "ino64" mount option guarantees that the userspace visible
inode number is always > 32 bits in length....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-10-08 08:36:16

by Greg Banks

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Fri, Oct 05, 2007 at 02:00:37PM -0400, Jeff Layton wrote:
> On Fri, 05 Oct 2007 13:30:10 -0400
> [email protected] wrote:
> >
> > How does Joe Sysadmin tell if he has an affected legacy app or not?
> >
> > (The obvious "try it and see what breaks" is a non-starter for many places,
> > because you too easily end up in a loop of "enable it, find 4-5 show stoppers,
> > turn it off, fix them, lather rinse repease". Been there, done that, got
> > the tshirt - a project I got dragged into involves a large storage array that
> > appears to insist on exporting 64-bit stuff, and a large farm of clients that
> > are very 64-bit unclean....)
> >
>
> In addition to Trond's suggestion, you might be able to use "nm" or
> something like it and see if there are references to non-LFS (f)stat
> calls in your binaries. For instance, if you see references to stat()
> (and not stat64()), then the app is probably not built with 64-bit file
> offsets.

Attached is a Perl script I wrote a while back to scan directories
looking for old stat calls in binaries. Here's the output from
my laptop:

# ./summarise-stat64.pl /usr/bin
775 26.8% are scripts (shell, perl, whatever)
1404 48.5% don't use any stat() family calls at all
428 14.8% use 32-bit stat() family interfaces only
278 9.6% use 64-bit stat64() family interfaces only
11 0.4% use both 32-bit and 64-bit stat() family interfaces

# ./summarise-stat64.pl /usr/sbin
164 35.7% are scripts (shell, perl, whatever)
170 37.0% don't use any stat() family calls at all
78 17.0% use 32-bit stat() family interfaces only
46 10.0% use 64-bit stat64() family interfaces only
1 0.2% use both 32-bit and 64-bit stat() family interfaces

# ./summarise-stat64.pl -v /usr/bin
...
/usr/bin/vi use 32-bit stat() family interfaces only
/usr/bin/view use 32-bit stat() family interfaces only
/usr/bin/vim use 32-bit stat() family interfaces only
...
/usr/bin/Mail use 32-bit stat() family interfaces only
/usr/bin/mail use 32-bit stat() family interfaces only
/usr/bin/mailx use 32-bit stat() family interfaces only
...
/usr/bin/gdb use 32-bit stat() family interfaces only
/usr/bin/gdbtui use 32-bit stat() family interfaces only
/usr/bin/rpcgen use 32-bit stat() family interfaces only
...
/usr/bin/cc use 32-bit stat() family interfaces only
/usr/bin/gcc use 32-bit stat() family interfaces only
/usr/bin/gcov use 32-bit stat() family interfaces only
/usr/bin/unprotoize use 32-bit stat() family interfaces only
...
/usr/bin/git use 32-bit stat() family interfaces only
/usr/bin/git-check-ref-format use 32-bit stat() family interfaces only
/usr/bin/git-cat-file use 32-bit stat() family interfaces only
/usr/bin/git-checkout-index use 32-bit stat() family interfaces only
/usr/bin/git-clone-pack use 32-bit stat() family interfaces only
/usr/bin/git-commit-tree use 32-bit stat() family interfaces only
/usr/bin/git-convert-objects use 32-bit stat() family interfaces only
/usr/bin/git-daemon use 32-bit stat() family interfaces only
/usr/bin/git-describe use 32-bit stat() family interfaces only
...

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

Attachments:

(No filename) (3.18 kB)
summarise-stat64.pl (3.83 kB)
Download all attachments

2007-10-03 23:52:33

by Jeff Garzik

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

Trond Myklebust wrote:
> Aside from the usual updates from Chuck for NFS-over-IPv6 (still
> incomplete) and a number of bugfixes for the text-based mount code, the
> main news in the NFS tree is the merging of support for the NFS/RDMA
> client code from Tom Talpey and the NetApp New England (NANE) team.
>
> We also have the 64-bit inode support from RedHat/Peter Staubach.

The marketroids compel me to say: It is Red Hat, not RedHat :)

Jeff, looking forward to NFSv4 over IPv6

2007-10-04 06:52:13

by Pierre Ossman

[permalink] [raw]

Subject: Re: What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Wed, 03 Oct 2007 19:41:16 -0400
Trond Myklebust <[email protected]> wrote:

>
> We also have the 64-bit inode support from RedHat/Peter Staubach.
>

As has been pointed[1] out[2], this will cause regressions for non-LFS
applications (of which there are still lots and lots). This change
should be in feature-removal (the "feature" being removed is legacy
support for non-LFS applications using NFS servers that make full use
of the protocol) and preferably accompanied with appropriate user space
changes (e.g. compatibility option in glibc).

[1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
[2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-04 14:00:50

by Trond Myklebust

[permalink] [raw]

Subject: Re: [NFS] What's slated for inclusion in 2.6.24-rc1 from the NFS client git tree...

On Thu, 2007-10-04 at 08:52 +0200, Pierre Ossman wrote:
> On Wed, 03 Oct 2007 19:41:16 -0400
> Trond Myklebust <[email protected]> wrote:
>
> >
> > We also have the 64-bit inode support from RedHat/Peter Staubach.
> >
>
> As has been pointed[1] out[2], this will cause regressions for non-LFS
> applications (of which there are still lots and lots). This change
> should be in feature-removal (the "feature" being removed is legacy
> support for non-LFS applications using NFS servers that make full use
> of the protocol) and preferably accompanied with appropriate user space
> changes (e.g. compatibility option in glibc).
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=241348
> [2] http://marc.info/?l=linux-nfs&m=118701088726477&w=2
>
> Rgds

How about a boot/module parameter to turn it on or off?

I don't see any point in having a sysctl for something like this: either
you have legacy applications or you don't. It is not something that you
switch off as you go off to lunch.
A compile parameter, OTOH, would be too restrictive since it would force
distros to choose just one behaviour (which would mean they would have
to choose the most conservative).

Trond