2006-08-10 05:04:29

by Xin Zhao

[permalink] [raw]
Subject: Urgent help needed on an NFS question, please help!!!

I just ran into a problem about NFS. It might be a fundmental problem
of my current work. So please help!

I am wondering how NFS guarantees a client didn't get wrong file
attributes. Consider the following scenario:

Suppose we have an NFS server S and two clients C1 and C2.

Now C1 needs to access the file attributes of file X, it first does
lookup() to get the file handle of file X.

After C1 gets X's file handle and before C1 issues the getattr()
request, C2 cuts in. Now C2 deletes file X and creates a new file X1,
which has different name but the same inode number and device ID as
the nonexistent file X.

When C1 issues getattr() with the old file handle, it may get file
attribute on wrong file X1. Is this true?

If not, how NFS avoid this problem? Please direct me to the code that
verifies this.

Many many thanks!

-x


2006-08-10 05:11:15

by NeilBrown

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thursday August 10, [email protected] wrote:
> I just ran into a problem about NFS. It might be a fundmental problem
> of my current work. So please help!
>
> I am wondering how NFS guarantees a client didn't get wrong file
> attributes. Consider the following scenario:
>
> Suppose we have an NFS server S and two clients C1 and C2.
>
> Now C1 needs to access the file attributes of file X, it first does
> lookup() to get the file handle of file X.
>
> After C1 gets X's file handle and before C1 issues the getattr()
> request, C2 cuts in. Now C2 deletes file X and creates a new file X1,
> which has different name but the same inode number and device ID as
> the nonexistent file X.
>
> When C1 issues getattr() with the old file handle, it may get file
> attribute on wrong file X1. Is this true?
>
> If not, how NFS avoid this problem? Please direct me to the code that
> verifies this.

Generation numbers.

When the filesystem creates a new file it assigns a random number
as the 'generation' number and stores that in the inode.
This gets included in the filehandle, and checked when the filehandle
lookup is done.

Look for references to 'i_generation' in fs/ext3/*

Other files systems may approach this slightly differently, but the
filesystem is responsible for providing a unique-over-time filehandle,
and 'generation number' is the 'standard' way of doing this.

NeilBrown

2006-08-10 05:54:35

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Many thanks for your kind help!

Your answer is what I expected. But what frustrated me is that I
cannot find the code that verifies the generation number in NFS V3
codes. Do you know where it check the generation number?

Thanks,
-x

On 8/10/06, Neil Brown <[email protected]> wrote:
> On Thursday August 10, [email protected] wrote:
> > I just ran into a problem about NFS. It might be a fundmental problem
> > of my current work. So please help!
> >
> > I am wondering how NFS guarantees a client didn't get wrong file
> > attributes. Consider the following scenario:
> >
> > Suppose we have an NFS server S and two clients C1 and C2.
> >
> > Now C1 needs to access the file attributes of file X, it first does
> > lookup() to get the file handle of file X.
> >
> > After C1 gets X's file handle and before C1 issues the getattr()
> > request, C2 cuts in. Now C2 deletes file X and creates a new file X1,
> > which has different name but the same inode number and device ID as
> > the nonexistent file X.
> >
> > When C1 issues getattr() with the old file handle, it may get file
> > attribute on wrong file X1. Is this true?
> >
> > If not, how NFS avoid this problem? Please direct me to the code that
> > verifies this.
>
> Generation numbers.
>
> When the filesystem creates a new file it assigns a random number
> as the 'generation' number and stores that in the inode.
> This gets included in the filehandle, and checked when the filehandle
> lookup is done.
>
> Look for references to 'i_generation' in fs/ext3/*
>
> Other files systems may approach this slightly differently, but the
> filesystem is responsible for providing a unique-over-time filehandle,
> and 'generation number' is the 'standard' way of doing this.
>
> NeilBrown
>

2006-08-10 06:03:29

by NeilBrown

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thursday August 10, [email protected] wrote:
> Many thanks for your kind help!
>
> Your answer is what I expected. But what frustrated me is that I
> cannot find the code that verifies the generation number in NFS V3
> codes. Do you know where it check the generation number?

NFSD doesn't. The individual filesystem does. You need to look in
the filesystem code.

Some filesystems use common code from fs/exportfs/expfs.c
See "export_iget".

NeilBrown.

2006-08-10 06:04:31

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

I think nfs_compare_fh() might do the file handle verification task.
However, it is still possible that AFTER C1 gets a valid file handle,
BUT BEFORE C1 sends out the getattr() request, C2 deletes file X and
creates a different file X1 which has the same inode number. Looks
like the server side must verify the generation number carried in the
file handle. Unfortunately, I didn't find this code at the server
side. Any further insight on this?

Thanks,
Xin

On 8/10/06, Neil Brown <[email protected]> wrote:
> On Thursday August 10, [email protected] wrote:
> > I just ran into a problem about NFS. It might be a fundmental problem
> > of my current work. So please help!
> >
> > I am wondering how NFS guarantees a client didn't get wrong file
> > attributes. Consider the following scenario:
> >
> > Suppose we have an NFS server S and two clients C1 and C2.
> >
> > Now C1 needs to access the file attributes of file X, it first does
> > lookup() to get the file handle of file X.
> >
> > After C1 gets X's file handle and before C1 issues the getattr()
> > request, C2 cuts in. Now C2 deletes file X and creates a new file X1,
> > which has different name but the same inode number and device ID as
> > the nonexistent file X.
> >
> > When C1 issues getattr() with the old file handle, it may get file
> > attribute on wrong file X1. Is this true?
> >
> > If not, how NFS avoid this problem? Please direct me to the code that
> > verifies this.
>
> Generation numbers.
>
> When the filesystem creates a new file it assigns a random number
> as the 'generation' number and stores that in the inode.
> This gets included in the filehandle, and checked when the filehandle
> lookup is done.
>
> Look for references to 'i_generation' in fs/ext3/*
>
> Other files systems may approach this slightly differently, but the
> filesystem is responsible for providing a unique-over-time filehandle,
> and 'generation number' is the 'standard' way of doing this.
>
> NeilBrown
>

2006-08-10 06:16:07

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

I found where the server checks the generation number. It's in fh_verify(). :)

Many thanks for your help, Neil!

-x

On 8/10/06, Xin Zhao <[email protected]> wrote:
> I think nfs_compare_fh() might do the file handle verification task.
> However, it is still possible that AFTER C1 gets a valid file handle,
> BUT BEFORE C1 sends out the getattr() request, C2 deletes file X and
> creates a different file X1 which has the same inode number. Looks
> like the server side must verify the generation number carried in the
> file handle. Unfortunately, I didn't find this code at the server
> side. Any further insight on this?
>
> Thanks,
> Xin
>
> On 8/10/06, Neil Brown <[email protected]> wrote:
> > On Thursday August 10, [email protected] wrote:
> > > I just ran into a problem about NFS. It might be a fundmental problem
> > > of my current work. So please help!
> > >
> > > I am wondering how NFS guarantees a client didn't get wrong file
> > > attributes. Consider the following scenario:
> > >
> > > Suppose we have an NFS server S and two clients C1 and C2.
> > >
> > > Now C1 needs to access the file attributes of file X, it first does
> > > lookup() to get the file handle of file X.
> > >
> > > After C1 gets X's file handle and before C1 issues the getattr()
> > > request, C2 cuts in. Now C2 deletes file X and creates a new file X1,
> > > which has different name but the same inode number and device ID as
> > > the nonexistent file X.
> > >
> > > When C1 issues getattr() with the old file handle, it may get file
> > > attribute on wrong file X1. Is this true?
> > >
> > > If not, how NFS avoid this problem? Please direct me to the code that
> > > verifies this.
> >
> > Generation numbers.
> >
> > When the filesystem creates a new file it assigns a random number
> > as the 'generation' number and stores that in the inode.
> > This gets included in the filehandle, and checked when the filehandle
> > lookup is done.
> >
> > Look for references to 'i_generation' in fs/ext3/*
> >
> > Other files systems may approach this slightly differently, but the
> > filesystem is responsible for providing a unique-over-time filehandle,
> > and 'generation number' is the 'standard' way of doing this.
> >
> > NeilBrown
> >
>

2006-08-10 15:16:01

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Hi,

I am considering another possibility: suppose client C1 does lookup()
on file X and gets a file handle, which include inode number,
generation number and parent's inode number. Before C1 issues
getattr(), C2 move the parent directory to a different place, which
will not change the parent's inode number, neither the file X's inode,
i_generation. So when C1 issues a getattr() request with this file
handle, the server seems to have no way to detect that file X is not
existent at the original path. Instead, the server will returns the
moved X's attributes, which are correct, but semantically wrong. Is
there any way that server deal with this problem?

Thanks a lot!
-x

On 8/10/06, Neil Brown <[email protected]> wrote:
> On Thursday August 10, [email protected] wrote:
> > Many thanks for your kind help!
> >
> > Your answer is what I expected. But what frustrated me is that I
> > cannot find the code that verifies the generation number in NFS V3
> > codes. Do you know where it check the generation number?
>
> NFSD doesn't. The individual filesystem does. You need to look in
> the filesystem code.
>
> Some filesystems use common code from fs/exportfs/expfs.c
> See "export_iget".
>
> NeilBrown.
>

2006-08-10 16:11:11

by Matthew Wilcox

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, Aug 10, 2006 at 11:15:57AM -0400, Xin Zhao wrote:
> I am considering another possibility: suppose client C1 does lookup()
> on file X and gets a file handle, which include inode number,
> generation number and parent's inode number. Before C1 issues
> getattr(), C2 move the parent directory to a different place, which
> will not change the parent's inode number, neither the file X's inode,
> i_generation. So when C1 issues a getattr() request with this file
> handle, the server seems to have no way to detect that file X is not
> existent at the original path. Instead, the server will returns the
> moved X's attributes, which are correct, but semantically wrong. Is
> there any way that server deal with this problem?

It isn't semantically wrong. There is no way for the application to
distinguish between the events:

open()
stat()
mv

and

open()
mv
stat()

As long as the results are consistent with the former case, it doesn't
matter if the latter case actually happened.

2006-08-10 16:23:16

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

That makes sense.

Can we make the following two conclusions?
1. In a single machine, inode+dev ID+i_generation can uniquely identify a file
2. Given a stored file handle and an inode object received from the
server, an NFS client can safely determine whether this inode
corresponds to the file handle by checking the inode+dev+i_generation.

Thanks,
-x


On 8/10/06, Matthew Wilcox <[email protected]> wrote:
> On Thu, Aug 10, 2006 at 11:15:57AM -0400, Xin Zhao wrote:
> > I am considering another possibility: suppose client C1 does lookup()
> > on file X and gets a file handle, which include inode number,
> > generation number and parent's inode number. Before C1 issues
> > getattr(), C2 move the parent directory to a different place, which
> > will not change the parent's inode number, neither the file X's inode,
> > i_generation. So when C1 issues a getattr() request with this file
> > handle, the server seems to have no way to detect that file X is not
> > existent at the original path. Instead, the server will returns the
> > moved X's attributes, which are correct, but semantically wrong. Is
> > there any way that server deal with this problem?
>
> It isn't semantically wrong. There is no way for the application to
> distinguish between the events:
>
> open()
> stat()
> mv
>
> and
>
> open()
> mv
> stat()
>
> As long as the results are consistent with the former case, it doesn't
> matter if the latter case actually happened.
>

2006-08-10 16:54:35

by Matthew Wilcox

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, Aug 10, 2006 at 12:23:12PM -0400, Xin Zhao wrote:
> That makes sense.
>
> Can we make the following two conclusions?
> 1. In a single machine, inode+dev ID+i_generation can uniquely identify a
> file

sure.

> 2. Given a stored file handle and an inode object received from the
> server, an NFS client can safely determine whether this inode
> corresponds to the file handle by checking the inode+dev+i_generation.

The NFS client makes up its own inode numbers for use on the local
machine. It doesn't know the device+inode+generation numbers on the
server (and indeed, the server may not even have the concepts of
inodes). To quote RFC 1813:

The file handle contains all the information the server needs to
distinguish an individual file. To the client, the file handle is
opaque. The client stores file handles for use in a later request
and can compare two file handles from the same server for equality by
doing a byte-by-byte comparison, but cannot otherwise interpret the
contents of file handles. If two file handles from the same server
are equal, they must refer to the same file, but if they are not
equal, no conclusions can be drawn. Servers should try to maintain
a one-to-one correspondence between file handles and files, but this
is not required. Clients should use file handle comparisons only to
improve performance, not for correct behavior.

2006-08-10 17:08:47

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Well. For regular NFS, because it needs to consider interoperability,
it cannot use file handle as an opaque object.

However, in our case, we essentially derived a VM based data sharing
infrastructure from NFS. This would allow multiple virtual machines in
a single server to share data efficiently. With some tricks, we are
able to export inode cache from server to client. Also, we modify the
file handle composer to carry the server-side inode address, inode
number, i_gen, dev along with a file handle. Upon receiving a file
handle, a client can directly access the inode object in the exported
inode cache and bypass the inter-VM communication.

So, in our case, we don't need to consider interoperability (at least
for now), and we DO know the inode number, generation, as well as
exported device info.

I think this explains why I want to make sure the conclusion is right:

Conclusion: Given a stored file handle and an inode object received from the
server, an NFS client can safely determine whether this inode
corresponds to the file handle by checking the inode+dev+i_generation.

Many thanks for this helpful discussion.

Xin

On 8/10/06, Matthew Wilcox <[email protected]> wrote:
> On Thu, Aug 10, 2006 at 12:23:12PM -0400, Xin Zhao wrote:
> > That makes sense.
> >
> > Can we make the following two conclusions?
> > 1. In a single machine, inode+dev ID+i_generation can uniquely identify a
> > file
>
> sure.
>
> > 2. Given a stored file handle and an inode object received from the
> > server, an NFS client can safely determine whether this inode
> > corresponds to the file handle by checking the inode+dev+i_generation.
>
> The NFS client makes up its own inode numbers for use on the local
> machine. It doesn't know the device+inode+generation numbers on the
> server (and indeed, the server may not even have the concepts of
> inodes). To quote RFC 1813:
>
> The file handle contains all the information the server needs to
> distinguish an individual file. To the client, the file handle is
> opaque. The client stores file handles for use in a later request
> and can compare two file handles from the same server for equality by
> doing a byte-by-byte comparison, but cannot otherwise interpret the
> contents of file handles. If two file handles from the same server
> are equal, they must refer to the same file, but if they are not
> equal, no conclusions can be drawn. Servers should try to maintain
> a one-to-one correspondence between file handles and files, but this
> is not required. Clients should use file handle comparisons only to
> improve performance, not for correct behavior.
>

2006-08-10 17:29:05

by Trond Myklebust

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, 2006-08-10 at 12:23 -0400, Xin Zhao wrote:
> That makes sense.
>
> Can we make the following two conclusions?
> 1. In a single machine, inode+dev ID+i_generation can uniquely identify a file

Not really. The device id is frequently subject to change on server
reboot or device disconnect/reconnect.

> 2. Given a stored file handle and an inode object received from the
> server, an NFS client can safely determine whether this inode
> corresponds to the file handle by checking the inode+dev+i_generation.

No! The file handle is an opaque bag of bytes as far as clients are
concerned. If you change the server, then the filehandle format can and
will change. On linux, even changing the setting of the subtree_checking
export option will suffice to change the filehandle.

Cheers,
Trond

2006-08-10 17:38:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, 2006-08-10 at 13:08 -0400, Xin Zhao wrote:
> Well. For regular NFS, because it needs to consider interoperability,
> it cannot use file handle as an opaque object.
>
> However, in our case, we essentially derived a VM based data sharing
> infrastructure from NFS. This would allow multiple virtual machines in
> a single server to share data efficiently. With some tricks, we are
> able to export inode cache from server to client. Also, we modify the
> file handle composer to carry the server-side inode address, inode
> number, i_gen, dev along with a file handle. Upon receiving a file
> handle, a client can directly access the inode object in the exported
> inode cache and bypass the inter-VM communication.

The correct way to do this sort of thing is to use pNFS, which has
protocol support for this sort of thing, and is part of the draft NFSv4
minor version 1 specification. See

http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-minorversion1-04.txt

Cheers,
Trond

2006-08-10 18:02:44

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Thanks. Trond.

The device is subject to change when server reboot? I don't quite
understand. If the backing device at the server side is not changed,
how come server reboot will cause device ID change?

One possibilty that can cause device ID to change is exported device
change AFTER server reboots. But this can be detected by adding a
server generation number or device generation number. So maybe we can
say: "In a single machine, inode+dev
ID+i_generation+server_generation can uniquely identify a file". Is
this true?

About your comment on the second conclusion, I already explained in
one of my previous email. We assume that both server and clients are
under our control. That is, we don't consider too much about
interoperability. The file handle format will be static even the NFS
server is changed. Actually, in our inter-VM inode sharing scheme, we
don't even care about the normal file handle contents. Instead, we
only check our extended fields, which include: server-side inode
address, ino, dev info, i_generation and server_generation. An NFS
client first uses the server-side inode address to locate the inode
object in the server inode cache (we dynamically remapped the inode
cache into the client, in order to expedite metadata retrieval and
bypass inter-VM communication). After getting the inode object, the
NFS client has to validate this inode object corresponds to the file
handle so that it can read the right file attributes stored in the
inode. There are many possibilities that can cause a located inode
stores false information: the inode has been released because someone
on the server remove the file, the inode was filled by another file's
inode (other possibilities?). So we must validate the inode before
using the file attributes retrieved from the mapped inode.

That's why we bring up this question.

Also, does someone compare NFS v4's delegation mechanism with the
speculative execution mechanism proposed in SOSP 2005
http://www.cs.cmu.edu/~dga/15-849/papers/speculator-sosp2005.pdf?

What are the pros and cons of these two mechanisms?

I put the content of my previous email below.
----My previous email ---
Well. For regular NFS, because it needs to consider interoperability,
it cannot use file handle as an opaque object.

However, in our case, we essentially derived a VM based data sharing
infrastructure from NFS. This would allow multiple virtual machines in
a single server to share data efficiently. With some tricks, we are
able to export inode cache from server to client. Also, we modify the
file handle composer to carry the server-side inode address, inode
number, i_gen, dev along with a file handle. Upon receiving a file
handle, a client can directly access the inode object in the exported
inode cache and bypass the inter-VM communication.

So, in our case, we don't need to consider interoperability (at least
for now), and we DO know the inode number, generation, as well as
exported device info.

I think this explains why I want to make sure the conclusion is right:

Conclusion: Given a stored file handle and an inode object received from the
server, an NFS client can safely determine whether this inode
corresponds to the file handle by checking the inode+dev+i_generation.

Many thanks for this helpful discussion.


On 8/10/06, Trond Myklebust <[email protected]> wrote:
> On Thu, 2006-08-10 at 12:23 -0400, Xin Zhao wrote:
> > That makes sense.
> >
> > Can we make the following two conclusions?
> > 1. In a single machine, inode+dev ID+i_generation can uniquely identify a file
>
> Not really. The device id is frequently subject to change on server
> reboot or device disconnect/reconnect.
>
> > 2. Given a stored file handle and an inode object received from the
> > server, an NFS client can safely determine whether this inode
> > corresponds to the file handle by checking the inode+dev+i_generation.
>
> No! The file handle is an opaque bag of bytes as far as clients are
> concerned. If you change the server, then the filehandle format can and
> will change. On linux, even changing the setting of the subtree_checking
> export option will suffice to change the filehandle.
>
> Cheers,
> Trond
>
>

2006-08-10 20:00:08

by Trond Myklebust

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, 2006-08-10 at 14:02 -0400, Xin Zhao wrote:
> Thanks. Trond.
>
> The device is subject to change when server reboot? I don't quite
> understand. If the backing device at the server side is not changed,
> how come server reboot will cause device ID change?

Things like USB, firewire, and fibre channel allocate their device ids
on the fly. There is no such thing as a fixed device id in those cases.

> About your comment on the second conclusion, I already explained in
> one of my previous email. We assume that both server and clients are
> under our control. That is, we don't consider too much about
> interoperability. The file handle format will be static even the NFS
> server is changed. Actually, in our inter-VM inode sharing scheme, we
> don't even care about the normal file handle contents. Instead, we
> only check our extended fields, which include: server-side inode
> address, ino, dev info, i_generation and server_generation. An NFS
> client first uses the server-side inode address to locate the inode
> object in the server inode cache (we dynamically remapped the inode
> cache into the client, in order to expedite metadata retrieval and
> bypass inter-VM communication). After getting the inode object, the
> NFS client has to validate this inode object corresponds to the file
> handle so that it can read the right file attributes stored in the
> inode. There are many possibilities that can cause a located inode
> stores false information: the inode has been released because someone
> on the server remove the file, the inode was filled by another file's
> inode (other possibilities?). So we must validate the inode before
> using the file attributes retrieved from the mapped inode.
>
> That's why we bring up this question.

Why do this, when people are working on standards and implementations
for doing precisely the above within the NFSv4 protocol?

> Also, does someone compare NFS v4's delegation mechanism with the
> speculative execution mechanism proposed in SOSP 2005
> http://www.cs.cmu.edu/~dga/15-849/papers/speculator-sosp2005.pdf?
>
> What are the pros and cons of these two mechanisms?

Delegations are all about caching. This paper appears to be about
getting round the bottlenecks due to synchronous operations. How are the
two issues related?

Cheers,
Trond

2006-08-10 21:00:23

by Peter Staubach

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Xin Zhao wrote:
> That makes sense.
>
> Can we make the following two conclusions?
> 1. In a single machine, inode+dev ID+i_generation can uniquely
> identify a file
> 2. Given a stored file handle and an inode object received from the
> server, an NFS client can safely determine whether this inode
> corresponds to the file handle by checking the inode+dev+i_generation.
>

#1 seems to safe enough to assume.

#2 either doesn't make sense to me or is assuming things about the file
handle
that the client is not allowed to assume. A file handle is an opaque string
of bytes to the client. The only entity allowed to interpret the contents
is the entity which generated the file handle.

---

Is this situation any different than an application opens file, "A".
Another
process then renames "A" to "B". Now, the original application is
reading and
writing from and to a file called "B" and has no knowledge of this.

---

The bottom line is that the file handle uniquely identifies a particular
entity on a file system on the server. The name of the entity does not
matter.

Thanx...

ps

2006-08-10 22:25:29

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

The inter-VM inode helps reduce communication cost used to retrieve
file attributes in a VM environment. In a network environment, it is
possible for a client to direct see the inode caches of the server.
But in the virtual server environment, where both client and server
running on the same physical host, this would be possible.

If clients have read-only access to server's inode cache, they can
directly retrieve file attributes without incurring expensive
getattr() rpc call. Of couse the delegation is able to allow a client
to trust local cached file attributes without worry about server
change. But this only works when file is not shared by multiple
clients. Right? Does NFS4 has some other mechanisms that can further
improve performance on metadata access?

Thanks,
-x

On 8/10/06, Trond Myklebust <[email protected]> wrote:
> On Thu, 2006-08-10 at 14:02 -0400, Xin Zhao wrote:
> > Thanks. Trond.
> >
> > The device is subject to change when server reboot? I don't quite
> > understand. If the backing device at the server side is not changed,
> > how come server reboot will cause device ID change?
>
> Things like USB, firewire, and fibre channel allocate their device ids
> on the fly. There is no such thing as a fixed device id in those cases.
>
> > About your comment on the second conclusion, I already explained in
> > one of my previous email. We assume that both server and clients are
> > under our control. That is, we don't consider too much about
> > interoperability. The file handle format will be static even the NFS
> > server is changed. Actually, in our inter-VM inode sharing scheme, we
> > don't even care about the normal file handle contents. Instead, we
> > only check our extended fields, which include: server-side inode
> > address, ino, dev info, i_generation and server_generation. An NFS
> > client first uses the server-side inode address to locate the inode
> > object in the server inode cache (we dynamically remapped the inode
> > cache into the client, in order to expedite metadata retrieval and
> > bypass inter-VM communication). After getting the inode object, the
> > NFS client has to validate this inode object corresponds to the file
> > handle so that it can read the right file attributes stored in the
> > inode. There are many possibilities that can cause a located inode
> > stores false information: the inode has been released because someone
> > on the server remove the file, the inode was filled by another file's
> > inode (other possibilities?). So we must validate the inode before
> > using the file attributes retrieved from the mapped inode.
> >
> > That's why we bring up this question.
>
> Why do this, when people are working on standards and implementations
> for doing precisely the above within the NFSv4 protocol?
>
> > Also, does someone compare NFS v4's delegation mechanism with the
> > speculative execution mechanism proposed in SOSP 2005
> > http://www.cs.cmu.edu/~dga/15-849/papers/speculator-sosp2005.pdf?
> >
> > What are the pros and cons of these two mechanisms?
>
> Delegations are all about caching. This paper appears to be about
> getting round the bottlenecks due to synchronous operations. How are the
> two issues related?
>
> Cheers,
> Trond
>
>

2006-08-10 22:28:34

by Xin Zhao

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

Also, delegations are about caching. That's true. It improve NFS
performance because a client with a lease does not need to worry about
server change and can manipulate files using local cache. But if
speculative execution can achieve the same goal without incurring the
cost of lease renewal and revoke, delegation becomes less useful.

So my question is essentially: if speculative execution is there, why
do we still need delegation? Can delegation do anything better?

Xin

On 8/10/06, Trond Myklebust <[email protected]> wrote:
> On Thu, 2006-08-10 at 14:02 -0400, Xin Zhao wrote:
> > Thanks. Trond.
> >
> > The device is subject to change when server reboot? I don't quite
> > understand. If the backing device at the server side is not changed,
> > how come server reboot will cause device ID change?
>
> Things like USB, firewire, and fibre channel allocate their device ids
> on the fly. There is no such thing as a fixed device id in those cases.
>
> > About your comment on the second conclusion, I already explained in
> > one of my previous email. We assume that both server and clients are
> > under our control. That is, we don't consider too much about
> > interoperability. The file handle format will be static even the NFS
> > server is changed. Actually, in our inter-VM inode sharing scheme, we
> > don't even care about the normal file handle contents. Instead, we
> > only check our extended fields, which include: server-side inode
> > address, ino, dev info, i_generation and server_generation. An NFS
> > client first uses the server-side inode address to locate the inode
> > object in the server inode cache (we dynamically remapped the inode
> > cache into the client, in order to expedite metadata retrieval and
> > bypass inter-VM communication). After getting the inode object, the
> > NFS client has to validate this inode object corresponds to the file
> > handle so that it can read the right file attributes stored in the
> > inode. There are many possibilities that can cause a located inode
> > stores false information: the inode has been released because someone
> > on the server remove the file, the inode was filled by another file's
> > inode (other possibilities?). So we must validate the inode before
> > using the file attributes retrieved from the mapped inode.
> >
> > That's why we bring up this question.
>
> Why do this, when people are working on standards and implementations
> for doing precisely the above within the NFSv4 protocol?
>
> > Also, does someone compare NFS v4's delegation mechanism with the
> > speculative execution mechanism proposed in SOSP 2005
> > http://www.cs.cmu.edu/~dga/15-849/papers/speculator-sosp2005.pdf?
> >
> > What are the pros and cons of these two mechanisms?
>
> Delegations are all about caching. This paper appears to be about
> getting round the bottlenecks due to synchronous operations. How are the
> two issues related?
>
> Cheers,
> Trond
>
>

2006-08-11 00:39:07

by Trond Myklebust

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, 2006-08-10 at 18:28 -0400, Xin Zhao wrote:
> Also, delegations are about caching. That's true. It improve NFS
> performance because a client with a lease does not need to worry about
> server change and can manipulate files using local cache. But if
> speculative execution can achieve the same goal without incurring the
> cost of lease renewal and revoke, delegation becomes less useful.

What am I missing? AFAICS the main purpose of speculative execution
would appear to be to reduce the latency of syscall execution on
clients. That doesn't suffice to replace caching even by a long shot.

Delegations are all about _not_ sending commands to the server when you
don't need to. They make NFS scale to larger numbers of clients.

> So my question is essentially: if speculative execution is there, why
> do we still need delegation? Can delegation do anything better?

Speculative execution is where? I see one academic paper detailing a
couple of lab experiments, but no published code. Do you know of anyone
who has reproduced these results in real life environments?

I'm particularly curious to see how they resolved the requirement that
"...speculative state should never be visible to the user or any
external device.". The fact that they need to discuss having to roll
back operations like "mkdir", which create (very) user-visible state on
the server, is rather telling...

Trond

2006-08-11 00:45:05

by Trond Myklebust

[permalink] [raw]
Subject: Re: Urgent help needed on an NFS question, please help!!!

On Thu, 2006-08-10 at 18:25 -0400, Xin Zhao wrote:
> The inter-VM inode helps reduce communication cost used to retrieve
> file attributes in a VM environment. In a network environment, it is
> possible for a client to direct see the inode caches of the server.
> But in the virtual server environment, where both client and server
> running on the same physical host, this would be possible.
>
> If clients have read-only access to server's inode cache, they can
> directly retrieve file attributes without incurring expensive
> getattr() rpc call. Of couse the delegation is able to allow a client
> to trust local cached file attributes without worry about server
> change. But this only works when file is not shared by multiple
> clients. Right? Does NFS4 has some other mechanisms that can further
> improve performance on metadata access?

Not metadata access, no. That would require some seriously messy locking
rules.
It improves performance by allowing a client to access the block device
directly for data reads and writes if it has the capability of doing so.

Trond