2008-01-04 16:41:58

by Jeff Garzik

[permalink] [raw]
Subject: Re: A new NFSv4 server...

Peter =C3=85strand wrote:
> Many years ago, before NFSv4 was finished, I felt the same. I was wai=
> for v4 and thought that everything would be so much better. I wanted =
> help and started the "pynfs" project. Today, I have a different opini=
on. I=20


> think v3 is a fairly good protocol, if you use it correctly. For exam=
> many people don't realize that you don't need the portmapper, that yo=
u can=20
> use a single well-known TCP port, that you can use RPCSEC_GSS and so=20
> forth, even with v3.=20

Absolutely... But still, I think integrated mount protocol (aka pseudo=
filesystem namespace) and integrated locking were big steps forward.=20
You really shouldn't need more than one protocol.

Speaking of RPCSEC_GSS: I would love to see a much more straightforward=
authentication process, something /not/ buried inside special behaviors=
triggered by opcodes found in an opaque cred struct :/ RPCSEC_GSS=20
context creation, the special casing around the 'null' procedure, and=20
the overloading of the RPC data portion of things is a huge pain to=20

Authentication and security should be simple, tough to screw up. I=20
would tend to prefer an ASCII-based authentication/security negotiation=
at the start of a [SCTP|TCP] stream.

Use TLS to give most people what they want: AUTH_SYS with encryption.=20
GSSAPI is fine as a "required option" but you shouldn't need GSSAPI to=20
do simple wire encryption between IP-authenticated hosts.

> I think v4 has a few valuable improvements, but it comes with a very =
> price. v3 has a minimalistic beauty which v4 lacks. For example, take=
> look at the OPEN operation with 7 arguments, of which many are comple=
> data structures:
> (cfh), seqid, share_access, share_deny, owner, openhow, claim ->
> (cfh), stateid, cinfo, rflags, open_confirm, attrset delegation
> Not pretty... =20

heh, tell me about it. First I started out using rpcgen, then rewrote=20
everything to do raw XDR decoding. OPEN is huge.

IMO, OPEN should be split into multiple operations, probably one for=20
each "OPEN arm". It's not like new opcode numbers are expensive.

Or, hope of hopes, simplify OPEN in some other manner, like delegating=20
tasks to other operations.

>> Oh, certainly. I was mainly thinking a replacement of the wire prot=
ocol would
>> be an easier step for people to swallow than a new protocol.
> I've been thinking of trying to put together something like NFS v3.5.=
> parts of v4 are nice, but the complexity is too high.=20

Agreed that's it's quite complex.

One of my personal desires is for a high level of cache coherence=20
throughout the system for all clients (though perhaps an admin could=20
optionally relax this requirement). I'm a fan of Google's "Chubby", a=20
distributed reliable filesystem that stalls client writes until cache=20
invalidations for the associated byte range are processed for all=20
interested clients.

And anything approaching cache coherence requires some complexity :/

Another thing I like about NFSv4 is that batching sequences into chunks=
of fine-grained operations is generally a useful practice. So while th=
end result (COMPOUND) is a bit of a pain, bundling a sequence of=20
operations into a single unit is useful.


2008-01-04 20:03:01

by Peter Åstrand

[permalink] [raw]
Subject: Re: A new NFSv4 server...

NFSv4 mailing list
[email protected]

(No filename) (138.00 B)