2012-03-01 22:00:16

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 00/15] For 3.4 (2nd take)

Hi-

Please consider these minor fixes and migration pre-requisites for
3.4. I've ported them to nfs-for-next on top of commit abd96698.
I've tested all but the last one (see patch description).

---

Chuck Lever (13):
NFS: Fix some minor problems with nfs4_deviceids
NFS: Fix some minor problems with nfs4_verifiers
NFS: Request fh_expire_type attribute in "server caps" operation
NFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS
NFS: Simplify arguments of encode_renew()
NFS: Save root file handle in nfs_server
NFS: Add a client-side function to display NFS file handles
commit 6f38b4ba433ac6494f83cb73dd07dcbde797e1e0
SUNRPC: Add API to acquire source address
NFS: Reduce debugging noise from encode_compound_hdr
NFS: Add debugging messages to NFSv4's CLOSE procedure
NFS: Clean up debugging in decode_pathname()
NFS: Make nfs_cache_array.size a signed integer

Trond Myklebust (2):
SUNRPC: Move clnt->cl_server into struct rpc_xprt
SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field


fs/nfs/blocklayout/blocklayout.c | 2
fs/nfs/blocklayout/blocklayoutdev.c | 8 -
fs/nfs/blocklayout/extents.c | 5 -
fs/nfs/callback.c | 3
fs/nfs/callback_proc.c | 9 +
fs/nfs/callback_xdr.c | 2
fs/nfs/client.c | 32 +++
fs/nfs/dir.c | 2
fs/nfs/getroot.c | 5 +
fs/nfs/inode.c | 45 +++++
fs/nfs/nfs4filelayout.c | 3
fs/nfs/nfs4namespace.c | 2
fs/nfs/nfs4proc.c | 40 +++-
fs/nfs/nfs4state.c | 25 ++-
fs/nfs/nfs4xdr.c | 84 ++++++--
fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 -
fs/nfs/pnfs_dev.c | 7 -
fs/nfs/super.c | 11 +
fs/nfsd/nfs4proc.c | 4
fs/nfsd/nfs4state.c | 2
fs/nfsd/nfs4xdr.c | 28 +--
include/linux/nfs4.h | 12 +
include/linux/nfs_fs.h | 14 +
include/linux/nfs_fs_sb.h | 4
include/linux/nfs_xdr.h | 12 +
include/linux/pnfs_osd_xdr.h | 4
include/linux/sunrpc/clnt.h | 6 -
include/linux/sunrpc/debug.h | 13 +
include/linux/sunrpc/xprt.h | 2
net/sunrpc/auth_gss/auth_gss.c | 4
net/sunrpc/clnt.c | 347 ++++++++++++++++++++++++++++-------
net/sunrpc/rpc_pipe.c | 6 +
net/sunrpc/rpcb_clnt.c | 19 +-
net/sunrpc/stats.c | 6 +
net/sunrpc/xprt.c | 15 +-
35 files changed, 604 insertions(+), 187 deletions(-)

--
Signature


2012-03-02 22:52:18

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

On 03/02/2012 02:11 PM, Chuck Lever wrote:
>
> I still need someone with OSD pNFS set up to test. Can I send you a proposed patch?
>

Yes that would be me. Sure, best is if you have a public git tree I can pull
but if not then patchset in the mail is also fine

I'll run a test and report, tell me when

Boaz


2012-03-02 22:00:28

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

On 03/02/2012 01:58 PM, Boaz Harrosh wrote:
>
> For the same reason we do debug prints at all, for inconvenience.
>

inconvenience => convenience

Thanks
Boaz

2012-03-02 17:17:19

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles



On 03/01/2012 05:28 PM, Myklebust, Trond wrote:
> On Thu, 2012-03-01 at 17:01 -0500, Chuck Lever wrote:
>> For debugging, introduce a simplistic function to print NFS file
>> handles on the system console. The main function is hooked into the
>> dprintk debugging facility, but you can directly call the helper,
>> _nfs_display_fhandle(), if you want to print a handle unconditionally.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>>
>
> One of the things that I really like about wireshark is that it also
> displays a 32-bit hexadecimal hash of the filehandle. That allows you to
> compare 2 filehandles _much_ faster than trying to look at the full
> shebang.
>
>
+1 Being able to easily trace filehandle in both the kernel debug and
wireshark would be huge.... IMHO...

steved.

2012-03-01 22:00:49

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 04/15] NFS: Reduce debugging noise from encode_compound_hdr

Get rid of

encode_compound: tag=

when XDR debugging is enabled. The current Linux client never sets
compound tags.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4xdr.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 4ca87cb..d1b7914d 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -889,7 +889,9 @@ static void encode_compound_hdr(struct xdr_stream *xdr,
* but this is not required as a MUST for the server to do so. */
hdr->replen = RPC_REPHDRSIZE + auth->au_rslack + 3 + hdr->taglen;

+#if 0
dprintk("encode_compound: tag=%.*s\n", (int)hdr->taglen, hdr->tag);
+#endif
BUG_ON(hdr->taglen > NFS4_MAXTAGLEN);
p = reserve_space(xdr, 4 + hdr->taglen + 8);
p = xdr_encode_opaque(p, hdr->tag, hdr->taglen);


2012-03-01 22:55:41

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids


On Mar 1, 2012, at 5:39 PM, Myklebust, Trond wrote:

> On Thu, 2012-03-01 at 17:02 -0500, Chuck Lever wrote:
>> Clean up due to code review.
>>
>> sizeof(struct nfs4_deviceid) is the size of the in-core device ID data
>> structure, but NFS4_DEVICEID4_SIZE is the number of octets in an XDR'd
>> device ID. The two are not interchangeable, even if they happen to
>> have the same value. If struct nfs4_deviceid is padded by the
>> compiler, sizeof(struct nfs4_deviceid) may not be the same as the
>> XDR'd size. Not likely, but still.
>>
>> Ensure that the data field is aligned to the largest pointer type that
>> is used to access it: in this case, that's u64. Type-punning among
>> types with different alignment has been discouraged in user space and
>> the kernel, to avoid unneeded pointer aliasing. The use of a union
>> is preferred instead.
>>
>> Ensure that XDR'ing and comparing is done via one of the union's data
>> fields, and not via the whole struct, again in case the compiler pads
>> the struct.
>>
>> As a micro-optimization, this change also ensures that the compiler
>> may perform memcpy() and memcmp() operations on these fields in
>> larger, more efficient, chunks.
>>
>> This patch needs review and testing by pnfs layout specialists.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>>
>> fs/nfs/blocklayout/blocklayout.c | 2 +-
>> fs/nfs/blocklayout/blocklayoutdev.c | 8 ++++----
>> fs/nfs/blocklayout/extents.c | 5 +++--
>> fs/nfs/callback_xdr.c | 2 +-
>> fs/nfs/nfs4filelayout.c | 3 +--
>> fs/nfs/nfs4xdr.c | 4 ++--
>> fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 ++++----
>> fs/nfs/pnfs_dev.c | 7 ++++---
>> include/linux/nfs4.h | 7 ++++++-
>> include/linux/pnfs_osd_xdr.h | 4 ++--
>> 10 files changed, 28 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
>> index 783ebd5..3b730bf 100644
>> --- a/fs/nfs/blocklayout/blocklayout.c
>> +++ b/fs/nfs/blocklayout/blocklayout.c
>> @@ -901,7 +901,7 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
>> dev->pglen = PAGE_SIZE * max_pages;
>> dev->mincount = 0;
>>
>> - dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.data);
>> + dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.u.data);
>> rc = nfs4_proc_getdeviceinfo(server, dev);
>> dprintk("%s getdevice info returns %d\n", __func__, rc);
>> if (rc) {
>> diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c
>> index b48f782..2dafca5 100644
>> --- a/fs/nfs/blocklayout/blocklayoutdev.c
>> +++ b/fs/nfs/blocklayout/blocklayoutdev.c
>> @@ -124,8 +124,8 @@ nfs4_blk_decode_device(struct nfs_server *server,
>> struct nfs_net *nn = net_generic(net, nfs_net_id);
>>
>> dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
>> - dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
>> - dev->mincount);
>> + dprintk("%s: deviceid: %s, mincount: %d\n", __func__,
>> + dev->dev_id.u.data, dev->mincount);
>>
>> memset(&msg, 0, sizeof(msg));
>> msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
>> @@ -206,7 +206,7 @@ static struct block_device *translate_devid(struct pnfs_layout_hdr *lo,
>> mid = BLK_ID(lo);
>> spin_lock(&mid->bm_lock);
>> list_for_each_entry(dev, &mid->bm_devlist, bm_node) {
>> - if (memcmp(id->data, dev->bm_mdevid.data,
>> + if (memcmp(id->u.data, dev->bm_mdevid.u.data,
>> NFS4_DEVICEID4_SIZE) == 0) {
>> rv = dev->bm_mdev;
>> goto out;
>> @@ -323,7 +323,7 @@ nfs4_blk_process_layoutget(struct pnfs_layout_hdr *lo,
>> status = -ENOMEM;
>> goto out_err;
>> }
>> - memcpy(&be->be_devid, p, NFS4_DEVICEID4_SIZE);
>> + memcpy(&be->be_devid.u.data, p, NFS4_DEVICEID4_SIZE);
>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>> be->be_mdev = translate_devid(lo, &be->be_devid);
>> if (!be->be_mdev)
>> diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c
>> index 1f9a603..52b747c 100644
>> --- a/fs/nfs/blocklayout/extents.c
>> +++ b/fs/nfs/blocklayout/extents.c
>> @@ -675,10 +675,11 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
>> if (!xdr_start)
>> goto out;
>> list_for_each_entry_safe(lce, save, &bl->bl_commit, bse_node) {
>> - p = xdr_reserve_space(xdr, 7 * 4 + sizeof(lce->bse_devid.data));
>> + p = xdr_reserve_space(xdr, 7 * 4 + NFS4_DEVICEID4_SIZE);
>> if (!p)
>> break;
>> - p = xdr_encode_opaque_fixed(p, lce->bse_devid.data, NFS4_DEVICEID4_SIZE);
>> + p = xdr_encode_opaque_fixed(p, lce->bse_devid.u.data,
>> + NFS4_DEVICEID4_SIZE);
>> p = xdr_encode_hyper(p, lce->bse_f_offset << SECTOR_SHIFT);
>> p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
>> p = xdr_encode_hyper(p, 0LL);
>> diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
>> index 5466829..6061322 100644
>> --- a/fs/nfs/callback_xdr.c
>> +++ b/fs/nfs/callback_xdr.c
>> @@ -347,7 +347,7 @@ __be32 decode_devicenotify_args(struct svc_rqst *rqstp,
>> goto err;
>> }
>> dev->cbd_layout_type = ntohl(*p++);
>> - memcpy(dev->cbd_dev_id.data, p, NFS4_DEVICEID4_SIZE);
>> + memcpy(dev->cbd_dev_id.u.data, p, NFS4_DEVICEID4_SIZE);
>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>
>> if (dev->cbd_layout_type == NOTIFY_DEVICEID4_CHANGE) {
>> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
>> index 47e8f34..f0c6ba0 100644
>> --- a/fs/nfs/nfs4filelayout.c
>> +++ b/fs/nfs/nfs4filelayout.c
>> @@ -550,8 +550,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
>> if (unlikely(!p))
>> goto out_err;
>>
>> - memcpy(id, p, sizeof(*id));
>> - p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>> + p = xdr_decode_opaque_fixed(p, id->u.data, NFS4_DEVICEID4_SIZE);
>> nfs4_print_deviceid(id);
>>
>> nfl_util = be32_to_cpup(p++);
>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>> index 80ba010..91020ea 100644
>> --- a/fs/nfs/nfs4xdr.c
>> +++ b/fs/nfs/nfs4xdr.c
>> @@ -1946,7 +1946,7 @@ encode_getdeviceinfo(struct xdr_stream *xdr,
>>
>> p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
>> *p++ = cpu_to_be32(OP_GETDEVICEINFO);
>> - p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
>> + p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.u.data,
>> NFS4_DEVICEID4_SIZE);
>> *p++ = cpu_to_be32(args->pdev->layout_type);
>> *p++ = cpu_to_be32(args->pdev->pglen); /* gdia_maxcount */
>> @@ -5492,7 +5492,7 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
>> if (unlikely(!p))
>> goto out_overflow;
>> for (i = 0; i < res->num_devs; i++)
>> - p = xdr_decode_opaque_fixed(p, res->dev_id[i].data,
>> + p = xdr_decode_opaque_fixed(p, res->dev_id[i].u.data,
>> NFS4_DEVICEID4_SIZE);
>> res->eof = be32_to_cpup(p);
>> return 0;
>> diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>> index b3918f7..99a1e89 100644
>> --- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>> +++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>> @@ -55,8 +55,8 @@
>> static __be32 *
>> _osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid)
>> {
>> - p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data,
>> - sizeof(objid->oid_device_id.data));
>> + p = xdr_decode_opaque_fixed(p, objid->oid_device_id.u.data,
>> + NFS4_DEVICEID4_SIZE);
>>
>> p = xdr_decode_hyper(p, &objid->oid_partition_id);
>> p = xdr_decode_hyper(p, &objid->oid_object_id);
>> @@ -377,8 +377,8 @@ pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr,
>> static inline __be32 *
>> pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id)
>> {
>> - p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data,
>> - sizeof(object_id->oid_device_id.data));
>> + p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.u.data,
>> + NFS4_DEVICEID4_SIZE);
>> p = xdr_encode_hyper(p, object_id->oid_partition_id);
>> p = xdr_encode_hyper(p, object_id->oid_object_id);
>>
>> diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
>> index 4f359d2..6c6ab81 100644
>> --- a/fs/nfs/pnfs_dev.c
>> +++ b/fs/nfs/pnfs_dev.c
>> @@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(nfs4_deviceid_lock);
>> void
>> nfs4_print_deviceid(const struct nfs4_deviceid *id)
>> {
>> - u32 *p = (u32 *)id;
>> + const u32 *p = id->u.data32;
>>
>> dprintk("%s: device id= [%x%x%x%x]\n", __func__,
>> p[0], p[1], p[2], p[3]);
>> @@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
>> static inline u32
>> nfs4_deviceid_hash(const struct nfs4_deviceid *id)
>> {
>> - unsigned char *cptr = (unsigned char *)id->data;
>> + const unsigned char *cptr = id->u.data;
>> unsigned int nbytes = NFS4_DEVICEID4_SIZE;
>> u32 x = 0;
>>
>> @@ -77,7 +77,8 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,
>>
>> hlist_for_each_entry_rcu(d, n, &nfs4_deviceid_cache[hash], node)
>> if (d->ld == ld && d->nfs_client == clp &&
>> - !memcmp(&d->deviceid, id, sizeof(*id))) {
>> + !memcmp(&d->deviceid.u.data, id->u.data,
>> + NFS4_DEVICEID4_SIZE)) {
>> if (atomic_read(&d->ref))
>> return d;
>> else
>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>> index b1e6b64..979f607 100644
>> --- a/include/linux/nfs4.h
>> +++ b/include/linux/nfs4.h
>> @@ -649,9 +649,14 @@ enum filelayout_hint_care4 {
>> #define NFS4_DEVICEID4_SIZE 16
>>
>> struct nfs4_deviceid {
>> - char data[NFS4_DEVICEID4_SIZE];
>> + union {
>> + unsigned char data[NFS4_DEVICEID4_SIZE];
>> + u32 data32[NFS4_DEVICEID4_SIZE / sizeof(u32)];
>> + u64 data64[NFS4_DEVICEID4_SIZE / sizeof(u64)];
>> + } u;
>> };
>>
>
> Ugh... This just looks worse and worse.... Let's rather just keep these
> as unsigned char. I don't care if OSDs have special secret meanings and
> all that crap. They can cast the damned thing if they need to. The rest
> of us should be considering this to be opaque data.

One possibility is that OSD and block layout drivers can construct their device ID in a u64 array on the stack, and then memcpy() the array to the nfs4_deviceid.

But we can't cast pointers to an "unsigned char" field, since it's not guaranteed to be aligned. That's a real bug, one that's hit us before.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-01 22:02:24

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

Clean up due to code review.

sizeof(struct nfs4_deviceid) is the size of the in-core device ID data
structure, but NFS4_DEVICEID4_SIZE is the number of octets in an XDR'd
device ID. The two are not interchangeable, even if they happen to
have the same value. If struct nfs4_deviceid is padded by the
compiler, sizeof(struct nfs4_deviceid) may not be the same as the
XDR'd size. Not likely, but still.

Ensure that the data field is aligned to the largest pointer type that
is used to access it: in this case, that's u64. Type-punning among
types with different alignment has been discouraged in user space and
the kernel, to avoid unneeded pointer aliasing. The use of a union
is preferred instead.

Ensure that XDR'ing and comparing is done via one of the union's data
fields, and not via the whole struct, again in case the compiler pads
the struct.

As a micro-optimization, this change also ensures that the compiler
may perform memcpy() and memcmp() operations on these fields in
larger, more efficient, chunks.

This patch needs review and testing by pnfs layout specialists.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/blocklayout/blocklayout.c | 2 +-
fs/nfs/blocklayout/blocklayoutdev.c | 8 ++++----
fs/nfs/blocklayout/extents.c | 5 +++--
fs/nfs/callback_xdr.c | 2 +-
fs/nfs/nfs4filelayout.c | 3 +--
fs/nfs/nfs4xdr.c | 4 ++--
fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 ++++----
fs/nfs/pnfs_dev.c | 7 ++++---
include/linux/nfs4.h | 7 ++++++-
include/linux/pnfs_osd_xdr.h | 4 ++--
10 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index 783ebd5..3b730bf 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -901,7 +901,7 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
dev->pglen = PAGE_SIZE * max_pages;
dev->mincount = 0;

- dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.data);
+ dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.u.data);
rc = nfs4_proc_getdeviceinfo(server, dev);
dprintk("%s getdevice info returns %d\n", __func__, rc);
if (rc) {
diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c
index b48f782..2dafca5 100644
--- a/fs/nfs/blocklayout/blocklayoutdev.c
+++ b/fs/nfs/blocklayout/blocklayoutdev.c
@@ -124,8 +124,8 @@ nfs4_blk_decode_device(struct nfs_server *server,
struct nfs_net *nn = net_generic(net, nfs_net_id);

dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
- dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
- dev->mincount);
+ dprintk("%s: deviceid: %s, mincount: %d\n", __func__,
+ dev->dev_id.u.data, dev->mincount);

memset(&msg, 0, sizeof(msg));
msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
@@ -206,7 +206,7 @@ static struct block_device *translate_devid(struct pnfs_layout_hdr *lo,
mid = BLK_ID(lo);
spin_lock(&mid->bm_lock);
list_for_each_entry(dev, &mid->bm_devlist, bm_node) {
- if (memcmp(id->data, dev->bm_mdevid.data,
+ if (memcmp(id->u.data, dev->bm_mdevid.u.data,
NFS4_DEVICEID4_SIZE) == 0) {
rv = dev->bm_mdev;
goto out;
@@ -323,7 +323,7 @@ nfs4_blk_process_layoutget(struct pnfs_layout_hdr *lo,
status = -ENOMEM;
goto out_err;
}
- memcpy(&be->be_devid, p, NFS4_DEVICEID4_SIZE);
+ memcpy(&be->be_devid.u.data, p, NFS4_DEVICEID4_SIZE);
p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
be->be_mdev = translate_devid(lo, &be->be_devid);
if (!be->be_mdev)
diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c
index 1f9a603..52b747c 100644
--- a/fs/nfs/blocklayout/extents.c
+++ b/fs/nfs/blocklayout/extents.c
@@ -675,10 +675,11 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
if (!xdr_start)
goto out;
list_for_each_entry_safe(lce, save, &bl->bl_commit, bse_node) {
- p = xdr_reserve_space(xdr, 7 * 4 + sizeof(lce->bse_devid.data));
+ p = xdr_reserve_space(xdr, 7 * 4 + NFS4_DEVICEID4_SIZE);
if (!p)
break;
- p = xdr_encode_opaque_fixed(p, lce->bse_devid.data, NFS4_DEVICEID4_SIZE);
+ p = xdr_encode_opaque_fixed(p, lce->bse_devid.u.data,
+ NFS4_DEVICEID4_SIZE);
p = xdr_encode_hyper(p, lce->bse_f_offset << SECTOR_SHIFT);
p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
p = xdr_encode_hyper(p, 0LL);
diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
index 5466829..6061322 100644
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -347,7 +347,7 @@ __be32 decode_devicenotify_args(struct svc_rqst *rqstp,
goto err;
}
dev->cbd_layout_type = ntohl(*p++);
- memcpy(dev->cbd_dev_id.data, p, NFS4_DEVICEID4_SIZE);
+ memcpy(dev->cbd_dev_id.u.data, p, NFS4_DEVICEID4_SIZE);
p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);

if (dev->cbd_layout_type == NOTIFY_DEVICEID4_CHANGE) {
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 47e8f34..f0c6ba0 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -550,8 +550,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
if (unlikely(!p))
goto out_err;

- memcpy(id, p, sizeof(*id));
- p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
+ p = xdr_decode_opaque_fixed(p, id->u.data, NFS4_DEVICEID4_SIZE);
nfs4_print_deviceid(id);

nfl_util = be32_to_cpup(p++);
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 80ba010..91020ea 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1946,7 +1946,7 @@ encode_getdeviceinfo(struct xdr_stream *xdr,

p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
*p++ = cpu_to_be32(OP_GETDEVICEINFO);
- p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
+ p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.u.data,
NFS4_DEVICEID4_SIZE);
*p++ = cpu_to_be32(args->pdev->layout_type);
*p++ = cpu_to_be32(args->pdev->pglen); /* gdia_maxcount */
@@ -5492,7 +5492,7 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
if (unlikely(!p))
goto out_overflow;
for (i = 0; i < res->num_devs; i++)
- p = xdr_decode_opaque_fixed(p, res->dev_id[i].data,
+ p = xdr_decode_opaque_fixed(p, res->dev_id[i].u.data,
NFS4_DEVICEID4_SIZE);
res->eof = be32_to_cpup(p);
return 0;
diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
index b3918f7..99a1e89 100644
--- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
+++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
@@ -55,8 +55,8 @@
static __be32 *
_osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid)
{
- p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data,
- sizeof(objid->oid_device_id.data));
+ p = xdr_decode_opaque_fixed(p, objid->oid_device_id.u.data,
+ NFS4_DEVICEID4_SIZE);

p = xdr_decode_hyper(p, &objid->oid_partition_id);
p = xdr_decode_hyper(p, &objid->oid_object_id);
@@ -377,8 +377,8 @@ pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr,
static inline __be32 *
pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id)
{
- p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data,
- sizeof(object_id->oid_device_id.data));
+ p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.u.data,
+ NFS4_DEVICEID4_SIZE);
p = xdr_encode_hyper(p, object_id->oid_partition_id);
p = xdr_encode_hyper(p, object_id->oid_object_id);

diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
index 4f359d2..6c6ab81 100644
--- a/fs/nfs/pnfs_dev.c
+++ b/fs/nfs/pnfs_dev.c
@@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(nfs4_deviceid_lock);
void
nfs4_print_deviceid(const struct nfs4_deviceid *id)
{
- u32 *p = (u32 *)id;
+ const u32 *p = id->u.data32;

dprintk("%s: device id= [%x%x%x%x]\n", __func__,
p[0], p[1], p[2], p[3]);
@@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
static inline u32
nfs4_deviceid_hash(const struct nfs4_deviceid *id)
{
- unsigned char *cptr = (unsigned char *)id->data;
+ const unsigned char *cptr = id->u.data;
unsigned int nbytes = NFS4_DEVICEID4_SIZE;
u32 x = 0;

@@ -77,7 +77,8 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,

hlist_for_each_entry_rcu(d, n, &nfs4_deviceid_cache[hash], node)
if (d->ld == ld && d->nfs_client == clp &&
- !memcmp(&d->deviceid, id, sizeof(*id))) {
+ !memcmp(&d->deviceid.u.data, id->u.data,
+ NFS4_DEVICEID4_SIZE)) {
if (atomic_read(&d->ref))
return d;
else
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index b1e6b64..979f607 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -649,9 +649,14 @@ enum filelayout_hint_care4 {
#define NFS4_DEVICEID4_SIZE 16

struct nfs4_deviceid {
- char data[NFS4_DEVICEID4_SIZE];
+ union {
+ unsigned char data[NFS4_DEVICEID4_SIZE];
+ u32 data32[NFS4_DEVICEID4_SIZE / sizeof(u32)];
+ u64 data64[NFS4_DEVICEID4_SIZE / sizeof(u64)];
+ } u;
};

+
#endif
#endif

diff --git a/include/linux/pnfs_osd_xdr.h b/include/linux/pnfs_osd_xdr.h
index 435dd5f..11b4af9 100644
--- a/include/linux/pnfs_osd_xdr.h
+++ b/include/linux/pnfs_osd_xdr.h
@@ -91,10 +91,10 @@ struct pnfs_osd_objid {
* BE style
*/
#define _DEVID_LO(oid_device_id) \
- (unsigned long long)be64_to_cpup((__be64 *)(oid_device_id)->data)
+ (unsigned long long)be64_to_cpup((__be64 *)(oid_device_id)->u.data64)

#define _DEVID_HI(oid_device_id) \
- (unsigned long long)be64_to_cpup(((__be64 *)(oid_device_id)->data) + 1)
+ (unsigned long long)be64_to_cpup(((__be64 *)(oid_device_id)->u.data64) + 1)

enum pnfs_osd_version {
PNFS_OSD_MISSING = 0,


2012-03-01 22:16:23

by Adamson, Dros

[permalink] [raw]
Subject: Re: [PATCH 04/15] NFS: Reduce debugging noise from encode_compound_hdr

On Mar 1, 2012, at 5:00 PM, Chuck Lever wrote:

> Get rid of
>
> encode_compound: tag=
>
> when XDR debugging is enabled. The current Linux client never sets
> compound tags.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
>
> fs/nfs/nfs4xdr.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 4ca87cb..d1b7914d 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -889,7 +889,9 @@ static void encode_compound_hdr(struct xdr_stream *xdr,
> * but this is not required as a MUST for the server to do so. */
> hdr->replen = RPC_REPHDRSIZE + auth->au_rslack + 3 + hdr->taglen;
>
> +#if 0
> dprintk("encode_compound: tag=%.*s\n", (int)hdr->taglen, hdr->tag);
> +#endif

We probably don't want to #if 0 this out. Either remove the line, or do something like:

if (hdr->taglen > 0)
dprintk(?);

This way, if we ever do have a tag set, it'll be displayed.

Thanks for suggesting this -- getting rid of useless dprintk output is a good thing!

-dros

> BUG_ON(hdr->taglen > NFS4_MAXTAGLEN);
> p = reserve_space(xdr, 4 + hdr->taglen + 8);
> p = xdr_encode_opaque(p, hdr->tag, hdr->taglen);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
smime.p7s (1.34 kB)

2012-03-01 22:30:26

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 10/15] NFS: Save root file handle in nfs_server

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDE3OjAxIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
U2F2ZSBlYWNoIEZTSUQncyByb290IGRpcmVjdG9yeSBmaWxlIGhhbmRsZSBpbiB0aGUgRlNJRCdz
IG5mc19zZXJ2ZXINCj4gc3RydWN0dXJlIG9uIHRoZSBjbGllbnQuICBGb3Igbm93LCBvbmx5IE5G
U3Y0IG1vdW50cyBzYXZlIHRoZSByb290IEZILg0KPiBUaGlzIGlzIG5lZWRlZCBmb3IgbWlncmF0
aW9uIHJlY292ZXJ5Lg0KPiANCg0KUGxlYXNlIHJlbWluZCBtZSB3aHkgdGhlIHJvb3QgZGVudHJ5
IGlzbid0IHN1ZmZpY2llbnQ/DQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xp
ZW50IG1haW50YWluZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3
Lm5ldGFwcC5jb20NCg0K

2012-03-01 22:19:42

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 04/15] NFS: Reduce debugging noise from encode_compound_hdr

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDE3OjAwIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
R2V0IHJpZCBvZg0KPiANCj4gICBlbmNvZGVfY29tcG91bmQ6IHRhZz0NCj4gDQo+IHdoZW4gWERS
IGRlYnVnZ2luZyBpcyBlbmFibGVkLiAgVGhlIGN1cnJlbnQgTGludXggY2xpZW50IG5ldmVyIHNl
dHMNCj4gY29tcG91bmQgdGFncy4NCj4gDQo+IFNpZ25lZC1vZmYtYnk6IENodWNrIExldmVyIDxj
aHVjay5sZXZlckBvcmFjbGUuY29tPg0KPiAtLS0NCj4gDQo+ICBmcy9uZnMvbmZzNHhkci5jIHwg
ICAgMiArKw0KPiAgMSBmaWxlcyBjaGFuZ2VkLCAyIGluc2VydGlvbnMoKyksIDAgZGVsZXRpb25z
KC0pDQo+IA0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL25mczR4ZHIuYyBiL2ZzL25mcy9uZnM0eGRy
LmMNCj4gaW5kZXggNGNhODdjYi4uZDFiNzkxNGQgMTAwNjQ0DQo+IC0tLSBhL2ZzL25mcy9uZnM0
eGRyLmMNCj4gKysrIGIvZnMvbmZzL25mczR4ZHIuYw0KPiBAQCAtODg5LDcgKzg4OSw5IEBAIHN0
YXRpYyB2b2lkIGVuY29kZV9jb21wb3VuZF9oZHIoc3RydWN0IHhkcl9zdHJlYW0gKnhkciwNCj4g
IAkgKiBidXQgdGhpcyBpcyBub3QgcmVxdWlyZWQgYXMgYSBNVVNUIGZvciB0aGUgc2VydmVyIHRv
IGRvIHNvLiAqLw0KPiAgCWhkci0+cmVwbGVuID0gUlBDX1JFUEhEUlNJWkUgKyBhdXRoLT5hdV9y
c2xhY2sgKyAzICsgaGRyLT50YWdsZW47DQo+ICANCj4gKyNpZiAwDQo+ICAJZHByaW50aygiZW5j
b2RlX2NvbXBvdW5kOiB0YWc9JS4qc1xuIiwgKGludCloZHItPnRhZ2xlbiwgaGRyLT50YWcpOw0K
PiArI2VuZGlmDQoNCkxldCdzIGp1c3QgZ2V0IHJpZCBvZiB0aGUgbGluZSBhbHRvZ2V0aGVyIGlu
c3RlYWQgb2YgYWRkaW5nIHVnbHkgbWFjcm8NCmNydWZ0Li4uDQoNCj4gIAlCVUdfT04oaGRyLT50
YWdsZW4gPiBORlM0X01BWFRBR0xFTik7DQo+ICAJcCA9IHJlc2VydmVfc3BhY2UoeGRyLCA0ICsg
aGRyLT50YWdsZW4gKyA4KTsNCj4gIAlwID0geGRyX2VuY29kZV9vcGFxdWUocCwgaGRyLT50YWcs
IGhkci0+dGFnbGVuKTsNCj4gDQo+IC0tDQo+IFRvIHVuc3Vic2NyaWJlIGZyb20gdGhpcyBsaXN0
OiBzZW5kIHRoZSBsaW5lICJ1bnN1YnNjcmliZSBsaW51eC1uZnMiIGluDQo+IHRoZSBib2R5IG9m
IGEgbWVzc2FnZSB0byBtYWpvcmRvbW9Admdlci5rZXJuZWwub3JnDQo+IE1vcmUgbWFqb3Jkb21v
IGluZm8gYXQgIGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbA0KDQot
LSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFw
cA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-03-02 22:06:52

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

On 03/02/2012 02:03 PM, Chuck Lever wrote:
>
> On Mar 2, 2012, at 4:58 PM, Boaz Harrosh wrote:
>> If you want I can use the get_unaligned() macro if you prefer. That should be perfectly safe.
>> (And will do nothing on x86 ARCs)
>
> Basically we would just use get_unaligned() in the _DEVID_LO and _DEVID_HI macros.
>

Perfect yes!

Could you spin that or you need that I'll send it?

>> I agree that it should stay just old plain "char data[NFS4_DEVICEID4_SIZE];"
>>
>> Thanks
>> Boaz

Thanks Chuck, for catching this
Boaz

2012-03-01 22:35:21

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 14/15] NFS: Fix some minor problems with nfs4_verifiers

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDE3OjAyIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
Q2xlYW4gdXAgZHVlIHRvIGNvZGUgcmV2aWV3Lg0KIA0KPiAtdHlwZWRlZiBzdHJ1Y3QgeyBjaGFy
IGRhdGFbTkZTNF9WRVJJRklFUl9TSVpFXTsgfSBuZnM0X3ZlcmlmaWVyOw0KPiArdHlwZWRlZiB1
bmlvbiB7DQo+ICsJdW5zaWduZWQgY2hhcglkYXRhW05GUzRfVkVSSUZJRVJfU0laRV07DQo+ICsJ
dTMyCQlkYXRhMzJbTkZTNF9WRVJJRklFUl9TSVpFIC8gc2l6ZW9mKHUzMildOw0KDQpUaGUgYWJv
dmUgc2hvdWxkIHJlYWxseSBiZSBhIF9fYmUzMiB0eXBlLiBXZSBkb24ndCBldmVyIGNvbnZlcnQN
CnZlcmlmaWVycy4NCg0KPiArfSBuZnM0X3ZlcmlmaWVyOw0KPiAgDQo+ICBzdHJ1Y3QgbmZzNDFf
c3RhdGVpZCB7DQo+ICAJX19iZTMyIHNlcWlkOw0KPiANCj4gLS0NCj4gVG8gdW5zdWJzY3JpYmUg
ZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LW5mcyIgaW4N
Cj4gdGhlIGJvZHkgb2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcNCj4g
TW9yZSBtYWpvcmRvbW8gaW5mbyBhdCAgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8t
aW5mby5odG1sDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50
YWluZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5j
b20NCg0K

2012-03-02 22:11:33

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids


On Mar 2, 2012, at 5:06 PM, Boaz Harrosh wrote:

> On 03/02/2012 02:03 PM, Chuck Lever wrote:
>>
>> On Mar 2, 2012, at 4:58 PM, Boaz Harrosh wrote:
>>> If you want I can use the get_unaligned() macro if you prefer. That should be perfectly safe.
>>> (And will do nothing on x86 ARCs)
>>
>> Basically we would just use get_unaligned() in the _DEVID_LO and _DEVID_HI macros.
>>
>
> Perfect yes!
>
> Could you spin that or you need that I'll send it?

I still need someone with OSD pNFS set up to test. Can I send you a proposed patch?

>
>>> I agree that it should stay just old plain "char data[NFS4_DEVICEID4_SIZE];"
>>>
>>> Thanks
>>> Boaz
>
> Thanks Chuck, for catching this
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-02 17:19:09

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles


On Mar 2, 2012, at 12:17 PM, Steve Dickson wrote:

>
>
> On 03/01/2012 05:28 PM, Myklebust, Trond wrote:
>> On Thu, 2012-03-01 at 17:01 -0500, Chuck Lever wrote:
>>> For debugging, introduce a simplistic function to print NFS file
>>> handles on the system console. The main function is hooked into the
>>> dprintk debugging facility, but you can directly call the helper,
>>> _nfs_display_fhandle(), if you want to print a handle unconditionally.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>>
>>
>> One of the things that I really like about wireshark is that it also
>> displays a 32-bit hexadecimal hash of the filehandle. That allows you to
>> compare 2 filehandles _much_ faster than trying to look at the full
>> shebang.
>>
>>
> +1 Being able to easily trace filehandle in both the kernel debug and
> wireshark would be huge.... IMHO...

Dros mentioned a possible follow-on patch last night at the pub that would add exactly this functionality.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-01 22:09:26

by Jim Rees

[permalink] [raw]
Subject: Re: [PATCH 07/15] SUNRPC: Add API to acquire source address

Chuck Lever wrote:

NFSv4.0 clients must send endpoint information for their callback
service to NFSv4.0 servers during their first contact with a server.
Traditionally on Linux, user space provides the callback endpoint IP
address via the "clientaddr=" mount option.

During an NFSv4 migration event, it is possible that an FSID may be
migrated to a destination server that is accessible via a different
source IP address than the source server was. The client must update
callback endpoint information on the destination server so that it can
maintain leases and allow delegation.

Without a new "clientaddr=" option from user space, however, the
kernel itself must construct an appropriate IP address for the
callback update. Provide an API in the RPC client for upper layer
RPC consumers to acquire a source address for a remote.

The mechanism used by the mount.nfs command is copied: set up a
connected UDP socket to the designated remote, then scrape the source
address off the socket. We are careful to select the correct network
namespace when setting up the temporary UDP socket.

That seems like a lot of work. On OpenBSD it's a one-liner, a call into the
routing system to get the device and source address that would be used to
talk to a given destination. Is there no equivalent in linux?

2012-03-01 22:00:24

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 01/15] NFS: Make nfs_cache_array.size a signed integer

Eliminate a number of implicit type casts in comparisons, and these
compiler warnings:

fs/nfs/dir.c: In function ‘nfs_readdir_clear_array’:
fs/nfs/dir.c:264:16: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c: In function ‘nfs_readdir_search_for_cookie’:
fs/nfs/dir.c:352:16: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c: In function ‘nfs_do_filldir’:
fs/nfs/dir.c:769:38: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c:780:9: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/dir.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index bb132a8..9952170 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -207,7 +207,7 @@ struct nfs_cache_array_entry {
};

struct nfs_cache_array {
- unsigned int size;
+ int size;
int eof_index;
u64 last_cookie;
struct nfs_cache_array_entry array[0];


2012-03-01 22:01:41

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 10/15] NFS: Save root file handle in nfs_server

Save each FSID's root directory file handle in the FSID's nfs_server
structure on the client. For now, only NFSv4 mounts save the root FH.
This is needed for migration recovery.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/client.c | 1 +
fs/nfs/getroot.c | 5 +++++
include/linux/nfs_fs_sb.h | 1 +
3 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index add92d9..c9e1bb4 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1135,6 +1135,7 @@ void nfs_free_server(struct nfs_server *server)
ida_destroy(&server->lockowner_id);
ida_destroy(&server->openowner_id);
nfs_free_iostats(server->io_stats);
+ nfs_free_fhandle(server->rootfh);
bdi_destroy(&server->backing_dev_info);
kfree(server);
nfs_release_automount_timer();
diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c
index dcb6154..8d4fbe1 100644
--- a/fs/nfs/getroot.c
+++ b/fs/nfs/getroot.c
@@ -232,6 +232,11 @@ struct dentry *nfs4_get_root(struct super_block *sb, struct nfs_fh *mntfh,
ret = ERR_CAST(inode);
goto out;
}
+ server->rootfh = nfs_alloc_fhandle();
+ if (server->rootfh != NULL) {
+ nfs_display_fhandle(mntfh, "nfs_get_root: new root FH");
+ nfs_copy_fh(server->rootfh, mntfh);
+ }

error = nfs_superblock_set_dummy_root(sb, inode);
if (error != 0) {
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 3bf4766..0a2c826 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -159,6 +159,7 @@ struct nfs_server {
struct list_head layouts;
struct list_head delegations;
void (*destroy)(struct nfs_server *);
+ struct nfs_fh *rootfh;

atomic_t active; /* Keep trace of any activity to this server */



2012-03-01 22:00:59

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 05/15] SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field

From: Trond Myklebust <[email protected]>

A migration event will replace the rpc_xprt used by an rpc_clnt. To
ensure this can be done safely, all references to cl_xprt must now use
a form of rcu_dereference().

Special care is taken with rpc_peeraddr2str(), which returns a pointer
to memory whose lifetime is the same as the rpc_xprt.

Signed-off-by: Trond Myklebust <[email protected]>
[ cel: fix lockdep splats and layering violations ]
[ cel: forward ported to 3.4 ]
[ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/callback_proc.c | 9 ++-
fs/nfs/client.c | 16 ++++--
fs/nfs/nfs4namespace.c | 2 -
fs/nfs/nfs4proc.c | 13 ++++-
fs/nfs/nfs4state.c | 25 ++++++---
fs/nfs/super.c | 5 ++
include/linux/sunrpc/clnt.h | 4 +
include/linux/sunrpc/debug.h | 13 +++++
net/sunrpc/auth_gss/auth_gss.c | 4 +
net/sunrpc/clnt.c | 110 ++++++++++++++++++++++++++++++++--------
net/sunrpc/rpc_pipe.c | 3 +
net/sunrpc/rpcb_clnt.c | 15 ++++-
net/sunrpc/stats.c | 6 ++
13 files changed, 175 insertions(+), 50 deletions(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index f71978d..2db26be 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -8,6 +8,7 @@
#include <linux/nfs4.h>
#include <linux/nfs_fs.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>
#include "nfs4_fs.h"
#include "callback.h"
#include "delegation.h"
@@ -33,7 +34,7 @@ __be32 nfs4_callback_getattr(struct cb_getattrargs *args,
res->bitmap[0] = res->bitmap[1] = 0;
res->status = htonl(NFS4ERR_BADHANDLE);

- dprintk("NFS: GETATTR callback request from %s\n",
+ dprintk_rcu("NFS: GETATTR callback request from %s\n",
rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));

inode = nfs_delegation_find_inode(cps->clp, &args->fh);
@@ -73,7 +74,7 @@ __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy,
if (!cps->clp) /* Always set for v4.0. Set in cb_sequence for v4.1 */
goto out;

- dprintk("NFS: RECALL callback request from %s\n",
+ dprintk_rcu("NFS: RECALL callback request from %s\n",
rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));

res = htonl(NFS4ERR_BADHANDLE);
@@ -537,7 +538,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
if (!cps->clp) /* set in cb_sequence */
goto out;

- dprintk("NFS: RECALL_ANY callback request from %s\n",
+ dprintk_rcu("NFS: RECALL_ANY callback request from %s\n",
rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));

status = cpu_to_be32(NFS4ERR_INVAL);
@@ -572,7 +573,7 @@ __be32 nfs4_callback_recallslot(struct cb_recallslotargs *args, void *dummy,
if (!cps->clp) /* set in cb_sequence */
goto out;

- dprintk("NFS: CB_RECALL_SLOT request from %s target max slots %d\n",
+ dprintk_rcu("NFS: CB_RECALL_SLOT request from %s target max slots %d\n",
rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR),
args->crsa_target_max_slots);

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 8563585..994899d 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1282,16 +1282,18 @@ static int nfs4_init_callback(struct nfs_client *clp)
int error;

if (clp->rpc_ops->version == 4) {
+ struct rpc_xprt *xprt;
+
+ xprt = rcu_dereference_raw(clp->cl_rpcclient->cl_xprt);
+
if (nfs4_has_session(clp)) {
- error = xprt_setup_backchannel(
- clp->cl_rpcclient->cl_xprt,
+ error = xprt_setup_backchannel(xprt,
NFS41_BC_MIN_CALLBACKS);
if (error < 0)
return error;
}

- error = nfs_callback_up(clp->cl_mvops->minor_version,
- clp->cl_rpcclient->cl_xprt);
+ error = nfs_callback_up(clp->cl_mvops->minor_version, xprt);
if (error < 0) {
dprintk("%s: failed to start callback. Error = %d\n",
__func__, error);
@@ -1676,7 +1678,7 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data,
data->addrlen,
parent_client->cl_ipaddr,
data->authflavor,
- parent_server->client->cl_xprt->prot,
+ rpc_protocol(parent_server->client),
parent_server->client->cl_timeout,
parent_client->cl_mvops->minor_version,
parent_client->net);
@@ -1903,12 +1905,14 @@ static int nfs_server_list_show(struct seq_file *m, void *v)
if (clp->cl_cons_state != NFS_CS_READY)
return 0;

+ rcu_read_lock();
seq_printf(m, "v%u %s %s %3d %s\n",
clp->rpc_ops->version,
rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR),
rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_PORT),
atomic_read(&clp->cl_count),
clp->cl_hostname);
+ rcu_read_unlock();

return 0;
}
@@ -1991,6 +1995,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
(unsigned long long) server->fsid.major,
(unsigned long long) server->fsid.minor);

+ rcu_read_lock();
seq_printf(m, "v%u %s %s %-7s %-17s %s\n",
clp->rpc_ops->version,
rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR),
@@ -1998,6 +2003,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
dev,
fsid,
nfs_server_fscache_state(server));
+ rcu_read_unlock();

return 0;
}
diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
index 667ea74..9c8eca3 100644
--- a/fs/nfs/nfs4namespace.c
+++ b/fs/nfs/nfs4namespace.c
@@ -96,8 +96,8 @@ static int nfs4_validate_fspath(struct dentry *dentry,
static size_t nfs_parse_server_name(char *string, size_t len,
struct sockaddr *sa, size_t salen, struct nfs_server *server)
{
+ struct net *net = rpc_net_ns(server->client);
ssize_t ret;
- struct net *net = server->client->cl_xprt->xprt_net;

ret = rpc_pton(net, string, len, sa, salen);
if (ret == 0) {
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 53daab9..3ec09fb 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3833,6 +3833,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
*p = htonl((u32)clp->cl_boot_time.tv_nsec);

for(;;) {
+ rcu_read_lock();
setclientid.sc_name_len = scnprintf(setclientid.sc_name,
sizeof(setclientid.sc_name), "%s/%s %s %s %u",
clp->cl_ipaddr,
@@ -3849,6 +3850,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
setclientid.sc_uaddr_len = scnprintf(setclientid.sc_uaddr,
sizeof(setclientid.sc_uaddr), "%s.%u.%u",
clp->cl_ipaddr, port >> 8, port & 255);
+ rcu_read_unlock();

status = rpc_call_sync(clp->cl_rpcclient, &msg, RPC_TASK_TIMEOUT);
if (status != -NFS4ERR_CLID_INUSE)
@@ -5220,11 +5222,16 @@ struct nfs4_session *nfs4_alloc_session(struct nfs_client *clp)

void nfs4_destroy_session(struct nfs4_session *session)
{
+ struct rpc_xprt *xprt;
+
nfs4_proc_destroy_session(session);
+
+ rcu_read_lock();
+ xprt = rcu_dereference(session->clp->cl_rpcclient->cl_xprt);
+ rcu_read_unlock();
dprintk("%s Destroy backchannel for xprt %p\n",
- __func__, session->clp->cl_rpcclient->cl_xprt);
- xprt_destroy_backchannel(session->clp->cl_rpcclient->cl_xprt,
- NFS41_BC_MIN_CALLBACKS);
+ __func__, xprt);
+ xprt_destroy_backchannel(xprt, NFS41_BC_MIN_CALLBACKS);
nfs4_destroy_slot_tables(session);
kfree(session);
}
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index c1111a3..bae959e 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1037,19 +1037,28 @@ static void nfs4_clear_state_manager_bit(struct nfs_client *clp)
void nfs4_schedule_state_manager(struct nfs_client *clp)
{
struct task_struct *task;
+ char buf[INET6_ADDRSTRLEN + sizeof("-manager") + 1];

if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0)
return;
__module_get(THIS_MODULE);
atomic_inc(&clp->cl_count);
- task = kthread_run(nfs4_run_state_manager, clp, "%s-manager",
- rpc_peeraddr2str(clp->cl_rpcclient,
- RPC_DISPLAY_ADDR));
- if (!IS_ERR(task))
- return;
- nfs4_clear_state_manager_bit(clp);
- nfs_put_client(clp);
- module_put(THIS_MODULE);
+
+ /* The rcu_read_lock() is not strictly necessary, as the state
+ * manager is the only thread that ever changes the rpc_xprt
+ * after it's initialized. At this point, we're single threaded. */
+ rcu_read_lock();
+ snprintf(buf, sizeof(buf), "%s-manager",
+ rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
+ rcu_read_unlock();
+ task = kthread_run(nfs4_run_state_manager, clp, buf);
+ if (IS_ERR(task)) {
+ printk(KERN_ERR "%s: kthread_run: %ld\n",
+ __func__, PTR_ERR(task));
+ nfs4_clear_state_manager_bit(clp);
+ nfs_put_client(clp);
+ module_put(THIS_MODULE);
+ }
}

/*
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 6708f30..72c7f63 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -53,6 +53,7 @@
#include <linux/magic.h>
#include <linux/parser.h>
#include <linux/nsproxy.h>
+#include <linux/rcupdate.h>

#include <asm/system.h>
#include <asm/uaccess.h>
@@ -679,8 +680,10 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
else
seq_puts(m, nfs_infop->nostr);
}
+ rcu_read_lock();
seq_printf(m, ",proto=%s",
rpc_peeraddr2str(nfss->client, RPC_DISPLAY_NETID));
+ rcu_read_unlock();
if (version == 4) {
if (nfss->port != NFS_PORT)
seq_printf(m, ",port=%u", nfss->port);
@@ -729,9 +732,11 @@ static int nfs_show_options(struct seq_file *m, struct dentry *root)

nfs_show_mount_options(m, nfss, 0);

+ rcu_read_lock();
seq_printf(m, ",addr=%s",
rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient,
RPC_DISPLAY_ADDR));
+ rcu_read_unlock();

return 0;
}
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index a4c62e9..e3d12b4 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -35,7 +35,7 @@ struct rpc_clnt {
struct list_head cl_clients; /* Global list of clients */
struct list_head cl_tasks; /* List of tasks */
spinlock_t cl_lock; /* spinlock */
- struct rpc_xprt * cl_xprt; /* transport */
+ struct rpc_xprt __rcu * cl_xprt; /* transport */
struct rpc_procinfo * cl_procinfo; /* procedure info */
u32 cl_prog, /* RPC program number */
cl_vers, /* RPC version number */
@@ -156,6 +156,8 @@ struct rpc_task *rpc_call_null(struct rpc_clnt *clnt, struct rpc_cred *cred,
int rpc_restart_call_prepare(struct rpc_task *);
int rpc_restart_call(struct rpc_task *);
void rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int);
+int rpc_protocol(struct rpc_clnt *);
+struct net * rpc_net_ns(struct rpc_clnt *);
size_t rpc_max_payload(struct rpc_clnt *);
void rpc_force_rebind(struct rpc_clnt *);
size_t rpc_peeraddr(struct rpc_clnt *, struct sockaddr *, size_t);
diff --git a/include/linux/sunrpc/debug.h b/include/linux/sunrpc/debug.h
index b506936..6cb2517 100644
--- a/include/linux/sunrpc/debug.h
+++ b/include/linux/sunrpc/debug.h
@@ -50,19 +50,32 @@ extern unsigned int nlm_debug;
#endif

#define dprintk(args...) dfprintk(FACILITY, ## args)
+#define dprintk_rcu(args...) dfprintk_rcu(FACILITY, ## args)

#undef ifdebug
#ifdef RPC_DEBUG
# define ifdebug(fac) if (unlikely(rpc_debug & RPCDBG_##fac))
+
# define dfprintk(fac, args...) \
do { \
ifdebug(fac) \
printk(KERN_DEFAULT args); \
} while (0)
+
+# define dfprintk_rcu(fac, args...) \
+ do { \
+ ifdebug(fac) { \
+ rcu_read_lock(); \
+ printk(KERN_DEFAULT args); \
+ rcu_read_unlock(); \
+ } \
+ } while (0)
+
# define RPC_IFDEBUG(x) x
#else
# define ifdebug(fac) if (0)
# define dfprintk(fac, args...) do ; while (0)
+# define dfprintk_rcu(fac, args...) do ; while (0)
# define RPC_IFDEBUG(x)
#endif

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index cb2e564..d3ad81f 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -799,7 +799,7 @@ err_unlink_pipe_1:
static void gss_pipes_dentries_destroy_net(struct rpc_clnt *clnt,
struct rpc_auth *auth)
{
- struct net *net = clnt->cl_xprt->xprt_net;
+ struct net *net = rpc_net_ns(clnt);
struct super_block *sb;

sb = rpc_get_sb_net(net);
@@ -813,7 +813,7 @@ static void gss_pipes_dentries_destroy_net(struct rpc_clnt *clnt,
static int gss_pipes_dentries_create_net(struct rpc_clnt *clnt,
struct rpc_auth *auth)
{
- struct net *net = clnt->cl_xprt->xprt_net;
+ struct net *net = rpc_net_ns(clnt);
struct super_block *sb;
int err = 0;

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index bb7ed2f3..f043c7b 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -31,6 +31,7 @@
#include <linux/in.h>
#include <linux/in6.h>
#include <linux/un.h>
+#include <linux/rcupdate.h>

#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/rpc_pipe_fs.h>
@@ -81,7 +82,8 @@ static int rpc_ping(struct rpc_clnt *clnt);

static void rpc_register_client(struct rpc_clnt *clnt)
{
- struct sunrpc_net *sn = net_generic(clnt->cl_xprt->xprt_net, sunrpc_net_id);
+ struct net *net = rpc_net_ns(clnt);
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

spin_lock(&sn->rpc_client_lock);
list_add(&clnt->cl_clients, &sn->all_clients);
@@ -90,7 +92,8 @@ static void rpc_register_client(struct rpc_clnt *clnt)

static void rpc_unregister_client(struct rpc_clnt *clnt)
{
- struct sunrpc_net *sn = net_generic(clnt->cl_xprt->xprt_net, sunrpc_net_id);
+ struct net *net = rpc_net_ns(clnt);
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

spin_lock(&sn->rpc_client_lock);
list_del(&clnt->cl_clients);
@@ -109,12 +112,13 @@ static void __rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)

static void rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)
{
+ struct net *net = rpc_net_ns(clnt);
struct super_block *pipefs_sb;

- pipefs_sb = rpc_get_sb_net(clnt->cl_xprt->xprt_net);
+ pipefs_sb = rpc_get_sb_net(net);
if (pipefs_sb) {
__rpc_clnt_remove_pipedir(clnt);
- rpc_put_sb_net(clnt->cl_xprt->xprt_net);
+ rpc_put_sb_net(net);
}
}

@@ -155,17 +159,18 @@ static struct dentry *rpc_setup_pipedir_sb(struct super_block *sb,
static int
rpc_setup_pipedir(struct rpc_clnt *clnt, const char *dir_name)
{
+ struct net *net = rpc_net_ns(clnt);
struct super_block *pipefs_sb;
struct dentry *dentry;

clnt->cl_dentry = NULL;
if (dir_name == NULL)
return 0;
- pipefs_sb = rpc_get_sb_net(clnt->cl_xprt->xprt_net);
+ pipefs_sb = rpc_get_sb_net(net);
if (!pipefs_sb)
return 0;
dentry = rpc_setup_pipedir_sb(pipefs_sb, clnt, dir_name);
- rpc_put_sb_net(clnt->cl_xprt->xprt_net);
+ rpc_put_sb_net(net);
if (IS_ERR(dentry))
return PTR_ERR(dentry);
clnt->cl_dentry = dentry;
@@ -279,7 +284,7 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
if (clnt->cl_server == NULL)
goto out_no_server;

- clnt->cl_xprt = xprt;
+ rcu_assign_pointer(clnt->cl_xprt, xprt);
clnt->cl_procinfo = version->procs;
clnt->cl_maxproc = version->nrprocs;
clnt->cl_protname = program->name;
@@ -294,7 +299,7 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
INIT_LIST_HEAD(&clnt->cl_tasks);
spin_lock_init(&clnt->cl_lock);

- if (!xprt_bound(clnt->cl_xprt))
+ if (!xprt_bound(xprt))
clnt->cl_autobind = 1;

clnt->cl_timeout = xprt->timeout;
@@ -461,6 +466,7 @@ struct rpc_clnt *
rpc_clone_client(struct rpc_clnt *clnt)
{
struct rpc_clnt *new;
+ struct rpc_xprt *xprt;
int err = -ENOMEM;

new = kmemdup(clnt, sizeof(*new), GFP_KERNEL);
@@ -483,18 +489,25 @@ rpc_clone_client(struct rpc_clnt *clnt)
if (new->cl_principal == NULL)
goto out_no_principal;
}
+ rcu_read_lock();
+ xprt = xprt_get(rcu_dereference(clnt->cl_xprt));
+ rcu_read_unlock();
+ if (xprt == NULL)
+ goto out_no_transport;
+ rcu_assign_pointer(new->cl_xprt, xprt);
atomic_set(&new->cl_count, 1);
err = rpc_setup_pipedir(new, clnt->cl_program->pipe_dir_name);
if (err != 0)
goto out_no_path;
if (new->cl_auth)
atomic_inc(&new->cl_auth->au_count);
- xprt_get(clnt->cl_xprt);
atomic_inc(&clnt->cl_count);
rpc_register_client(new);
rpciod_up();
return new;
out_no_path:
+ xprt_put(xprt);
+out_no_transport:
kfree(new->cl_principal);
out_no_principal:
rpc_free_iostats(new->cl_metrics);
@@ -574,7 +587,7 @@ rpc_free_client(struct rpc_clnt *clnt)
rpc_free_iostats(clnt->cl_metrics);
kfree(clnt->cl_principal);
clnt->cl_metrics = NULL;
- xprt_put(clnt->cl_xprt);
+ xprt_put(rcu_dereference_raw(clnt->cl_xprt));
rpciod_down();
kfree(clnt);
}
@@ -863,13 +876,18 @@ EXPORT_SYMBOL_GPL(rpc_call_start);
size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize)
{
size_t bytes;
- struct rpc_xprt *xprt = clnt->cl_xprt;
+ struct rpc_xprt *xprt;
+
+ rcu_read_lock();
+ xprt = rcu_dereference(clnt->cl_xprt);

- bytes = sizeof(xprt->addr);
+ bytes = xprt->addrlen;
if (bytes > bufsize)
bytes = bufsize;
- memcpy(buf, &clnt->cl_xprt->addr, bytes);
- return xprt->addrlen;
+ memcpy(buf, &xprt->addr, bytes);
+ rcu_read_unlock();
+
+ return bytes;
}
EXPORT_SYMBOL_GPL(rpc_peeraddr);

@@ -878,11 +896,16 @@ EXPORT_SYMBOL_GPL(rpc_peeraddr);
* @clnt: RPC client structure
* @format: address format
*
+ * NB: the lifetime of the memory referenced by the returned pointer is
+ * the same as the rpc_xprt itself. As long as the caller uses this
+ * pointer, it must hold the RCU read lock.
*/
const char *rpc_peeraddr2str(struct rpc_clnt *clnt,
enum rpc_display_format_t format)
{
- struct rpc_xprt *xprt = clnt->cl_xprt;
+ struct rpc_xprt *xprt;
+
+ xprt = rcu_dereference(clnt->cl_xprt);

if (xprt->address_strings[format] != NULL)
return xprt->address_strings[format];
@@ -894,14 +917,51 @@ EXPORT_SYMBOL_GPL(rpc_peeraddr2str);
void
rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize)
{
- struct rpc_xprt *xprt = clnt->cl_xprt;
+ struct rpc_xprt *xprt;
+
+ rcu_read_lock();
+ xprt = rcu_dereference(clnt->cl_xprt);
if (xprt->ops->set_buffer_size)
xprt->ops->set_buffer_size(xprt, sndsize, rcvsize);
+ rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(rpc_setbufsize);

-/*
- * Return size of largest payload RPC client can support, in bytes
+/**
+ * rpc_protocol - Get transport protocol number for an RPC client
+ * @clnt: RPC client to query
+ *
+ */
+int rpc_protocol(struct rpc_clnt *clnt)
+{
+ int protocol;
+
+ rcu_read_lock();
+ protocol = rcu_dereference(clnt->cl_xprt)->prot;
+ rcu_read_unlock();
+ return protocol;
+}
+EXPORT_SYMBOL_GPL(rpc_protocol);
+
+/**
+ * rpc_net_ns - Get the network namespace for this RPC client
+ * @clnt: RPC client to query
+ *
+ */
+struct net *rpc_net_ns(struct rpc_clnt *clnt)
+{
+ struct net *ret;
+
+ rcu_read_lock();
+ ret = rcu_dereference(clnt->cl_xprt)->xprt_net;
+ rcu_read_unlock();
+ return ret;
+}
+EXPORT_SYMBOL_GPL(rpc_net_ns);
+
+/**
+ * rpc_max_payload - Get maximum payload size for a transport, in bytes
+ * @clnt: RPC client to query
*
* For stream transports, this is one RPC record fragment (see RFC
* 1831), as we don't support multi-record requests yet. For datagram
@@ -910,7 +970,12 @@ EXPORT_SYMBOL_GPL(rpc_setbufsize);
*/
size_t rpc_max_payload(struct rpc_clnt *clnt)
{
- return clnt->cl_xprt->max_payload;
+ size_t ret;
+
+ rcu_read_lock();
+ ret = rcu_dereference(clnt->cl_xprt)->max_payload;
+ rcu_read_unlock();
+ return ret;
}
EXPORT_SYMBOL_GPL(rpc_max_payload);

@@ -921,8 +986,11 @@ EXPORT_SYMBOL_GPL(rpc_max_payload);
*/
void rpc_force_rebind(struct rpc_clnt *clnt)
{
- if (clnt->cl_autobind)
- xprt_clear_bound(clnt->cl_xprt);
+ if (clnt->cl_autobind) {
+ rcu_read_lock();
+ xprt_clear_bound(rcu_dereference(clnt->cl_xprt));
+ rcu_read_unlock();
+ }
}
EXPORT_SYMBOL_GPL(rpc_force_rebind);

diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 6873c9b..bbd636b 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -16,6 +16,7 @@
#include <linux/namei.h>
#include <linux/fsnotify.h>
#include <linux/kernel.h>
+#include <linux/rcupdate.h>

#include <asm/ioctls.h>
#include <linux/poll.h>
@@ -383,12 +384,14 @@ rpc_show_info(struct seq_file *m, void *v)
{
struct rpc_clnt *clnt = m->private;

+ rcu_read_lock();
seq_printf(m, "RPC server: %s\n", clnt->cl_server);
seq_printf(m, "service: %s (%d) version %d\n", clnt->cl_protname,
clnt->cl_prog, clnt->cl_vers);
seq_printf(m, "address: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR));
seq_printf(m, "protocol: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_PROTO));
seq_printf(m, "port: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_PORT));
+ rcu_read_unlock();
return 0;
}

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index b1f08bd..4f8af63 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -620,9 +620,10 @@ static struct rpc_task *rpcb_call_async(struct rpc_clnt *rpcb_clnt, struct rpcbi
static struct rpc_clnt *rpcb_find_transport_owner(struct rpc_clnt *clnt)
{
struct rpc_clnt *parent = clnt->cl_parent;
+ struct rpc_xprt *xprt = rcu_dereference(clnt->cl_xprt);

while (parent != clnt) {
- if (parent->cl_xprt != clnt->cl_xprt)
+ if (rcu_dereference(parent->cl_xprt) != xprt)
break;
if (clnt->cl_autobind)
break;
@@ -653,8 +654,12 @@ void rpcb_getport_async(struct rpc_task *task)
size_t salen;
int status;

- clnt = rpcb_find_transport_owner(task->tk_client);
- xprt = clnt->cl_xprt;
+ rcu_read_lock();
+ do {
+ clnt = rpcb_find_transport_owner(task->tk_client);
+ xprt = xprt_get(rcu_dereference(clnt->cl_xprt));
+ } while (xprt == NULL);
+ rcu_read_unlock();

dprintk("RPC: %5u %s(%s, %u, %u, %d)\n",
task->tk_pid, __func__,
@@ -667,6 +672,7 @@ void rpcb_getport_async(struct rpc_task *task)
if (xprt_test_and_set_binding(xprt)) {
dprintk("RPC: %5u %s: waiting for another binder\n",
task->tk_pid, __func__);
+ xprt_put(xprt);
return;
}

@@ -734,7 +740,7 @@ void rpcb_getport_async(struct rpc_task *task)
switch (bind_version) {
case RPCBVERS_4:
case RPCBVERS_3:
- map->r_netid = rpc_peeraddr2str(clnt, RPC_DISPLAY_NETID);
+ map->r_netid = xprt->address_strings[RPC_DISPLAY_NETID];
map->r_addr = rpc_sockaddr2uaddr(sap, GFP_ATOMIC);
map->r_owner = "";
break;
@@ -763,6 +769,7 @@ bailout_release_client:
bailout_nofree:
rpcb_wake_rpcbind_waiters(xprt, status);
task->tk_status = status;
+ xprt_put(xprt);
}
EXPORT_SYMBOL_GPL(rpcb_getport_async);

diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c
index 1eb3304..bc2068e 100644
--- a/net/sunrpc/stats.c
+++ b/net/sunrpc/stats.c
@@ -22,6 +22,7 @@
#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/metrics.h>
+#include <linux/rcupdate.h>

#include "netns.h"

@@ -179,7 +180,7 @@ static void _print_name(struct seq_file *seq, unsigned int op,
void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
{
struct rpc_iostats *stats = clnt->cl_metrics;
- struct rpc_xprt *xprt = clnt->cl_xprt;
+ struct rpc_xprt *xprt;
unsigned int op, maxproc = clnt->cl_maxproc;

if (!stats)
@@ -189,8 +190,11 @@ void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
seq_printf(seq, "p/v: %u/%u (%s)\n",
clnt->cl_prog, clnt->cl_vers, clnt->cl_protname);

+ rcu_read_lock();
+ xprt = rcu_dereference(clnt->cl_xprt);
if (xprt)
xprt->ops->print_stats(xprt, seq);
+ rcu_read_unlock();

seq_printf(seq, "\tper-op statistics\n");
for (op = 0; op < maxproc; op++) {


2012-03-01 22:01:24

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 08/15] commit 6f38b4ba433ac6494f83cb73dd07dcbde797e1e0

NFS: Make clientaddr= optional

For NFSv4 mounts, the clientaddr= mount option has always been
required. Now we have rpc_localaddr() in the kernel, which was
modeled after the same logic in the mount.nfs command that constructs
the clientaddr= mount option. If user space doesn't provide a
clientaddr= mount option, the kernel can now construct its own.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/client.c | 15 +++++++++++++++
fs/nfs/super.c | 6 ------
2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 994899d..add92d9 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1344,6 +1344,7 @@ int nfs4_init_client(struct nfs_client *clp,
rpc_authflavor_t authflavour,
int noresvport)
{
+ char buf[INET6_ADDRSTRLEN + 1];
int error;

if (clp->cl_cons_state == NFS_CS_READY) {
@@ -1359,6 +1360,20 @@ int nfs4_init_client(struct nfs_client *clp,
1, noresvport);
if (error < 0)
goto error;
+
+ /* If no clientaddr= option was specified, find a usable cb address */
+ if (ip_addr == NULL) {
+ struct sockaddr_storage cb_addr;
+ struct sockaddr *sap = (struct sockaddr *)&cb_addr;
+
+ error = rpc_localaddr(clp->cl_rpcclient, sap, sizeof(cb_addr));
+ if (error < 0)
+ goto error;
+ error = rpc_ntop(sap, buf, sizeof(buf));
+ if (error < 0)
+ goto error;
+ ip_addr = (const char *)buf;
+ }
strlcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr));

error = nfs_idmap_new(clp);
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 72c7f63..8ae8fb2 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2526,12 +2526,6 @@ static int nfs4_validate_text_mount_data(void *options,
return -EINVAL;
}

- if (args->client_address == NULL) {
- dfprintk(MOUNT,
- "NFS4: mount program didn't pass callback address\n");
- return -EINVAL;
- }
-
return nfs_parse_devname(dev_name,
&args->nfs_server.hostname,
NFS4_MAXNAMLEN,


2012-03-01 22:28:03

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDE3OjAxIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
Rm9yIGRlYnVnZ2luZywgaW50cm9kdWNlIGEgc2ltcGxpc3RpYyBmdW5jdGlvbiB0byBwcmludCBO
RlMgZmlsZQ0KPiBoYW5kbGVzIG9uIHRoZSBzeXN0ZW0gY29uc29sZS4gIFRoZSBtYWluIGZ1bmN0
aW9uIGlzIGhvb2tlZCBpbnRvIHRoZQ0KPiBkcHJpbnRrIGRlYnVnZ2luZyBmYWNpbGl0eSwgYnV0
IHlvdSBjYW4gZGlyZWN0bHkgY2FsbCB0aGUgaGVscGVyLA0KPiBfbmZzX2Rpc3BsYXlfZmhhbmRs
ZSgpLCBpZiB5b3Ugd2FudCB0byBwcmludCBhIGhhbmRsZSB1bmNvbmRpdGlvbmFsbHkuDQo+IA0K
PiBTaWduZWQtb2ZmLWJ5OiBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbT4NCj4g
LS0tDQo+IA0KDQpPbmUgb2YgdGhlIHRoaW5ncyB0aGF0IEkgcmVhbGx5IGxpa2UgYWJvdXQgd2ly
ZXNoYXJrIGlzIHRoYXQgaXQgYWxzbw0KZGlzcGxheXMgYSAzMi1iaXQgaGV4YWRlY2ltYWwgaGFz
aCBvZiB0aGUgZmlsZWhhbmRsZS4gVGhhdCBhbGxvd3MgeW91IHRvDQpjb21wYXJlIDIgZmlsZWhh
bmRsZXMgX211Y2hfIGZhc3RlciB0aGFuIHRyeWluZyB0byBsb29rIGF0IHRoZSBmdWxsDQpzaGVi
YW5nLg0KDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWlu
ZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20N
Cg0K

2012-03-01 22:02:16

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 14/15] NFS: Fix some minor problems with nfs4_verifiers

Clean up due to code review.

sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier. The two are not interchangeable, even if they happen to
have the same value. If the nfs4_verifier struct is padded by the
compiler, sizeof(nfs4_verifier) may not be the same as the XDR'd
size. Not likely, but still.

Ensure that the data field in the nfs4_verifier structure is aligned
to the largest pointer type that is used to access it: in this case,
that's u32. Type-punning among types with different alignment has
been discouraged in user space and the kernel, to avoid unneeded
pointer aliasing. The use of a union is preferred instead.

As a micro-optimization, this change also ensures that the compiler
may perform memcpy() and memcmp() operations on these fields in
larger, more efficient, chunks.

Pull out some common nfs4_verifier XDR encoding logic into a helper.

I'm sure I missed a few spots.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4proc.c | 6 +++---
fs/nfs/nfs4xdr.c | 27 ++++++++++++++++-----------
fs/nfsd/nfs4proc.c | 4 ++--
fs/nfsd/nfs4state.c | 2 +-
fs/nfsd/nfs4xdr.c | 28 ++++++++++++++--------------
include/linux/nfs4.h | 5 ++++-
6 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 07bd1f3..cb08b1e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -837,7 +837,7 @@ static struct nfs4_opendata *nfs4_opendata_alloc(struct dentry *dentry,

p->o_arg.u.attrs = &p->attrs;
memcpy(&p->attrs, attrs, sizeof(p->attrs));
- s = (u32 *) p->o_arg.u.verifier.data;
+ s = p->o_arg.u.verifier.data32;
s[0] = jiffies;
s[1] = current->pid;
}
@@ -3830,7 +3830,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
int loop = 0;
int status;

- p = (__be32*)sc_verifier.data;
+ p = (__be32*)sc_verifier.data32;
*p++ = htonl((u32)clp->cl_boot_time.tv_sec);
*p = htonl((u32)clp->cl_boot_time.tv_nsec);

@@ -4941,7 +4941,7 @@ int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred)
dprintk("--> %s\n", __func__);
BUG_ON(clp == NULL);

- p = (u32 *)verifier.data;
+ p = verifier.data32;
*p++ = htonl((u32)clp->cl_boot_time.tv_sec);
*p = htonl((u32)clp->cl_boot_time.tv_nsec);
args.verifier = &verifier;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 018bbd7..80ba010 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -906,13 +906,18 @@ static void encode_nops(struct compound_hdr *hdr)
*hdr->nops_p = htonl(hdr->nops);
}

+static __be32 *xdr_encode_nfs4_verifier(__be32 *p, const nfs4_verifier *verf)
+{
+ return xdr_encode_opaque_fixed(p, verf->data, NFS4_VERIFIER_SIZE);
+}
+
static void encode_nfs4_verifier(struct xdr_stream *xdr, const nfs4_verifier *verf)
{
__be32 *p;

p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
BUG_ON(p == NULL);
- xdr_encode_opaque_fixed(p, verf->data, NFS4_VERIFIER_SIZE);
+ xdr_encode_nfs4_verifier(p, verf);
}

static void encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const struct nfs_server *server)
@@ -1571,7 +1576,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
p = reserve_space(xdr, 12+NFS4_VERIFIER_SIZE+20);
*p++ = cpu_to_be32(OP_READDIR);
p = xdr_encode_hyper(p, readdir->cookie);
- p = xdr_encode_opaque_fixed(p, readdir->verifier.data, NFS4_VERIFIER_SIZE);
+ p = xdr_encode_nfs4_verifier(p, &readdir->verifier);
*p++ = cpu_to_be32(dircount);
*p++ = cpu_to_be32(readdir->count);
*p++ = cpu_to_be32(2);
@@ -1583,8 +1588,8 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
dprintk("%s: cookie = %Lu, verifier = %08x:%08x, bitmap = %08x:%08x\n",
__func__,
(unsigned long long)readdir->cookie,
- ((u32 *)readdir->verifier.data)[0],
- ((u32 *)readdir->verifier.data)[1],
+ readdir->verifier.data32[0],
+ readdir->verifier.data32[1],
attrs[0] & readdir->bitmask[0],
attrs[1] & readdir->bitmask[1]);
}
@@ -1693,7 +1698,7 @@ static void encode_setclientid(struct xdr_stream *xdr, const struct nfs4_setclie

p = reserve_space(xdr, 4 + NFS4_VERIFIER_SIZE);
*p++ = cpu_to_be32(OP_SETCLIENTID);
- xdr_encode_opaque_fixed(p, setclientid->sc_verifier->data, NFS4_VERIFIER_SIZE);
+ xdr_encode_nfs4_verifier(p, setclientid->sc_verifier);

encode_string(xdr, setclientid->sc_name_len, setclientid->sc_name);
p = reserve_space(xdr, 4);
@@ -1713,7 +1718,7 @@ static void encode_setclientid_confirm(struct xdr_stream *xdr, const struct nfs4
p = reserve_space(xdr, 12 + NFS4_VERIFIER_SIZE);
*p++ = cpu_to_be32(OP_SETCLIENTID_CONFIRM);
p = xdr_encode_hyper(p, arg->clientid);
- xdr_encode_opaque_fixed(p, arg->confirm.data, NFS4_VERIFIER_SIZE);
+ xdr_encode_nfs4_verifier(p, &arg->confirm);
hdr->nops++;
hdr->replen += decode_setclientid_confirm_maxsz;
}
@@ -1770,9 +1775,9 @@ static void encode_exchange_id(struct xdr_stream *xdr,
{
__be32 *p;

- p = reserve_space(xdr, 4 + sizeof(args->verifier->data));
+ p = reserve_space(xdr, 4 + NFS4_VERIFIER_SIZE);
*p++ = cpu_to_be32(OP_EXCHANGE_ID);
- xdr_encode_opaque_fixed(p, args->verifier->data, sizeof(args->verifier->data));
+ xdr_encode_nfs4_verifier(p, args->verifier);

encode_string(xdr, args->id_len, args->id);

@@ -4205,7 +4210,7 @@ static int decode_close(struct xdr_stream *xdr, struct nfs_closeres *res)

static int decode_verifier(struct xdr_stream *xdr, void *verifier)
{
- return decode_opaque_fixed(xdr, verifier, 8);
+ return decode_opaque_fixed(xdr, verifier, NFS4_VERIFIER_SIZE);
}

static int decode_commit(struct xdr_stream *xdr, struct nfs_writeres *res)
@@ -4905,8 +4910,8 @@ static int decode_readdir(struct xdr_stream *xdr, struct rpc_rqst *req, struct n
return status;
dprintk("%s: verifier = %08x:%08x\n",
__func__,
- ((u32 *)readdir->verifier.data)[0],
- ((u32 *)readdir->verifier.data)[1]);
+ readdir->verifier.data32[0],
+ readdir->verifier.data32[1]);


hdrlen = (char *) xdr->p - (char *) iov->iov_base;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 896da74..73ac8c6 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -221,7 +221,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_o
status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
open->op_fname.len, &open->op_iattr,
&resfh, open->op_createmode,
- (u32 *)open->op_verf.data,
+ open->op_verf.data32,
&open->op_truncate, &open->op_created);

/*
@@ -485,7 +485,7 @@ static __be32
nfsd4_commit(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_commit *commit)
{
- u32 *p = (u32 *)commit->co_verf.data;
+ u32 *p = commit->co_verf.data32;
*p++ = nfssvc_boot.tv_sec;
*p++ = nfssvc_boot.tv_usec;

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c5cddd6..c8ec181 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1141,7 +1141,7 @@ static void gen_confirm(struct nfs4_client *clp)
static u32 i;
u32 *p;

- p = (u32 *)clp->cl_confirm.data;
+ p = clp->cl_confirm.data32;
*p++ = get_seconds();
*p++ = i++;
}
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 0ec5a1b..f8dbc80 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -755,14 +755,14 @@ nfsd4_decode_open(struct nfsd4_compoundargs *argp, struct nfsd4_open *open)
goto out;
break;
case NFS4_CREATE_EXCLUSIVE:
- READ_BUF(8);
- COPYMEM(open->op_verf.data, 8);
+ READ_BUF(NFS4_VERIFIER_SIZE);
+ COPYMEM(open->verf.data, NFS4_VERIFIER_SIZE);
break;
case NFS4_CREATE_EXCLUSIVE4_1:
if (argp->minorversion < 1)
goto xdr_error;
- READ_BUF(8);
- COPYMEM(open->op_verf.data, 8);
+ READ_BUF(NFS4_VERIFIER_SIZE);
+ COPYMEM(open->verf.data, NFS4_VERIFIER_SIZE);
status = nfsd4_decode_fattr(argp, open->op_bmval,
&open->op_iattr, &open->op_acl);
if (status)
@@ -994,8 +994,8 @@ nfsd4_decode_setclientid(struct nfsd4_compoundargs *argp, struct nfsd4_setclient
{
DECODE_HEAD;

- READ_BUF(8);
- COPYMEM(setclientid->se_verf.data, 8);
+ READ_BUF(NFS4_VERIFIER_SIZE);
+ COPYMEM(setclientid->se_verf.data, NFS4_VERIFIER_SIZE);

status = nfsd4_decode_opaque(argp, &setclientid->se_name);
if (status)
@@ -1020,9 +1020,9 @@ nfsd4_decode_setclientid_confirm(struct nfsd4_compoundargs *argp, struct nfsd4_s
{
DECODE_HEAD;

- READ_BUF(8 + sizeof(nfs4_verifier));
+ READ_BUF(8 + NFS4_VERIFIER_SIZE);
COPYMEM(&scd_c->sc_clientid, 8);
- COPYMEM(&scd_c->sc_confirm, sizeof(nfs4_verifier));
+ COPYMEM(&scd_c->sc_confirm, NFS4_VERIFIER_SIZE);

DECODE_TAIL;
}
@@ -2661,8 +2661,8 @@ nfsd4_encode_commit(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(8);
- WRITEMEM(commit->co_verf.data, 8);
+ RESERVE_SPACE(NFS4_VERIFIER_SIZE);
+ WRITEMEM(commit->co_verf.data, NFS4_VERIFIER_SIZE);
ADJUST_ARGS();
}
return nfserr;
@@ -3008,7 +3008,7 @@ nfsd4_encode_readdir(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4
if (resp->xbuf->page_len)
return nfserr_resource;

- RESERVE_SPACE(8); /* verifier */
+ RESERVE_SPACE(NFS4_VERIFIER_SIZE);
savep = p;

/* XXX: Following NFSv3, we ignore the READDIR verifier for now. */
@@ -3209,9 +3209,9 @@ nfsd4_encode_setclientid(struct nfsd4_compoundres *resp, __be32 nfserr, struct n
__be32 *p;

if (!nfserr) {
- RESERVE_SPACE(8 + sizeof(nfs4_verifier));
+ RESERVE_SPACE(8 + NFS4_VERIFIER_SIZE);
WRITEMEM(&scd->se_clientid, 8);
- WRITEMEM(&scd->se_confirm, sizeof(nfs4_verifier));
+ WRITEMEM(&scd->se_confirm, NFS4_VERIFIER_SIZE);
ADJUST_ARGS();
}
else if (nfserr == nfserr_clid_inuse) {
@@ -3232,7 +3232,7 @@ nfsd4_encode_write(struct nfsd4_compoundres *resp, __be32 nfserr, struct nfsd4_w
RESERVE_SPACE(16);
WRITE32(write->wr_bytes_written);
WRITE32(write->wr_how_written);
- WRITEMEM(write->wr_verifier.data, 8);
+ WRITEMEM(write->wr_verifier.data, NFS4_VERIFIER_SIZE);
ADJUST_ARGS();
}
return nfserr;
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 32345c2..b1e6b64 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -181,7 +181,10 @@ struct nfs4_acl {
struct nfs4_ace aces[0];
};

-typedef struct { char data[NFS4_VERIFIER_SIZE]; } nfs4_verifier;
+typedef union {
+ unsigned char data[NFS4_VERIFIER_SIZE];
+ u32 data32[NFS4_VERIFIER_SIZE / sizeof(u32)];
+} nfs4_verifier;

struct nfs41_stateid {
__be32 seqid;


2012-03-06 16:55:55

by Adamson, Dros

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles


On Mar 2, 2012, at 1:50 PM, Steve Dickson wrote:

>
>
> On 03/02/2012 12:19 PM, Chuck Lever wrote:
>>
>> On Mar 2, 2012, at 12:17 PM, Steve Dickson wrote:
>>
>>>
>>>
>>> On 03/01/2012 05:28 PM, Myklebust, Trond wrote:
>>>> On Thu, 2012-03-01 at 17:01 -0500, Chuck Lever wrote:
>>>>> For debugging, introduce a simplistic function to print NFS file
>>>>> handles on the system console. The main function is hooked into the
>>>>> dprintk debugging facility, but you can directly call the helper,
>>>>> _nfs_display_fhandle(), if you want to print a handle unconditionally.
>>>>>
>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>> ---
>>>>>
>>>>
>>>> One of the things that I really like about wireshark is that it also
>>>> displays a 32-bit hexadecimal hash of the filehandle. That allows you to
>>>> compare 2 filehandles _much_ faster than trying to look at the full
>>>> shebang.
>>>>
>>>>
>>> +1 Being able to easily trace filehandle in both the kernel debug and
>>> wireshark would be huge.... IMHO...
>>
>> Dros mentioned a possible follow-on patch last night at the pub that would add exactly this functionality.
>>
> And they say good things don't come out of debauchery!!! 8-)

Thanks for reminding me! Patch soon.

-dros


Attachments:
smime.p7s (1.34 kB)

2012-03-01 22:01:50

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 11/15] NFS: Simplify arguments of encode_renew()

Clean up: pass just the clientid4 to encode_renew(). This enables it
to be used by callers who might not have an full nfs_client.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4xdr.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index d1b7914d..8cb35f2 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1622,13 +1622,14 @@ static void encode_rename(struct xdr_stream *xdr, const struct qstr *oldname, co
hdr->replen += decode_rename_maxsz;
}

-static void encode_renew(struct xdr_stream *xdr, const struct nfs_client *client_stateid, struct compound_hdr *hdr)
+static void encode_renew(struct xdr_stream *xdr, clientid4 clid,
+ struct compound_hdr *hdr)
{
__be32 *p;

p = reserve_space(xdr, 12);
*p++ = cpu_to_be32(OP_RENEW);
- xdr_encode_hyper(p, client_stateid->cl_clientid);
+ xdr_encode_hyper(p, clid);
hdr->nops++;
hdr->replen += decode_renew_maxsz;
}
@@ -2653,7 +2654,7 @@ static void nfs4_xdr_enc_renew(struct rpc_rqst *req, struct xdr_stream *xdr,
};

encode_compound_hdr(xdr, req, &hdr);
- encode_renew(xdr, clp, &hdr);
+ encode_renew(xdr, clp->cl_clientid, &hdr);
encode_nops(&hdr);
}



2012-03-01 22:27:34

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 07/15] SUNRPC: Add API to acquire source address


On Mar 1, 2012, at 5:09 PM, Jim Rees wrote:

> Chuck Lever wrote:
>
> NFSv4.0 clients must send endpoint information for their callback
> service to NFSv4.0 servers during their first contact with a server.
> Traditionally on Linux, user space provides the callback endpoint IP
> address via the "clientaddr=" mount option.
>
> During an NFSv4 migration event, it is possible that an FSID may be
> migrated to a destination server that is accessible via a different
> source IP address than the source server was. The client must update
> callback endpoint information on the destination server so that it can
> maintain leases and allow delegation.
>
> Without a new "clientaddr=" option from user space, however, the
> kernel itself must construct an appropriate IP address for the
> callback update. Provide an API in the RPC client for upper layer
> RPC consumers to acquire a source address for a remote.
>
> The mechanism used by the mount.nfs command is copied: set up a
> connected UDP socket to the designated remote, then scrape the source
> address off the socket. We are careful to select the correct network
> namespace when setting up the temporary UDP socket.
>
> That seems like a lot of work. On OpenBSD it's a one-liner, a call into the
> routing system to get the device and source address that would be used to
> talk to a given destination.

The goal here is to emulate exactly what is done in user space so that the result is the same whether the address is obtained in the kernel or by the mount.nfs command.

> Is there no equivalent in linux?

No idea.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-01 22:43:48

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 14/15] NFS: Fix some minor problems with nfs4_verifiers

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDIyOjM1ICswMDAwLCBNeWtsZWJ1c3QsIFRyb25kIHdyb3Rl
Og0KPiBPbiBUaHUsIDIwMTItMDMtMDEgYXQgMTc6MDIgLTA1MDAsIENodWNrIExldmVyIHdyb3Rl
Og0KPiA+IENsZWFuIHVwIGR1ZSB0byBjb2RlIHJldmlldy4NCj4gIA0KPiA+IC10eXBlZGVmIHN0
cnVjdCB7IGNoYXIgZGF0YVtORlM0X1ZFUklGSUVSX1NJWkVdOyB9IG5mczRfdmVyaWZpZXI7DQo+
ID4gK3R5cGVkZWYgdW5pb24gew0KPiA+ICsJdW5zaWduZWQgY2hhcglkYXRhW05GUzRfVkVSSUZJ
RVJfU0laRV07DQo+ID4gKwl1MzIJCWRhdGEzMltORlM0X1ZFUklGSUVSX1NJWkUgLyBzaXplb2Yo
dTMyKV07DQo+IA0KPiBUaGUgYWJvdmUgc2hvdWxkIHJlYWxseSBiZSBhIF9fYmUzMiB0eXBlLiBX
ZSBkb24ndCBldmVyIGNvbnZlcnQNCj4gdmVyaWZpZXJzLg0KDQpDb3JyZWN0aW9uLi4uLiBBcyBJ
IHNhaWQgZm9yIDE1LzE1LCB0aGVzZSB0aGluZ3MgYXJlIGRvY3VtZW50ZWQgaW4gdGhlDQpzcGVj
IGFzIGJlaW5nIG9wYXF1ZS4gVGhlIGZhY3QgdGhhdCB3ZSBkb24ndCBhbHdheXMgdHJlYXQgdGhl
bSB0aGF0IHdheQ0KZG9lc24ndCBqdXN0aWZ5IHRoZSBhYm92ZSBjaGFuZ2UuDQoNCi0tIA0KVHJv
bmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXINCg0KTmV0QXBwDQpUcm9u
ZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20NCg0K

2012-03-01 22:01:16

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 07/15] SUNRPC: Add API to acquire source address

NFSv4.0 clients must send endpoint information for their callback
service to NFSv4.0 servers during their first contact with a server.
Traditionally on Linux, user space provides the callback endpoint IP
address via the "clientaddr=" mount option.

During an NFSv4 migration event, it is possible that an FSID may be
migrated to a destination server that is accessible via a different
source IP address than the source server was. The client must update
callback endpoint information on the destination server so that it can
maintain leases and allow delegation.

Without a new "clientaddr=" option from user space, however, the
kernel itself must construct an appropriate IP address for the
callback update. Provide an API in the RPC client for upper layer
RPC consumers to acquire a source address for a remote.

The mechanism used by the mount.nfs command is copied: set up a
connected UDP socket to the designated remote, then scrape the source
address off the socket. We are careful to select the correct network
namespace when setting up the temporary UDP socket.

Signed-off-by: Chuck Lever <[email protected]>
---

include/linux/sunrpc/clnt.h | 1
net/sunrpc/clnt.c | 149 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 150 insertions(+), 0 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index acd5502..523547e 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -161,6 +161,7 @@ size_t rpc_max_payload(struct rpc_clnt *);
void rpc_force_rebind(struct rpc_clnt *);
size_t rpc_peeraddr(struct rpc_clnt *, struct sockaddr *, size_t);
const char *rpc_peeraddr2str(struct rpc_clnt *, enum rpc_display_format_t);
+int rpc_localaddr(struct rpc_clnt *, struct sockaddr *, size_t);

size_t rpc_ntop(const struct sockaddr *, char *, const size_t);
size_t rpc_pton(struct net *, const char *, const size_t,
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 84bebb1..064e55bb 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -898,6 +898,155 @@ const char *rpc_peeraddr2str(struct rpc_clnt *clnt,
}
EXPORT_SYMBOL_GPL(rpc_peeraddr2str);

+static const struct sockaddr_in rpc_inaddr_loopback = {
+ .sin_family = AF_INET,
+ .sin_addr.s_addr = htonl(INADDR_ANY),
+};
+
+static const struct sockaddr_in6 rpc_in6addr_loopback = {
+ .sin6_family = AF_INET6,
+ .sin6_addr = IN6ADDR_ANY_INIT,
+};
+
+/*
+ * Try a getsockname() on a connected datagram socket. Using a
+ * connected datagram socket prevents leaving a socket in TIME_WAIT.
+ * This conserves the ephemeral port number space.
+ *
+ * Returns zero and fills in "buf" if successful; otherwise, a
+ * negative errno is returned.
+ */
+static int rpc_sockname(struct net *net, struct sockaddr *sap, size_t salen,
+ struct sockaddr *buf, int buflen)
+{
+ struct socket *sock;
+ int err;
+
+ err = __sock_create(net, sap->sa_family,
+ SOCK_DGRAM, IPPROTO_UDP, &sock, 1);
+ if (err < 0) {
+ dprintk("RPC: can't create UDP socket (%d)\n", err);
+ goto out;
+ }
+
+ switch (sap->sa_family) {
+ case AF_INET:
+ err = kernel_bind(sock,
+ (struct sockaddr *)&rpc_inaddr_loopback,
+ sizeof(rpc_inaddr_loopback));
+ break;
+ case AF_INET6:
+ err = kernel_bind(sock,
+ (struct sockaddr *)&rpc_in6addr_loopback,
+ sizeof(rpc_in6addr_loopback));
+ break;
+ default:
+ err = -EAFNOSUPPORT;
+ goto out;
+ }
+ if (err < 0) {
+ dprintk("RPC: can't bind UDP socket (%d)\n", err);
+ goto out_release;
+ }
+
+ err = kernel_connect(sock, sap, salen, 0);
+ if (err < 0) {
+ dprintk("RPC: can't connect UDP socket (%d)\n", err);
+ goto out_release;
+ }
+
+ err = kernel_getsockname(sock, buf, &buflen);
+ if (err < 0) {
+ dprintk("RPC: getsockname failed (%d)\n", err);
+ goto out_release;
+ }
+
+ err = 0;
+ if (buf->sa_family == AF_INET6) {
+ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)buf;
+ sin6->sin6_scope_id = 0;
+ }
+ dprintk("RPC: %s succeeded\n", __func__);
+
+out_release:
+ sock_release(sock);
+out:
+ return err;
+}
+
+/*
+ * Scraping a connected socket failed, so we don't have a useable
+ * local address. Fallback: generate an address that will prevent
+ * the server from calling us back.
+ *
+ * Returns zero and fills in "buf" if successful; otherwise, a
+ * negative errno is returned.
+ */
+static int rpc_anyaddr(int family, struct sockaddr *buf, size_t buflen)
+{
+ switch (family) {
+ case AF_INET:
+ if (buflen < sizeof(rpc_inaddr_loopback))
+ return -EINVAL;
+ memcpy(buf, &rpc_inaddr_loopback,
+ sizeof(rpc_inaddr_loopback));
+ break;
+ case AF_INET6:
+ if (buflen < sizeof(rpc_in6addr_loopback))
+ return -EINVAL;
+ memcpy(buf, &rpc_in6addr_loopback,
+ sizeof(rpc_in6addr_loopback));
+ default:
+ dprintk("RPC: %s: address family not supported\n",
+ __func__);
+ return -EAFNOSUPPORT;
+ }
+ dprintk("RPC: %s: succeeded\n", __func__);
+ return 0;
+}
+
+/**
+ * rpc_localaddr - discover local endpoint address for an RPC client
+ * @clnt: RPC client structure
+ * @buf: target buffer
+ * @buflen: size of target buffer, in bytes
+ *
+ * Returns zero and fills in "buf" and "buflen" if successful;
+ * otherwise, a negative errno is returned.
+ *
+ * This works even if the underlying transport is not currently connected,
+ * or if the upper layer never previously provided a source address.
+ *
+ * The result of this function call is transient: multiple calls in
+ * succession may give different results, depending on how local
+ * networking configuration changes over time.
+ */
+int rpc_localaddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t buflen)
+{
+ struct sockaddr_storage address;
+ struct sockaddr *sap = (struct sockaddr *)&address;
+ struct rpc_xprt *xprt;
+ struct net *net;
+ size_t salen;
+ int err;
+
+ rcu_read_lock();
+ xprt = rcu_dereference(clnt->cl_xprt);
+ salen = xprt->addrlen;
+ memcpy(sap, &xprt->addr, salen);
+ net = get_net(xprt->xprt_net);
+ rcu_read_unlock();
+
+ rpc_set_port(sap, 0);
+ err = rpc_sockname(net, sap, salen, buf, buflen);
+ put_net(net);
+ if (err != 0)
+ /* Couldn't discover local address, return ANYADDR */
+ return rpc_anyaddr(sap->sa_family, buf, buflen);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(rpc_localaddr);
+
void
rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize)
{


2012-03-01 22:01:58

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 12/15] NFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS

The Linux NFS client must distinguish between referral events (which
it currently supports) and migration events (which it does not yet
support).

In both types of events, an fs_locations array is returned. But upper
layers, not the XDR layer, should make the distinction between a
referral and a migration. There really isn't a way for an XDR decoder
function to distinguish the two, in general.

Slightly adjust the FATTR flags returned by decode_fs_locations()
to set NFS_ATTR_FATTR_V4_LOCATIONS only if a non-empty locations
array was returned from the server. Then have logic in nfs4proc.c
distinguish whether the locations array is for a referral or
something else.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4proc.c | 6 +++---
fs/nfs/nfs4xdr.c | 2 +-
include/linux/nfs_xdr.h | 11 ++++++-----
3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index d43bcda..ced56d0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -79,6 +79,7 @@ static int _nfs4_proc_open(struct nfs4_opendata *data);
static int _nfs4_recover_proc_open(struct nfs4_opendata *data);
static int nfs4_do_fsinfo(struct nfs_server *, struct nfs_fh *, struct nfs_fsinfo *);
static int nfs4_async_handle_error(struct rpc_task *, const struct nfs_server *, struct nfs4_state *);
+static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr);
static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, struct nfs_fattr *fattr);
static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
struct nfs_fattr *fattr, struct iattr *sattr,
@@ -2340,7 +2341,6 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
return nfs4_map_errors(status);
}

-static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr);
/*
* Get locations and (maybe) other attributes of a referral.
* Note that we'll actually follow the referral later when
@@ -4797,11 +4797,11 @@ static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr)
if (!(((fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID) ||
(fattr->valid & NFS_ATTR_FATTR_FILEID)) &&
(fattr->valid & NFS_ATTR_FATTR_FSID) &&
- (fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL)))
+ (fattr->valid & NFS_ATTR_FATTR_V4_LOCATIONS)))
return;

fattr->valid |= NFS_ATTR_FATTR_TYPE | NFS_ATTR_FATTR_MODE |
- NFS_ATTR_FATTR_NLINK;
+ NFS_ATTR_FATTR_NLINK | NFS_ATTR_FATTR_V4_REFERRAL;
fattr->mode = S_IFDIR | S_IRUGO | S_IXUGO;
fattr->nlink = 2;
}
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 8cb35f2..bec7227 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -3621,7 +3621,7 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
res->nlocations++;
}
if (res->nlocations != 0)
- status = NFS_ATTR_FATTR_V4_REFERRAL;
+ status = NFS_ATTR_FATTR_V4_LOCATIONS;
out:
dprintk("%s: fs_locations done, error = %d\n", __func__, status);
return status;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index adbc84a..07eb9c9 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -88,11 +88,12 @@ struct nfs_fattr {
#define NFS_ATTR_FATTR_PRECTIME (1U << 16)
#define NFS_ATTR_FATTR_CHANGE (1U << 17)
#define NFS_ATTR_FATTR_PRECHANGE (1U << 18)
-#define NFS_ATTR_FATTR_V4_REFERRAL (1U << 19) /* NFSv4 referral */
-#define NFS_ATTR_FATTR_MOUNTPOINT (1U << 20) /* Treat as mountpoint */
-#define NFS_ATTR_FATTR_MOUNTED_ON_FILEID (1U << 21)
-#define NFS_ATTR_FATTR_OWNER_NAME (1U << 22)
-#define NFS_ATTR_FATTR_GROUP_NAME (1U << 23)
+#define NFS_ATTR_FATTR_V4_LOCATIONS (1U << 19)
+#define NFS_ATTR_FATTR_V4_REFERRAL (1U << 20)
+#define NFS_ATTR_FATTR_MOUNTPOINT (1U << 21)
+#define NFS_ATTR_FATTR_MOUNTED_ON_FILEID (1U << 22)
+#define NFS_ATTR_FATTR_OWNER_NAME (1U << 23)
+#define NFS_ATTR_FATTR_GROUP_NAME (1U << 24)

#define NFS_ATTR_FATTR (NFS_ATTR_FATTR_TYPE \
| NFS_ATTR_FATTR_MODE \


2012-03-02 22:03:23

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids


On Mar 2, 2012, at 4:58 PM, Boaz Harrosh wrote:

> On 03/02/2012 10:10 AM, Chuck Lever wrote:
>>
>> On Mar 1, 2012, at 5:55 PM, Chuck Lever wrote:
>>
>>>
>>> On Mar 1, 2012, at 5:39 PM, Myklebust, Trond wrote:
>>>
>>>> On Thu, 2012-03-01 at 17:02 -0500, Chuck Lever wrote:
>>>>> Clean up due to code review.
>>>>>
>>>>> sizeof(struct nfs4_deviceid) is the size of the in-core device ID data
>>>>> structure, but NFS4_DEVICEID4_SIZE is the number of octets in an XDR'd
>>>>> device ID. The two are not interchangeable, even if they happen to
>>>>> have the same value. If struct nfs4_deviceid is padded by the
>>>>> compiler, sizeof(struct nfs4_deviceid) may not be the same as the
>>>>> XDR'd size. Not likely, but still.
>>>>>
>>>>> Ensure that the data field is aligned to the largest pointer type that
>>>>> is used to access it: in this case, that's u64. Type-punning among
>>>>> types with different alignment has been discouraged in user space and
>>>>> the kernel, to avoid unneeded pointer aliasing. The use of a union
>>>>> is preferred instead.
>>>>>
>>>>> Ensure that XDR'ing and comparing is done via one of the union's data
>>>>> fields, and not via the whole struct, again in case the compiler pads
>>>>> the struct.
>>>>>
>>>>> As a micro-optimization, this change also ensures that the compiler
>>>>> may perform memcpy() and memcmp() operations on these fields in
>>>>> larger, more efficient, chunks.
>>>>>
>>>>> This patch needs review and testing by pnfs layout specialists.
>>>>>
>>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>>> ---
>>>>>
>>>>> fs/nfs/blocklayout/blocklayout.c | 2 +-
>>>>> fs/nfs/blocklayout/blocklayoutdev.c | 8 ++++----
>>>>> fs/nfs/blocklayout/extents.c | 5 +++--
>>>>> fs/nfs/callback_xdr.c | 2 +-
>>>>> fs/nfs/nfs4filelayout.c | 3 +--
>>>>> fs/nfs/nfs4xdr.c | 4 ++--
>>>>> fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 ++++----
>>>>> fs/nfs/pnfs_dev.c | 7 ++++---
>>>>> include/linux/nfs4.h | 7 ++++++-
>>>>> include/linux/pnfs_osd_xdr.h | 4 ++--
>>>>> 10 files changed, 28 insertions(+), 22 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
>>>>> index 783ebd5..3b730bf 100644
>>>>> --- a/fs/nfs/blocklayout/blocklayout.c
>>>>> +++ b/fs/nfs/blocklayout/blocklayout.c
>>>>> @@ -901,7 +901,7 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
>>>>> dev->pglen = PAGE_SIZE * max_pages;
>>>>> dev->mincount = 0;
>>>>>
>>>>> - dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.data);
>>>>> + dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.u.data);
>>>>> rc = nfs4_proc_getdeviceinfo(server, dev);
>>>>> dprintk("%s getdevice info returns %d\n", __func__, rc);
>>>>> if (rc) {
>>>>> diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c
>>>>> index b48f782..2dafca5 100644
>>>>> --- a/fs/nfs/blocklayout/blocklayoutdev.c
>>>>> +++ b/fs/nfs/blocklayout/blocklayoutdev.c
>>>>> @@ -124,8 +124,8 @@ nfs4_blk_decode_device(struct nfs_server *server,
>>>>> struct nfs_net *nn = net_generic(net, nfs_net_id);
>>>>>
>>>>> dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
>>>>> - dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
>>>>> - dev->mincount);
>>>>> + dprintk("%s: deviceid: %s, mincount: %d\n", __func__,
>>>>> + dev->dev_id.u.data, dev->mincount);
>>>>>
>>>>> memset(&msg, 0, sizeof(msg));
>>>>> msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
>>>>> @@ -206,7 +206,7 @@ static struct block_device *translate_devid(struct pnfs_layout_hdr *lo,
>>>>> mid = BLK_ID(lo);
>>>>> spin_lock(&mid->bm_lock);
>>>>> list_for_each_entry(dev, &mid->bm_devlist, bm_node) {
>>>>> - if (memcmp(id->data, dev->bm_mdevid.data,
>>>>> + if (memcmp(id->u.data, dev->bm_mdevid.u.data,
>>>>> NFS4_DEVICEID4_SIZE) == 0) {
>>>>> rv = dev->bm_mdev;
>>>>> goto out;
>>>>> @@ -323,7 +323,7 @@ nfs4_blk_process_layoutget(struct pnfs_layout_hdr *lo,
>>>>> status = -ENOMEM;
>>>>> goto out_err;
>>>>> }
>>>>> - memcpy(&be->be_devid, p, NFS4_DEVICEID4_SIZE);
>>>>> + memcpy(&be->be_devid.u.data, p, NFS4_DEVICEID4_SIZE);
>>>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>>> be->be_mdev = translate_devid(lo, &be->be_devid);
>>>>> if (!be->be_mdev)
>>>>> diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c
>>>>> index 1f9a603..52b747c 100644
>>>>> --- a/fs/nfs/blocklayout/extents.c
>>>>> +++ b/fs/nfs/blocklayout/extents.c
>>>>> @@ -675,10 +675,11 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
>>>>> if (!xdr_start)
>>>>> goto out;
>>>>> list_for_each_entry_safe(lce, save, &bl->bl_commit, bse_node) {
>>>>> - p = xdr_reserve_space(xdr, 7 * 4 + sizeof(lce->bse_devid.data));
>>>>> + p = xdr_reserve_space(xdr, 7 * 4 + NFS4_DEVICEID4_SIZE);
>>>>> if (!p)
>>>>> break;
>>>>> - p = xdr_encode_opaque_fixed(p, lce->bse_devid.data, NFS4_DEVICEID4_SIZE);
>>>>> + p = xdr_encode_opaque_fixed(p, lce->bse_devid.u.data,
>>>>> + NFS4_DEVICEID4_SIZE);
>>>>> p = xdr_encode_hyper(p, lce->bse_f_offset << SECTOR_SHIFT);
>>>>> p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
>>>>> p = xdr_encode_hyper(p, 0LL);
>>>>> diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
>>>>> index 5466829..6061322 100644
>>>>> --- a/fs/nfs/callback_xdr.c
>>>>> +++ b/fs/nfs/callback_xdr.c
>>>>> @@ -347,7 +347,7 @@ __be32 decode_devicenotify_args(struct svc_rqst *rqstp,
>>>>> goto err;
>>>>> }
>>>>> dev->cbd_layout_type = ntohl(*p++);
>>>>> - memcpy(dev->cbd_dev_id.data, p, NFS4_DEVICEID4_SIZE);
>>>>> + memcpy(dev->cbd_dev_id.u.data, p, NFS4_DEVICEID4_SIZE);
>>>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>>>
>>>>> if (dev->cbd_layout_type == NOTIFY_DEVICEID4_CHANGE) {
>>>>> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
>>>>> index 47e8f34..f0c6ba0 100644
>>>>> --- a/fs/nfs/nfs4filelayout.c
>>>>> +++ b/fs/nfs/nfs4filelayout.c
>>>>> @@ -550,8 +550,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
>>>>> if (unlikely(!p))
>>>>> goto out_err;
>>>>>
>>>>> - memcpy(id, p, sizeof(*id));
>>>>> - p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>>> + p = xdr_decode_opaque_fixed(p, id->u.data, NFS4_DEVICEID4_SIZE);
>>>>> nfs4_print_deviceid(id);
>>>>>
>>>>> nfl_util = be32_to_cpup(p++);
>>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>>> index 80ba010..91020ea 100644
>>>>> --- a/fs/nfs/nfs4xdr.c
>>>>> +++ b/fs/nfs/nfs4xdr.c
>>>>> @@ -1946,7 +1946,7 @@ encode_getdeviceinfo(struct xdr_stream *xdr,
>>>>>
>>>>> p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
>>>>> *p++ = cpu_to_be32(OP_GETDEVICEINFO);
>>>>> - p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
>>>>> + p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.u.data,
>>>>> NFS4_DEVICEID4_SIZE);
>>>>> *p++ = cpu_to_be32(args->pdev->layout_type);
>>>>> *p++ = cpu_to_be32(args->pdev->pglen); /* gdia_maxcount */
>>>>> @@ -5492,7 +5492,7 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
>>>>> if (unlikely(!p))
>>>>> goto out_overflow;
>>>>> for (i = 0; i < res->num_devs; i++)
>>>>> - p = xdr_decode_opaque_fixed(p, res->dev_id[i].data,
>>>>> + p = xdr_decode_opaque_fixed(p, res->dev_id[i].u.data,
>>>>> NFS4_DEVICEID4_SIZE);
>>>>> res->eof = be32_to_cpup(p);
>>>>> return 0;
>>>>> diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>>> index b3918f7..99a1e89 100644
>>>>> --- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>>> +++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>>> @@ -55,8 +55,8 @@
>>>>> static __be32 *
>>>>> _osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid)
>>>>> {
>>>>> - p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data,
>>>>> - sizeof(objid->oid_device_id.data));
>>>>> + p = xdr_decode_opaque_fixed(p, objid->oid_device_id.u.data,
>>>>> + NFS4_DEVICEID4_SIZE);
>>>>>
>>>>> p = xdr_decode_hyper(p, &objid->oid_partition_id);
>>>>> p = xdr_decode_hyper(p, &objid->oid_object_id);
>>>>> @@ -377,8 +377,8 @@ pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr,
>>>>> static inline __be32 *
>>>>> pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id)
>>>>> {
>>>>> - p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data,
>>>>> - sizeof(object_id->oid_device_id.data));
>>>>> + p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.u.data,
>>>>> + NFS4_DEVICEID4_SIZE);
>>>>> p = xdr_encode_hyper(p, object_id->oid_partition_id);
>>>>> p = xdr_encode_hyper(p, object_id->oid_object_id);
>>>>>
>>>>> diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
>>>>> index 4f359d2..6c6ab81 100644
>>>>> --- a/fs/nfs/pnfs_dev.c
>>>>> +++ b/fs/nfs/pnfs_dev.c
>>>>> @@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(nfs4_deviceid_lock);
>>>>> void
>>>>> nfs4_print_deviceid(const struct nfs4_deviceid *id)
>>>>> {
>>>>> - u32 *p = (u32 *)id;
>>>>> + const u32 *p = id->u.data32;
>>>>>
>>>>> dprintk("%s: device id= [%x%x%x%x]\n", __func__,
>>>>> p[0], p[1], p[2], p[3]);
>>>>> @@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
>>>>> static inline u32
>>>>> nfs4_deviceid_hash(const struct nfs4_deviceid *id)
>>>>> {
>>>>> - unsigned char *cptr = (unsigned char *)id->data;
>>>>> + const unsigned char *cptr = id->u.data;
>>>>> unsigned int nbytes = NFS4_DEVICEID4_SIZE;
>>>>> u32 x = 0;
>>>>>
>>>>> @@ -77,7 +77,8 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,
>>>>>
>>>>> hlist_for_each_entry_rcu(d, n, &nfs4_deviceid_cache[hash], node)
>>>>> if (d->ld == ld && d->nfs_client == clp &&
>>>>> - !memcmp(&d->deviceid, id, sizeof(*id))) {
>>>>> + !memcmp(&d->deviceid.u.data, id->u.data,
>>>>> + NFS4_DEVICEID4_SIZE)) {
>>>>> if (atomic_read(&d->ref))
>>>>> return d;
>>>>> else
>>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>>>> index b1e6b64..979f607 100644
>>>>> --- a/include/linux/nfs4.h
>>>>> +++ b/include/linux/nfs4.h
>>>>> @@ -649,9 +649,14 @@ enum filelayout_hint_care4 {
>>>>> #define NFS4_DEVICEID4_SIZE 16
>>>>>
>>>>> struct nfs4_deviceid {
>>>>> - char data[NFS4_DEVICEID4_SIZE];
>>>>> + union {
>>>>> + unsigned char data[NFS4_DEVICEID4_SIZE];
>>>>> + u32 data32[NFS4_DEVICEID4_SIZE / sizeof(u32)];
>>>>> + u64 data64[NFS4_DEVICEID4_SIZE / sizeof(u64)];
>>>>> + } u;
>>>>> };
>>>>>
>>>>
>>>> Ugh... This just looks worse and worse.... Let's rather just keep these
>>>> as unsigned char. I don't care if OSDs have special secret meanings and
>>>> all that crap. They can cast the damned thing if they need to. The rest
>>>> of us should be considering this to be opaque data.
>>>
>>> One possibility is that OSD and block layout drivers can construct their device ID in a u64 array on the stack, and then memcpy() the array to the nfs4_deviceid.
>>>
>>> But we can't cast pointers to an "unsigned char" field, since it's not guaranteed to be aligned. That's a real bug, one that's hit us before.
>>
>> Accessing a "char x[16]" field by casting x to a (u64 *) is considered harmful. It will work when x happens to be aligned to 64-bits, but otherwise accessing a field that way will cause an oops.
>>
>> It's not clear why the OSD layout driver needs to print device IDs as a pair of unsigned long longs. Why can't nfs4_print_deviceid() be used ?
>>
>
> For the same reason we do debug prints at all, for inconvenience.
>
> If you want I can use the get_unaligned() macro if you prefer. That should be perfectly safe.
> (And will do nothing on x86 ARCs)

Basically we would just use get_unaligned() in the _DEVID_LO and _DEVID_HI macros.

> I agree that it should stay just old plain "char data[NFS4_DEVICEID4_SIZE];"
>
> Thanks
> Boaz
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-01 22:01:07

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 06/15] SUNRPC: Move clnt->cl_server into struct rpc_xprt

From: Trond Myklebust <[email protected]>

When the cl_xprt field is updated, the cl_server field will also have
to change. Since the contents of cl_server follow the remote endpoint
of cl_xprt, just move that field to the rpc_xprt.

Signed-off-by: Trond Myklebust <[email protected]>
[ cel: simplify check_gss_callback_principal(), whitespace changes ]
[ cel: forward ported to 3.4 ]
Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/callback.c | 3 -
fs/nfs/nfs4proc.c | 3 +
include/linux/sunrpc/clnt.h | 1
include/linux/sunrpc/xprt.h | 2 +
net/sunrpc/clnt.c | 88 +++++++++++++++++++++----------------------
net/sunrpc/rpc_pipe.c | 3 +
net/sunrpc/rpcb_clnt.c | 4 +-
net/sunrpc/xprt.c | 15 +++++++
8 files changed, 66 insertions(+), 53 deletions(-)

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 4a122ae..2afe233 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -332,7 +332,6 @@ void nfs_callback_down(int minorversion)
int
check_gss_callback_principal(struct nfs_client *clp, struct svc_rqst *rqstp)
{
- struct rpc_clnt *r = clp->cl_rpcclient;
char *p = svc_gss_principal(rqstp);

if (rqstp->rq_authop->flavour != RPC_AUTH_GSS)
@@ -353,7 +352,7 @@ check_gss_callback_principal(struct nfs_client *clp, struct svc_rqst *rqstp)
if (memcmp(p, "nfs@", 4) != 0)
return 0;
p += 4;
- if (strcmp(p, r->cl_server) != 0)
+ if (strcmp(p, clp->cl_hostname) != 0)
return 0;
return 1;
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3ec09fb..d43bcda 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1100,6 +1100,7 @@ static struct nfs4_state *nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data
if (state == NULL)
goto err_put_inode;
if (data->o_res.delegation_type != 0) {
+ struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
int delegation_flags = 0;

rcu_read_lock();
@@ -1111,7 +1112,7 @@ static struct nfs4_state *nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data
pr_err_ratelimited("NFS: Broken NFSv4 server %s is "
"returning a delegation for "
"OPEN(CLAIM_DELEGATE_CUR)\n",
- NFS_CLIENT(inode)->cl_server);
+ clp->cl_hostname);
} else if ((delegation_flags & 1UL<<NFS_DELEGATION_NEED_RECLAIM) == 0)
nfs_inode_set_delegation(state->inode,
data->owner->so_cred,
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index e3d12b4..acd5502 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -41,7 +41,6 @@ struct rpc_clnt {
cl_vers, /* RPC version number */
cl_maxproc; /* max procedure number */

- const char * cl_server; /* server machine name */
const char * cl_protname; /* protocol name */
struct rpc_auth * cl_auth; /* authenticator */
struct rpc_stat * cl_stats; /* per-program statistics */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index ea712f9..77d278d 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -229,6 +229,7 @@ struct rpc_xprt {
} stat;

struct net *xprt_net;
+ const char *servername;
const char *address_strings[RPC_DISPLAY_MAX];
};

@@ -258,6 +259,7 @@ struct xprt_create {
struct sockaddr * srcaddr; /* optional local address */
struct sockaddr * dstaddr; /* remote peer address */
size_t addrlen;
+ const char *servername;
struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f043c7b..84bebb1 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -249,15 +249,8 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
struct rpc_clnt *clnt = NULL;
struct rpc_auth *auth;
int err;
- size_t len;

/* sanity check the name before trying to print it */
- err = -EINVAL;
- len = strlen(args->servername);
- if (len > RPC_MAXNETNAMELEN)
- goto out_no_rpciod;
- len++;
-
dprintk("RPC: creating %s client for %s (xprt %p)\n",
program->name, args->servername, xprt);

@@ -280,10 +273,6 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
goto out_err;
clnt->cl_parent = clnt;

- clnt->cl_server = kstrdup(args->servername, GFP_KERNEL);
- if (clnt->cl_server == NULL)
- goto out_no_server;
-
rcu_assign_pointer(clnt->cl_xprt, xprt);
clnt->cl_procinfo = version->procs;
clnt->cl_maxproc = version->nrprocs;
@@ -347,8 +336,6 @@ out_no_path:
out_no_principal:
rpc_free_iostats(clnt->cl_metrics);
out_no_stats:
- kfree(clnt->cl_server);
-out_no_server:
kfree(clnt);
out_err:
xprt_put(xprt);
@@ -378,6 +365,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
.srcaddr = args->saddress,
.dstaddr = args->address,
.addrlen = args->addrsize,
+ .servername = args->servername,
.bc_xprt = args->bc_xprt,
};
char servername[48];
@@ -386,7 +374,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
* If the caller chooses not to specify a hostname, whip
* up a string representation of the passed-in address.
*/
- if (args->servername == NULL) {
+ if (xprtargs.servername == NULL) {
struct sockaddr_un *sun =
(struct sockaddr_un *)args->address;
struct sockaddr_in *sin =
@@ -413,7 +401,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
* address family isn't recognized. */
return ERR_PTR(-EINVAL);
}
- args->servername = servername;
+ xprtargs.servername = servername;
}

xprt = xprt_create_transport(&xprtargs);
@@ -472,9 +460,6 @@ rpc_clone_client(struct rpc_clnt *clnt)
new = kmemdup(clnt, sizeof(*new), GFP_KERNEL);
if (!new)
goto out_no_clnt;
- new->cl_server = kstrdup(clnt->cl_server, GFP_KERNEL);
- if (new->cl_server == NULL)
- goto out_no_server;
new->cl_parent = clnt;
/* Turn off autobind on clones */
new->cl_autobind = 0;
@@ -512,8 +497,6 @@ out_no_transport:
out_no_principal:
rpc_free_iostats(new->cl_metrics);
out_no_stats:
- kfree(new->cl_server);
-out_no_server:
kfree(new);
out_no_clnt:
dprintk("RPC: %s: returned error %d\n", __func__, err);
@@ -558,8 +541,9 @@ EXPORT_SYMBOL_GPL(rpc_killall_tasks);
*/
void rpc_shutdown_client(struct rpc_clnt *clnt)
{
- dprintk("RPC: shutting down %s client for %s\n",
- clnt->cl_protname, clnt->cl_server);
+ dprintk_rcu("RPC: shutting down %s client for %s\n",
+ clnt->cl_protname,
+ rcu_dereference(clnt->cl_xprt)->servername);

while (!list_empty(&clnt->cl_tasks)) {
rpc_killall_tasks(clnt);
@@ -577,11 +561,11 @@ EXPORT_SYMBOL_GPL(rpc_shutdown_client);
static void
rpc_free_client(struct rpc_clnt *clnt)
{
- dprintk("RPC: destroying %s client for %s\n",
- clnt->cl_protname, clnt->cl_server);
+ dprintk_rcu("RPC: destroying %s client for %s\n",
+ clnt->cl_protname,
+ rcu_dereference(clnt->cl_xprt)->servername);
if (clnt->cl_parent != clnt)
rpc_release_client(clnt->cl_parent);
- kfree(clnt->cl_server);
rpc_unregister_client(clnt);
rpc_clnt_remove_pipedir(clnt);
rpc_free_iostats(clnt->cl_metrics);
@@ -1669,8 +1653,11 @@ call_timeout(struct rpc_task *task)
}
if (RPC_IS_SOFT(task)) {
if (clnt->cl_chatty)
+ rcu_read_lock();
printk(KERN_NOTICE "%s: server %s not responding, timed out\n",
- clnt->cl_protname, clnt->cl_server);
+ clnt->cl_protname,
+ rcu_dereference(clnt->cl_xprt)->servername);
+ rcu_read_unlock();
if (task->tk_flags & RPC_TASK_TIMEOUT)
rpc_exit(task, -ETIMEDOUT);
else
@@ -1680,9 +1667,13 @@ call_timeout(struct rpc_task *task)

if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) {
task->tk_flags |= RPC_CALL_MAJORSEEN;
- if (clnt->cl_chatty)
+ if (clnt->cl_chatty) {
+ rcu_read_lock();
printk(KERN_NOTICE "%s: server %s not responding, still trying\n",
- clnt->cl_protname, clnt->cl_server);
+ clnt->cl_protname,
+ rcu_dereference(clnt->cl_xprt)->servername);
+ rcu_read_unlock();
+ }
}
rpc_force_rebind(clnt);
/*
@@ -1711,9 +1702,13 @@ call_decode(struct rpc_task *task)
dprint_status(task);

if (task->tk_flags & RPC_CALL_MAJORSEEN) {
- if (clnt->cl_chatty)
+ if (clnt->cl_chatty) {
+ rcu_read_lock();
printk(KERN_NOTICE "%s: server %s OK\n",
- clnt->cl_protname, clnt->cl_server);
+ clnt->cl_protname,
+ rcu_dereference(clnt->cl_xprt)->servername);
+ rcu_read_unlock();
+ }
task->tk_flags &= ~RPC_CALL_MAJORSEEN;
}

@@ -1791,6 +1786,7 @@ rpc_encode_header(struct rpc_task *task)
static __be32 *
rpc_verify_header(struct rpc_task *task)
{
+ struct rpc_clnt *clnt = task->tk_client;
struct kvec *iov = &task->tk_rqstp->rq_rcv_buf.head[0];
int len = task->tk_rqstp->rq_rcv_buf.len >> 2;
__be32 *p = iov->iov_base;
@@ -1863,8 +1859,11 @@ rpc_verify_header(struct rpc_task *task)
task->tk_action = call_bind;
goto out_retry;
case RPC_AUTH_TOOWEAK:
+ rcu_read_lock();
printk(KERN_NOTICE "RPC: server %s requires stronger "
- "authentication.\n", task->tk_client->cl_server);
+ "authentication.\n",
+ rcu_dereference(clnt->cl_xprt)->servername);
+ rcu_read_unlock();
break;
default:
dprintk("RPC: %5u %s: unknown auth error: %x\n",
@@ -1887,28 +1886,27 @@ rpc_verify_header(struct rpc_task *task)
case RPC_SUCCESS:
return p;
case RPC_PROG_UNAVAIL:
- dprintk("RPC: %5u %s: program %u is unsupported by server %s\n",
- task->tk_pid, __func__,
- (unsigned int)task->tk_client->cl_prog,
- task->tk_client->cl_server);
+ dprintk_rcu("RPC: %5u %s: program %u is unsupported "
+ "by server %s\n", task->tk_pid, __func__,
+ (unsigned int)clnt->cl_prog,
+ rcu_dereference(clnt->cl_xprt)->servername);
error = -EPFNOSUPPORT;
goto out_err;
case RPC_PROG_MISMATCH:
- dprintk("RPC: %5u %s: program %u, version %u unsupported by "
- "server %s\n", task->tk_pid, __func__,
- (unsigned int)task->tk_client->cl_prog,
- (unsigned int)task->tk_client->cl_vers,
- task->tk_client->cl_server);
+ dprintk_rcu("RPC: %5u %s: program %u, version %u unsupported "
+ "by server %s\n", task->tk_pid, __func__,
+ (unsigned int)clnt->cl_prog,
+ (unsigned int)clnt->cl_vers,
+ rcu_dereference(clnt->cl_xprt)->servername);
error = -EPROTONOSUPPORT;
goto out_err;
case RPC_PROC_UNAVAIL:
- dprintk("RPC: %5u %s: proc %s unsupported by program %u, "
+ dprintk_rcu("RPC: %5u %s: proc %s unsupported by program %u, "
"version %u on server %s\n",
task->tk_pid, __func__,
rpc_proc_name(task),
- task->tk_client->cl_prog,
- task->tk_client->cl_vers,
- task->tk_client->cl_server);
+ clnt->cl_prog, clnt->cl_vers,
+ rcu_dereference(clnt->cl_xprt)->servername);
error = -EOPNOTSUPP;
goto out_err;
case RPC_GARBAGE_ARGS:
@@ -1922,7 +1920,7 @@ rpc_verify_header(struct rpc_task *task)
}

out_garbage:
- task->tk_client->cl_stats->rpcgarbage++;
+ clnt->cl_stats->rpcgarbage++;
if (task->tk_garb_retry) {
task->tk_garb_retry--;
dprintk("RPC: %5u %s: retrying\n",
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index bbd636b..3c9b16f 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -385,7 +385,8 @@ rpc_show_info(struct seq_file *m, void *v)
struct rpc_clnt *clnt = m->private;

rcu_read_lock();
- seq_printf(m, "RPC server: %s\n", clnt->cl_server);
+ seq_printf(m, "RPC server: %s\n",
+ rcu_dereference(clnt->cl_xprt)->servername);
seq_printf(m, "service: %s (%d) version %d\n", clnt->cl_protname,
clnt->cl_prog, clnt->cl_vers);
seq_printf(m, "address: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR));
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 4f8af63..e699ff0 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -663,7 +663,7 @@ void rpcb_getport_async(struct rpc_task *task)

dprintk("RPC: %5u %s(%s, %u, %u, %d)\n",
task->tk_pid, __func__,
- clnt->cl_server, clnt->cl_prog, clnt->cl_vers, xprt->prot);
+ xprt->servername, clnt->cl_prog, clnt->cl_vers, xprt->prot);

/* Put self on the wait queue to ensure we get notified if
* some other task is already attempting to bind the port */
@@ -714,7 +714,7 @@ void rpcb_getport_async(struct rpc_task *task)
dprintk("RPC: %5u %s: trying rpcbind version %u\n",
task->tk_pid, __func__, bind_version);

- rpcb_clnt = rpcb_create(xprt->xprt_net, clnt->cl_server, sap, salen,
+ rpcb_clnt = rpcb_create(xprt->xprt_net, xprt->servername, sap, salen,
xprt->prot, bind_version);
if (IS_ERR(rpcb_clnt)) {
status = PTR_ERR(rpcb_clnt);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 32e3794..0cbcd1a 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -66,6 +66,7 @@ static void xprt_init(struct rpc_xprt *xprt, struct net *net);
static void xprt_request_init(struct rpc_task *, struct rpc_xprt *);
static void xprt_connect_status(struct rpc_task *task);
static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
+static void xprt_destroy(struct rpc_xprt *xprt);

static DEFINE_SPINLOCK(xprt_list_lock);
static LIST_HEAD(xprt_list);
@@ -751,7 +752,7 @@ static void xprt_connect_status(struct rpc_task *task)
default:
dprintk("RPC: %5u xprt_connect_status: error %d connecting to "
"server %s\n", task->tk_pid, -task->tk_status,
- task->tk_client->cl_server);
+ xprt->servername);
xprt_release_write(xprt, task);
task->tk_status = -EIO;
}
@@ -1229,6 +1230,17 @@ found:
(unsigned long)xprt);
else
init_timer(&xprt->timer);
+
+ if (strlen(args->servername) > RPC_MAXNETNAMELEN) {
+ xprt_destroy(xprt);
+ return ERR_PTR(-EINVAL);
+ }
+ xprt->servername = kstrdup(args->servername, GFP_KERNEL);
+ if (xprt->servername == NULL) {
+ xprt_destroy(xprt);
+ return ERR_PTR(-ENOMEM);
+ }
+
dprintk("RPC: created transport %p with %u slots\n", xprt,
xprt->max_reqs);
out:
@@ -1251,6 +1263,7 @@ static void xprt_destroy(struct rpc_xprt *xprt)
rpc_destroy_wait_queue(&xprt->sending);
rpc_destroy_wait_queue(&xprt->backlog);
cancel_work_sync(&xprt->task_cleanup);
+ kfree(xprt->servername);
/*
* Tear down transport state and free the rpc_xprt
*/


2012-03-01 22:02:07

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 13/15] NFS: Request fh_expire_type attribute in "server caps" operation

The fh_expire_type file attribute is a filesystem wide attribute that
consists of flags that indicate what characteristics file handles
on this FSID have.

Our client doesn't support volatile file handles. It should find
out early (say, at mount time) whether the server is going to play
shenanighans with file handles during a migration.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4proc.c | 1 +
fs/nfs/nfs4xdr.c | 26 ++++++++++++++++++++++++++
include/linux/nfs_fs_sb.h | 3 +++
include/linux/nfs_xdr.h | 1 +
4 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ced56d0..07bd1f3 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2220,6 +2220,7 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f
server->cache_consistency_bitmask[0] &= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
server->cache_consistency_bitmask[1] &= FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
server->acl_bitmask = res.acl_bitmask;
+ server->fh_expire_type = res.fh_expire_type;
}

return status;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index bec7227..018bbd7 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2637,6 +2637,7 @@ static void nfs4_xdr_enc_server_caps(struct rpc_rqst *req,
encode_sequence(xdr, &args->seq_args, &hdr);
encode_putfh(xdr, args->fhandle, &hdr);
encode_getattr_one(xdr, FATTR4_WORD0_SUPPORTED_ATTRS|
+ FATTR4_WORD0_FH_EXPIRE_TYPE|
FATTR4_WORD0_LINK_SUPPORT|
FATTR4_WORD0_SYMLINK_SUPPORT|
FATTR4_WORD0_ACLSUPPORT, &hdr);
@@ -3184,6 +3185,28 @@ out_overflow:
return -EIO;
}

+static int decode_attr_fh_expire_type(struct xdr_stream *xdr,
+ uint32_t *bitmap, uint32_t *type)
+{
+ __be32 *p;
+
+ *type = 0;
+ if (unlikely(bitmap[0] & (FATTR4_WORD0_FH_EXPIRE_TYPE - 1U)))
+ return -EIO;
+ if (likely(bitmap[0] & FATTR4_WORD0_FH_EXPIRE_TYPE)) {
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ goto out_overflow;
+ *type = be32_to_cpup(p);
+ bitmap[0] &= ~FATTR4_WORD0_FH_EXPIRE_TYPE;
+ }
+ dprintk("%s: expire type=0x%x\n", __func__, *type);
+ return 0;
+out_overflow:
+ print_overflow_msg(__func__, xdr);
+ return -EIO;
+}
+
static int decode_attr_change(struct xdr_stream *xdr, uint32_t *bitmap, uint64_t *change)
{
__be32 *p;
@@ -4232,6 +4255,9 @@ static int decode_server_caps(struct xdr_stream *xdr, struct nfs4_server_caps_re
goto xdr_error;
if ((status = decode_attr_supported(xdr, bitmap, res->attr_bitmask)) != 0)
goto xdr_error;
+ if ((status = decode_attr_fh_expire_type(xdr, bitmap,
+ &res->fh_expire_type)) != 0)
+ goto xdr_error;
if ((status = decode_attr_link_support(xdr, bitmap, &res->has_links)) != 0)
goto xdr_error;
if ((status = decode_attr_symlink_support(xdr, bitmap, &res->has_symlinks)) != 0)
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 0a2c826..4f72b53 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -146,6 +146,9 @@ struct nfs_server {
u32 acl_bitmask; /* V4 bitmask representing the ACEs
that are supported on this
filesystem */
+ u32 fh_expire_type; /* V4 bitmask representing file
+ handle volatility type for
+ this filesystem */
struct pnfs_layoutdriver_type *pnfs_curr_ld; /* Active layout driver */
struct rpc_wait_queue roc_rpcwaitq;
void *pnfs_ld_data; /* per mount point data */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 07eb9c9..211b280 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -977,6 +977,7 @@ struct nfs4_server_caps_res {
u32 acl_bitmask;
u32 has_links;
u32 has_symlinks;
+ u32 fh_expire_type;
struct nfs4_sequence_res seq_res;
};



2012-03-01 22:00:41

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 03/15] NFS: Add debugging messages to NFSv4's CLOSE procedure

CLOSE is new with NFSv4. Sometimes it's important to know the timing
of this operation compared to things like lease renewal.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4proc.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 87c584d..53daab9 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1983,6 +1983,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
struct nfs4_state *state = calldata->state;
struct nfs_server *server = NFS_SERVER(calldata->inode);

+ dprintk("%s: begin!\n", __func__);
if (!nfs4_sequence_done(task, &calldata->res.seq_res))
return;
/* hmm. we are done with the inode, and in the process of freeing
@@ -2010,6 +2011,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
}
nfs_release_seqid(calldata->arg.seqid);
nfs_refresh_inode(calldata->inode, calldata->res.fattr);
+ dprintk("%s: done, ret = %d!\n", __func__, task->tk_status);
}

static void nfs4_close_prepare(struct rpc_task *task, void *data)
@@ -2018,6 +2020,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
struct nfs4_state *state = calldata->state;
int call_close = 0;

+ dprintk("%s: begin!\n", __func__);
if (nfs_wait_on_sequence(calldata->arg.seqid, task) != 0)
return;

@@ -2042,7 +2045,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
if (!call_close) {
/* Note: exit _without_ calling nfs4_close_done */
task->tk_action = NULL;
- return;
+ goto out;
}

if (calldata->arg.fmode == 0) {
@@ -2051,7 +2054,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
pnfs_roc_drain(calldata->inode, &calldata->roc_barrier)) {
rpc_sleep_on(&NFS_SERVER(calldata->inode)->roc_rpcwaitq,
task, NULL);
- return;
+ goto out;
}
}

@@ -2061,8 +2064,10 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
&calldata->arg.seq_args,
&calldata->res.seq_res,
task))
- return;
+ goto out;
rpc_call_start(task);
+out:
+ dprintk("%s: done!\n", __func__);
}

static const struct rpc_call_ops nfs4_close_ops = {


2012-03-02 21:59:22

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

On 03/02/2012 10:10 AM, Chuck Lever wrote:
>
> On Mar 1, 2012, at 5:55 PM, Chuck Lever wrote:
>
>>
>> On Mar 1, 2012, at 5:39 PM, Myklebust, Trond wrote:
>>
>>> On Thu, 2012-03-01 at 17:02 -0500, Chuck Lever wrote:
>>>> Clean up due to code review.
>>>>
>>>> sizeof(struct nfs4_deviceid) is the size of the in-core device ID data
>>>> structure, but NFS4_DEVICEID4_SIZE is the number of octets in an XDR'd
>>>> device ID. The two are not interchangeable, even if they happen to
>>>> have the same value. If struct nfs4_deviceid is padded by the
>>>> compiler, sizeof(struct nfs4_deviceid) may not be the same as the
>>>> XDR'd size. Not likely, but still.
>>>>
>>>> Ensure that the data field is aligned to the largest pointer type that
>>>> is used to access it: in this case, that's u64. Type-punning among
>>>> types with different alignment has been discouraged in user space and
>>>> the kernel, to avoid unneeded pointer aliasing. The use of a union
>>>> is preferred instead.
>>>>
>>>> Ensure that XDR'ing and comparing is done via one of the union's data
>>>> fields, and not via the whole struct, again in case the compiler pads
>>>> the struct.
>>>>
>>>> As a micro-optimization, this change also ensures that the compiler
>>>> may perform memcpy() and memcmp() operations on these fields in
>>>> larger, more efficient, chunks.
>>>>
>>>> This patch needs review and testing by pnfs layout specialists.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>>
>>>> fs/nfs/blocklayout/blocklayout.c | 2 +-
>>>> fs/nfs/blocklayout/blocklayoutdev.c | 8 ++++----
>>>> fs/nfs/blocklayout/extents.c | 5 +++--
>>>> fs/nfs/callback_xdr.c | 2 +-
>>>> fs/nfs/nfs4filelayout.c | 3 +--
>>>> fs/nfs/nfs4xdr.c | 4 ++--
>>>> fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 ++++----
>>>> fs/nfs/pnfs_dev.c | 7 ++++---
>>>> include/linux/nfs4.h | 7 ++++++-
>>>> include/linux/pnfs_osd_xdr.h | 4 ++--
>>>> 10 files changed, 28 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
>>>> index 783ebd5..3b730bf 100644
>>>> --- a/fs/nfs/blocklayout/blocklayout.c
>>>> +++ b/fs/nfs/blocklayout/blocklayout.c
>>>> @@ -901,7 +901,7 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
>>>> dev->pglen = PAGE_SIZE * max_pages;
>>>> dev->mincount = 0;
>>>>
>>>> - dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.data);
>>>> + dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.u.data);
>>>> rc = nfs4_proc_getdeviceinfo(server, dev);
>>>> dprintk("%s getdevice info returns %d\n", __func__, rc);
>>>> if (rc) {
>>>> diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c
>>>> index b48f782..2dafca5 100644
>>>> --- a/fs/nfs/blocklayout/blocklayoutdev.c
>>>> +++ b/fs/nfs/blocklayout/blocklayoutdev.c
>>>> @@ -124,8 +124,8 @@ nfs4_blk_decode_device(struct nfs_server *server,
>>>> struct nfs_net *nn = net_generic(net, nfs_net_id);
>>>>
>>>> dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
>>>> - dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
>>>> - dev->mincount);
>>>> + dprintk("%s: deviceid: %s, mincount: %d\n", __func__,
>>>> + dev->dev_id.u.data, dev->mincount);
>>>>
>>>> memset(&msg, 0, sizeof(msg));
>>>> msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
>>>> @@ -206,7 +206,7 @@ static struct block_device *translate_devid(struct pnfs_layout_hdr *lo,
>>>> mid = BLK_ID(lo);
>>>> spin_lock(&mid->bm_lock);
>>>> list_for_each_entry(dev, &mid->bm_devlist, bm_node) {
>>>> - if (memcmp(id->data, dev->bm_mdevid.data,
>>>> + if (memcmp(id->u.data, dev->bm_mdevid.u.data,
>>>> NFS4_DEVICEID4_SIZE) == 0) {
>>>> rv = dev->bm_mdev;
>>>> goto out;
>>>> @@ -323,7 +323,7 @@ nfs4_blk_process_layoutget(struct pnfs_layout_hdr *lo,
>>>> status = -ENOMEM;
>>>> goto out_err;
>>>> }
>>>> - memcpy(&be->be_devid, p, NFS4_DEVICEID4_SIZE);
>>>> + memcpy(&be->be_devid.u.data, p, NFS4_DEVICEID4_SIZE);
>>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>> be->be_mdev = translate_devid(lo, &be->be_devid);
>>>> if (!be->be_mdev)
>>>> diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c
>>>> index 1f9a603..52b747c 100644
>>>> --- a/fs/nfs/blocklayout/extents.c
>>>> +++ b/fs/nfs/blocklayout/extents.c
>>>> @@ -675,10 +675,11 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
>>>> if (!xdr_start)
>>>> goto out;
>>>> list_for_each_entry_safe(lce, save, &bl->bl_commit, bse_node) {
>>>> - p = xdr_reserve_space(xdr, 7 * 4 + sizeof(lce->bse_devid.data));
>>>> + p = xdr_reserve_space(xdr, 7 * 4 + NFS4_DEVICEID4_SIZE);
>>>> if (!p)
>>>> break;
>>>> - p = xdr_encode_opaque_fixed(p, lce->bse_devid.data, NFS4_DEVICEID4_SIZE);
>>>> + p = xdr_encode_opaque_fixed(p, lce->bse_devid.u.data,
>>>> + NFS4_DEVICEID4_SIZE);
>>>> p = xdr_encode_hyper(p, lce->bse_f_offset << SECTOR_SHIFT);
>>>> p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
>>>> p = xdr_encode_hyper(p, 0LL);
>>>> diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
>>>> index 5466829..6061322 100644
>>>> --- a/fs/nfs/callback_xdr.c
>>>> +++ b/fs/nfs/callback_xdr.c
>>>> @@ -347,7 +347,7 @@ __be32 decode_devicenotify_args(struct svc_rqst *rqstp,
>>>> goto err;
>>>> }
>>>> dev->cbd_layout_type = ntohl(*p++);
>>>> - memcpy(dev->cbd_dev_id.data, p, NFS4_DEVICEID4_SIZE);
>>>> + memcpy(dev->cbd_dev_id.u.data, p, NFS4_DEVICEID4_SIZE);
>>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>>
>>>> if (dev->cbd_layout_type == NOTIFY_DEVICEID4_CHANGE) {
>>>> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
>>>> index 47e8f34..f0c6ba0 100644
>>>> --- a/fs/nfs/nfs4filelayout.c
>>>> +++ b/fs/nfs/nfs4filelayout.c
>>>> @@ -550,8 +550,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
>>>> if (unlikely(!p))
>>>> goto out_err;
>>>>
>>>> - memcpy(id, p, sizeof(*id));
>>>> - p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>> + p = xdr_decode_opaque_fixed(p, id->u.data, NFS4_DEVICEID4_SIZE);
>>>> nfs4_print_deviceid(id);
>>>>
>>>> nfl_util = be32_to_cpup(p++);
>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>> index 80ba010..91020ea 100644
>>>> --- a/fs/nfs/nfs4xdr.c
>>>> +++ b/fs/nfs/nfs4xdr.c
>>>> @@ -1946,7 +1946,7 @@ encode_getdeviceinfo(struct xdr_stream *xdr,
>>>>
>>>> p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
>>>> *p++ = cpu_to_be32(OP_GETDEVICEINFO);
>>>> - p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
>>>> + p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.u.data,
>>>> NFS4_DEVICEID4_SIZE);
>>>> *p++ = cpu_to_be32(args->pdev->layout_type);
>>>> *p++ = cpu_to_be32(args->pdev->pglen); /* gdia_maxcount */
>>>> @@ -5492,7 +5492,7 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
>>>> if (unlikely(!p))
>>>> goto out_overflow;
>>>> for (i = 0; i < res->num_devs; i++)
>>>> - p = xdr_decode_opaque_fixed(p, res->dev_id[i].data,
>>>> + p = xdr_decode_opaque_fixed(p, res->dev_id[i].u.data,
>>>> NFS4_DEVICEID4_SIZE);
>>>> res->eof = be32_to_cpup(p);
>>>> return 0;
>>>> diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>> index b3918f7..99a1e89 100644
>>>> --- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>> +++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>>> @@ -55,8 +55,8 @@
>>>> static __be32 *
>>>> _osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid)
>>>> {
>>>> - p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data,
>>>> - sizeof(objid->oid_device_id.data));
>>>> + p = xdr_decode_opaque_fixed(p, objid->oid_device_id.u.data,
>>>> + NFS4_DEVICEID4_SIZE);
>>>>
>>>> p = xdr_decode_hyper(p, &objid->oid_partition_id);
>>>> p = xdr_decode_hyper(p, &objid->oid_object_id);
>>>> @@ -377,8 +377,8 @@ pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr,
>>>> static inline __be32 *
>>>> pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id)
>>>> {
>>>> - p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data,
>>>> - sizeof(object_id->oid_device_id.data));
>>>> + p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.u.data,
>>>> + NFS4_DEVICEID4_SIZE);
>>>> p = xdr_encode_hyper(p, object_id->oid_partition_id);
>>>> p = xdr_encode_hyper(p, object_id->oid_object_id);
>>>>
>>>> diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
>>>> index 4f359d2..6c6ab81 100644
>>>> --- a/fs/nfs/pnfs_dev.c
>>>> +++ b/fs/nfs/pnfs_dev.c
>>>> @@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(nfs4_deviceid_lock);
>>>> void
>>>> nfs4_print_deviceid(const struct nfs4_deviceid *id)
>>>> {
>>>> - u32 *p = (u32 *)id;
>>>> + const u32 *p = id->u.data32;
>>>>
>>>> dprintk("%s: device id= [%x%x%x%x]\n", __func__,
>>>> p[0], p[1], p[2], p[3]);
>>>> @@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
>>>> static inline u32
>>>> nfs4_deviceid_hash(const struct nfs4_deviceid *id)
>>>> {
>>>> - unsigned char *cptr = (unsigned char *)id->data;
>>>> + const unsigned char *cptr = id->u.data;
>>>> unsigned int nbytes = NFS4_DEVICEID4_SIZE;
>>>> u32 x = 0;
>>>>
>>>> @@ -77,7 +77,8 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,
>>>>
>>>> hlist_for_each_entry_rcu(d, n, &nfs4_deviceid_cache[hash], node)
>>>> if (d->ld == ld && d->nfs_client == clp &&
>>>> - !memcmp(&d->deviceid, id, sizeof(*id))) {
>>>> + !memcmp(&d->deviceid.u.data, id->u.data,
>>>> + NFS4_DEVICEID4_SIZE)) {
>>>> if (atomic_read(&d->ref))
>>>> return d;
>>>> else
>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>>> index b1e6b64..979f607 100644
>>>> --- a/include/linux/nfs4.h
>>>> +++ b/include/linux/nfs4.h
>>>> @@ -649,9 +649,14 @@ enum filelayout_hint_care4 {
>>>> #define NFS4_DEVICEID4_SIZE 16
>>>>
>>>> struct nfs4_deviceid {
>>>> - char data[NFS4_DEVICEID4_SIZE];
>>>> + union {
>>>> + unsigned char data[NFS4_DEVICEID4_SIZE];
>>>> + u32 data32[NFS4_DEVICEID4_SIZE / sizeof(u32)];
>>>> + u64 data64[NFS4_DEVICEID4_SIZE / sizeof(u64)];
>>>> + } u;
>>>> };
>>>>
>>>
>>> Ugh... This just looks worse and worse.... Let's rather just keep these
>>> as unsigned char. I don't care if OSDs have special secret meanings and
>>> all that crap. They can cast the damned thing if they need to. The rest
>>> of us should be considering this to be opaque data.
>>
>> One possibility is that OSD and block layout drivers can construct their device ID in a u64 array on the stack, and then memcpy() the array to the nfs4_deviceid.
>>
>> But we can't cast pointers to an "unsigned char" field, since it's not guaranteed to be aligned. That's a real bug, one that's hit us before.
>
> Accessing a "char x[16]" field by casting x to a (u64 *) is considered harmful. It will work when x happens to be aligned to 64-bits, but otherwise accessing a field that way will cause an oops.
>
> It's not clear why the OSD layout driver needs to print device IDs as a pair of unsigned long longs. Why can't nfs4_print_deviceid() be used ?
>

For the same reason we do debug prints at all, for inconvenience.

If you want I can use the get_unaligned() macro if you prefer. That should be perfectly safe.
(And will do nothing on x86 ARCs)

I agree that it should stay just old plain "char data[NFS4_DEVICEID4_SIZE];"

Thanks
Boaz





2012-03-01 22:00:33

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 02/15] NFS: Clean up debugging in decode_pathname()

I noticed recently that decode_attr_fs_locations() is not generating
very pretty debugging output. The pathname components each appear on
a separate line of output, though that does not appear to be the
intended display behavior. The preferred way to generate continued
lines of output on the console is to use pr_cont().

Note that incoming pathname4 components contain a string that is not
necessarily NUL-terminated. I did actually see some trailing garbage
on the console. In addition to correcting the line continuation
problem, add a string precision format specifier to ensure that each
component string is displayed properly, and that vsnprintf() does
not Oops.

Someone pointed out that allowing incoming network data to possibly
generate a console line of unbounded length may not be such a good
idea. Since this output will rarely be enabled, and there is a hard
upper bound (NFS4_PATHNAME_MAXCOMPONENTS) in our implementation, this
is probably not a major concern.

It might be useful to additionally sanity-check the length of each
incoming component, however. RFC 3530bis15 does not suggest a maximum
number of UTF-8 characters per component for either the pathname4 or
component4 types. However, we could invent one that is appropriate
for our implementation.

Another possibility is to scrap all of this and print these pathnames
in upper layers after a reasonable amount of sanity checking in the
XDR layer. This would give us an opportunity to allocate a full
buffer so that the whole pathname would be output via a single
dprintk.

Introduced by commit 7aaa0b3b: "NFSv4: convert fs-locations-components
to conform to RFC3530," (June 9, 2006).

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/nfs4xdr.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index ae78343..4ca87cb 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -3514,16 +3514,17 @@ static int decode_pathname(struct xdr_stream *xdr, struct nfs4_pathname *path)
n = be32_to_cpup(p);
if (n == 0)
goto root_path;
- dprintk("path ");
+ dprintk("pathname4: ");
path->ncomponents = 0;
while (path->ncomponents < n) {
struct nfs4_string *component = &path->components[path->ncomponents];
status = decode_opaque_inline(xdr, &component->len, &component->data);
if (unlikely(status != 0))
goto out_eio;
- if (path->ncomponents != n)
- dprintk("/");
- dprintk("%s", component->data);
+ if (unlikely(nfs_debug & NFSDBG_XDR))
+ pr_cont("%s%.*s ",
+ (path->ncomponents != n ? "/ " : ""),
+ component->len, component->data);
if (path->ncomponents < NFS4_PATHNAME_MAXCOMPONENTS)
path->ncomponents++;
else {
@@ -3532,14 +3533,13 @@ static int decode_pathname(struct xdr_stream *xdr, struct nfs4_pathname *path)
}
}
out:
- dprintk("\n");
return status;
root_path:
/* a root pathname is sent as a zero component4 */
path->ncomponents = 1;
path->components[0].len=0;
path->components[0].data=NULL;
- dprintk("path /\n");
+ dprintk("pathname4: /\n");
goto out;
out_eio:
dprintk(" status %d", status);
@@ -3565,7 +3565,7 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
/* Ignore borken servers that return unrequested attrs */
if (unlikely(res == NULL))
goto out;
- dprintk("%s: fsroot ", __func__);
+ dprintk("%s: fsroot:\n", __func__);
status = decode_pathname(xdr, &res->fs_path);
if (unlikely(status != 0))
goto out;
@@ -3586,7 +3586,7 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
m = be32_to_cpup(p);

loc->nservers = 0;
- dprintk("%s: servers ", __func__);
+ dprintk("%s: servers:\n", __func__);
while (loc->nservers < m) {
struct nfs4_string *server = &loc->servers[loc->nservers];
status = decode_opaque_inline(xdr, &server->len, &server->data);


2012-03-01 22:01:33

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles

For debugging, introduce a simplistic function to print NFS file
handles on the system console. The main function is hooked into the
dprintk debugging facility, but you can directly call the helper,
_nfs_display_fhandle(), if you want to print a handle unconditionally.

Signed-off-by: Chuck Lever <[email protected]>
---

fs/nfs/inode.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/nfs_fs.h | 14 ++++++++++++++
2 files changed, 59 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 6c66259..99a4f52 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1046,6 +1046,51 @@ struct nfs_fh *nfs_alloc_fhandle(void)
}

/**
+ * _nfs_display_fhandle - display an NFS file handle on the console
+ *
+ * @fh: file handle to display
+ * @caption: display caption
+ *
+ * For debugging only.
+ */
+#ifdef RPC_DEBUG
+void _nfs_display_fhandle(const struct nfs_fh *fh, const char *caption)
+{
+ unsigned short i;
+
+ if (fh->size == 0 || fh == NULL) {
+ printk(KERN_DEFAULT "%s at %p is empty\n", caption, fh);
+ return;
+ }
+
+ printk(KERN_DEFAULT "%s at %p is %u bytes:\n", caption, fh, fh->size);
+ for (i = 0; i < fh->size; i += 16) {
+ __be32 *pos = (__be32 *)&fh->data[i];
+
+ switch ((fh->size - i - 1) >> 2) {
+ case 0:
+ printk(KERN_DEFAULT " %08x\n",
+ be32_to_cpup(pos));
+ break;
+ case 1:
+ printk(KERN_DEFAULT " %08x %08x\n",
+ be32_to_cpup(pos), be32_to_cpup(pos + 1));
+ break;
+ case 2:
+ printk(KERN_DEFAULT " %08x %08x %08x\n",
+ be32_to_cpup(pos), be32_to_cpup(pos + 1),
+ be32_to_cpup(pos + 2));
+ break;
+ default:
+ printk(KERN_DEFAULT " %08x %08x %08x %08x\n",
+ be32_to_cpup(pos), be32_to_cpup(pos + 1),
+ be32_to_cpup(pos + 2), be32_to_cpup(pos + 3));
+ }
+ }
+}
+#endif
+
+/**
* nfs_inode_attrs_need_update - check if the inode attributes need updating
* @inode - pointer to inode
* @fattr - attributes
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 8c29950..c07a757 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -395,6 +395,20 @@ static inline void nfs_free_fhandle(const struct nfs_fh *fh)
kfree(fh);
}

+#ifdef RPC_DEBUG
+extern void _nfs_display_fhandle(const struct nfs_fh *fh, const char *caption);
+#define nfs_display_fhandle(fh, caption) \
+ do { \
+ if (unlikely(nfs_debug & NFSDBG_FACILITY)) \
+ _nfs_display_fhandle(fh, caption); \
+ } while (0)
+#else
+static inline void nfs_display_fhandle(const struct nfs_fh *fh,
+ const char *caption)
+{
+}
+#endif
+
/*
* linux/fs/nfs/nfsroot.c
*/


2012-03-02 18:11:08

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids


On Mar 1, 2012, at 5:55 PM, Chuck Lever wrote:

>
> On Mar 1, 2012, at 5:39 PM, Myklebust, Trond wrote:
>
>> On Thu, 2012-03-01 at 17:02 -0500, Chuck Lever wrote:
>>> Clean up due to code review.
>>>
>>> sizeof(struct nfs4_deviceid) is the size of the in-core device ID data
>>> structure, but NFS4_DEVICEID4_SIZE is the number of octets in an XDR'd
>>> device ID. The two are not interchangeable, even if they happen to
>>> have the same value. If struct nfs4_deviceid is padded by the
>>> compiler, sizeof(struct nfs4_deviceid) may not be the same as the
>>> XDR'd size. Not likely, but still.
>>>
>>> Ensure that the data field is aligned to the largest pointer type that
>>> is used to access it: in this case, that's u64. Type-punning among
>>> types with different alignment has been discouraged in user space and
>>> the kernel, to avoid unneeded pointer aliasing. The use of a union
>>> is preferred instead.
>>>
>>> Ensure that XDR'ing and comparing is done via one of the union's data
>>> fields, and not via the whole struct, again in case the compiler pads
>>> the struct.
>>>
>>> As a micro-optimization, this change also ensures that the compiler
>>> may perform memcpy() and memcmp() operations on these fields in
>>> larger, more efficient, chunks.
>>>
>>> This patch needs review and testing by pnfs layout specialists.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>>
>>> fs/nfs/blocklayout/blocklayout.c | 2 +-
>>> fs/nfs/blocklayout/blocklayoutdev.c | 8 ++++----
>>> fs/nfs/blocklayout/extents.c | 5 +++--
>>> fs/nfs/callback_xdr.c | 2 +-
>>> fs/nfs/nfs4filelayout.c | 3 +--
>>> fs/nfs/nfs4xdr.c | 4 ++--
>>> fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 8 ++++----
>>> fs/nfs/pnfs_dev.c | 7 ++++---
>>> include/linux/nfs4.h | 7 ++++++-
>>> include/linux/pnfs_osd_xdr.h | 4 ++--
>>> 10 files changed, 28 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
>>> index 783ebd5..3b730bf 100644
>>> --- a/fs/nfs/blocklayout/blocklayout.c
>>> +++ b/fs/nfs/blocklayout/blocklayout.c
>>> @@ -901,7 +901,7 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
>>> dev->pglen = PAGE_SIZE * max_pages;
>>> dev->mincount = 0;
>>>
>>> - dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.data);
>>> + dprintk("%s: dev_id: %s\n", __func__, dev->dev_id.u.data);
>>> rc = nfs4_proc_getdeviceinfo(server, dev);
>>> dprintk("%s getdevice info returns %d\n", __func__, rc);
>>> if (rc) {
>>> diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c
>>> index b48f782..2dafca5 100644
>>> --- a/fs/nfs/blocklayout/blocklayoutdev.c
>>> +++ b/fs/nfs/blocklayout/blocklayoutdev.c
>>> @@ -124,8 +124,8 @@ nfs4_blk_decode_device(struct nfs_server *server,
>>> struct nfs_net *nn = net_generic(net, nfs_net_id);
>>>
>>> dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
>>> - dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
>>> - dev->mincount);
>>> + dprintk("%s: deviceid: %s, mincount: %d\n", __func__,
>>> + dev->dev_id.u.data, dev->mincount);
>>>
>>> memset(&msg, 0, sizeof(msg));
>>> msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
>>> @@ -206,7 +206,7 @@ static struct block_device *translate_devid(struct pnfs_layout_hdr *lo,
>>> mid = BLK_ID(lo);
>>> spin_lock(&mid->bm_lock);
>>> list_for_each_entry(dev, &mid->bm_devlist, bm_node) {
>>> - if (memcmp(id->data, dev->bm_mdevid.data,
>>> + if (memcmp(id->u.data, dev->bm_mdevid.u.data,
>>> NFS4_DEVICEID4_SIZE) == 0) {
>>> rv = dev->bm_mdev;
>>> goto out;
>>> @@ -323,7 +323,7 @@ nfs4_blk_process_layoutget(struct pnfs_layout_hdr *lo,
>>> status = -ENOMEM;
>>> goto out_err;
>>> }
>>> - memcpy(&be->be_devid, p, NFS4_DEVICEID4_SIZE);
>>> + memcpy(&be->be_devid.u.data, p, NFS4_DEVICEID4_SIZE);
>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>> be->be_mdev = translate_devid(lo, &be->be_devid);
>>> if (!be->be_mdev)
>>> diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c
>>> index 1f9a603..52b747c 100644
>>> --- a/fs/nfs/blocklayout/extents.c
>>> +++ b/fs/nfs/blocklayout/extents.c
>>> @@ -675,10 +675,11 @@ encode_pnfs_block_layoutupdate(struct pnfs_block_layout *bl,
>>> if (!xdr_start)
>>> goto out;
>>> list_for_each_entry_safe(lce, save, &bl->bl_commit, bse_node) {
>>> - p = xdr_reserve_space(xdr, 7 * 4 + sizeof(lce->bse_devid.data));
>>> + p = xdr_reserve_space(xdr, 7 * 4 + NFS4_DEVICEID4_SIZE);
>>> if (!p)
>>> break;
>>> - p = xdr_encode_opaque_fixed(p, lce->bse_devid.data, NFS4_DEVICEID4_SIZE);
>>> + p = xdr_encode_opaque_fixed(p, lce->bse_devid.u.data,
>>> + NFS4_DEVICEID4_SIZE);
>>> p = xdr_encode_hyper(p, lce->bse_f_offset << SECTOR_SHIFT);
>>> p = xdr_encode_hyper(p, lce->bse_length << SECTOR_SHIFT);
>>> p = xdr_encode_hyper(p, 0LL);
>>> diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
>>> index 5466829..6061322 100644
>>> --- a/fs/nfs/callback_xdr.c
>>> +++ b/fs/nfs/callback_xdr.c
>>> @@ -347,7 +347,7 @@ __be32 decode_devicenotify_args(struct svc_rqst *rqstp,
>>> goto err;
>>> }
>>> dev->cbd_layout_type = ntohl(*p++);
>>> - memcpy(dev->cbd_dev_id.data, p, NFS4_DEVICEID4_SIZE);
>>> + memcpy(dev->cbd_dev_id.u.data, p, NFS4_DEVICEID4_SIZE);
>>> p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>>
>>> if (dev->cbd_layout_type == NOTIFY_DEVICEID4_CHANGE) {
>>> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
>>> index 47e8f34..f0c6ba0 100644
>>> --- a/fs/nfs/nfs4filelayout.c
>>> +++ b/fs/nfs/nfs4filelayout.c
>>> @@ -550,8 +550,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
>>> if (unlikely(!p))
>>> goto out_err;
>>>
>>> - memcpy(id, p, sizeof(*id));
>>> - p += XDR_QUADLEN(NFS4_DEVICEID4_SIZE);
>>> + p = xdr_decode_opaque_fixed(p, id->u.data, NFS4_DEVICEID4_SIZE);
>>> nfs4_print_deviceid(id);
>>>
>>> nfl_util = be32_to_cpup(p++);
>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>> index 80ba010..91020ea 100644
>>> --- a/fs/nfs/nfs4xdr.c
>>> +++ b/fs/nfs/nfs4xdr.c
>>> @@ -1946,7 +1946,7 @@ encode_getdeviceinfo(struct xdr_stream *xdr,
>>>
>>> p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
>>> *p++ = cpu_to_be32(OP_GETDEVICEINFO);
>>> - p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
>>> + p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.u.data,
>>> NFS4_DEVICEID4_SIZE);
>>> *p++ = cpu_to_be32(args->pdev->layout_type);
>>> *p++ = cpu_to_be32(args->pdev->pglen); /* gdia_maxcount */
>>> @@ -5492,7 +5492,7 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
>>> if (unlikely(!p))
>>> goto out_overflow;
>>> for (i = 0; i < res->num_devs; i++)
>>> - p = xdr_decode_opaque_fixed(p, res->dev_id[i].data,
>>> + p = xdr_decode_opaque_fixed(p, res->dev_id[i].u.data,
>>> NFS4_DEVICEID4_SIZE);
>>> res->eof = be32_to_cpup(p);
>>> return 0;
>>> diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>> index b3918f7..99a1e89 100644
>>> --- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>> +++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
>>> @@ -55,8 +55,8 @@
>>> static __be32 *
>>> _osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid)
>>> {
>>> - p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data,
>>> - sizeof(objid->oid_device_id.data));
>>> + p = xdr_decode_opaque_fixed(p, objid->oid_device_id.u.data,
>>> + NFS4_DEVICEID4_SIZE);
>>>
>>> p = xdr_decode_hyper(p, &objid->oid_partition_id);
>>> p = xdr_decode_hyper(p, &objid->oid_object_id);
>>> @@ -377,8 +377,8 @@ pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr,
>>> static inline __be32 *
>>> pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id)
>>> {
>>> - p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data,
>>> - sizeof(object_id->oid_device_id.data));
>>> + p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.u.data,
>>> + NFS4_DEVICEID4_SIZE);
>>> p = xdr_encode_hyper(p, object_id->oid_partition_id);
>>> p = xdr_encode_hyper(p, object_id->oid_object_id);
>>>
>>> diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
>>> index 4f359d2..6c6ab81 100644
>>> --- a/fs/nfs/pnfs_dev.c
>>> +++ b/fs/nfs/pnfs_dev.c
>>> @@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(nfs4_deviceid_lock);
>>> void
>>> nfs4_print_deviceid(const struct nfs4_deviceid *id)
>>> {
>>> - u32 *p = (u32 *)id;
>>> + const u32 *p = id->u.data32;
>>>
>>> dprintk("%s: device id= [%x%x%x%x]\n", __func__,
>>> p[0], p[1], p[2], p[3]);
>>> @@ -56,7 +56,7 @@ EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
>>> static inline u32
>>> nfs4_deviceid_hash(const struct nfs4_deviceid *id)
>>> {
>>> - unsigned char *cptr = (unsigned char *)id->data;
>>> + const unsigned char *cptr = id->u.data;
>>> unsigned int nbytes = NFS4_DEVICEID4_SIZE;
>>> u32 x = 0;
>>>
>>> @@ -77,7 +77,8 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,
>>>
>>> hlist_for_each_entry_rcu(d, n, &nfs4_deviceid_cache[hash], node)
>>> if (d->ld == ld && d->nfs_client == clp &&
>>> - !memcmp(&d->deviceid, id, sizeof(*id))) {
>>> + !memcmp(&d->deviceid.u.data, id->u.data,
>>> + NFS4_DEVICEID4_SIZE)) {
>>> if (atomic_read(&d->ref))
>>> return d;
>>> else
>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>> index b1e6b64..979f607 100644
>>> --- a/include/linux/nfs4.h
>>> +++ b/include/linux/nfs4.h
>>> @@ -649,9 +649,14 @@ enum filelayout_hint_care4 {
>>> #define NFS4_DEVICEID4_SIZE 16
>>>
>>> struct nfs4_deviceid {
>>> - char data[NFS4_DEVICEID4_SIZE];
>>> + union {
>>> + unsigned char data[NFS4_DEVICEID4_SIZE];
>>> + u32 data32[NFS4_DEVICEID4_SIZE / sizeof(u32)];
>>> + u64 data64[NFS4_DEVICEID4_SIZE / sizeof(u64)];
>>> + } u;
>>> };
>>>
>>
>> Ugh... This just looks worse and worse.... Let's rather just keep these
>> as unsigned char. I don't care if OSDs have special secret meanings and
>> all that crap. They can cast the damned thing if they need to. The rest
>> of us should be considering this to be opaque data.
>
> One possibility is that OSD and block layout drivers can construct their device ID in a u64 array on the stack, and then memcpy() the array to the nfs4_deviceid.
>
> But we can't cast pointers to an "unsigned char" field, since it's not guaranteed to be aligned. That's a real bug, one that's hit us before.

Accessing a "char x[16]" field by casting x to a (u64 *) is considered harmful. It will work when x happens to be aligned to 64-bits, but otherwise accessing a field that way will cause an oops.

It's not clear why the OSD layout driver needs to print device IDs as a pair of unsigned long longs. Why can't nfs4_print_deviceid() be used ?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-01 22:40:04

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 15/15] NFS: Fix some minor problems with nfs4_deviceids

T24gVGh1LCAyMDEyLTAzLTAxIGF0IDE3OjAyIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
Q2xlYW4gdXAgZHVlIHRvIGNvZGUgcmV2aWV3Lg0KPiANCj4gc2l6ZW9mKHN0cnVjdCBuZnM0X2Rl
dmljZWlkKSBpcyB0aGUgc2l6ZSBvZiB0aGUgaW4tY29yZSBkZXZpY2UgSUQgZGF0YQ0KPiBzdHJ1
Y3R1cmUsIGJ1dCBORlM0X0RFVklDRUlENF9TSVpFIGlzIHRoZSBudW1iZXIgb2Ygb2N0ZXRzIGlu
IGFuIFhEUidkDQo+IGRldmljZSBJRC4gIFRoZSB0d28gYXJlIG5vdCBpbnRlcmNoYW5nZWFibGUs
IGV2ZW4gaWYgdGhleSBoYXBwZW4gdG8NCj4gaGF2ZSB0aGUgc2FtZSB2YWx1ZS4gIElmIHN0cnVj
dCBuZnM0X2RldmljZWlkIGlzIHBhZGRlZCBieSB0aGUNCj4gY29tcGlsZXIsIHNpemVvZihzdHJ1
Y3QgbmZzNF9kZXZpY2VpZCkgbWF5IG5vdCBiZSB0aGUgc2FtZSBhcyB0aGUNCj4gWERSJ2Qgc2l6
ZS4gIE5vdCBsaWtlbHksIGJ1dCBzdGlsbC4NCj4gDQo+IEVuc3VyZSB0aGF0IHRoZSBkYXRhIGZp
ZWxkIGlzIGFsaWduZWQgdG8gdGhlIGxhcmdlc3QgcG9pbnRlciB0eXBlIHRoYXQNCj4gaXMgdXNl
ZCB0byBhY2Nlc3MgaXQ6IGluIHRoaXMgY2FzZSwgdGhhdCdzIHU2NC4gIFR5cGUtcHVubmluZyBh
bW9uZw0KPiB0eXBlcyB3aXRoIGRpZmZlcmVudCBhbGlnbm1lbnQgaGFzIGJlZW4gZGlzY291cmFn
ZWQgaW4gdXNlciBzcGFjZSBhbmQNCj4gdGhlIGtlcm5lbCwgdG8gYXZvaWQgdW5uZWVkZWQgcG9p
bnRlciBhbGlhc2luZy4gIFRoZSB1c2Ugb2YgYSB1bmlvbg0KPiBpcyBwcmVmZXJyZWQgaW5zdGVh
ZC4NCj4gDQo+IEVuc3VyZSB0aGF0IFhEUidpbmcgYW5kIGNvbXBhcmluZyBpcyBkb25lIHZpYSBv
bmUgb2YgdGhlIHVuaW9uJ3MgZGF0YQ0KPiBmaWVsZHMsIGFuZCBub3QgdmlhIHRoZSB3aG9sZSBz
dHJ1Y3QsIGFnYWluIGluIGNhc2UgdGhlIGNvbXBpbGVyIHBhZHMNCj4gdGhlIHN0cnVjdC4NCj4g
DQo+IEFzIGEgbWljcm8tb3B0aW1pemF0aW9uLCB0aGlzIGNoYW5nZSBhbHNvIGVuc3VyZXMgdGhh
dCB0aGUgY29tcGlsZXINCj4gbWF5IHBlcmZvcm0gbWVtY3B5KCkgYW5kIG1lbWNtcCgpIG9wZXJh
dGlvbnMgb24gdGhlc2UgZmllbGRzIGluDQo+IGxhcmdlciwgbW9yZSBlZmZpY2llbnQsIGNodW5r
cy4NCj4gDQo+IFRoaXMgcGF0Y2ggbmVlZHMgcmV2aWV3IGFuZCB0ZXN0aW5nIGJ5IHBuZnMgbGF5
b3V0IHNwZWNpYWxpc3RzLg0KPiANCj4gU2lnbmVkLW9mZi1ieTogQ2h1Y2sgTGV2ZXIgPGNodWNr
LmxldmVyQG9yYWNsZS5jb20+DQo+IC0tLQ0KPiANCj4gIGZzL25mcy9ibG9ja2xheW91dC9ibG9j
a2xheW91dC5jICAgIHwgICAgMiArLQ0KPiAgZnMvbmZzL2Jsb2NrbGF5b3V0L2Jsb2NrbGF5b3V0
ZGV2LmMgfCAgICA4ICsrKystLS0tDQo+ICBmcy9uZnMvYmxvY2tsYXlvdXQvZXh0ZW50cy5jICAg
ICAgICB8ICAgIDUgKysrLS0NCj4gIGZzL25mcy9jYWxsYmFja194ZHIuYyAgICAgICAgICAgICAg
IHwgICAgMiArLQ0KPiAgZnMvbmZzL25mczRmaWxlbGF5b3V0LmMgICAgICAgICAgICAgfCAgICAz
ICstLQ0KPiAgZnMvbmZzL25mczR4ZHIuYyAgICAgICAgICAgICAgICAgICAgfCAgICA0ICsrLS0N
Cj4gIGZzL25mcy9vYmpsYXlvdXQvcG5mc19vc2RfeGRyX2NsaS5jIHwgICAgOCArKysrLS0tLQ0K
PiAgZnMvbmZzL3BuZnNfZGV2LmMgICAgICAgICAgICAgICAgICAgfCAgICA3ICsrKystLS0NCj4g
IGluY2x1ZGUvbGludXgvbmZzNC5oICAgICAgICAgICAgICAgIHwgICAgNyArKysrKystDQo+ICBp
bmNsdWRlL2xpbnV4L3BuZnNfb3NkX3hkci5oICAgICAgICB8ICAgIDQgKystLQ0KPiAgMTAgZmls
ZXMgY2hhbmdlZCwgMjggaW5zZXJ0aW9ucygrKSwgMjIgZGVsZXRpb25zKC0pDQo+IA0KPiBkaWZm
IC0tZ2l0IGEvZnMvbmZzL2Jsb2NrbGF5b3V0L2Jsb2NrbGF5b3V0LmMgYi9mcy9uZnMvYmxvY2ts
YXlvdXQvYmxvY2tsYXlvdXQuYw0KPiBpbmRleCA3ODNlYmQ1Li4zYjczMGJmIDEwMDY0NA0KPiAt
LS0gYS9mcy9uZnMvYmxvY2tsYXlvdXQvYmxvY2tsYXlvdXQuYw0KPiArKysgYi9mcy9uZnMvYmxv
Y2tsYXlvdXQvYmxvY2tsYXlvdXQuYw0KPiBAQCAtOTAxLDcgKzkwMSw3IEBAIG5mczRfYmxrX2dl
dF9kZXZpY2VpbmZvKHN0cnVjdCBuZnNfc2VydmVyICpzZXJ2ZXIsIGNvbnN0IHN0cnVjdCBuZnNf
ZmggKmZoLA0KPiAgCWRldi0+cGdsZW4gPSBQQUdFX1NJWkUgKiBtYXhfcGFnZXM7DQo+ICAJZGV2
LT5taW5jb3VudCA9IDA7DQo+ICANCj4gLQlkcHJpbnRrKCIlczogZGV2X2lkOiAlc1xuIiwgX19m
dW5jX18sIGRldi0+ZGV2X2lkLmRhdGEpOw0KPiArCWRwcmludGsoIiVzOiBkZXZfaWQ6ICVzXG4i
LCBfX2Z1bmNfXywgZGV2LT5kZXZfaWQudS5kYXRhKTsNCj4gIAlyYyA9IG5mczRfcHJvY19nZXRk
ZXZpY2VpbmZvKHNlcnZlciwgZGV2KTsNCj4gIAlkcHJpbnRrKCIlcyBnZXRkZXZpY2UgaW5mbyBy
ZXR1cm5zICVkXG4iLCBfX2Z1bmNfXywgcmMpOw0KPiAgCWlmIChyYykgew0KPiBkaWZmIC0tZ2l0
IGEvZnMvbmZzL2Jsb2NrbGF5b3V0L2Jsb2NrbGF5b3V0ZGV2LmMgYi9mcy9uZnMvYmxvY2tsYXlv
dXQvYmxvY2tsYXlvdXRkZXYuYw0KPiBpbmRleCBiNDhmNzgyLi4yZGFmY2E1IDEwMDY0NA0KPiAt
LS0gYS9mcy9uZnMvYmxvY2tsYXlvdXQvYmxvY2tsYXlvdXRkZXYuYw0KPiArKysgYi9mcy9uZnMv
YmxvY2tsYXlvdXQvYmxvY2tsYXlvdXRkZXYuYw0KPiBAQCAtMTI0LDggKzEyNCw4IEBAIG5mczRf
YmxrX2RlY29kZV9kZXZpY2Uoc3RydWN0IG5mc19zZXJ2ZXIgKnNlcnZlciwNCj4gIAlzdHJ1Y3Qg
bmZzX25ldCAqbm4gPSBuZXRfZ2VuZXJpYyhuZXQsIG5mc19uZXRfaWQpOw0KPiAgDQo+ICAJZHBy
aW50aygiJXMgQ1JFQVRJTkcgUElQRUZTIE1FU1NBR0VcbiIsIF9fZnVuY19fKTsNCj4gLQlkcHJp
bnRrKCIlczogZGV2aWNlaWQ6ICVzLCBtaW5jb3VudDogJWRcbiIsIF9fZnVuY19fLCBkZXYtPmRl
dl9pZC5kYXRhLA0KPiAtCQlkZXYtPm1pbmNvdW50KTsNCj4gKwlkcHJpbnRrKCIlczogZGV2aWNl
aWQ6ICVzLCBtaW5jb3VudDogJWRcbiIsIF9fZnVuY19fLA0KPiArCQlkZXYtPmRldl9pZC51LmRh
dGEsIGRldi0+bWluY291bnQpOw0KPiAgDQo+ICAJbWVtc2V0KCZtc2csIDAsIHNpemVvZihtc2cp
KTsNCj4gIAltc2cuZGF0YSA9IGt6YWxsb2Moc2l6ZW9mKGJsX21zZykgKyBkZXYtPm1pbmNvdW50
LCBHRlBfTk9GUyk7DQo+IEBAIC0yMDYsNyArMjA2LDcgQEAgc3RhdGljIHN0cnVjdCBibG9ja19k
ZXZpY2UgKnRyYW5zbGF0ZV9kZXZpZChzdHJ1Y3QgcG5mc19sYXlvdXRfaGRyICpsbywNCj4gIAlt
aWQgPSBCTEtfSUQobG8pOw0KPiAgCXNwaW5fbG9jaygmbWlkLT5ibV9sb2NrKTsNCj4gIAlsaXN0
X2Zvcl9lYWNoX2VudHJ5KGRldiwgJm1pZC0+Ym1fZGV2bGlzdCwgYm1fbm9kZSkgew0KPiAtCQlp
ZiAobWVtY21wKGlkLT5kYXRhLCBkZXYtPmJtX21kZXZpZC5kYXRhLA0KPiArCQlpZiAobWVtY21w
KGlkLT51LmRhdGEsIGRldi0+Ym1fbWRldmlkLnUuZGF0YSwNCj4gIAkJCSAgIE5GUzRfREVWSUNF
SUQ0X1NJWkUpID09IDApIHsNCj4gIAkJCXJ2ID0gZGV2LT5ibV9tZGV2Ow0KPiAgCQkJZ290byBv
dXQ7DQo+IEBAIC0zMjMsNyArMzIzLDcgQEAgbmZzNF9ibGtfcHJvY2Vzc19sYXlvdXRnZXQoc3Ry
dWN0IHBuZnNfbGF5b3V0X2hkciAqbG8sDQo+ICAJCQlzdGF0dXMgPSAtRU5PTUVNOw0KPiAgCQkJ
Z290byBvdXRfZXJyOw0KPiAgCQl9DQo+IC0JCW1lbWNweSgmYmUtPmJlX2RldmlkLCBwLCBORlM0
X0RFVklDRUlENF9TSVpFKTsNCj4gKwkJbWVtY3B5KCZiZS0+YmVfZGV2aWQudS5kYXRhLCBwLCBO
RlM0X0RFVklDRUlENF9TSVpFKTsNCj4gIAkJcCArPSBYRFJfUVVBRExFTihORlM0X0RFVklDRUlE
NF9TSVpFKTsNCj4gIAkJYmUtPmJlX21kZXYgPSB0cmFuc2xhdGVfZGV2aWQobG8sICZiZS0+YmVf
ZGV2aWQpOw0KPiAgCQlpZiAoIWJlLT5iZV9tZGV2KQ0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL2Js
b2NrbGF5b3V0L2V4dGVudHMuYyBiL2ZzL25mcy9ibG9ja2xheW91dC9leHRlbnRzLmMNCj4gaW5k
ZXggMWY5YTYwMy4uNTJiNzQ3YyAxMDA2NDQNCj4gLS0tIGEvZnMvbmZzL2Jsb2NrbGF5b3V0L2V4
dGVudHMuYw0KPiArKysgYi9mcy9uZnMvYmxvY2tsYXlvdXQvZXh0ZW50cy5jDQo+IEBAIC02NzUs
MTAgKzY3NSwxMSBAQCBlbmNvZGVfcG5mc19ibG9ja19sYXlvdXR1cGRhdGUoc3RydWN0IHBuZnNf
YmxvY2tfbGF5b3V0ICpibCwNCj4gIAlpZiAoIXhkcl9zdGFydCkNCj4gIAkJZ290byBvdXQ7DQo+
ICAJbGlzdF9mb3JfZWFjaF9lbnRyeV9zYWZlKGxjZSwgc2F2ZSwgJmJsLT5ibF9jb21taXQsIGJz
ZV9ub2RlKSB7DQo+IC0JCXAgPSB4ZHJfcmVzZXJ2ZV9zcGFjZSh4ZHIsIDcgKiA0ICsgc2l6ZW9m
KGxjZS0+YnNlX2RldmlkLmRhdGEpKTsNCj4gKwkJcCA9IHhkcl9yZXNlcnZlX3NwYWNlKHhkciwg
NyAqIDQgKyBORlM0X0RFVklDRUlENF9TSVpFKTsNCj4gIAkJaWYgKCFwKQ0KPiAgCQkJYnJlYWs7
DQo+IC0JCXAgPSB4ZHJfZW5jb2RlX29wYXF1ZV9maXhlZChwLCBsY2UtPmJzZV9kZXZpZC5kYXRh
LCBORlM0X0RFVklDRUlENF9TSVpFKTsNCj4gKwkJcCA9IHhkcl9lbmNvZGVfb3BhcXVlX2ZpeGVk
KHAsIGxjZS0+YnNlX2RldmlkLnUuZGF0YSwNCj4gKwkJCQkJICAgIE5GUzRfREVWSUNFSUQ0X1NJ
WkUpOw0KPiAgCQlwID0geGRyX2VuY29kZV9oeXBlcihwLCBsY2UtPmJzZV9mX29mZnNldCA8PCBT
RUNUT1JfU0hJRlQpOw0KPiAgCQlwID0geGRyX2VuY29kZV9oeXBlcihwLCBsY2UtPmJzZV9sZW5n
dGggPDwgU0VDVE9SX1NISUZUKTsNCj4gIAkJcCA9IHhkcl9lbmNvZGVfaHlwZXIocCwgMExMKTsN
Cj4gZGlmZiAtLWdpdCBhL2ZzL25mcy9jYWxsYmFja194ZHIuYyBiL2ZzL25mcy9jYWxsYmFja194
ZHIuYw0KPiBpbmRleCA1NDY2ODI5Li42MDYxMzIyIDEwMDY0NA0KPiAtLS0gYS9mcy9uZnMvY2Fs
bGJhY2tfeGRyLmMNCj4gKysrIGIvZnMvbmZzL2NhbGxiYWNrX3hkci5jDQo+IEBAIC0zNDcsNyAr
MzQ3LDcgQEAgX19iZTMyIGRlY29kZV9kZXZpY2Vub3RpZnlfYXJncyhzdHJ1Y3Qgc3ZjX3Jxc3Qg
KnJxc3RwLA0KPiAgCQkJZ290byBlcnI7DQo+ICAJCX0NCj4gIAkJZGV2LT5jYmRfbGF5b3V0X3R5
cGUgPSBudG9obCgqcCsrKTsNCj4gLQkJbWVtY3B5KGRldi0+Y2JkX2Rldl9pZC5kYXRhLCBwLCBO
RlM0X0RFVklDRUlENF9TSVpFKTsNCj4gKwkJbWVtY3B5KGRldi0+Y2JkX2Rldl9pZC51LmRhdGEs
IHAsIE5GUzRfREVWSUNFSUQ0X1NJWkUpOw0KPiAgCQlwICs9IFhEUl9RVUFETEVOKE5GUzRfREVW
SUNFSUQ0X1NJWkUpOw0KPiAgDQo+ICAJCWlmIChkZXYtPmNiZF9sYXlvdXRfdHlwZSA9PSBOT1RJ
RllfREVWSUNFSUQ0X0NIQU5HRSkgew0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL25mczRmaWxlbGF5
b3V0LmMgYi9mcy9uZnMvbmZzNGZpbGVsYXlvdXQuYw0KPiBpbmRleCA0N2U4ZjM0Li5mMGM2YmEw
IDEwMDY0NA0KPiAtLS0gYS9mcy9uZnMvbmZzNGZpbGVsYXlvdXQuYw0KPiArKysgYi9mcy9uZnMv
bmZzNGZpbGVsYXlvdXQuYw0KPiBAQCAtNTUwLDggKzU1MCw3IEBAIGZpbGVsYXlvdXRfZGVjb2Rl
X2xheW91dChzdHJ1Y3QgcG5mc19sYXlvdXRfaGRyICpmbG8sDQo+ICAJaWYgKHVubGlrZWx5KCFw
KSkNCj4gIAkJZ290byBvdXRfZXJyOw0KPiAgDQo+IC0JbWVtY3B5KGlkLCBwLCBzaXplb2YoKmlk
KSk7DQo+IC0JcCArPSBYRFJfUVVBRExFTihORlM0X0RFVklDRUlENF9TSVpFKTsNCj4gKwlwID0g
eGRyX2RlY29kZV9vcGFxdWVfZml4ZWQocCwgaWQtPnUuZGF0YSwgTkZTNF9ERVZJQ0VJRDRfU0la
RSk7DQo+ICAJbmZzNF9wcmludF9kZXZpY2VpZChpZCk7DQo+ICANCj4gIAluZmxfdXRpbCA9IGJl
MzJfdG9fY3B1cChwKyspOw0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL25mczR4ZHIuYyBiL2ZzL25m
cy9uZnM0eGRyLmMNCj4gaW5kZXggODBiYTAxMC4uOTEwMjBlYSAxMDA2NDQNCj4gLS0tIGEvZnMv
bmZzL25mczR4ZHIuYw0KPiArKysgYi9mcy9uZnMvbmZzNHhkci5jDQo+IEBAIC0xOTQ2LDcgKzE5
NDYsNyBAQCBlbmNvZGVfZ2V0ZGV2aWNlaW5mbyhzdHJ1Y3QgeGRyX3N0cmVhbSAqeGRyLA0KPiAg
DQo+ICAJcCA9IHJlc2VydmVfc3BhY2UoeGRyLCAxNiArIE5GUzRfREVWSUNFSUQ0X1NJWkUpOw0K
PiAgCSpwKysgPSBjcHVfdG9fYmUzMihPUF9HRVRERVZJQ0VJTkZPKTsNCj4gLQlwID0geGRyX2Vu
Y29kZV9vcGFxdWVfZml4ZWQocCwgYXJncy0+cGRldi0+ZGV2X2lkLmRhdGEsDQo+ICsJcCA9IHhk
cl9lbmNvZGVfb3BhcXVlX2ZpeGVkKHAsIGFyZ3MtPnBkZXYtPmRldl9pZC51LmRhdGEsDQo+ICAJ
CQkJICAgIE5GUzRfREVWSUNFSUQ0X1NJWkUpOw0KPiAgCSpwKysgPSBjcHVfdG9fYmUzMihhcmdz
LT5wZGV2LT5sYXlvdXRfdHlwZSk7DQo+ICAJKnArKyA9IGNwdV90b19iZTMyKGFyZ3MtPnBkZXYt
PnBnbGVuKTsJCS8qIGdkaWFfbWF4Y291bnQgKi8NCj4gQEAgLTU0OTIsNyArNTQ5Miw3IEBAIHN0
YXRpYyBpbnQgZGVjb2RlX2dldGRldmljZWxpc3Qoc3RydWN0IHhkcl9zdHJlYW0gKnhkciwNCj4g
IAlpZiAodW5saWtlbHkoIXApKQ0KPiAgCQlnb3RvIG91dF9vdmVyZmxvdzsNCj4gIAlmb3IgKGkg
PSAwOyBpIDwgcmVzLT5udW1fZGV2czsgaSsrKQ0KPiAtCQlwID0geGRyX2RlY29kZV9vcGFxdWVf
Zml4ZWQocCwgcmVzLT5kZXZfaWRbaV0uZGF0YSwNCj4gKwkJcCA9IHhkcl9kZWNvZGVfb3BhcXVl
X2ZpeGVkKHAsIHJlcy0+ZGV2X2lkW2ldLnUuZGF0YSwNCj4gIAkJCQkJICAgIE5GUzRfREVWSUNF
SUQ0X1NJWkUpOw0KPiAgCXJlcy0+ZW9mID0gYmUzMl90b19jcHVwKHApOw0KPiAgCXJldHVybiAw
Ow0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL29iamxheW91dC9wbmZzX29zZF94ZHJfY2xpLmMgYi9m
cy9uZnMvb2JqbGF5b3V0L3BuZnNfb3NkX3hkcl9jbGkuYw0KPiBpbmRleCBiMzkxOGY3Li45OWEx
ZTg5IDEwMDY0NA0KPiAtLS0gYS9mcy9uZnMvb2JqbGF5b3V0L3BuZnNfb3NkX3hkcl9jbGkuYw0K
PiArKysgYi9mcy9uZnMvb2JqbGF5b3V0L3BuZnNfb3NkX3hkcl9jbGkuYw0KPiBAQCAtNTUsOCAr
NTUsOCBAQA0KPiAgc3RhdGljIF9fYmUzMiAqDQo+ICBfb3NkX3hkcl9kZWNvZGVfb2JqaWQoX19i
ZTMyICpwLCBzdHJ1Y3QgcG5mc19vc2Rfb2JqaWQgKm9iamlkKQ0KPiAgew0KPiAtCXAgPSB4ZHJf
ZGVjb2RlX29wYXF1ZV9maXhlZChwLCBvYmppZC0+b2lkX2RldmljZV9pZC5kYXRhLA0KPiAtCQkJ
CSAgICBzaXplb2Yob2JqaWQtPm9pZF9kZXZpY2VfaWQuZGF0YSkpOw0KPiArCXAgPSB4ZHJfZGVj
b2RlX29wYXF1ZV9maXhlZChwLCBvYmppZC0+b2lkX2RldmljZV9pZC51LmRhdGEsDQo+ICsJCQkJ
ICAgIE5GUzRfREVWSUNFSUQ0X1NJWkUpOw0KPiAgDQo+ICAJcCA9IHhkcl9kZWNvZGVfaHlwZXIo
cCwgJm9iamlkLT5vaWRfcGFydGl0aW9uX2lkKTsNCj4gIAlwID0geGRyX2RlY29kZV9oeXBlcihw
LCAmb2JqaWQtPm9pZF9vYmplY3RfaWQpOw0KPiBAQCAtMzc3LDggKzM3Nyw4IEBAIHBuZnNfb3Nk
X3hkcl9lbmNvZGVfbGF5b3V0dXBkYXRlKHN0cnVjdCB4ZHJfc3RyZWFtICp4ZHIsDQo+ICBzdGF0
aWMgaW5saW5lIF9fYmUzMiAqDQo+ICBwbmZzX29zZF94ZHJfZW5jb2RlX29iamlkKF9fYmUzMiAq
cCwgc3RydWN0IHBuZnNfb3NkX29iamlkICpvYmplY3RfaWQpDQo+ICB7DQo+IC0JcCA9IHhkcl9l
bmNvZGVfb3BhcXVlX2ZpeGVkKHAsICZvYmplY3RfaWQtPm9pZF9kZXZpY2VfaWQuZGF0YSwNCj4g
LQkJCQkgICAgc2l6ZW9mKG9iamVjdF9pZC0+b2lkX2RldmljZV9pZC5kYXRhKSk7DQo+ICsJcCA9
IHhkcl9lbmNvZGVfb3BhcXVlX2ZpeGVkKHAsICZvYmplY3RfaWQtPm9pZF9kZXZpY2VfaWQudS5k
YXRhLA0KPiArCQkJCSAgICBORlM0X0RFVklDRUlENF9TSVpFKTsNCj4gIAlwID0geGRyX2VuY29k
ZV9oeXBlcihwLCBvYmplY3RfaWQtPm9pZF9wYXJ0aXRpb25faWQpOw0KPiAgCXAgPSB4ZHJfZW5j
b2RlX2h5cGVyKHAsIG9iamVjdF9pZC0+b2lkX29iamVjdF9pZCk7DQo+ICANCj4gZGlmZiAtLWdp
dCBhL2ZzL25mcy9wbmZzX2Rldi5jIGIvZnMvbmZzL3BuZnNfZGV2LmMNCj4gaW5kZXggNGYzNTlk
Mi4uNmM2YWI4MSAxMDA2NDQNCj4gLS0tIGEvZnMvbmZzL3BuZnNfZGV2LmMNCj4gKysrIGIvZnMv
bmZzL3BuZnNfZGV2LmMNCj4gQEAgLTQ2LDcgKzQ2LDcgQEAgc3RhdGljIERFRklORV9TUElOTE9D
SyhuZnM0X2RldmljZWlkX2xvY2spOw0KPiAgdm9pZA0KPiAgbmZzNF9wcmludF9kZXZpY2VpZChj
b25zdCBzdHJ1Y3QgbmZzNF9kZXZpY2VpZCAqaWQpDQo+ICB7DQo+IC0JdTMyICpwID0gKHUzMiAq
KWlkOw0KPiArCWNvbnN0IHUzMiAqcCA9IGlkLT51LmRhdGEzMjsNCj4gIA0KPiAgCWRwcmludGso
IiVzOiBkZXZpY2UgaWQ9IFsleCV4JXgleF1cbiIsIF9fZnVuY19fLA0KPiAgCQlwWzBdLCBwWzFd
LCBwWzJdLCBwWzNdKTsNCj4gQEAgLTU2LDcgKzU2LDcgQEAgRVhQT1JUX1NZTUJPTF9HUEwobmZz
NF9wcmludF9kZXZpY2VpZCk7DQo+ICBzdGF0aWMgaW5saW5lIHUzMg0KPiAgbmZzNF9kZXZpY2Vp
ZF9oYXNoKGNvbnN0IHN0cnVjdCBuZnM0X2RldmljZWlkICppZCkNCj4gIHsNCj4gLQl1bnNpZ25l
ZCBjaGFyICpjcHRyID0gKHVuc2lnbmVkIGNoYXIgKilpZC0+ZGF0YTsNCj4gKwljb25zdCB1bnNp
Z25lZCBjaGFyICpjcHRyID0gaWQtPnUuZGF0YTsNCj4gIAl1bnNpZ25lZCBpbnQgbmJ5dGVzID0g
TkZTNF9ERVZJQ0VJRDRfU0laRTsNCj4gIAl1MzIgeCA9IDA7DQo+ICANCj4gQEAgLTc3LDcgKzc3
LDggQEAgX2xvb2t1cF9kZXZpY2VpZChjb25zdCBzdHJ1Y3QgcG5mc19sYXlvdXRkcml2ZXJfdHlw
ZSAqbGQsDQo+ICANCj4gIAlobGlzdF9mb3JfZWFjaF9lbnRyeV9yY3UoZCwgbiwgJm5mczRfZGV2
aWNlaWRfY2FjaGVbaGFzaF0sIG5vZGUpDQo+ICAJCWlmIChkLT5sZCA9PSBsZCAmJiBkLT5uZnNf
Y2xpZW50ID09IGNscCAmJg0KPiAtCQkgICAgIW1lbWNtcCgmZC0+ZGV2aWNlaWQsIGlkLCBzaXpl
b2YoKmlkKSkpIHsNCj4gKwkJICAgICFtZW1jbXAoJmQtPmRldmljZWlkLnUuZGF0YSwgaWQtPnUu
ZGF0YSwNCj4gKwkJCQkJCU5GUzRfREVWSUNFSUQ0X1NJWkUpKSB7DQo+ICAJCQlpZiAoYXRvbWlj
X3JlYWQoJmQtPnJlZikpDQo+ICAJCQkJcmV0dXJuIGQ7DQo+ICAJCQllbHNlDQo+IGRpZmYgLS1n
aXQgYS9pbmNsdWRlL2xpbnV4L25mczQuaCBiL2luY2x1ZGUvbGludXgvbmZzNC5oDQo+IGluZGV4
IGIxZTZiNjQuLjk3OWY2MDcgMTAwNjQ0DQo+IC0tLSBhL2luY2x1ZGUvbGludXgvbmZzNC5oDQo+
ICsrKyBiL2luY2x1ZGUvbGludXgvbmZzNC5oDQo+IEBAIC02NDksOSArNjQ5LDE0IEBAIGVudW0g
ZmlsZWxheW91dF9oaW50X2NhcmU0IHsNCj4gICNkZWZpbmUgTkZTNF9ERVZJQ0VJRDRfU0laRSAx
Ng0KPiAgDQo+ICBzdHJ1Y3QgbmZzNF9kZXZpY2VpZCB7DQo+IC0JY2hhciBkYXRhW05GUzRfREVW
SUNFSUQ0X1NJWkVdOw0KPiArCXVuaW9uIHsNCj4gKwkJdW5zaWduZWQgY2hhcglkYXRhW05GUzRf
REVWSUNFSUQ0X1NJWkVdOw0KPiArCQl1MzIJCWRhdGEzMltORlM0X0RFVklDRUlENF9TSVpFIC8g
c2l6ZW9mKHUzMildOw0KPiArCQl1NjQJCWRhdGE2NFtORlM0X0RFVklDRUlENF9TSVpFIC8gc2l6
ZW9mKHU2NCldOw0KPiArCX0gdTsNCj4gIH07DQo+ICANCg0KVWdoLi4uIFRoaXMganVzdCBsb29r
cyB3b3JzZSBhbmQgd29yc2UuLi4uIExldCdzIHJhdGhlciBqdXN0IGtlZXAgdGhlc2UNCmFzIHVu
c2lnbmVkIGNoYXIuIEkgZG9uJ3QgY2FyZSBpZiBPU0RzIGhhdmUgc3BlY2lhbCBzZWNyZXQgbWVh
bmluZ3MgYW5kDQphbGwgdGhhdCBjcmFwLiBUaGV5IGNhbiBjYXN0IHRoZSBkYW1uZWQgdGhpbmcg
aWYgdGhleSBuZWVkIHRvLiBUaGUgcmVzdA0Kb2YgdXMgc2hvdWxkIGJlIGNvbnNpZGVyaW5nIHRo
aXMgdG8gYmUgb3BhcXVlIGRhdGEuDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNs
aWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3
dy5uZXRhcHAuY29tDQoNCg==

2012-03-01 22:32:40

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles


On Mar 1, 2012, at 5:28 PM, Myklebust, Trond wrote:

> On Thu, 2012-03-01 at 17:01 -0500, Chuck Lever wrote:
>> For debugging, introduce a simplistic function to print NFS file
>> handles on the system console. The main function is hooked into the
>> dprintk debugging facility, but you can directly call the helper,
>> _nfs_display_fhandle(), if you want to print a handle unconditionally.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>>
>
> One of the things that I really like about wireshark is that it also
> displays a 32-bit hexadecimal hash of the filehandle. That allows you to
> compare 2 filehandles _much_ faster than trying to look at the full
> shebang.

It should be easy to add a displayed hash in a subsequent patch.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-03-02 18:50:30

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH 09/15] NFS: Add a client-side function to display NFS file handles



On 03/02/2012 12:19 PM, Chuck Lever wrote:
>
> On Mar 2, 2012, at 12:17 PM, Steve Dickson wrote:
>
>>
>>
>> On 03/01/2012 05:28 PM, Myklebust, Trond wrote:
>>> On Thu, 2012-03-01 at 17:01 -0500, Chuck Lever wrote:
>>>> For debugging, introduce a simplistic function to print NFS file
>>>> handles on the system console. The main function is hooked into the
>>>> dprintk debugging facility, but you can directly call the helper,
>>>> _nfs_display_fhandle(), if you want to print a handle unconditionally.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>>
>>>
>>> One of the things that I really like about wireshark is that it also
>>> displays a 32-bit hexadecimal hash of the filehandle. That allows you to
>>> compare 2 filehandles _much_ faster than trying to look at the full
>>> shebang.
>>>
>>>
>> +1 Being able to easily trace filehandle in both the kernel debug and
>> wireshark would be huge.... IMHO...
>
> Dros mentioned a possible follow-on patch last night at the pub that would add exactly this functionality.
>
And they say good things don't come out of debauchery!!! 8-)