2016-09-15 18:40:51

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 resend 0/2] pnfs: allow client to support servers that send multiple layout types

Resending as I noticed that Anna had merged the first of the two
patches into her linux-next branch, but not the second one. Were
there any further concerns with it?

v4:
- pass around fsinfo instead of separate array and length args

v3:
- use sort() from lib/sort.c to sort the list

v2:
- rework Tigran's patch to preserve existing selection behaviour
- simplify layout driver selection by sorting the list

This is v4 of the patchset. The main change is the change to pass
around the fsinfo instead of separate array pointer and length args.

Only lightly tested by mounting a server that sends both flexfiles and
block layouts. The client successfully selected the block layout in most
cases, but if I blacklist blocklayoutdriver then it selects flexfiles
instead.

Jeff Layton (2):
pnfs: track multiple layout types in fsinfo structure
pnfs: add a new mechanism to select a layout driver according to an
ordered list

fs/nfs/client.c | 3 ++-
fs/nfs/nfs4proc.c | 2 +-
fs/nfs/nfs4xdr.c | 40 ++++++++++++++--------------
fs/nfs/pnfs.c | 69 ++++++++++++++++++++++++++++++++++++++++---------
fs/nfs/pnfs.h | 5 ++--
include/linux/nfs_xdr.h | 8 +++++-
6 files changed, 91 insertions(+), 36 deletions(-)

--
2.7.4



2016-09-15 18:40:53

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 resend 2/2] pnfs: add a new mechanism to select a layout driver according to an ordered list

Currently, the layout driver selection code always chooses the first one
from the list. That's not really ideal however, as the server can send
the list of layout types in any order that it likes. It's up to the
client to select the best one for its needs.

This patch adds an ordered list of preferred driver types and has the
selection code sort the list of available layout drivers according to it.
Any unrecognized layout type is sorted to the end of the list.

For now, the order of preference is hardcoded, but it should be possible
to make this configurable in the future.

Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: J. Bruce Fields <[email protected]>
---
fs/nfs/client.c | 1 +
fs/nfs/nfs4proc.c | 2 +-
fs/nfs/nfs4xdr.c | 31 +++++++++++++++------------
fs/nfs/pnfs.c | 56 ++++++++++++++++++++++++++++++++++++++++++-------
fs/nfs/pnfs.h | 5 +++--
include/linux/nfs_xdr.h | 1 +
6 files changed, 72 insertions(+), 24 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 0c1b3a002dc2..31b03375a039 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -785,6 +785,7 @@ int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, struct nfs
}

fsinfo.fattr = fattr;
+ fsinfo.nlayouttypes = 0;
memset(fsinfo.layouttype, 0, sizeof(fsinfo.layouttype));
error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo);
if (error < 0)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dd19ab986b80..1e45952e44f6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4312,7 +4312,7 @@ static int nfs4_proc_fsinfo(struct nfs_server *server, struct nfs_fh *fhandle, s
if (error == 0) {
/* block layout checks this! */
server->pnfs_blksize = fsinfo->blksize;
- set_pnfs_layoutdriver(server, fhandle, fsinfo->layouttype);
+ set_pnfs_layoutdriver(server, fhandle, fsinfo);
}

return error;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 41a02f994976..17b4e059c588 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4728,29 +4728,34 @@ static int decode_getfattr(struct xdr_stream *xdr, struct nfs_fattr *fattr,
* Decode potentially multiple layout types.
*/
static int decode_pnfs_layout_types(struct xdr_stream *xdr,
- uint32_t *layouttype)
+ struct nfs_fsinfo *fsinfo)
{
__be32 *p;
- uint32_t num, i;
+ uint32_t i;

p = xdr_inline_decode(xdr, 4);
if (unlikely(!p))
goto out_overflow;
- num = be32_to_cpup(p);
+ fsinfo->nlayouttypes = be32_to_cpup(p);

/* pNFS is not supported by the underlying file system */
- if (num == 0) {
+ if (fsinfo->nlayouttypes == 0)
return 0;
- }
- if (num > NFS_MAX_LAYOUT_TYPES)
- printk(KERN_INFO "NFS: %s: Warning: Too many (%d) pNFS layout types\n", __func__, num);

/* Decode and set first layout type, move xdr->p past unused types */
- p = xdr_inline_decode(xdr, num * 4);
+ p = xdr_inline_decode(xdr, fsinfo->nlayouttypes * 4);
if (unlikely(!p))
goto out_overflow;
- for(i = 0; i < num && i < NFS_MAX_LAYOUT_TYPES; i++)
- layouttype[i] = be32_to_cpup(p++);
+
+ /* If we get too many, then just cap it at the max */
+ if (fsinfo->nlayouttypes > NFS_MAX_LAYOUT_TYPES) {
+ printk(KERN_INFO "NFS: %s: Warning: Too many (%u) pNFS layout types\n",
+ __func__, fsinfo->nlayouttypes);
+ fsinfo->nlayouttypes = NFS_MAX_LAYOUT_TYPES;
+ }
+
+ for(i = 0; i < fsinfo->nlayouttypes; ++i)
+ fsinfo->layouttype[i] = be32_to_cpup(p++);
return 0;
out_overflow:
print_overflow_msg(__func__, xdr);
@@ -4762,7 +4767,7 @@ out_overflow:
* Note we must ensure that layouttype is set in any non-error case.
*/
static int decode_attr_pnfstype(struct xdr_stream *xdr, uint32_t *bitmap,
- uint32_t *layouttype)
+ struct nfs_fsinfo *fsinfo)
{
int status = 0;

@@ -4770,7 +4775,7 @@ static int decode_attr_pnfstype(struct xdr_stream *xdr, uint32_t *bitmap,
if (unlikely(bitmap[1] & (FATTR4_WORD1_FS_LAYOUT_TYPES - 1U)))
return -EIO;
if (bitmap[1] & FATTR4_WORD1_FS_LAYOUT_TYPES) {
- status = decode_pnfs_layout_types(xdr, layouttype);
+ status = decode_pnfs_layout_types(xdr, fsinfo);
bitmap[1] &= ~FATTR4_WORD1_FS_LAYOUT_TYPES;
}
return status;
@@ -4853,7 +4858,7 @@ static int decode_fsinfo(struct xdr_stream *xdr, struct nfs_fsinfo *fsinfo)
status = decode_attr_time_delta(xdr, bitmap, &fsinfo->time_delta);
if (status != 0)
goto xdr_error;
- status = decode_attr_pnfstype(xdr, bitmap, fsinfo->layouttype);
+ status = decode_attr_pnfstype(xdr, bitmap, fsinfo);
if (status != 0)
goto xdr_error;

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index a6a683fbd230..b588ccf05045 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -30,6 +30,7 @@
#include <linux/nfs_fs.h>
#include <linux/nfs_page.h>
#include <linux/module.h>
+#include <linux/sort.h>
#include "internal.h"
#include "pnfs.h"
#include "iostat.h"
@@ -99,6 +100,39 @@ unset_pnfs_layoutdriver(struct nfs_server *nfss)
}

/*
+ * When the server sends a list of layout types, we choose one in the order
+ * given in the list below.
+ *
+ * FIXME: should this list be configurable in some fashion? module param?
+ * mount option? something else?
+ */
+static const u32 ld_prefs[] = {
+ LAYOUT_SCSI,
+ LAYOUT_BLOCK_VOLUME,
+ LAYOUT_OSD2_OBJECTS,
+ LAYOUT_FLEX_FILES,
+ LAYOUT_NFSV4_1_FILES,
+ 0
+};
+
+static int
+ld_cmp(const void *e1, const void *e2)
+{
+ u32 ld1 = *((u32 *)e1);
+ u32 ld2 = *((u32 *)e2);
+ int i;
+
+ for (i = 0; ld_prefs[i] != 0; i++) {
+ if (ld1 == ld_prefs[i])
+ return -1;
+
+ if (ld2 == ld_prefs[i])
+ return 1;
+ }
+ return 0;
+}
+
+/*
* Try to set the server's pnfs module to the pnfs layout type specified by id.
* Currently only one pNFS layout driver per filesystem is supported.
*
@@ -106,10 +140,11 @@ unset_pnfs_layoutdriver(struct nfs_server *nfss)
*/
void
set_pnfs_layoutdriver(struct nfs_server *server, const struct nfs_fh *mntfh,
- u32 *ids)
+ struct nfs_fsinfo *fsinfo)
{
struct pnfs_layoutdriver_type *ld_type = NULL;
u32 id;
+ int i;

if (!(server->nfs_client->cl_exchange_flags &
(EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_USE_PNFS_MDS))) {
@@ -118,18 +153,23 @@ set_pnfs_layoutdriver(struct nfs_server *server, const struct nfs_fh *mntfh,
goto out_no_driver;
}

- id = ids[0];
- if (!id)
- goto out_no_driver;
+ sort(fsinfo->layouttype, fsinfo->nlayouttypes,
+ sizeof(*fsinfo->layouttype), ld_cmp, NULL);

- ld_type = find_pnfs_driver(id);
- if (!ld_type) {
- request_module("%s-%u", LAYOUT_NFSV4_1_MODULE_PREFIX, id);
+ for (i = 0; i < fsinfo->nlayouttypes; i++) {
+ id = fsinfo->layouttype[i];
ld_type = find_pnfs_driver(id);
+ if (!ld_type) {
+ request_module("%s-%u", LAYOUT_NFSV4_1_MODULE_PREFIX,
+ id);
+ ld_type = find_pnfs_driver(id);
+ }
+ if (ld_type)
+ break;
}

if (!ld_type) {
- dprintk("%s: No pNFS module found for %u.\n", __func__, id);
+ dprintk("%s: No pNFS module found!\n", __func__);
goto out_no_driver;
}

diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index be515e6a3823..5c295512c967 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -236,7 +236,7 @@ void pnfs_get_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_put_lseg(struct pnfs_layout_segment *lseg);
void pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg);

-void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, u32 *);
+void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, struct nfs_fsinfo *);
void unset_pnfs_layoutdriver(struct nfs_server *);
void pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *, struct nfs_page *);
int pnfs_generic_pg_readpages(struct nfs_pageio_descriptor *desc);
@@ -657,7 +657,8 @@ pnfs_wait_on_layoutreturn(struct inode *ino, struct rpc_task *task)
}

static inline void set_pnfs_layoutdriver(struct nfs_server *s,
- const struct nfs_fh *mntfh, u32 *ids)
+ const struct nfs_fh *mntfh,
+ struct nfs_fsinfo *fsinfo)
{
}

diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index f11b26ed001b..beb1e10f446e 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -144,6 +144,7 @@ struct nfs_fsinfo {
__u64 maxfilesize;
struct timespec time_delta; /* server time granularity */
__u32 lease_time; /* in seconds */
+ __u32 nlayouttypes; /* number of layouttypes */
__u32 layouttype[NFS_MAX_LAYOUT_TYPES]; /* supported pnfs layout driver */
__u32 blksize; /* preferred pnfs io block size */
__u32 clone_blksize; /* granularity of a CLONE operation */
--
2.7.4


2016-09-15 18:40:53

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 resend 1/2] pnfs: track multiple layout types in fsinfo structure

Current NFSv4.1/pNFS client assumes that MDS supports only one layout
type. While it's true for most existing servers, this may not always be
the case.

For now, this patch just plumbs in the ability to track a list of
layouts in the fsinfo structure. The existing behavior of the client
is preserved, by having it just select the first entry in the list.

Signed-off-by: Tigran Mkrtchyan <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: J. Bruce Fields <[email protected]>
---
fs/nfs/client.c | 2 +-
fs/nfs/nfs4xdr.c | 23 ++++++++++-------------
fs/nfs/pnfs.c | 27 ++++++++++++++++-----------
fs/nfs/pnfs.h | 4 ++--
include/linux/nfs_xdr.h | 7 ++++++-
5 files changed, 35 insertions(+), 28 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 1e106780a237..0c1b3a002dc2 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -785,7 +785,7 @@ int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, struct nfs
}

fsinfo.fattr = fattr;
- fsinfo.layouttype = 0;
+ memset(fsinfo.layouttype, 0, sizeof(fsinfo.layouttype));
error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo);
if (error < 0)
goto out_error;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 7bd3a5c09d31..41a02f994976 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4725,14 +4725,13 @@ static int decode_getfattr(struct xdr_stream *xdr, struct nfs_fattr *fattr,
}

/*
- * Decode potentially multiple layout types. Currently we only support
- * one layout driver per file system.
+ * Decode potentially multiple layout types.
*/
-static int decode_first_pnfs_layout_type(struct xdr_stream *xdr,
+static int decode_pnfs_layout_types(struct xdr_stream *xdr,
uint32_t *layouttype)
{
__be32 *p;
- int num;
+ uint32_t num, i;

p = xdr_inline_decode(xdr, 4);
if (unlikely(!p))
@@ -4741,18 +4740,17 @@ static int decode_first_pnfs_layout_type(struct xdr_stream *xdr,

/* pNFS is not supported by the underlying file system */
if (num == 0) {
- *layouttype = 0;
return 0;
}
- if (num > 1)
- printk(KERN_INFO "NFS: %s: Warning: Multiple pNFS layout "
- "drivers per filesystem not supported\n", __func__);
+ if (num > NFS_MAX_LAYOUT_TYPES)
+ printk(KERN_INFO "NFS: %s: Warning: Too many (%d) pNFS layout types\n", __func__, num);

/* Decode and set first layout type, move xdr->p past unused types */
p = xdr_inline_decode(xdr, num * 4);
if (unlikely(!p))
goto out_overflow;
- *layouttype = be32_to_cpup(p);
+ for(i = 0; i < num && i < NFS_MAX_LAYOUT_TYPES; i++)
+ layouttype[i] = be32_to_cpup(p++);
return 0;
out_overflow:
print_overflow_msg(__func__, xdr);
@@ -4772,10 +4770,9 @@ static int decode_attr_pnfstype(struct xdr_stream *xdr, uint32_t *bitmap,
if (unlikely(bitmap[1] & (FATTR4_WORD1_FS_LAYOUT_TYPES - 1U)))
return -EIO;
if (bitmap[1] & FATTR4_WORD1_FS_LAYOUT_TYPES) {
- status = decode_first_pnfs_layout_type(xdr, layouttype);
+ status = decode_pnfs_layout_types(xdr, layouttype);
bitmap[1] &= ~FATTR4_WORD1_FS_LAYOUT_TYPES;
- } else
- *layouttype = 0;
+ }
return status;
}

@@ -4856,7 +4853,7 @@ static int decode_fsinfo(struct xdr_stream *xdr, struct nfs_fsinfo *fsinfo)
status = decode_attr_time_delta(xdr, bitmap, &fsinfo->time_delta);
if (status != 0)
goto xdr_error;
- status = decode_attr_pnfstype(xdr, bitmap, &fsinfo->layouttype);
+ status = decode_attr_pnfstype(xdr, bitmap, fsinfo->layouttype);
if (status != 0)
goto xdr_error;

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 2c93a85eda51..a6a683fbd230 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -102,32 +102,37 @@ unset_pnfs_layoutdriver(struct nfs_server *nfss)
* Try to set the server's pnfs module to the pnfs layout type specified by id.
* Currently only one pNFS layout driver per filesystem is supported.
*
- * @id layout type. Zero (illegal layout type) indicates pNFS not in use.
+ * @ids array of layout types supported by MDS.
*/
void
set_pnfs_layoutdriver(struct nfs_server *server, const struct nfs_fh *mntfh,
- u32 id)
+ u32 *ids)
{
struct pnfs_layoutdriver_type *ld_type = NULL;
+ u32 id;

- if (id == 0)
- goto out_no_driver;
if (!(server->nfs_client->cl_exchange_flags &
(EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_USE_PNFS_MDS))) {
- printk(KERN_ERR "NFS: %s: id %u cl_exchange_flags 0x%x\n",
- __func__, id, server->nfs_client->cl_exchange_flags);
+ printk(KERN_ERR "NFS: %s: cl_exchange_flags 0x%x\n",
+ __func__, server->nfs_client->cl_exchange_flags);
goto out_no_driver;
}
+
+ id = ids[0];
+ if (!id)
+ goto out_no_driver;
+
ld_type = find_pnfs_driver(id);
if (!ld_type) {
request_module("%s-%u", LAYOUT_NFSV4_1_MODULE_PREFIX, id);
ld_type = find_pnfs_driver(id);
- if (!ld_type) {
- dprintk("%s: No pNFS module found for %u.\n",
- __func__, id);
- goto out_no_driver;
- }
}
+
+ if (!ld_type) {
+ dprintk("%s: No pNFS module found for %u.\n", __func__, id);
+ goto out_no_driver;
+ }
+
server->pnfs_curr_ld = ld_type;
if (ld_type->set_layoutdriver
&& ld_type->set_layoutdriver(server, mntfh)) {
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 31d99b2927b0..be515e6a3823 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -236,7 +236,7 @@ void pnfs_get_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_put_lseg(struct pnfs_layout_segment *lseg);
void pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg);

-void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, u32);
+void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, u32 *);
void unset_pnfs_layoutdriver(struct nfs_server *);
void pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *, struct nfs_page *);
int pnfs_generic_pg_readpages(struct nfs_pageio_descriptor *desc);
@@ -657,7 +657,7 @@ pnfs_wait_on_layoutreturn(struct inode *ino, struct rpc_task *task)
}

static inline void set_pnfs_layoutdriver(struct nfs_server *s,
- const struct nfs_fh *mntfh, u32 id)
+ const struct nfs_fh *mntfh, u32 *ids)
{
}

diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 7cc0deee5bde..f11b26ed001b 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -125,6 +125,11 @@ struct nfs_fattr {
| NFS_ATTR_FATTR_V4_SECURITY_LABEL)

/*
+ * Maximal number of supported layout drivers.
+ */
+#define NFS_MAX_LAYOUT_TYPES 8
+
+/*
* Info on the file system
*/
struct nfs_fsinfo {
@@ -139,7 +144,7 @@ struct nfs_fsinfo {
__u64 maxfilesize;
struct timespec time_delta; /* server time granularity */
__u32 lease_time; /* in seconds */
- __u32 layouttype; /* supported pnfs layout driver */
+ __u32 layouttype[NFS_MAX_LAYOUT_TYPES]; /* supported pnfs layout driver */
__u32 blksize; /* preferred pnfs io block size */
__u32 clone_blksize; /* granularity of a CLONE operation */
};
--
2.7.4


2016-09-19 17:13:47

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v4 resend 0/2] pnfs: allow client to support servers that send multiple layout types

On 09/15/2016 02:40 PM, Jeff Layton wrote:
> Resending as I noticed that Anna had merged the first of the two
> patches into her linux-next branch, but not the second one. Were
> there any further concerns with it?

Nope, no concerns. I updated a few of the later patchsets in my branch before pushing everything out, so it must have gotten lost in the shuffle. I've applied the patch again, so it should be there now. Thanks for catching!

Sorry about the mistake, and thanks for catching it!
Anna

>
> v4:
> - pass around fsinfo instead of separate array and length args
>
> v3:
> - use sort() from lib/sort.c to sort the list
>
> v2:
> - rework Tigran's patch to preserve existing selection behaviour
> - simplify layout driver selection by sorting the list
>
> This is v4 of the patchset. The main change is the change to pass
> around the fsinfo instead of separate array pointer and length args.
>
> Only lightly tested by mounting a server that sends both flexfiles and
> block layouts. The client successfully selected the block layout in most
> cases, but if I blacklist blocklayoutdriver then it selects flexfiles
> instead.
>
> Jeff Layton (2):
> pnfs: track multiple layout types in fsinfo structure
> pnfs: add a new mechanism to select a layout driver according to an
> ordered list
>
> fs/nfs/client.c | 3 ++-
> fs/nfs/nfs4proc.c | 2 +-
> fs/nfs/nfs4xdr.c | 40 ++++++++++++++--------------
> fs/nfs/pnfs.c | 69 ++++++++++++++++++++++++++++++++++++++++---------
> fs/nfs/pnfs.h | 5 ++--
> include/linux/nfs_xdr.h | 8 +++++-
> 6 files changed, 91 insertions(+), 36 deletions(-)
>