2021-03-12 21:19:50

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

From: Anna Schumaker <[email protected]>

It's possible for an NFS server to go down but come back up with a
different IP address. These patches provide a way for administrators to
handle this issue by providing a new IP address for xprt sockets to
connect to.

Chuck has suggested some ideas for future work that could also use this
interface, such as:
- srcaddr: To move between network devices on the client
- type: "tcp", "rdma", "local"
- bound: 0 for autobind, or the result of the most recent rpcbind query
- connected: either true or false
- last: read-only timestamp of the last operation to use the transport
- device: A symlink to the physical network device

Changes in v3:
- Rename functions and objects to make future expansion easier
- Put files under /sys/kernel/sunrpc/client/ instead of
/sys/kernel/sunrpc/net/, again for future expansions
- Clean up use of WARN_ON_ONCE() in xs_connect()
- Fix up locking, reference counting, and RCU usage
- Unconditionally create files so userspace tools don't need to guess
what is supported (We return an error message now instead)

Changes in v2:
- Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
- Rename file from "address" to "dstaddr"

Thoughts?
Anna


Anna Schumaker (5):
sunrpc: Create a sunrpc directory under /sys/kernel/
sunrpc: Create a client/ subdirectory in the sunrpc sysfs
sunrpc: Create per-rpc_clnt sysfs kobjects
sunrpc: Prepare xs_connect() for taking NULL tasks
sunrpc: Create a per-rpc_clnt file for managing the destination IP
address

include/linux/sunrpc/clnt.h | 1 +
net/sunrpc/Makefile | 2 +-
net/sunrpc/clnt.c | 5 +
net/sunrpc/sunrpc_syms.c | 8 ++
net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
net/sunrpc/sysfs.h | 20 ++++
net/sunrpc/xprtsock.c | 2 +-
7 files changed, 227 insertions(+), 2 deletions(-)
create mode 100644 net/sunrpc/sysfs.c
create mode 100644 net/sunrpc/sysfs.h

--
2.29.2


2021-03-12 21:20:07

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 3/5] sunrpc: Create per-rpc_clnt sysfs kobjects

From: Anna Schumaker <[email protected]>

These will eventually have files placed under them for sysfs operations.

Signed-off-by: Anna Schumaker <[email protected]>
---
v3: s/netns/sysfs/
---
include/linux/sunrpc/clnt.h | 1 +
net/sunrpc/clnt.c | 5 ++++
net/sunrpc/sysfs.c | 60 +++++++++++++++++++++++++++++++++++++
net/sunrpc/sysfs.h | 8 +++++
4 files changed, 74 insertions(+)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 02e7a5863d28..503653720e18 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -71,6 +71,7 @@ struct rpc_clnt {
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
struct dentry *cl_debugfs; /* debugfs directory */
#endif
+ void *cl_sysfs; /* sysfs directory */
/* cl_work is only needed after cl_xpi is no longer used,
* and that are of similar size
*/
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 612f0a641f4c..ceb8d19d4cb4 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -41,6 +41,7 @@
#include <trace/events/sunrpc.h>

#include "sunrpc.h"
+#include "sysfs.h"
#include "netns.h"

#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
@@ -300,6 +301,7 @@ static int rpc_client_register(struct rpc_clnt *clnt,
int err;

rpc_clnt_debugfs_register(clnt);
+ rpc_sysfs_client_setup(clnt, net);

pipefs_sb = rpc_get_sb_net(net);
if (pipefs_sb) {
@@ -327,6 +329,7 @@ static int rpc_client_register(struct rpc_clnt *clnt,
out:
if (pipefs_sb)
rpc_put_sb_net(net);
+ rpc_sysfs_client_destroy(clnt);
rpc_clnt_debugfs_unregister(clnt);
return err;
}
@@ -733,6 +736,7 @@ int rpc_switch_client_transport(struct rpc_clnt *clnt,

rpc_unregister_client(clnt);
__rpc_clnt_remove_pipedir(clnt);
+ rpc_sysfs_client_destroy(clnt);
rpc_clnt_debugfs_unregister(clnt);

/*
@@ -879,6 +883,7 @@ static void rpc_free_client_work(struct work_struct *work)
* so they cannot be called in rpciod, so they are handled separately
* here.
*/
+ rpc_sysfs_client_destroy(clnt);
rpc_clnt_debugfs_unregister(clnt);
rpc_free_clid(clnt);
rpc_clnt_remove_pipedir(clnt);
diff --git a/net/sunrpc/sysfs.c b/net/sunrpc/sysfs.c
index 350b2b1628e7..abfe8c0b3108 100644
--- a/net/sunrpc/sysfs.c
+++ b/net/sunrpc/sysfs.c
@@ -4,6 +4,7 @@
*/
#include <linux/sunrpc/clnt.h>
#include <linux/kobject.h>
+#include "sysfs.h"

static struct kset *rpc_sunrpc_kset;
static struct kobject *rpc_sunrpc_client_kobj;
@@ -54,8 +55,67 @@ int rpc_sysfs_init(void)
return 0;
}

+static void rpc_sysfs_client_release(struct kobject *kobj)
+{
+ struct rpc_sysfs_client *c;
+
+ c = container_of(kobj, struct rpc_sysfs_client, kobject);
+ kfree(c);
+}
+
+static const void *rpc_sysfs_client_namespace(struct kobject *kobj)
+{
+ return container_of(kobj, struct rpc_sysfs_client, kobject)->net;
+}
+
+static struct kobj_type rpc_sysfs_client_type = {
+ .release = rpc_sysfs_client_release,
+ .sysfs_ops = &kobj_sysfs_ops,
+ .namespace = rpc_sysfs_client_namespace,
+};
+
void rpc_sysfs_exit(void)
{
kobject_put(rpc_sunrpc_client_kobj);
kset_unregister(rpc_sunrpc_kset);
}
+
+static struct rpc_sysfs_client *rpc_sysfs_client_alloc(struct kobject *parent,
+ struct net *net, int clid)
+{
+ struct rpc_sysfs_client *p;
+
+ p = kzalloc(sizeof(*p), GFP_KERNEL);
+ if (p) {
+ p->net = net;
+ p->kobject.kset = rpc_sunrpc_kset;
+ if (kobject_init_and_add(&p->kobject, &rpc_sysfs_client_type,
+ parent, "%d", clid) == 0)
+ return p;
+ kobject_put(&p->kobject);
+ }
+ return NULL;
+}
+
+void rpc_sysfs_client_setup(struct rpc_clnt *clnt, struct net *net)
+{
+ struct rpc_sysfs_client *rpc_client;
+
+ rpc_client = rpc_sysfs_client_alloc(rpc_sunrpc_client_kobj, net, clnt->cl_clid);
+ if (rpc_client) {
+ clnt->cl_sysfs = rpc_client;
+ kobject_uevent(&rpc_client->kobject, KOBJ_ADD);
+ }
+}
+
+void rpc_sysfs_client_destroy(struct rpc_clnt *clnt)
+{
+ struct rpc_sysfs_client *rpc_client = clnt->cl_sysfs;
+
+ if (rpc_client) {
+ kobject_uevent(&rpc_client->kobject, KOBJ_REMOVE);
+ kobject_del(&rpc_client->kobject);
+ kobject_put(&rpc_client->kobject);
+ clnt->cl_sysfs = NULL;
+ }
+}
diff --git a/net/sunrpc/sysfs.h b/net/sunrpc/sysfs.h
index c9d9918e87ee..443679d71f38 100644
--- a/net/sunrpc/sysfs.h
+++ b/net/sunrpc/sysfs.h
@@ -5,7 +5,15 @@
#ifndef __SUNRPC_SYSFS_H
#define __SUNRPC_SYSFS_H

+struct rpc_sysfs_client {
+ struct kobject kobject;
+ struct net *net;
+};
+
extern int rpc_sysfs_init(void);
extern void rpc_sysfs_exit(void);

+void rpc_sysfs_client_setup(struct rpc_clnt *clnt, struct net *net);
+void rpc_sysfs_client_destroy(struct rpc_clnt *clnt);
+
#endif
--
2.29.2

2021-03-12 21:20:27

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 4/5] sunrpc: Prepare xs_connect() for taking NULL tasks

From: Anna Schumaker <[email protected]>

We won't have a task structure when we go to change IP addresses, so
check for one before calling the WARN_ON() to avoid crashing.

Signed-off-by: Anna Schumaker <[email protected]>
---
v3: Clean up WARN_ON_ONCE() to avoid a new if
---
net/sunrpc/xprtsock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index e35760f238a4..eb8df10aac9c 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2310,7 +2310,7 @@ static void xs_connect(struct rpc_xprt *xprt, struct rpc_task *task)
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
unsigned long delay = 0;

- WARN_ON_ONCE(!xprt_lock_connect(xprt, task, transport));
+ WARN_ON_ONCE(task && !xprt_lock_connect(xprt, task, transport));

if (transport->sock != NULL) {
dprintk("RPC: xs_connect delayed xprt %p for %lu "
--
2.29.2

2021-03-12 21:20:31

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 5/5] sunrpc: Create a per-rpc_clnt file for managing the destination IP address

From: Anna Schumaker <[email protected]>

Reading the file displays the current destination address, and writing
to it allows users to change the address.

And since we're using IP here, restrict to only creating sysfs files for
TCP and RDMA connections.

Signed-off-by: Anna Schumaker <[email protected]>
---
v3: Fix suspicious RCU usage warnings
Fix up xprt locking and reference counting
s/netns/sysfs/
Unconditionally create files instead of looking at protocol type
during client setup (this makes it easier for userspace tools)
Add a newline character to the buffer returned in "show" handler
v2: Change filename and related functions from "address" to "dstaddr"
Combine patches for reading and writing to this file
---
net/sunrpc/sysfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
net/sunrpc/sysfs.h | 1 +
2 files changed, 71 insertions(+)

diff --git a/net/sunrpc/sysfs.c b/net/sunrpc/sysfs.c
index abfe8c0b3108..1ae15c4729c0 100644
--- a/net/sunrpc/sysfs.c
+++ b/net/sunrpc/sysfs.c
@@ -3,6 +3,8 @@
* Copyright (c) 2020 Anna Schumaker <[email protected]>
*/
#include <linux/sunrpc/clnt.h>
+#include <linux/sunrpc/addr.h>
+#include <linux/sunrpc/xprtsock.h>
#include <linux/kobject.h>
#include "sysfs.h"

@@ -55,6 +57,64 @@ int rpc_sysfs_init(void)
return 0;
}

+static inline struct rpc_xprt *rpc_sysfs_client_kobj_get_xprt(struct kobject *kobj)
+{
+ struct rpc_sysfs_client *c = container_of(kobj,
+ struct rpc_sysfs_client, kobject);
+ struct rpc_xprt *xprt;
+
+ rcu_read_lock();
+ xprt = xprt_get(rcu_dereference(c->clnt->cl_xprt));
+ rcu_read_unlock();
+ return xprt;
+}
+
+static ssize_t rpc_sysfs_dstaddr_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct rpc_xprt *xprt = rpc_sysfs_client_kobj_get_xprt(kobj);
+ ssize_t ret;
+
+ if (!xprt)
+ return 0;
+ if (!(xprt->prot & (IPPROTO_TCP | XPRT_TRANSPORT_RDMA)))
+ ret = sprintf(buf, "Not Supported");
+ else
+ ret = rpc_ntop((struct sockaddr *)&xprt->addr, buf, PAGE_SIZE);
+ buf[ret] = '\n';
+ xprt_put(xprt);
+ return ret + 1;
+}
+
+static ssize_t rpc_sysfs_dstaddr_store(struct kobject *kobj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ struct rpc_xprt *xprt = rpc_sysfs_client_kobj_get_xprt(kobj);
+ struct sockaddr *saddr;
+ int port;
+
+ if (!xprt)
+ return 0;
+ if (!(xprt->prot & (IPPROTO_TCP | XPRT_TRANSPORT_RDMA))) {
+ xprt_put(xprt);
+ return -ENOTSUPP;
+ }
+
+ wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_KILLABLE);
+ saddr = (struct sockaddr *)&xprt->addr;
+ port = rpc_get_port(saddr);
+
+ kfree(xprt->address_strings[RPC_DISPLAY_ADDR]);
+ xprt->address_strings[RPC_DISPLAY_ADDR] = kstrndup(buf, count - 1, GFP_KERNEL);
+ xprt->addrlen = rpc_pton(xprt->xprt_net, buf, count - 1, saddr, sizeof(*saddr));
+ rpc_set_port(saddr, port);
+
+ xprt->ops->connect(xprt, NULL);
+ clear_bit(XPRT_LOCKED, &xprt->state);
+ xprt_put(xprt);
+ return count;
+}
+
static void rpc_sysfs_client_release(struct kobject *kobj)
{
struct rpc_sysfs_client *c;
@@ -68,8 +128,17 @@ static const void *rpc_sysfs_client_namespace(struct kobject *kobj)
return container_of(kobj, struct rpc_sysfs_client, kobject)->net;
}

+static struct kobj_attribute rpc_sysfs_client_dstaddr = __ATTR(dstaddr,
+ 0644, rpc_sysfs_dstaddr_show, rpc_sysfs_dstaddr_store);
+
+static struct attribute *rpc_sysfs_client_attrs[] = {
+ &rpc_sysfs_client_dstaddr.attr,
+ NULL,
+};
+
static struct kobj_type rpc_sysfs_client_type = {
.release = rpc_sysfs_client_release,
+ .default_attrs = rpc_sysfs_client_attrs,
.sysfs_ops = &kobj_sysfs_ops,
.namespace = rpc_sysfs_client_namespace,
};
@@ -104,6 +173,7 @@ void rpc_sysfs_client_setup(struct rpc_clnt *clnt, struct net *net)
rpc_client = rpc_sysfs_client_alloc(rpc_sunrpc_client_kobj, net, clnt->cl_clid);
if (rpc_client) {
clnt->cl_sysfs = rpc_client;
+ rpc_client->clnt = clnt;
kobject_uevent(&rpc_client->kobject, KOBJ_ADD);
}
}
diff --git a/net/sunrpc/sysfs.h b/net/sunrpc/sysfs.h
index 443679d71f38..5093f93a83cb 100644
--- a/net/sunrpc/sysfs.h
+++ b/net/sunrpc/sysfs.h
@@ -8,6 +8,7 @@
struct rpc_sysfs_client {
struct kobject kobject;
struct net *net;
+ struct rpc_clnt *clnt;
};

extern int rpc_sysfs_init(void);
--
2.29.2

2021-03-12 21:22:34

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 2/5] sunrpc: Create a client/ subdirectory in the sunrpc sysfs

From: Anna Schumaker <[email protected]>

For network namespace separation.

Signed-off-by: Anna Schumaker <[email protected]>
---
v3: Rename net/ directory to client/
Rename rpc_client_kobj to rpc_sunrpc_client_kobj
s/netns/sysfs/
---
net/sunrpc/sysfs.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)

diff --git a/net/sunrpc/sysfs.c b/net/sunrpc/sysfs.c
index 27eda180ac5e..350b2b1628e7 100644
--- a/net/sunrpc/sysfs.c
+++ b/net/sunrpc/sysfs.c
@@ -2,19 +2,60 @@
/*
* Copyright (c) 2020 Anna Schumaker <[email protected]>
*/
+#include <linux/sunrpc/clnt.h>
#include <linux/kobject.h>

static struct kset *rpc_sunrpc_kset;
+static struct kobject *rpc_sunrpc_client_kobj;
+
+static void rpc_sysfs_object_release(struct kobject *kobj)
+{
+ kfree(kobj);
+}
+
+static const struct kobj_ns_type_operations *rpc_sysfs_object_child_ns_type(
+ struct kobject *kobj)
+{
+ return &net_ns_type_operations;
+}
+
+static struct kobj_type rpc_sysfs_object_type = {
+ .release = rpc_sysfs_object_release,
+ .sysfs_ops = &kobj_sysfs_ops,
+ .child_ns_type = rpc_sysfs_object_child_ns_type,
+};
+
+static struct kobject *rpc_sysfs_object_alloc(const char *name,
+ struct kset *kset, struct kobject *parent)
+{
+ struct kobject *kobj;
+ kobj = kzalloc(sizeof(*kobj), GFP_KERNEL);
+ if (kobj) {
+ kobj->kset = kset;
+ if (kobject_init_and_add(kobj, &rpc_sysfs_object_type,
+ parent, "%s", name) == 0)
+ return kobj;
+ kobject_put(kobj);
+ }
+ return NULL;
+}

int rpc_sysfs_init(void)
{
rpc_sunrpc_kset = kset_create_and_add("sunrpc", NULL, kernel_kobj);
if (!rpc_sunrpc_kset)
return -ENOMEM;
+ rpc_sunrpc_client_kobj = rpc_sysfs_object_alloc("client", rpc_sunrpc_kset, NULL);
+ if (!rpc_sunrpc_client_kobj) {
+ kset_unregister(rpc_sunrpc_kset);
+ rpc_sunrpc_client_kobj = NULL;
+ return -ENOMEM;
+ }
return 0;
}

void rpc_sysfs_exit(void)
{
+ kobject_put(rpc_sunrpc_client_kobj);
kset_unregister(rpc_sunrpc_kset);
}
--
2.29.2

2021-03-20 01:27:10

by Nagendra Tomar

[permalink] [raw]
Subject: RE: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

Hi Anna,
We have a similar but slightly different requirement.
You change allows a user to force a xprt's remote address to anything, allowing
it to connect to a different address than what it originally had.
The original server/xprt address starts as the one that userspace mount program
provides, possibly after resolving the servername used in the mount command.

Our requirement is that that server name remains same but its address changes,
aka, DNS failover.
Such cases (which I believe are more common) can be handled fully automatically,
by resolving the server name before every xprt reconnect. CIFS does this.
NFS also has fs/nfs/dns_resolve.c which we can use to do the name resolution,
though it's currently not being used for this specific use.

Did you have a similar requirement in mind, and/or did you consider the above?
Would like to know your thoughts.

Thanks,
Tomar

-----Original Message-----
From: Anna Schumaker <[email protected]> On Behalf Of [email protected]
Sent: 13 March 2021 02:48
To: [email protected]
Cc: [email protected]
Subject: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

From: Anna Schumaker <[email protected]>

It's possible for an NFS server to go down but come back up with a
different IP address. These patches provide a way for administrators to
handle this issue by providing a new IP address for xprt sockets to
connect to.

Chuck has suggested some ideas for future work that could also use this
interface, such as:
- srcaddr: To move between network devices on the client
- type: "tcp", "rdma", "local"
- bound: 0 for autobind, or the result of the most recent rpcbind query
- connected: either true or false
- last: read-only timestamp of the last operation to use the transport
- device: A symlink to the physical network device

Changes in v3:
- Rename functions and objects to make future expansion easier
- Put files under /sys/kernel/sunrpc/client/ instead of
/sys/kernel/sunrpc/net/, again for future expansions
- Clean up use of WARN_ON_ONCE() in xs_connect()
- Fix up locking, reference counting, and RCU usage
- Unconditionally create files so userspace tools don't need to guess
what is supported (We return an error message now instead)

Changes in v2:
- Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
- Rename file from "address" to "dstaddr"

Thoughts?
Anna


Anna Schumaker (5):
sunrpc: Create a sunrpc directory under /sys/kernel/
sunrpc: Create a client/ subdirectory in the sunrpc sysfs
sunrpc: Create per-rpc_clnt sysfs kobjects
sunrpc: Prepare xs_connect() for taking NULL tasks
sunrpc: Create a per-rpc_clnt file for managing the destination IP
address

include/linux/sunrpc/clnt.h | 1 +
net/sunrpc/Makefile | 2 +-
net/sunrpc/clnt.c | 5 +
net/sunrpc/sunrpc_syms.c | 8 ++
net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
net/sunrpc/sysfs.h | 20 ++++
net/sunrpc/xprtsock.c | 2 +-
7 files changed, 227 insertions(+), 2 deletions(-)
create mode 100644 net/sunrpc/sysfs.c
create mode 100644 net/sunrpc/sysfs.h

--
2.29.2


2021-03-24 17:32:03

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

On Fri, Mar 12, 2021 at 4:19 PM <[email protected]> wrote:
>
> From: Anna Schumaker <[email protected]>
>
> It's possible for an NFS server to go down but come back up with a
> different IP address. These patches provide a way for administrators to
> handle this issue by providing a new IP address for xprt sockets to
> connect to.
>
> Chuck has suggested some ideas for future work that could also use this
> interface, such as:
> - srcaddr: To move between network devices on the client
> - type: "tcp", "rdma", "local"
> - bound: 0 for autobind, or the result of the most recent rpcbind query
> - connected: either true or false
> - last: read-only timestamp of the last operation to use the transport
> - device: A symlink to the physical network device
>
> Changes in v3:
> - Rename functions and objects to make future expansion easier
> - Put files under /sys/kernel/sunrpc/client/ instead of
> /sys/kernel/sunrpc/net/, again for future expansions
> - Clean up use of WARN_ON_ONCE() in xs_connect()
> - Fix up locking, reference counting, and RCU usage
> - Unconditionally create files so userspace tools don't need to guess
> what is supported (We return an error message now instead)
>
> Changes in v2:
> - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> - Rename file from "address" to "dstaddr"
>
> Thoughts?

Reviewed-by/Tested-by this version. Works OK for me.

I would like to note that the interface doesn't or rather perhaps
cannot do any error checking. So if the "user" were to echo a
nonsensical data into the sysfs (echo foobar > <sysfspath>), that
breaks the existing connection. However, if a proper IP were to be
entered to correct it, things will go back to normal.

> Anna
>
>
> Anna Schumaker (5):
> sunrpc: Create a sunrpc directory under /sys/kernel/
> sunrpc: Create a client/ subdirectory in the sunrpc sysfs
> sunrpc: Create per-rpc_clnt sysfs kobjects
> sunrpc: Prepare xs_connect() for taking NULL tasks
> sunrpc: Create a per-rpc_clnt file for managing the destination IP
> address
>
> include/linux/sunrpc/clnt.h | 1 +
> net/sunrpc/Makefile | 2 +-
> net/sunrpc/clnt.c | 5 +
> net/sunrpc/sunrpc_syms.c | 8 ++
> net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
> net/sunrpc/sysfs.h | 20 ++++
> net/sunrpc/xprtsock.c | 2 +-
> 7 files changed, 227 insertions(+), 2 deletions(-)
> create mode 100644 net/sunrpc/sysfs.c
> create mode 100644 net/sunrpc/sysfs.h
>
> --
> 2.29.2
>

2021-03-24 19:15:25

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

Hi Tomar,

SUNRPC layer only deals with resolved addresses not dns names (on
purpose). DNS resolution code that does exists at the NFS layer is for
referrals/migration -- an event that happens at a point in time and
doesn't repeat (but it does implement a simple caching strategy). At
the SUNRPC layer, a connection can be dropped and re-established
numerous times and thus, to require DNS resolution on each attempt
will interfere with performance because a DNS resolution requires an
upcall into the userland (implementing DNS caching at the SUNRPC layer
is a non-starter probably, since policy based features should be in
userland).

You mention that you are interested in using the "same" server only
changing the IP. DNS failover is no way an authority in "sameness" of
the server. NFSv4.x have definitions of what it means to call 2
servers the same. When doing an IP change via a SUNRPC layer, we are
relying on the fact that an administrator will be pointing it to the
"same" server.

Given that a failover is an "event", an administrator can do a DNS
query and then use the provided IP to supply into the sysfs to direct
the IO to the new server IP. Maybe the sysfs interface can support
receiving a DNS name and doing a DNS lookup then (but that probably
should be an addition onto these patches not in the initial version)?


On Fri, Mar 19, 2021 at 9:27 PM Nagendra Tomar
<[email protected]> wrote:
>
> Hi Anna,
> We have a similar but slightly different requirement.
> You change allows a user to force a xprt's remote address to anything, allowing
> it to connect to a different address than what it originally had.
> The original server/xprt address starts as the one that userspace mount program
> provides, possibly after resolving the servername used in the mount command.
>
> Our requirement is that that server name remains same but its address changes,
> aka, DNS failover.
> Such cases (which I believe are more common) can be handled fully automatically,
> by resolving the server name before every xprt reconnect. CIFS does this.
> NFS also has fs/nfs/dns_resolve.c which we can use to do the name resolution,
> though it's currently not being used for this specific use.
>
> Did you have a similar requirement in mind, and/or did you consider the above?
> Would like to know your thoughts.
>
> Thanks,
> Tomar
>
> -----Original Message-----
> From: Anna Schumaker <[email protected]> On Behalf Of [email protected]
> Sent: 13 March 2021 02:48
> To: [email protected]
> Cc: [email protected]
> Subject: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address
>
> From: Anna Schumaker <[email protected]>
>
> It's possible for an NFS server to go down but come back up with a
> different IP address. These patches provide a way for administrators to
> handle this issue by providing a new IP address for xprt sockets to
> connect to.
>
> Chuck has suggested some ideas for future work that could also use this
> interface, such as:
> - srcaddr: To move between network devices on the client
> - type: "tcp", "rdma", "local"
> - bound: 0 for autobind, or the result of the most recent rpcbind query
> - connected: either true or false
> - last: read-only timestamp of the last operation to use the transport
> - device: A symlink to the physical network device
>
> Changes in v3:
> - Rename functions and objects to make future expansion easier
> - Put files under /sys/kernel/sunrpc/client/ instead of
> /sys/kernel/sunrpc/net/, again for future expansions
> - Clean up use of WARN_ON_ONCE() in xs_connect()
> - Fix up locking, reference counting, and RCU usage
> - Unconditionally create files so userspace tools don't need to guess
> what is supported (We return an error message now instead)
>
> Changes in v2:
> - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> - Rename file from "address" to "dstaddr"
>
> Thoughts?
> Anna
>
>
> Anna Schumaker (5):
> sunrpc: Create a sunrpc directory under /sys/kernel/
> sunrpc: Create a client/ subdirectory in the sunrpc sysfs
> sunrpc: Create per-rpc_clnt sysfs kobjects
> sunrpc: Prepare xs_connect() for taking NULL tasks
> sunrpc: Create a per-rpc_clnt file for managing the destination IP
> address
>
> include/linux/sunrpc/clnt.h | 1 +
> net/sunrpc/Makefile | 2 +-
> net/sunrpc/clnt.c | 5 +
> net/sunrpc/sunrpc_syms.c | 8 ++
> net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
> net/sunrpc/sysfs.h | 20 ++++
> net/sunrpc/xprtsock.c | 2 +-
> 7 files changed, 227 insertions(+), 2 deletions(-)
> create mode 100644 net/sunrpc/sysfs.c
> create mode 100644 net/sunrpc/sysfs.h
>
> --
> 2.29.2
>
>

2021-03-25 05:11:29

by Nagendra Tomar

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

> From: Olga Kornievskaia <[email protected]>
> Sent: 25 March 2021 00:43
>
> Hi Tomar,

Hi Olga,
Thanks for your comments!

>
> SUNRPC layer only deals with resolved addresses not dns names (on
> purpose). DNS resolution code that does exists at the NFS layer is for
> referrals/migration -- an event that happens at a point in time and
> doesn't repeat (but it does implement a simple caching strategy). At
> the SUNRPC layer, a connection can be dropped and re-established
> numerous times and thus, to require DNS resolution on each attempt
> will interfere with performance because a DNS resolution requires an
> upcall into the userland (implementing DNS caching at the SUNRPC layer
> is a non-starter probably, since policy based features should be in
> userland).

A SUNRPC reconnect is not a fastpath operation. Usually, it's done after
1+ consecutive RPC timeouts which would easily be at least 10's
of seconds to few minutes. Having a DNS resolution which might take
additional few 10/100's msecs doesn't look like too much extra work IMHO,
given that one of the reasons for server not responding on the old address
could well be because it has moved to a new address (Zone/Geo failovers
being more likely with cloud based NFS servers).
Also, if user does not have a server that uses DNS failover, she can
simply provide the IP at the time of mount (IP:/share vs Hostname:/share),
and that can act as a hint for the rpc layer to not do resolution on reconnect.

>
> You mention that you are interested in using the "same" server only
> changing the IP. DNS failover is no way an authority in "sameness" of
> the server. NFSv4.x have definitions of what it means to call 2
> servers the same. When doing an IP change via a SUNRPC layer, we are
> relying on the fact that an administrator will be pointing it to the
> "same" server.

I don't think we are depending on the admin setting "same" server.
I mean we just take the IP on face value and go about our usual RPC state
machine. If the IP happens to point to a different server it will eventually
hit auth failure (for reasonable auth policies) and then we will try to re-setup
auth and fail if we cannot, so admin setting wrong IP cannot break the security
provided by the RPC layer (which is a good thing).
The same should simply work for IP obtained from DNS resolution.

>
> Given that a failover is an "event", an administrator can do a DNS
> query and then use the provided IP to supply into the sysfs to direct
> the IO to the new server IP. Maybe the sysfs interface can support
> receiving a DNS name and doing a DNS lookup then (but that probably
> should be an addition onto these patches not in the initial version)?

I feel that auto DNS resolution just makes the process more smooth and
intuitive. That's something which to me looks like a more common usage
scenario. The write-ip-to-sysfs approach is definitely generic, but I would
love if it solves the more common DNS failover usecase too.

>
>
> On Fri, Mar 19, 2021 at 9:27 PM Nagendra Tomar
> <[email protected]> wrote:
> >
> > Hi Anna,
> > We have a similar but slightly different requirement.
> > You change allows a user to force a xprt's remote address to anything,
> allowing
> > it to connect to a different address than what it originally had.
> > The original server/xprt address starts as the one that userspace mount
> program
> > provides, possibly after resolving the servername used in the mount
> command.
> >
> > Our requirement is that that server name remains same but its address
> changes,
> > aka, DNS failover.
> > Such cases (which I believe are more common) can be handled fully
> automatically,
> > by resolving the server name before every xprt reconnect. CIFS does this.
> > NFS also has fs/nfs/dns_resolve.c which we can use to do the name
> resolution,
> > though it's currently not being used for this specific use.
> >
> > Did you have a similar requirement in mind, and/or did you consider the
> above?
> > Would like to know your thoughts.
> >
> > Thanks,
> > Tomar
> >
> > -----Original Message-----
> > From: Anna Schumaker <[email protected]> On Behalf Of
> [email protected]
> > Sent: 13 March 2021 02:48
> > To: [email protected]
> > Cc: [email protected]
> > Subject: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address
> >
> > From: Anna Schumaker <[email protected]>
> >
> > It's possible for an NFS server to go down but come back up with a
> > different IP address. These patches provide a way for administrators to
> > handle this issue by providing a new IP address for xprt sockets to
> > connect to.
> >
> > Chuck has suggested some ideas for future work that could also use this
> > interface, such as:
> > - srcaddr: To move between network devices on the client
> > - type: "tcp", "rdma", "local"
> > - bound: 0 for autobind, or the result of the most recent rpcbind query
> > - connected: either true or false
> > - last: read-only timestamp of the last operation to use the transport
> > - device: A symlink to the physical network device
> >
> > Changes in v3:
> > - Rename functions and objects to make future expansion easier
> > - Put files under /sys/kernel/sunrpc/client/ instead of
> > /sys/kernel/sunrpc/net/, again for future expansions
> > - Clean up use of WARN_ON_ONCE() in xs_connect()
> > - Fix up locking, reference counting, and RCU usage
> > - Unconditionally create files so userspace tools don't need to guess
> > what is supported (We return an error message now instead)
> >
> > Changes in v2:
> > - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> > - Rename file from "address" to "dstaddr"
> >
> > Thoughts?
> > Anna
> >
> >
> > Anna Schumaker (5):
> > sunrpc: Create a sunrpc directory under /sys/kernel/
> > sunrpc: Create a client/ subdirectory in the sunrpc sysfs
> > sunrpc: Create per-rpc_clnt sysfs kobjects
> > sunrpc: Prepare xs_connect() for taking NULL tasks
> > sunrpc: Create a per-rpc_clnt file for managing the destination IP
> > address
> >
> > include/linux/sunrpc/clnt.h | 1 +
> > net/sunrpc/Makefile | 2 +-
> > net/sunrpc/clnt.c | 5 +
> > net/sunrpc/sunrpc_syms.c | 8 ++
> > net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
> > net/sunrpc/sysfs.h | 20 ++++
> > net/sunrpc/xprtsock.c | 2 +-
> > 7 files changed, 227 insertions(+), 2 deletions(-)
> > create mode 100644 net/sunrpc/sysfs.c
> > create mode 100644 net/sunrpc/sysfs.h
> >
> > --
> > 2.29.2
> >
> >

2021-03-25 13:57:43

by Anna Schumaker

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

Hi Tomar,

Sorry for not chiming in earlier, as Olga said: I'm on leave (and will
be for several more weeks) after my wife and I had a baby.

On Thu, Mar 25, 2021 at 1:11 AM Nagendra Tomar
<[email protected]> wrote:
>
> > From: Olga Kornievskaia <[email protected]>
> > Sent: 25 March 2021 00:43
> >
> > Hi Tomar,
>
> Hi Olga,
> Thanks for your comments!
>
> >
> > SUNRPC layer only deals with resolved addresses not dns names (on
> > purpose). DNS resolution code that does exists at the NFS layer is for
> > referrals/migration -- an event that happens at a point in time and
> > doesn't repeat (but it does implement a simple caching strategy). At
> > the SUNRPC layer, a connection can be dropped and re-established
> > numerous times and thus, to require DNS resolution on each attempt
> > will interfere with performance because a DNS resolution requires an
> > upcall into the userland (implementing DNS caching at the SUNRPC layer
> > is a non-starter probably, since policy based features should be in
> > userland).
>
> A SUNRPC reconnect is not a fastpath operation. Usually, it's done after
> 1+ consecutive RPC timeouts which would easily be at least 10's
> of seconds to few minutes. Having a DNS resolution which might take
> additional few 10/100's msecs doesn't look like too much extra work IMHO,
> given that one of the reasons for server not responding on the old address
> could well be because it has moved to a new address (Zone/Geo failovers
> being more likely with cloud based NFS servers).
> Also, if user does not have a server that uses DNS failover, she can
> simply provide the IP at the time of mount (IP:/share vs Hostname:/share),
> and that can act as a hint for the rpc layer to not do resolution on reconnect.
>
> >
> > You mention that you are interested in using the "same" server only
> > changing the IP. DNS failover is no way an authority in "sameness" of
> > the server. NFSv4.x have definitions of what it means to call 2
> > servers the same. When doing an IP change via a SUNRPC layer, we are
> > relying on the fact that an administrator will be pointing it to the
> > "same" server.
>
> I don't think we are depending on the admin setting "same" server.
> I mean we just take the IP on face value and go about our usual RPC state
> machine. If the IP happens to point to a different server it will eventually
> hit auth failure (for reasonable auth policies) and then we will try to re-setup
> auth and fail if we cannot, so admin setting wrong IP cannot break the security
> provided by the RPC layer (which is a good thing).
> The same should simply work for IP obtained from DNS resolution.
>
> >
> > Given that a failover is an "event", an administrator can do a DNS
> > query and then use the provided IP to supply into the sysfs to direct
> > the IO to the new server IP. Maybe the sysfs interface can support
> > receiving a DNS name and doing a DNS lookup then (but that probably
> > should be an addition onto these patches not in the initial version)?
>
> I feel that auto DNS resolution just makes the process more smooth and
> intuitive. That's something which to me looks like a more common usage
> scenario. The write-ip-to-sysfs approach is definitely generic, but I would
> love if it solves the more common DNS failover usecase too.

I've been thinking of the write-to-sysfs approach as just the kernel
interface. I would expect there to be some future userspace tool built
on the sysfs files that is easier for admins to use. This future tool
could probably be coded to handle dns resolutions and write the result
to the sysfs file.

Thanks,
Anna

>
> >
> >
> > On Fri, Mar 19, 2021 at 9:27 PM Nagendra Tomar
> > <[email protected]> wrote:
> > >
> > > Hi Anna,
> > > We have a similar but slightly different requirement.
> > > You change allows a user to force a xprt's remote address to anything,
> > allowing
> > > it to connect to a different address than what it originally had.
> > > The original server/xprt address starts as the one that userspace mount
> > program
> > > provides, possibly after resolving the servername used in the mount
> > command.
> > >
> > > Our requirement is that that server name remains same but its address
> > changes,
> > > aka, DNS failover.
> > > Such cases (which I believe are more common) can be handled fully
> > automatically,
> > > by resolving the server name before every xprt reconnect. CIFS does this.
> > > NFS also has fs/nfs/dns_resolve.c which we can use to do the name
> > resolution,
> > > though it's currently not being used for this specific use.
> > >
> > > Did you have a similar requirement in mind, and/or did you consider the
> > above?
> > > Would like to know your thoughts.
> > >
> > > Thanks,
> > > Tomar
> > >
> > > -----Original Message-----
> > > From: Anna Schumaker <[email protected]> On Behalf Of
> > [email protected]
> > > Sent: 13 March 2021 02:48
> > > To: [email protected]
> > > Cc: [email protected]
> > > Subject: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address
> > >
> > > From: Anna Schumaker <[email protected]>
> > >
> > > It's possible for an NFS server to go down but come back up with a
> > > different IP address. These patches provide a way for administrators to
> > > handle this issue by providing a new IP address for xprt sockets to
> > > connect to.
> > >
> > > Chuck has suggested some ideas for future work that could also use this
> > > interface, such as:
> > > - srcaddr: To move between network devices on the client
> > > - type: "tcp", "rdma", "local"
> > > - bound: 0 for autobind, or the result of the most recent rpcbind query
> > > - connected: either true or false
> > > - last: read-only timestamp of the last operation to use the transport
> > > - device: A symlink to the physical network device
> > >
> > > Changes in v3:
> > > - Rename functions and objects to make future expansion easier
> > > - Put files under /sys/kernel/sunrpc/client/ instead of
> > > /sys/kernel/sunrpc/net/, again for future expansions
> > > - Clean up use of WARN_ON_ONCE() in xs_connect()
> > > - Fix up locking, reference counting, and RCU usage
> > > - Unconditionally create files so userspace tools don't need to guess
> > > what is supported (We return an error message now instead)
> > >
> > > Changes in v2:
> > > - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> > > - Rename file from "address" to "dstaddr"
> > >
> > > Thoughts?
> > > Anna
> > >
> > >
> > > Anna Schumaker (5):
> > > sunrpc: Create a sunrpc directory under /sys/kernel/
> > > sunrpc: Create a client/ subdirectory in the sunrpc sysfs
> > > sunrpc: Create per-rpc_clnt sysfs kobjects
> > > sunrpc: Prepare xs_connect() for taking NULL tasks
> > > sunrpc: Create a per-rpc_clnt file for managing the destination IP
> > > address
> > >
> > > include/linux/sunrpc/clnt.h | 1 +
> > > net/sunrpc/Makefile | 2 +-
> > > net/sunrpc/clnt.c | 5 +
> > > net/sunrpc/sunrpc_syms.c | 8 ++
> > > net/sunrpc/sysfs.c | 191 ++++++++++++++++++++++++++++++++++++
> > > net/sunrpc/sysfs.h | 20 ++++
> > > net/sunrpc/xprtsock.c | 2 +-
> > > 7 files changed, 227 insertions(+), 2 deletions(-)
> > > create mode 100644 net/sunrpc/sysfs.c
> > > create mode 100644 net/sunrpc/sysfs.h
> > >
> > > --
> > > 2.29.2
> > >
> > >

2021-03-25 15:07:19

by Nagendra Tomar

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address

> From: Anna Schumaker <[email protected]>
> Sent: 25 March 2021 19:26
> Hi Tomar,
>
> Sorry for not chiming in earlier, as Olga said: I'm on leave (and will
> be for several more weeks) after my wife and I had a baby.

No problem Anna, and congratulations in advance to you and your wife!

> I've been thinking of the write-to-sysfs approach as just the kernel
> interface. I would expect there to be some future userspace tool built
> on the sysfs files that is easier for admins to use. This future tool
> could probably be coded to handle dns resolutions and write the result
> to the sysfs file.

Yeah, I got the idea. I was hoping to avoid any additional userspace tool.
IIUC the userspace tool would be more like an always-running program that
periodically does the name resolution and updates the sysfs xprt addr
file. It can be done using mount helpers which take the NFS server's hostname
and start the process to periodically resolve the hostname and write to sysfs.
Then on unmount the program has to be stopped.
With the dns resolution upcall on reconnect, all of this could be avoided.
Infact as Olga said, if we extend your patch to treat hostnames specially -
do dns resolution upcall if the sysfs has the hostname instead of IP address,
that will serve my usecase well too.

Thanks,
Tomar

2021-03-25 16:04:41

by Chuck Lever III

[permalink] [raw]
Subject: Re: [EXTERNAL] [PATCH v3 0/5] SUNRPC: Create sysfs files for changing IP address



> On Mar 25, 2021, at 11:03 AM, Nagendra Tomar <[email protected]> wrote:
>
>> I've been thinking of the write-to-sysfs approach as just the kernel
>> interface. I would expect there to be some future userspace tool built
>> on the sysfs files that is easier for admins to use. This future tool
>> could probably be coded to handle dns resolutions and write the result
>> to the sysfs file.
>
> Yeah, I got the idea. I was hoping to avoid any additional userspace tool.
> IIUC the userspace tool would be more like an always-running program that
> periodically does the name resolution and updates the sysfs xprt addr
> file. It can be done using mount helpers which take the NFS server's hostname
> and start the process to periodically resolve the hostname and write to sysfs.
> Then on unmount the program has to be stopped.

Well there are variations on this idea. One variation might be
a single orchestrator daemon that would manage the sysfs files
using inotify, so it wouldn't have to track the operation of
mount/umount.

The sysfs files are meant to be only a conduit. User space is
meant to dynamically provide settings based on policy. The
opportunity here is to make this something that can be active
and rules-based instead of a static configuration (ie, more
like udev and less like /etc/fstab).

And note that any solution has to be container-aware. mount
certainly is, and that's another reason why the DNS query is
done at that time. An upcall would have to be done in the same
namespace(s) as the mount was done.

(I'm not advocating any particular approach, merely pointing
out some things that need consideration).


> With the dns resolution upcall on reconnect, all of this could be avoided.
> Infact as Olga said, if we extend your patch to treat hostnames specially -
> do dns resolution upcall if the sysfs has the hostname instead of IP address,
> that will serve my usecase well too.
>
> Thanks,
> Tomar

--
Chuck Lever