2020-10-16 13:14:25

by Sargun Dhillon

[permalink] [raw]
Subject: [PATCH v2 0/3] NFS User Namespaces

This patchset adds some functionality to allow NFS to be used from
NFS namespaces (containers).

Changes since v1:
* Added samples

Sargun Dhillon (3):
NFS: Use cred from fscontext during fsmount
samples/vfs: Split out common code for new syscall APIs
samples/vfs: Add example leveraging NFS with new APIs and user
namespaces

fs/nfs/client.c | 2 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 1 +
fs/nfs/nfs4client.c | 2 +-
samples/vfs/.gitignore | 2 +
samples/vfs/Makefile | 5 +-
samples/vfs/test-fsmount.c | 86 +-----------
samples/vfs/test-nfs-userns.c | 181 +++++++++++++++++++++++++
samples/vfs/vfs-helper.c | 43 ++++++
samples/vfs/vfs-helper.h | 55 ++++++++
9 files changed, 289 insertions(+), 88 deletions(-)
create mode 100644 samples/vfs/test-nfs-userns.c
create mode 100644 samples/vfs/vfs-helper.c
create mode 100644 samples/vfs/vfs-helper.h

--
2.25.1


2020-10-16 13:16:12

by Sargun Dhillon

[permalink] [raw]
Subject: [PATCH v2 1/3] NFS: Use cred from fscontext during fsmount

In several patches, support was introduced to NFS for user namespaces:

ccfe51a5161c: SUNRPC: Fix the server AUTH_UNIX userspace mappings
e6667c73a27d: SUNRPC: rsi_parse() should use the current user namespace
1a58e8a0e5c1: NFS: Store the credential of the mount process in the nfs_server
283ebe3ec415: SUNRPC: Use the client user namespace when encoding creds
ac83228a7101: SUNRPC: Use namespace of listening daemon in the client AUTH_GSS upcall
264d948ce7d0: NFS: Convert NFSv3 to use the container user namespace
58002399da65: NFSv4: Convert the NFS client idmapper to use the container user namespace
c207db2f5da5: NFS: Convert NFSv2 to use the container user namespace
3b7eb5e35d0f: NFS: When mounting, don't share filesystems between different user namespaces

All of these commits are predicated on the NFS server being created with
credentials that are in the user namespace of interest. The new VFS
mount APIs help in this[1], in that the creation of the FSFD (fsopen)
captures a set of credentials at creation time.

Normally, the new file system API users automatically get their
super block's user_ns set to the fc->user_ns in sget_fc, but since
NFS has to do special manipulation of UIDs / GIDs on the wire,
it keeps track of credentials itself.

Unfortunately, the credentials that the NFS uses are the current_creds
at the time FSCONFIG_CMD_CREATE is called. When FSCONFIG_CMD_CREATE is
called, simultaneously, mount_capable is checked -- which checks if
the user has CAP_SYS_ADMIN in the init_user_ns because NFS does not
have FS_USERNS_MOUNT.

This makes a subtle change so that the struct cred from fsopen
is used instead. Since the fs_context is available at server
creation time, and it has the credentials, we can just use
those.

This roughly allows a privileged user to mount on behalf of an unprivileged
usernamespace, by forking off and calling fsopen in the unprivileged user
namespace. It can then pass back that fsfd to the privileged process which
can configure the NFS mount, and then it can call FSCONFIG_CMD_CREATE
before switching back into the mount namespace of the container, and finish
up the mounting process and call fsmount and move_mount.

This change makes a small user space change if the user performs this
elaborate process of passing around file descriptors, and switching
namespaces. There may be a better way to go about this, or even enable
FS_USERNS_MOUNT on NFS, but this seems like the safest and most
straightforward approach.

[1]: https://lore.kernel.org/linux-fsdevel/155059610368.17079.2220554006494174417.stgit@warthog.procyon.org.uk/

Signed-off-by: Sargun Dhillon <[email protected]>
Cc: J. Bruce Fields <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: Trond Myklebust <[email protected]>
Cc: Anna Schumaker <[email protected]>
Cc: David Howells <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Kyle Anderson <[email protected]>
---
fs/nfs/client.c | 2 +-
fs/nfs/nfs4client.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index f1ff3076e4a4..fdefcc649884 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -967,7 +967,7 @@ struct nfs_server *nfs_create_server(struct fs_context *fc)
if (!server)
return ERR_PTR(-ENOMEM);

- server->cred = get_cred(current_cred());
+ server->cred = get_cred(fc->cred);

error = -ENOMEM;
fattr = nfs_alloc_fattr();
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 0bd77cc1f639..92ff6fb8e324 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -1120,7 +1120,7 @@ struct nfs_server *nfs4_create_server(struct fs_context *fc)
if (!server)
return ERR_PTR(-ENOMEM);

- server->cred = get_cred(current_cred());
+ server->cred = get_cred(fc->cred);

auth_probe = ctx->auth_info.flavor_len < 1;

--
2.25.1

2020-10-16 13:16:13

by Sargun Dhillon

[permalink] [raw]
Subject: [PATCH v2 2/3] samples/vfs: Split out common code for new syscall APIs

There are a bunch of helper functions which make using the new
mount APIs much easier. As we add examples of leveraging the
new APIs, it probably makes sense to promote code reuse.

Cc: David Howells <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Kyle Anderson <[email protected]>
---
samples/vfs/Makefile | 2 +
samples/vfs/test-fsmount.c | 86 +-------------------------------------
samples/vfs/vfs-helper.c | 43 +++++++++++++++++++
samples/vfs/vfs-helper.h | 55 ++++++++++++++++++++++++
4 files changed, 101 insertions(+), 85 deletions(-)
create mode 100644 samples/vfs/vfs-helper.c
create mode 100644 samples/vfs/vfs-helper.h

diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile
index 00b6824f9237..7f76875eaa70 100644
--- a/samples/vfs/Makefile
+++ b/samples/vfs/Makefile
@@ -1,5 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
+test-fsmount-objs := test-fsmount.o vfs-helper.o
userprogs := test-fsmount test-statx
+
always-y := $(userprogs)

userccflags += -I usr/include
diff --git a/samples/vfs/test-fsmount.c b/samples/vfs/test-fsmount.c
index 50f47b72e85f..36a4fa886200 100644
--- a/samples/vfs/test-fsmount.c
+++ b/samples/vfs/test-fsmount.c
@@ -14,91 +14,7 @@
#include <sys/wait.h>
#include <linux/mount.h>
#include <linux/unistd.h>
-
-#define E(x) do { if ((x) == -1) { perror(#x); exit(1); } } while(0)
-
-static void check_messages(int fd)
-{
- char buf[4096];
- int err, n;
-
- err = errno;
-
- for (;;) {
- n = read(fd, buf, sizeof(buf));
- if (n < 0)
- break;
- n -= 2;
-
- switch (buf[0]) {
- case 'e':
- fprintf(stderr, "Error: %*.*s\n", n, n, buf + 2);
- break;
- case 'w':
- fprintf(stderr, "Warning: %*.*s\n", n, n, buf + 2);
- break;
- case 'i':
- fprintf(stderr, "Info: %*.*s\n", n, n, buf + 2);
- break;
- }
- }
-
- errno = err;
-}
-
-static __attribute__((noreturn))
-void mount_error(int fd, const char *s)
-{
- check_messages(fd);
- fprintf(stderr, "%s: %m\n", s);
- exit(1);
-}
-
-/* Hope -1 isn't a syscall */
-#ifndef __NR_fsopen
-#define __NR_fsopen -1
-#endif
-#ifndef __NR_fsmount
-#define __NR_fsmount -1
-#endif
-#ifndef __NR_fsconfig
-#define __NR_fsconfig -1
-#endif
-#ifndef __NR_move_mount
-#define __NR_move_mount -1
-#endif
-
-
-static inline int fsopen(const char *fs_name, unsigned int flags)
-{
- return syscall(__NR_fsopen, fs_name, flags);
-}
-
-static inline int fsmount(int fsfd, unsigned int flags, unsigned int ms_flags)
-{
- return syscall(__NR_fsmount, fsfd, flags, ms_flags);
-}
-
-static inline int fsconfig(int fsfd, unsigned int cmd,
- const char *key, const void *val, int aux)
-{
- return syscall(__NR_fsconfig, fsfd, cmd, key, val, aux);
-}
-
-static inline int move_mount(int from_dfd, const char *from_pathname,
- int to_dfd, const char *to_pathname,
- unsigned int flags)
-{
- return syscall(__NR_move_mount,
- from_dfd, from_pathname,
- to_dfd, to_pathname, flags);
-}
-
-#define E_fsconfig(fd, cmd, key, val, aux) \
- do { \
- if (fsconfig(fd, cmd, key, val, aux) == -1) \
- mount_error(fd, key ?: "create"); \
- } while (0)
+#include "vfs-helper.h"

int main(int argc, char *argv[])
{
diff --git a/samples/vfs/vfs-helper.c b/samples/vfs/vfs-helper.c
new file mode 100644
index 000000000000..bae2bc03c923
--- /dev/null
+++ b/samples/vfs/vfs-helper.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+#include "vfs-helper.h"
+
+void check_messages(int fd)
+{
+ char buf[4096];
+ int err, n;
+
+ err = errno;
+
+ for (;;) {
+ n = read(fd, buf, sizeof(buf));
+ if (n < 0)
+ break;
+ n -= 2;
+
+ switch (buf[0]) {
+ case 'e':
+ fprintf(stderr, "Error: %*.*s\n", n, n, buf + 2);
+ break;
+ case 'w':
+ fprintf(stderr, "Warning: %*.*s\n", n, n, buf + 2);
+ break;
+ case 'i':
+ fprintf(stderr, "Info: %*.*s\n", n, n, buf + 2);
+ break;
+ }
+ }
+
+ errno = err;
+}
+
+__attribute__((noreturn))
+void mount_error(int fd, const char *s)
+{
+ check_messages(fd);
+ fprintf(stderr, "%s: %m\n", s);
+ exit(1);
+}
\ No newline at end of file
diff --git a/samples/vfs/vfs-helper.h b/samples/vfs/vfs-helper.h
new file mode 100644
index 000000000000..be460ab48247
--- /dev/null
+++ b/samples/vfs/vfs-helper.h
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/unistd.h>
+#include <linux/mount.h>
+#include <sys/syscall.h>
+
+#define E(x) do { if ((x) == -1) { perror(#x); exit(1); } } while(0)
+
+/* Hope -1 isn't a syscall */
+#ifndef __NR_fsopen
+#define __NR_fsopen -1
+#endif
+#ifndef __NR_fsmount
+#define __NR_fsmount -1
+#endif
+#ifndef __NR_fsconfig
+#define __NR_fsconfig -1
+#endif
+#ifndef __NR_move_mount
+#define __NR_move_mount -1
+#endif
+
+#define E_fsconfig(fd, cmd, key, val, aux) \
+ do { \
+ if (fsconfig(fd, cmd, key, val, aux) == -1) \
+ mount_error(fd, key ?: "create"); \
+ } while (0)
+
+static inline int fsopen(const char *fs_name, unsigned int flags)
+{
+ return syscall(__NR_fsopen, fs_name, flags);
+}
+
+static inline int fsmount(int fsfd, unsigned int flags, unsigned int ms_flags)
+{
+ return syscall(__NR_fsmount, fsfd, flags, ms_flags);
+}
+
+static inline int fsconfig(int fsfd, unsigned int cmd,
+ const char *key, const void *val, int aux)
+{
+ return syscall(__NR_fsconfig, fsfd, cmd, key, val, aux);
+}
+
+static inline int move_mount(int from_dfd, const char *from_pathname,
+ int to_dfd, const char *to_pathname,
+ unsigned int flags)
+{
+ return syscall(__NR_move_mount,
+ from_dfd, from_pathname,
+ to_dfd, to_pathname, flags);
+}
+
+__attribute__((noreturn))
+void mount_error(int fd, const char *s);
+void check_messages(int fd);
--
2.25.1