2023-04-18 09:39:21

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 0/8 v3] nfs-utils: Improving NFS re-export wrt. crossmnt

After a longer hiatus I'm sending the next iteration of my re-export
improvement patch series. While the kernel side is upstream since v6.2,
the nfs-utils parts are still missing.
This patch series aims to solve this.

The core idea is adding new export option, reeport=
Using reexport= it is possible to mark an export entry in the exports
file explicitly as NFS re-export and select a strategy on how unique
identifiers should be provided. This makes the crossmnt feature work
in the re-export case.
Currently two strategies are supported, "auto-fsidnum" and
"predefined-fsidnum".

In my earlier series a sqlite database was mandatory to keep track of
generated fsids.
This series follows a different approach, instead of directly using
sqlite in all nfs-utils components (linking libsqlite), a new deamon
manages the database, fsidd.
fsidd offers a simple (but stupid?) text based interface over a unix domain
socket which can be queried by mountd, exportfs, etc. for fsidnums.
The main idea behind fsidd is allowing users to implement their own
fsidd which keeps global state across load balancers.
I'm still not happy with fsidd, there is room for improvement but first
I'd like to know whether you like or hate this approach.

A typical export entry on a re-exporting server looks like:
/nfs *(rw,no_root_squash,no_subtree_check,crossmnt,reexport=auto-fsidnum)
reexport=auto-fsidnum will automatically assign an fsid= to /nfs and all
uncovered subvolumes.

Changes since v2, https://lore.kernel.org/linux-nfs/[email protected]/
- Split patch series
- Add improved fsidd system unit file
- Rebased to nfs-utils master as of today
- Dropped init code from exportd

Changes since v1, https://lore.kernel.org/linux-nfs/[email protected]/
- Factor out Sqlite and put it into a daemon
- Add fsidd
- Basically re-implemented the patch series
- Lot's of fixes (e.g. nfs v4 root export)


Richard Weinberger (8):
Add reexport helper library
Implement reexport= export option
export: Wireup reexport mechanism
export: Uncover NFS subvolume after reboot
exports.man: Document reexport= option
reexport: Add sqlite backend
export: Add fsidd
Add fsid systemd service file

configure.ac | 1 +
support/Makefile.am | 2 +-
support/export/Makefile.am | 2 +
support/export/cache.c | 74 ++++++-
support/export/export.c | 20 ++
support/include/nfslib.h | 1 +
support/nfs/Makefile.am | 1 +
support/nfs/exports.c | 62 ++++++
support/reexport/Makefile.am | 18 ++
support/reexport/backend_sqlite.c | 267 +++++++++++++++++++++++
support/reexport/fsidd.c | 198 +++++++++++++++++
support/reexport/reexport.c | 326 ++++++++++++++++++++++++++++
support/reexport/reexport.h | 18 ++
support/reexport/reexport_backend.h | 47 ++++
systemd/Makefile.am | 5 +-
systemd/fsidd.service | 10 +
utils/exportd/Makefile.am | 4 +-
utils/exportfs/Makefile.am | 3 +
utils/exportfs/exportfs.c | 11 +
utils/exportfs/exports.man | 31 +++
utils/mount/Makefile.am | 3 +-
utils/mountd/Makefile.am | 2 +
22 files changed, 1096 insertions(+), 10 deletions(-)
create mode 100644 support/reexport/Makefile.am
create mode 100644 support/reexport/backend_sqlite.c
create mode 100644 support/reexport/fsidd.c
create mode 100644 support/reexport/reexport.c
create mode 100644 support/reexport/reexport.h
create mode 100644 support/reexport/reexport_backend.h
create mode 100644 systemd/fsidd.service

--
2.31.1


2023-04-18 09:40:20

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 7/8] export: Add fsidd

The fsidnum daemon offers a local UNIX domain socket interface
for all NFS userspace to query the reexport database.
Currently fsidd just uses the SQlite backend.

fsidd serves also as an example on how to implement more complex
backends for the load balancing use case.

Signed-off-by: Richard Weinberger <[email protected]>
---
support/reexport/Makefile.am | 12 +++
support/reexport/fsidd.c | 198 +++++++++++++++++++++++++++++++++++
2 files changed, 210 insertions(+)
create mode 100644 support/reexport/fsidd.c

diff --git a/support/reexport/Makefile.am b/support/reexport/Makefile.am
index 9d544a8f..fbd66a20 100644
--- a/support/reexport/Makefile.am
+++ b/support/reexport/Makefile.am
@@ -3,4 +3,16 @@
noinst_LIBRARIES = libreexport.a
libreexport_a_SOURCES = reexport.c

+sbin_PROGRAMS = fsidd
+
+fsidd_SOURCES = fsidd.c backend_sqlite.c
+
+fsidd_LDADD = ../../support/misc/libmisc.a \
+ ../../support/nfs/libnfs.la \
+ $(LIBPTHREAD) $(LIBEVENT) $(LIBSQLITE) \
+ $(OPTLIBS)
+
+fsidd_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) \
+ -I$(top_builddir)/support/include
+
MAINTAINERCLEANFILES = Makefile.in
diff --git a/support/reexport/fsidd.c b/support/reexport/fsidd.c
new file mode 100644
index 00000000..410b3a37
--- /dev/null
+++ b/support/reexport/fsidd.c
@@ -0,0 +1,198 @@
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <event2/event.h>
+#include <limits.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <sys/random.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/un.h>
+#include <sys/vfs.h>
+#include <unistd.h>
+
+#include "conffile.h"
+#include "reexport_backend.h"
+#include "xcommon.h"
+#include "xlog.h"
+
+#define FSID_SOCKET_NAME "fsid.sock"
+
+static struct event_base *evbase;
+static struct reexpdb_backend_plugin *dbbackend = &sqlite_plug_ops;
+
+static void client_cb(evutil_socket_t cl, short ev, void *d)
+{
+ struct event *me = d;
+ char buf[PATH_MAX * 2];
+ int n;
+
+ (void)ev;
+
+ n = recv(cl, buf, sizeof(buf) - 1, 0);
+ if (n <= 0) {
+ event_del(me);
+ event_free(me);
+ close(cl);
+ return;
+ }
+
+ buf[n] = '\0';
+
+ if (strncmp(buf, "get_fsidnum ", strlen("get_fsidnum ")) == 0) {
+ char *req_path = buf + strlen("get_fsidnum ");
+ uint32_t fsidnum;
+ char *answer = NULL;
+ bool found;
+
+ assert(req_path < buf + n );
+
+ printf("client asks for %s\n", req_path);
+
+ if (dbbackend->fsidnum_by_path(req_path, &fsidnum, false, &found)) {
+ if (found)
+ assert(asprintf(&answer, "+ %u", fsidnum) != -1);
+ else
+ assert(asprintf(&answer, "+ ") != -1);
+
+ } else {
+ assert(asprintf(&answer, "- %s", "Command failed") != -1);
+ }
+
+ (void)send(cl, answer, strlen(answer), 0);
+
+ free(answer);
+ } else if (strncmp(buf, "get_or_create_fsidnum ", strlen("get_or_create_fsidnum ")) == 0) {
+ char *req_path = buf + strlen("get_or_create_fsidnum ");
+ uint32_t fsidnum;
+ char *answer = NULL;
+ bool found;
+
+ assert(req_path < buf + n );
+
+
+ if (dbbackend->fsidnum_by_path(req_path, &fsidnum, true, &found)) {
+ if (found) {
+ assert(asprintf(&answer, "+ %u", fsidnum) != -1);
+ } else {
+ assert(asprintf(&answer, "+ ") != -1);
+ }
+
+ } else {
+ assert(asprintf(&answer, "- %s", "Command failed") != -1);
+ }
+
+ (void)send(cl, answer, strlen(answer), 0);
+
+ free(answer);
+ } else if (strncmp(buf, "get_path ", strlen("get_path ")) == 0) {
+ char *req_fsidnum = buf + strlen("get_path ");
+ char *path = NULL, *answer = NULL, *endp;
+ bool bad_input = true;
+ uint32_t fsidnum;
+ bool found;
+
+ assert(req_fsidnum < buf + n );
+
+ errno = 0;
+ fsidnum = strtoul(req_fsidnum, &endp, 10);
+ if (errno == 0 && *endp == '\0') {
+ bad_input = false;
+ }
+
+ if (bad_input) {
+ assert(asprintf(&answer, "- %s", "Command failed: Bad input") != -1);
+ } else {
+ if (dbbackend->path_by_fsidnum(fsidnum, &path, &found)) {
+ if (found)
+ assert(asprintf(&answer, "+ %s", path) != -1);
+ else
+ assert(asprintf(&answer, "+ ") != -1);
+ } else {
+ assert(asprintf(&answer, "+ ") != -1);
+ }
+ }
+
+ (void)send(cl, answer, strlen(answer), 0);
+
+ free(path);
+ free(answer);
+ } else if (strcmp(buf, "version") == 0) {
+ char answer[] = "+ 1";
+
+ (void)send(cl, answer, strlen(answer), 0);
+ } else {
+ char *answer = NULL;
+
+ assert(asprintf(&answer, "- bad command") != -1);
+ (void)send(cl, answer, strlen(answer), 0);
+
+ free(answer);
+ }
+}
+
+static void srv_cb(evutil_socket_t fd, short ev, void *d)
+{
+ int cl = accept4(fd, NULL, NULL, SOCK_NONBLOCK);
+ struct event *client_ev;
+
+ (void)ev;
+ (void)d;
+
+ client_ev = event_new(evbase, cl, EV_READ | EV_PERSIST | EV_CLOSED, client_cb, event_self_cbarg());
+ event_add(client_ev, NULL);
+}
+
+int main(void)
+{
+ struct event *srv_ev;
+ struct sockaddr_un addr;
+ char *sock_file;
+ int srv;
+
+ conf_init_file(NFS_CONFFILE);
+
+ if (!dbbackend->initdb()) {
+ return 1;
+ }
+
+ sock_file = conf_get_str_with_def("reexport", "fsidd_socket", FSID_SOCKET_NAME);
+
+ unlink(sock_file);
+
+ memset(&addr, 0, sizeof(struct sockaddr_un));
+ addr.sun_family = AF_UNIX;
+ strncpy(addr.sun_path, sock_file, sizeof(addr.sun_path) - 1);
+
+ srv = socket(AF_UNIX, SOCK_SEQPACKET | SOCK_NONBLOCK, 0);
+ if (srv == -1) {
+ xlog(L_WARNING, "Unable to create AF_UNIX socket for %s: %m\n", sock_file);
+ return 1;
+ }
+
+ if (bind(srv, (const struct sockaddr *)&addr, sizeof(struct sockaddr_un)) == -1) {
+ xlog(L_WARNING, "Unable to bind %s: %m\n", sock_file);
+ return 1;
+ }
+
+ if (listen(srv, 5) == -1) {
+ xlog(L_WARNING, "Unable to listen on %s: %m\n", sock_file);
+ return 1;
+ }
+
+ evbase = event_base_new();
+
+ srv_ev = event_new(evbase, srv, EV_READ | EV_PERSIST, srv_cb, NULL);
+ event_add(srv_ev, NULL);
+
+ event_base_dispatch(evbase);
+
+ dbbackend->destroydb();
+
+ return 0;
+}
--
2.31.1

2023-04-18 09:41:05

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 2/8] Implement reexport= export option

When re-exporting a NFS volume it is mandatory to specify
either a UUID or numerical fsid= option because nfsd is unable
to derive an identifier on its own.

For NFS cross mounts this becomes a problem because nfsd also
needs an identifier for every crossed mount.
A common workaround is stating every single subvolume in the
exports list too.
But this defeats the purpose of the crossmnt option and is tedious.

This is where the reexport= tries to help.
It offers various strategies to automatically derive a identifier
for NFS volumes and sub volumes.

Currently two strategies are implemented:

1. auto-fsidnum
In this mode mountd/exportd will create a new numerical fsid
for a NFS volume and subvolume. The numbers are stored in a database,
via fsidd, such that the server will always use the same fsid.
The entry in the exports file allowed to skip the fsid= option but
stating a UUID is allowed, if needed.

This mode has the obvious downside that load balancing is by default not
possible since multiple re-exporting NFS servers would generate
different ids.
It is possible if all load balancers use the same database.
This can be achieved by using nfs-utils' fsidd and placing it's sqlit
database on a network share which supports file locks or by implementing
your own fsidd which is able to provide consistent fsids across multiple
re-exporting nfs servers.

2. predefined-fsidnum
This mode works just like auto-fsidnum but does not generate ids
for you. It helps in the load balancing case. A system administrator
has to manually maintain the database and install it on all re-exporting
NFS servers. If you have a massive amount of subvolumes this mode
will help because you don't have to bloat the exports list.

Signed-off-by: Richard Weinberger <[email protected]>
---
support/export/export.c | 20 ++++++++++++
support/nfs/Makefile.am | 1 +
support/nfs/exports.c | 62 ++++++++++++++++++++++++++++++++++++++
systemd/Makefile.am | 2 ++
utils/exportd/Makefile.am | 4 ++-
utils/exportfs/Makefile.am | 3 ++
utils/exportfs/exportfs.c | 11 +++++++
utils/mount/Makefile.am | 3 +-
utils/mountd/Makefile.am | 2 ++
9 files changed, 106 insertions(+), 2 deletions(-)

diff --git a/support/export/export.c b/support/export/export.c
index 03390dfc..3e48c42d 100644
--- a/support/export/export.c
+++ b/support/export/export.c
@@ -25,6 +25,7 @@
#include "exportfs.h"
#include "nfsd_path.h"
#include "xlog.h"
+#include "reexport.h"

exp_hash_table exportlist[MCL_MAXTYPES] = {{NULL, {{NULL,NULL}, }}, };
static int export_hash(char *);
@@ -115,6 +116,7 @@ export_read(char *fname, int ignore_hosts)
nfs_export *exp;

int volumes = 0;
+ int reexport_found = 0;

setexportent(fname, "r");
while ((eep = getexportent(0,1)) != NULL) {
@@ -126,7 +128,25 @@ export_read(char *fname, int ignore_hosts)
}
else
warn_duplicated_exports(exp, eep);
+
+ if (eep->e_reexport)
+ reexport_found = 1;
}
+
+ if (reexport_found) {
+ for (int i = 0; i < MCL_MAXTYPES; i++) {
+ for (exp = exportlist[i].p_head; exp; exp = exp->m_next) {
+ if (exp->m_export.e_reexport)
+ continue;
+
+ if (exp->m_export.e_flags & NFSEXP_FSID) {
+ xlog(L_ERROR, "When a reexport= option is present no manully assigned numerical fsid= options are allowed");
+ return -1;
+ }
+ }
+ }
+ }
+
endexportent();

return volumes;
diff --git a/support/nfs/Makefile.am b/support/nfs/Makefile.am
index 67e3a8e1..2e1577cc 100644
--- a/support/nfs/Makefile.am
+++ b/support/nfs/Makefile.am
@@ -9,6 +9,7 @@ libnfs_la_SOURCES = exports.c rmtab.c xio.c rpcmisc.c rpcdispatch.c \
svc_socket.c cacheio.c closeall.c nfs_mntent.c \
svc_create.c atomicio.c strlcat.c strlcpy.c
libnfs_la_LIBADD = libnfsconf.la
+libnfs_la_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) -I$(top_srcdir)/support/reexport

libnfsconf_la_SOURCES = conffile.c xlog.c

diff --git a/support/nfs/exports.c b/support/nfs/exports.c
index da8ace3a..72e632f4 100644
--- a/support/nfs/exports.c
+++ b/support/nfs/exports.c
@@ -31,6 +31,7 @@
#include "xlog.h"
#include "xio.h"
#include "pseudoflavors.h"
+#include "reexport.h"

#define EXPORT_DEFAULT_FLAGS \
(NFSEXP_READONLY|NFSEXP_ROOTSQUASH|NFSEXP_GATHERED_WRITES|NFSEXP_NOSUBTREECHECK)
@@ -104,6 +105,7 @@ static void init_exportent (struct exportent *ee, int fromkernel)
ee->e_nsqgids = 0;
ee->e_uuid = NULL;
ee->e_ttl = default_ttl;
+ ee->e_reexport = REEXP_NONE;
}

struct exportent *
@@ -313,6 +315,23 @@ putexportent(struct exportent *ep)
}
if (ep->e_uuid)
fprintf(fp, "fsid=%s,", ep->e_uuid);
+
+ if (ep->e_reexport) {
+ fprintf(fp, "reexport=");
+ switch (ep->e_reexport) {
+ case REEXP_AUTO_FSIDNUM:
+ fprintf(fp, "auto-fsidnum");
+ break;
+ case REEXP_PREDEFINED_FSIDNUM:
+ fprintf(fp, "predefined-fsidnum");
+ break;
+ default:
+ xlog(L_ERROR, "unknown reexport method %i", ep->e_reexport);
+ fprintf(fp, "none");
+ }
+ fprintf(fp, ",");
+ }
+
if (ep->e_mountpoint)
fprintf(fp, "mountpoint%s%s,",
ep->e_mountpoint[0]?"=":"", ep->e_mountpoint);
@@ -619,6 +638,7 @@ parseopts(char *cp, struct exportent *ep, int warn, int *had_subtree_opt_ptr)
char *flname = efname?efname:"command line";
int flline = efp?efp->x_line:0;
unsigned int active = 0;
+ int saw_reexport = 0;

squids = ep->e_squids; nsquids = ep->e_nsquids;
sqgids = ep->e_sqgids; nsqgids = ep->e_nsqgids;
@@ -725,6 +745,13 @@ bad_option:
}
} else if (strncmp(opt, "fsid=", 5) == 0) {
char *oe;
+
+ if (saw_reexport) {
+ xlog(L_ERROR, "%s:%d: 'fsid=' has to be before 'reexport=' %s\n",
+ flname, flline, opt);
+ goto bad_option;
+ }
+
if (strcmp(opt+5, "root") == 0) {
ep->e_fsid = 0;
setflags(NFSEXP_FSID, active, ep);
@@ -772,6 +799,41 @@ bad_option:
} else if (strncmp(opt, "xprtsec=", 8) == 0) {
if (!parse_xprtsec(opt + 8, ep))
goto bad_option;
+ } else if (strncmp(opt, "reexport=", 9) == 0) {
+ char *strategy = strchr(opt, '=');
+
+ if (!strategy) {
+ xlog(L_ERROR, "%s:%d: bad option %s\n",
+ flname, flline, opt);
+ goto bad_option;
+ }
+ strategy++;
+
+ if (saw_reexport) {
+ xlog(L_ERROR, "%s:%d: only one 'reexport=' is allowed%s\n",
+ flname, flline, opt);
+ goto bad_option;
+ }
+
+ if (strcmp(strategy, "auto-fsidnum") == 0) {
+ ep->e_reexport = REEXP_AUTO_FSIDNUM;
+ } else if (strcmp(strategy, "predefined-fsidnum") == 0) {
+ ep->e_reexport = REEXP_PREDEFINED_FSIDNUM;
+ } else if (strcmp(strategy, "none") == 0) {
+ ep->e_reexport = REEXP_NONE;
+ } else {
+ xlog(L_ERROR, "%s:%d: bad option %s\n",
+ flname, flline, strategy);
+ goto bad_option;
+ }
+
+ if (reexpdb_apply_reexport_settings(ep, flname, flline) != 0)
+ goto bad_option;
+
+ if (ep->e_fsid)
+ setflags(NFSEXP_FSID, active, ep);
+
+ saw_reexport = 1;
} else {
xlog(L_ERROR, "%s:%d: unknown keyword \"%s\"\n",
flname, flline, opt);
diff --git a/systemd/Makefile.am b/systemd/Makefile.am
index 577c6a22..2e250dca 100644
--- a/systemd/Makefile.am
+++ b/systemd/Makefile.am
@@ -70,8 +70,10 @@ rpc_pipefs_generator_SOURCES = $(COMMON_SRCS) rpc-pipefs-generator.c
nfs_server_generator_LDADD = ../support/export/libexport.a \
../support/nfs/libnfs.la \
../support/misc/libmisc.a \
+ ../support/reexport/libreexport.a \
$(LIBPTHREAD)

+
rpc_pipefs_generator_LDADD = ../support/nfs/libnfs.la

if INSTALL_SYSTEMD
diff --git a/utils/exportd/Makefile.am b/utils/exportd/Makefile.am
index c95bdee7..83024958 100644
--- a/utils/exportd/Makefile.am
+++ b/utils/exportd/Makefile.am
@@ -16,7 +16,9 @@ exportd_SOURCES = exportd.c
exportd_LDADD = ../../support/export/libexport.a \
../../support/nfs/libnfs.la \
../../support/misc/libmisc.a \
- $(OPTLIBS) $(LIBBLKID) $(LIBPTHREAD) -luuid
+ ../support/reexport/libreexport.a \
+ $(OPTLIBS) $(LIBBLKID) $(LIBPTHREAD) \
+ -luuid

exportd_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) \
-I$(top_srcdir)/support/export
diff --git a/utils/exportfs/Makefile.am b/utils/exportfs/Makefile.am
index 96524c72..7f8ce9fa 100644
--- a/utils/exportfs/Makefile.am
+++ b/utils/exportfs/Makefile.am
@@ -10,6 +10,9 @@ exportfs_SOURCES = exportfs.c
exportfs_LDADD = ../../support/export/libexport.a \
../../support/nfs/libnfs.la \
../../support/misc/libmisc.a \
+ ../../support/reexport/libreexport.a \
$(LIBWRAP) $(LIBNSL) $(LIBPTHREAD)

+exportfs_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) -I$(top_srcdir)/support/reexport
+
MAINTAINERCLEANFILES = Makefile.in
diff --git a/utils/exportfs/exportfs.c b/utils/exportfs/exportfs.c
index 37b9e4b3..b03a047b 100644
--- a/utils/exportfs/exportfs.c
+++ b/utils/exportfs/exportfs.c
@@ -38,6 +38,7 @@
#include "exportfs.h"
#include "xlog.h"
#include "conffile.h"
+#include "reexport.h"

static void export_all(int verbose);
static void exportfs(char *arg, char *options, int verbose);
@@ -719,6 +720,16 @@ dump(int verbose, int export_format)
c = dumpopt(c, "fsid=%d", ep->e_fsid);
if (ep->e_uuid)
c = dumpopt(c, "fsid=%s", ep->e_uuid);
+ if (ep->e_reexport) {
+ switch (ep->e_reexport) {
+ case REEXP_AUTO_FSIDNUM:
+ c = dumpopt(c, "reexport=%s", "auto-fsidnum");
+ break;
+ case REEXP_PREDEFINED_FSIDNUM:
+ c = dumpopt(c, "reexport=%s", "predefined-fsidnum");
+ break;
+ }
+ }
if (ep->e_mountpoint)
c = dumpopt(c, "mountpoint%s%s",
ep->e_mountpoint[0]?"=":"",
diff --git a/utils/mount/Makefile.am b/utils/mount/Makefile.am
index 3101f7ab..5ff1148c 100644
--- a/utils/mount/Makefile.am
+++ b/utils/mount/Makefile.am
@@ -29,8 +29,9 @@ endif

mount_nfs_LDADD = ../../support/nfs/libnfs.la \
../../support/export/libexport.a \
+ ../../support/reexport/libreexport.a \
../../support/misc/libmisc.a \
- $(LIBTIRPC)
+ $(LIBTIRPC) $(LIBPTHREAD)

mount_nfs_SOURCES = $(mount_common)

diff --git a/utils/mountd/Makefile.am b/utils/mountd/Makefile.am
index 13b25c90..197ef29b 100644
--- a/utils/mountd/Makefile.am
+++ b/utils/mountd/Makefile.am
@@ -17,9 +17,11 @@ mountd_SOURCES = mountd.c mount_dispatch.c rmtab.c \
mountd_LDADD = ../../support/export/libexport.a \
../../support/nfs/libnfs.la \
../../support/misc/libmisc.a \
+ ../../support/reexport/libreexport.a \
$(OPTLIBS) \
$(LIBBSD) $(LIBWRAP) $(LIBNSL) $(LIBBLKID) -luuid $(LIBTIRPC) \
$(LIBPTHREAD)
+
mountd_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) \
-I$(top_builddir)/support/include \
-I$(top_srcdir)/support/export
--
2.31.1

2023-04-18 09:42:53

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 6/8] reexport: Add sqlite backend

The reexport database code is designed to support multiple ways
to store the state.
So far only SQlite is implemented.

Signed-off-by: Richard Weinberger <[email protected]>
---
support/reexport/backend_sqlite.c | 267 ++++++++++++++++++++++++++++
support/reexport/reexport_backend.h | 47 +++++
2 files changed, 314 insertions(+)
create mode 100644 support/reexport/backend_sqlite.c
create mode 100644 support/reexport/reexport_backend.h

diff --git a/support/reexport/backend_sqlite.c b/support/reexport/backend_sqlite.c
new file mode 100644
index 00000000..132f30c4
--- /dev/null
+++ b/support/reexport/backend_sqlite.c
@@ -0,0 +1,267 @@
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <sqlite3.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/random.h>
+#include <unistd.h>
+
+#include "conffile.h"
+#include "reexport_backend.h"
+#include "xlog.h"
+
+#define REEXPDB_DBFILE NFS_STATEDIR "/reexpdb.sqlite3"
+#define REEXPDB_DBFILE_WAIT_USEC (5000)
+
+static sqlite3 *db;
+static int init_done;
+
+static int prng_init(void)
+{
+ int seed;
+
+ if (getrandom(&seed, sizeof(seed), 0) != sizeof(seed)) {
+ xlog(L_ERROR, "Unable to obtain seed for PRNG via getrandom()");
+ return -1;
+ }
+
+ srand(seed);
+ return 0;
+}
+
+static void wait_for_dbaccess(void)
+{
+ usleep(REEXPDB_DBFILE_WAIT_USEC + (rand() % REEXPDB_DBFILE_WAIT_USEC));
+}
+
+static bool sqlite_plug_init(void)
+{
+ char *sqlerr;
+ int ret;
+
+ if (init_done)
+ return true;
+
+ if (prng_init() != 0)
+ return false;
+
+ ret = sqlite3_open_v2(conf_get_str_with_def("reexport", "sqlitedb", REEXPDB_DBFILE),
+ &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX,
+ NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to open reexport database: %s", sqlite3_errstr(ret));
+ return false;
+ }
+
+again:
+ ret = sqlite3_exec(db, "CREATE TABLE IF NOT EXISTS fsidnums (num INTEGER PRIMARY KEY CHECK (num > 0 AND num < 4294967296), path TEXT UNIQUE); CREATE INDEX IF NOT EXISTS idx_ids_path ON fsidnums (path);", NULL, NULL, &sqlerr);
+ switch (ret) {
+ case SQLITE_OK:
+ init_done = 1;
+ ret = 0;
+ break;
+ case SQLITE_BUSY:
+ case SQLITE_LOCKED:
+ wait_for_dbaccess();
+ goto again;
+ default:
+ xlog(L_ERROR, "Unable to init reexport database: %s", sqlite3_errstr(ret));
+ sqlite3_free(sqlerr);
+ sqlite3_close_v2(db);
+ ret = -1;
+ }
+
+ return ret == 0 ? true : false;
+}
+
+static void sqlite_plug_destroy(void)
+{
+ if (!init_done)
+ return;
+
+ sqlite3_close_v2(db);
+}
+
+static bool get_fsidnum_by_path(char *path, uint32_t *fsidnum, bool *found)
+{
+ static const char fsidnum_by_path_sql[] = "SELECT num FROM fsidnums WHERE path = ?1;";
+ sqlite3_stmt *stmt = NULL;
+ bool success = false;
+ int ret;
+
+ *found = false;
+
+ ret = sqlite3_prepare_v2(db, fsidnum_by_path_sql, sizeof(fsidnum_by_path_sql), &stmt, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to prepare SQL query '%s': %s", fsidnum_by_path_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+ ret = sqlite3_bind_text(stmt, 1, path, -1, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to bind SQL query '%s': %s", fsidnum_by_path_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+again:
+ ret = sqlite3_step(stmt);
+ switch (ret) {
+ case SQLITE_ROW:
+ *fsidnum = sqlite3_column_int(stmt, 0);
+ success = true;
+ *found = true;
+ break;
+ case SQLITE_DONE:
+ /* No hit */
+ success = true;
+ *found = false;
+ break;
+ case SQLITE_BUSY:
+ case SQLITE_LOCKED:
+ wait_for_dbaccess();
+ goto again;
+ default:
+ xlog(L_WARNING, "Error while looking up '%s' in database: %s", path, sqlite3_errstr(ret));
+ }
+
+out:
+ sqlite3_finalize(stmt);
+ return success;
+}
+
+static bool sqlite_plug_path_by_fsidnum(uint32_t fsidnum, char **path, bool *found)
+{
+ static const char path_by_fsidnum_sql[] = "SELECT path FROM fsidnums WHERE num = ?1;";
+ sqlite3_stmt *stmt = NULL;
+ bool success = false;
+ int ret;
+
+ *found = false;
+
+ ret = sqlite3_prepare_v2(db, path_by_fsidnum_sql, sizeof(path_by_fsidnum_sql), &stmt, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to prepare SQL query '%s': %s", path_by_fsidnum_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+ ret = sqlite3_bind_int(stmt, 1, fsidnum);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to bind SQL query '%s': %s", path_by_fsidnum_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+again:
+ ret = sqlite3_step(stmt);
+ switch (ret) {
+ case SQLITE_ROW:
+ *path = strdup((char *)sqlite3_column_text(stmt, 0));
+ if (*path) {
+ *found = true;
+ success = true;
+ } else {
+ xlog(L_WARNING, "Out of memory");
+ }
+ break;
+ case SQLITE_DONE:
+ /* No hit */
+ *found = false;
+ success = true;
+ break;
+ case SQLITE_BUSY:
+ case SQLITE_LOCKED:
+ wait_for_dbaccess();
+ goto again;
+ default:
+ xlog(L_WARNING, "Error while looking up '%i' in database: %s", fsidnum, sqlite3_errstr(ret));
+ }
+
+out:
+ sqlite3_finalize(stmt);
+ return success;
+}
+
+static bool new_fsidnum_by_path(char *path, uint32_t *fsidnum)
+{
+ /*
+ * This query is a little tricky. We use SQL to find and claim the smallest free fsid number.
+ * To find a free fsid the fsidnums is left joined to itself but with an offset of 1.
+ * Everything after the UNION statement is to handle the corner case where fsidnums
+ * is empty. In this case we want 1 as first fsid number.
+ */
+ static const char new_fsidnum_by_path_sql[] = "INSERT INTO fsidnums VALUES ((SELECT ids1.num + 1 FROM fsidnums AS ids1 LEFT JOIN fsidnums AS ids2 ON ids2.num = ids1.num + 1 WHERE ids2.num IS NULL UNION SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM fsidnums WHERE num = 1) LIMIT 1), ?1) RETURNING num;";
+
+ sqlite3_stmt *stmt = NULL;
+ int ret, check = 0;
+ bool success = false;
+
+ ret = sqlite3_prepare_v2(db, new_fsidnum_by_path_sql, sizeof(new_fsidnum_by_path_sql), &stmt, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to prepare SQL query '%s': %s", new_fsidnum_by_path_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+ ret = sqlite3_bind_text(stmt, 1, path, -1, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_WARNING, "Unable to bind SQL query '%s': %s", new_fsidnum_by_path_sql, sqlite3_errstr(ret));
+ goto out;
+ }
+
+again:
+ ret = sqlite3_step(stmt);
+ switch (ret) {
+ case SQLITE_ROW:
+ *fsidnum = sqlite3_column_int(stmt, 0);
+ success = true;
+ break;
+ case SQLITE_CONSTRAINT:
+ /* Maybe we lost the race against another writer and the path is now present. */
+ check = 1;
+ break;
+ case SQLITE_BUSY:
+ case SQLITE_LOCKED:
+ wait_for_dbaccess();
+ goto again;
+ default:
+ xlog(L_WARNING, "Error while looking up '%s' in database: %s", path, sqlite3_errstr(ret));
+ }
+
+out:
+ sqlite3_finalize(stmt);
+
+ if (check) {
+ bool found = false;
+
+ get_fsidnum_by_path(path, fsidnum, &found);
+ if (!found)
+ xlog(L_WARNING, "SQLITE_CONSTRAINT error while inserting '%s' in database", path);
+ }
+
+ return success;
+}
+
+static bool sqlite_plug_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create, bool *found)
+{
+ bool success;
+
+ success = get_fsidnum_by_path(path, fsidnum, found);
+ if (success) {
+ if (!*found && may_create) {
+ success = new_fsidnum_by_path(path, fsidnum);
+ if (success)
+ *found = true;
+ }
+ }
+
+ return success;
+}
+
+struct reexpdb_backend_plugin sqlite_plug_ops = {
+ .fsidnum_by_path = sqlite_plug_fsidnum_by_path,
+ .path_by_fsidnum = sqlite_plug_path_by_fsidnum,
+ .initdb = sqlite_plug_init,
+ .destroydb = sqlite_plug_destroy,
+};
diff --git a/support/reexport/reexport_backend.h b/support/reexport/reexport_backend.h
new file mode 100644
index 00000000..4940f06f
--- /dev/null
+++ b/support/reexport/reexport_backend.h
@@ -0,0 +1,47 @@
+#ifndef REEXPORT_BACKEND_H
+#define REEXPORT_BACKEND_H
+
+extern struct reexpdb_backend_plugin sqlite_plug_ops;
+
+struct reexpdb_backend_plugin {
+ /*
+ * Find or allocate a fsidnum for a given path.
+ *
+ * @path: Path to look for
+ * @fsidnum: Pointer to an uint32_t variable
+ * @may_create: If non-zero, a fsidnum will be allocated if none was found
+ *
+ * Returns true if either an fsidnum was found or successfully allocated,
+ * false otherwise.
+ * On success, the fsidnum will be stored into @fsidnum.
+ * Upon errors, false is returned and errors are logged.
+ */
+ bool (*fsidnum_by_path)(char *path, uint32_t *fsidnum, int may_create, bool *found);
+
+ /*
+ * Lookup path by a given fsidnum
+ *
+ * @fsidnum: fsidnum to look for
+ * @path: address of a char pointer
+ *
+ * Returns true if a path was found, false otherwise.
+ * Upon errors, false is returned and errors are logged.
+ * In case of success, the function returns the found path
+ * via @path, @path will point to a freshly allocated buffer
+ * which is free()'able.
+ */
+ bool (*path_by_fsidnum)(uint32_t fsidnum, char **path, bool *found);
+
+ /*
+ * Init database connection, can get called multiple times.
+ * Returns true on success, false otherwise.
+ */
+ bool (*initdb)(void);
+
+ /*
+ * Undoes initdb().
+ */
+ void (*destroydb)(void);
+};
+
+#endif /* REEXPORT_BACKEND_H */
--
2.31.1

2023-04-18 09:44:01

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 1/8] Add reexport helper library

Add some helper functions which will be used by the reexport
mechanism to create and find fsidnums for re-exported NFS shares.

Signed-off-by: Richard Weinberger <[email protected]>
---
configure.ac | 1 +
support/Makefile.am | 2 +-
support/export/Makefile.am | 2 +
support/include/nfslib.h | 1 +
support/reexport/Makefile.am | 6 +
support/reexport/reexport.c | 326 +++++++++++++++++++++++++++++++++++
support/reexport/reexport.h | 18 ++
7 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 support/reexport/Makefile.am
create mode 100644 support/reexport/reexport.c
create mode 100644 support/reexport/reexport.h

diff --git a/configure.ac b/configure.ac
index 7672a760..9f43267c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -717,6 +717,7 @@ AC_CONFIG_FILES([
support/nsm/Makefile
support/nfsidmap/Makefile
support/nfsidmap/libnfsidmap.pc
+ support/reexport/Makefile
tools/Makefile
tools/locktest/Makefile
tools/nlmtest/Makefile
diff --git a/support/Makefile.am b/support/Makefile.am
index c962d4d4..07cfd87e 100644
--- a/support/Makefile.am
+++ b/support/Makefile.am
@@ -10,7 +10,7 @@ if CONFIG_JUNCTION
OPTDIRS += junction
endif

-SUBDIRS = export include misc nfs nsm $(OPTDIRS)
+SUBDIRS = export include misc nfs nsm reexport $(OPTDIRS)

MAINTAINERCLEANFILES = Makefile.in

diff --git a/support/export/Makefile.am b/support/export/Makefile.am
index eec737f6..7338e1c7 100644
--- a/support/export/Makefile.am
+++ b/support/export/Makefile.am
@@ -14,6 +14,8 @@ libexport_a_SOURCES = client.c export.c hostname.c \
xtab.c mount_clnt.c mount_xdr.c \
cache.c auth.c v4root.c fsloc.c \
v4clients.c
+libexport_a_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) -I$(top_srcdir)/support/reexport
+
BUILT_SOURCES = $(GENFILES)

noinst_HEADERS = mount.h
diff --git a/support/include/nfslib.h b/support/include/nfslib.h
index 61c19933..bdbde78d 100644
--- a/support/include/nfslib.h
+++ b/support/include/nfslib.h
@@ -98,6 +98,7 @@ struct exportent {
struct xprtsec_entry e_xprtsec[XPRTSECMODE_COUNT + 1];
unsigned int e_ttl;
char * e_realpath;
+ int e_reexport;
};

struct rmtabent {
diff --git a/support/reexport/Makefile.am b/support/reexport/Makefile.am
new file mode 100644
index 00000000..9d544a8f
--- /dev/null
+++ b/support/reexport/Makefile.am
@@ -0,0 +1,6 @@
+## Process this file with automake to produce Makefile.in
+
+noinst_LIBRARIES = libreexport.a
+libreexport_a_SOURCES = reexport.c
+
+MAINTAINERCLEANFILES = Makefile.in
diff --git a/support/reexport/reexport.c b/support/reexport/reexport.c
new file mode 100644
index 00000000..eddc9bf4
--- /dev/null
+++ b/support/reexport/reexport.c
@@ -0,0 +1,326 @@
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <dlfcn.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <sys/random.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/vfs.h>
+#include <unistd.h>
+#include <errno.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "nfsd_path.h"
+#include "conffile.h"
+#include "nfslib.h"
+#include "reexport.h"
+#include "xcommon.h"
+#include "xlog.h"
+
+static int fsidd_srv = -1;
+
+static bool connect_fsid_service(void)
+{
+ struct sockaddr_un addr;
+ char *sock_file;
+ int ret;
+ int s;
+
+ if (fsidd_srv != -1)
+ return true;
+
+ sock_file = conf_get_str_with_def("reexport", "fsidd_socket", FSID_SOCKET_NAME);
+
+ memset(&addr, 0, sizeof(struct sockaddr_un));
+ addr.sun_family = AF_UNIX;
+ strncpy(addr.sun_path, sock_file, sizeof(addr.sun_path) - 1);
+
+ s = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+ if (s == -1) {
+ xlog(L_WARNING, "Unable to create AF_UNIX socket for %s: %m\n", sock_file);
+ return false;
+ }
+
+ ret = connect(s, (const struct sockaddr *)&addr, sizeof(struct sockaddr_un));
+ if (ret == -1) {
+ xlog(L_WARNING, "Unable to connect %s: %m, is fsidd running?\n", sock_file);
+ return false;
+ }
+
+ fsidd_srv = s;
+
+ return true;
+}
+
+int reexpdb_init(void)
+{
+ int try_count = 3;
+
+ while (try_count > 0 && !connect_fsid_service()) {
+ sleep(1);
+ try_count--;
+ }
+
+ return try_count > 0;
+}
+
+void reexpdb_destroy(void)
+{
+ close(fsidd_srv);
+ fsidd_srv = -1;
+}
+
+static bool parse_fsidd_reply(const char *cmd_info, char *buf, size_t len, char **result)
+{
+ if (len == 0) {
+ xlog(L_WARNING, "Unable to read %s result: server closed the connection", cmd_info);
+ return false;
+ } else if (len < 2) {
+ xlog(L_WARNING, "Unable to read %s result: server sent too few bytes", cmd_info);
+ return false;
+ }
+
+ if (buf[0] == '-') {
+ if (len > 2) {
+ char *reason = buf + 2;
+ xlog(L_WARNING, "Command %s failed, server said: %s", cmd_info, reason);
+ } else {
+ xlog(L_WARNING, "Command %s failed at server side", cmd_info);
+ }
+
+ return false;
+ }
+
+ if (buf[0] != '+') {
+ xlog(L_WARNING, "Unable to read %s result: server sent malformed answer", cmd_info);
+ return false;
+ }
+
+ if (len > 2) {
+ *result = strdup(buf + 2);
+ } else {
+ *result = NULL;
+ }
+
+ return true;
+}
+
+static bool do_fsidd_cmd(const char *cmd_info, char *msg, size_t len, char **result)
+{
+ char recvbuf[1024];
+ int n;
+
+ if (fsidd_srv == -1) {
+ xlog(L_NOTICE, "Reconnecting to fsid services");
+ if (reexpdb_init() == false)
+ return false;
+ }
+
+ xlog(D_GENERAL, "Request to fsidd: msg=\"%s\" len=%zd", msg, len);
+
+ if (write(fsidd_srv, msg, len) == -1) {
+ xlog(L_WARNING, "Unable to send %s command: %m", cmd_info);
+ goto out_close;
+ }
+
+ n = read(fsidd_srv, recvbuf, sizeof(recvbuf) - 1);
+ if (n <= -1) {
+ xlog(L_WARNING, "Unable to recv %s answer: %m", cmd_info);
+ goto out_close;
+ } else if (n == sizeof(recvbuf) - 1) {
+ //TODO: use better way to detect truncation
+ xlog(L_WARNING, "Unable to recv %s answer: answer truncated", cmd_info);
+ goto out_close;
+ }
+ recvbuf[n] = '\0';
+
+ xlog(D_GENERAL, "Answer from fsidd: msg=\"%s\" len=%i", recvbuf, n);
+
+ if (parse_fsidd_reply(cmd_info, recvbuf, n, result) == false) {
+ goto out_close;
+ }
+
+ return true;
+
+out_close:
+ close(fsidd_srv);
+ fsidd_srv = -1;
+ return false;
+}
+
+static bool fsidnum_get_by_path(char *path, uint32_t *fsidnum, bool may_create)
+{
+ char *msg, *result;
+ bool ret = false;
+ int len;
+
+ char *cmd = may_create ? "get_or_create_fsidnum" : "get_fsidnum";
+
+ len = asprintf(&msg, "%s %s", cmd, path);
+ if (len == -1) {
+ xlog(L_WARNING, "Unable to build %s command: %m", cmd);
+ goto out;
+ }
+
+ if (do_fsidd_cmd(cmd, msg, len, &result) == false) {
+ goto out;
+ }
+
+ if (result) {
+ bool bad_input = true;
+ char *endp;
+
+ errno = 0;
+ *fsidnum = strtoul(result, &endp, 10);
+ if (errno == 0 && *endp == '\0') {
+ bad_input = false;
+ }
+
+ free(result);
+
+ if (!bad_input) {
+ ret = true;
+ } else {
+ xlog(L_NOTICE, "Got malformed fsid for path %s", path);
+ }
+ } else {
+ xlog(L_NOTICE, "No fsid found for path %s", path);
+ }
+
+out:
+ free(msg);
+ return ret;
+}
+
+static bool path_by_fsidnum(uint32_t fsidnum, char **path)
+{
+ char *msg, *result;
+ bool ret = false;
+ int len;
+
+ len = asprintf(&msg, "get_path %d", (unsigned int)fsidnum);
+ if (len == -1) {
+ xlog(L_WARNING, "Unable to build get_path command: %m");
+ goto out;
+ }
+
+ if (do_fsidd_cmd("get_path", msg, len, &result) == false) {
+ goto out;
+ }
+
+ if (result) {
+ *path = result;
+ ret = true;
+ } else {
+ xlog(L_NOTICE, "No path found for fsid %u", (unsigned int)fsidnum);
+ }
+
+out:
+ free(msg);
+ return ret;
+}
+
+/*
+ * reexpdb_fsidnum_by_path - Lookup a fsid by path.
+ *
+ * @path: File system path used as lookup key
+ * @fsidnum: Pointer where found fsid is written to
+ * @may_create: If non-zero, allocate new fsid if lookup failed
+ *
+ */
+int reexpdb_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create)
+{
+ return fsidnum_get_by_path(path, fsidnum, may_create);
+}
+
+/*
+ * reexpdb_uncover_subvolume - Make sure a subvolume is present.
+ *
+ * @fsidnum: Numerical fsid number to look for
+ *
+ * Subvolumes (NFS cross mounts) get automatically mounted upon first
+ * access and can vanish after fs.nfs.nfs_mountpoint_timeout seconds.
+ * Also if the NFS server reboots, clients can still have valid file
+ * handles for such a subvolume.
+ *
+ * If kNFSd asks mountd for the path of a given fsidnum it can
+ * trigger an automount by calling statfs() on the given path.
+ */
+void reexpdb_uncover_subvolume(uint32_t fsidnum)
+{
+ struct statfs st;
+ char *path = NULL;
+ int ret;
+
+ if (path_by_fsidnum(fsidnum, &path)) {
+ ret = nfsd_path_statfs(path, &st);
+ if (ret == -1)
+ xlog(L_WARNING, "statfs() failed");
+ }
+
+ free(path);
+}
+
+/*
+ * reexpdb_apply_reexport_settings - Apply reexport specific settings to an exportent
+ *
+ * @ep: exportent to apply to
+ * @flname: Current export file, only useful for logging
+ * @flline: Current line, only useful for logging
+ *
+ * This is a helper function for applying reexport specific settings to an exportent.
+ * It searches a suitable fsid an sets @ep->e_fsid.
+ */
+int reexpdb_apply_reexport_settings(struct exportent *ep, char *flname, int flline)
+{
+ uint32_t fsidnum;
+ bool found, is_v4root = ((ep->e_flags & NFSEXP_FSID) && !ep->e_fsid);
+ int ret = 0;
+
+ if (ep->e_reexport == REEXP_NONE)
+ goto out;
+
+ if (ep->e_uuid)
+ goto out;
+
+ if (is_v4root)
+ goto out;
+
+ found = reexpdb_fsidnum_by_path(ep->e_path, &fsidnum, 0);
+ if (!found) {
+ if (ep->e_reexport == REEXP_AUTO_FSIDNUM) {
+ found = reexpdb_fsidnum_by_path(ep->e_path, &fsidnum, 1);
+ if (!found) {
+ xlog(L_ERROR, "%s:%i: Unable to generate fsid for %s",
+ flname, flline, ep->e_path);
+ ret = -1;
+ goto out;
+ }
+ } else {
+ if (!ep->e_fsid) {
+ xlog(L_ERROR, "%s:%i: Selected 'reexport=' mode requires either a UUID 'fsid=' or a numerical 'fsid=' or a reexport db entry %d",
+ flname, flline, ep->e_fsid);
+ ret = -1;
+ }
+
+ goto out;
+ }
+ }
+
+ if (ep->e_fsid) {
+ if (ep->e_fsid != fsidnum) {
+ xlog(L_ERROR, "%s:%i: Selected 'reexport=' mode requires configured numerical 'fsid=' to agree with reexport db entry",
+ flname, flline);
+ ret = -1;
+ }
+ } else {
+ ep->e_fsid = fsidnum;
+ }
+
+out:
+ return ret;
+}
diff --git a/support/reexport/reexport.h b/support/reexport/reexport.h
new file mode 100644
index 00000000..3bed03a9
--- /dev/null
+++ b/support/reexport/reexport.h
@@ -0,0 +1,18 @@
+#ifndef REEXPORT_H
+#define REEXPORT_H
+
+enum {
+ REEXP_NONE = 0,
+ REEXP_AUTO_FSIDNUM,
+ REEXP_PREDEFINED_FSIDNUM,
+};
+
+int reexpdb_init(void);
+void reexpdb_destroy(void);
+int reexpdb_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create);
+int reexpdb_apply_reexport_settings(struct exportent *ep, char *flname, int flline);
+void reexpdb_uncover_subvolume(uint32_t fsidnum);
+
+#define FSID_SOCKET_NAME "fsid.sock"
+
+#endif /* REEXPORT_H */
--
2.31.1

2023-04-18 09:44:04

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 5/8] exports.man: Document reexport= option

Signed-off-by: Richard Weinberger <[email protected]>
---
utils/exportfs/exports.man | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/utils/exportfs/exports.man b/utils/exportfs/exports.man
index 83dd6807..b7582776 100644
--- a/utils/exportfs/exports.man
+++ b/utils/exportfs/exports.man
@@ -468,6 +468,37 @@ will only work if all clients use a consistent security policy. Note
that early kernels did not support this export option, and instead
enabled security labels by default.

+.TP
+.IR reexport= auto-fsidnum|predefined-fsidnum
+This option helps when a NFS share is re-exported. Since the NFS server
+needs a unique identifier for each exported filesystem and a NFS share
+cannot provide such, usually a manual fsid is needed.
+As soon
+.IR crossmnt
+is used manually assigning fsid won't work anymore. This is where this
+option becomes handy. It will automatically assign a numerical fsid
+to exported NFS shares. The fsid and path relations are stored in a SQLite
+database. If
+.IR auto-fsidnum
+is selected, the fsid is also autmatically allocated.
+.IR predefined-fsidnum
+assumes pre-allocated fsid numbers and will just look them up.
+This option depends also on the kernel, you will need at least kernel version
+5.19.
+Since
+.IR reexport=
+can automatically allocate and assign numerical fsids, it is no longer possible
+to have numerical fsids in other exports as soon this option is used in at least
+one export entry.
+
+The association between fsid numbers and paths is stored in a SQLite database.
+Don't edit or remove the database unless you know exactly what you're doing.
+.IR predefined-fsidnum
+is useful when you have used
+.IR auto-fsidnum
+before and don't want further entries stored.
+
+
.SS User ID Mapping
.PP
.B nfsd
--
2.31.1

2023-04-18 09:44:23

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 3/8] export: Wireup reexport mechanism

Detect the case when a NFS share is re-exported and assign an
fsidnum to it.
The fsidnum is read (or created) from the reexport database.

Signed-off-by: Richard Weinberger <[email protected]>
---
support/export/cache.c | 68 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 62 insertions(+), 6 deletions(-)

diff --git a/support/export/cache.c b/support/export/cache.c
index 0a37703b..42a694d0 100644
--- a/support/export/cache.c
+++ b/support/export/cache.c
@@ -33,6 +33,7 @@
#include "export.h"
#include "pseudoflavors.h"
#include "xcommon.h"
+#include "reexport.h"

#ifdef HAVE_JUNCTION_SUPPORT
#include "fsloc.h"
@@ -235,6 +236,16 @@ static void auth_unix_gid(int f)
xlog(L_ERROR, "auth_unix_gid: error writing reply");
}

+static int match_crossmnt_fsidnum(uint32_t parsed_fsidnum, char *path)
+{
+ uint32_t fsidnum;
+
+ if (reexpdb_fsidnum_by_path(path, &fsidnum, 0) == 0)
+ return 0;
+
+ return fsidnum == parsed_fsidnum;
+}
+
#ifdef USE_BLKID
static const char *get_uuid_blkdev(char *path)
{
@@ -688,8 +699,13 @@ static int match_fsid(struct parsed_fsid *parsed, nfs_export *exp, char *path)
goto match;
case FSID_NUM:
if (((exp->m_export.e_flags & NFSEXP_FSID) == 0 ||
- exp->m_export.e_fsid != parsed->fsidnum))
+ exp->m_export.e_fsid != parsed->fsidnum)) {
+ if ((exp->m_export.e_flags & NFSEXP_CROSSMOUNT) && exp->m_export.e_reexport != REEXP_NONE &&
+ match_crossmnt_fsidnum(parsed->fsidnum, path))
+ goto match;
+
goto nomatch;
+ }
goto match;
case FSID_UUID4_INUM:
case FSID_UUID16_INUM:
@@ -937,7 +953,7 @@ static void write_fsloc(char **bp, int *blen, struct exportent *ep)
}
#endif

-static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_mask)
+static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_mask, int extra_flag)
{
struct sec_entry *p;

@@ -952,7 +968,7 @@ static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_m
qword_addint(bp, blen, p - ep->e_secinfo);
for (p = ep->e_secinfo; p->flav; p++) {
qword_addint(bp, blen, p->flav->fnum);
- qword_addint(bp, blen, p->flags & flag_mask);
+ qword_addint(bp, blen, (p->flags | extra_flag) & flag_mask);
}
}

@@ -970,6 +986,15 @@ static void write_xprtsec(char **bp, int *blen, struct exportent *ep)
qword_addint(bp, blen, p->info->number);
}

+static int can_reexport_via_fsidnum(struct exportent *exp, struct statfs *st)
+{
+ if (st->f_type != 0x6969 /* NFS_SUPER_MAGIC */)
+ return 0;
+
+ return exp->e_reexport == REEXP_PREDEFINED_FSIDNUM ||
+ exp->e_reexport == REEXP_AUTO_FSIDNUM;
+}
+
static int dump_to_cache(int f, char *buf, int blen, char *domain,
char *path, struct exportent *exp, int ttl)
{
@@ -986,17 +1011,48 @@ static int dump_to_cache(int f, char *buf, int blen, char *domain,
if (exp) {
int different_fs = strcmp(path, exp->e_path) != 0;
int flag_mask = different_fs ? ~NFSEXP_FSID : ~0;
+ int rc, do_fsidnum = 0;
+ uint32_t fsidnum = exp->e_fsid;
+
+ if (different_fs) {
+ struct statfs st;
+
+ rc = nfsd_path_statfs(path, &st);
+ if (rc) {
+ xlog(L_WARNING, "unable to statfs %s", path);
+ errno = EINVAL;
+ return -1;
+ }
+
+ if (can_reexport_via_fsidnum(exp, &st)) {
+ do_fsidnum = 1;
+ flag_mask = ~0;
+ }
+ }

qword_adduint(&bp, &blen, now + exp->e_ttl);
- qword_addint(&bp, &blen, exp->e_flags & flag_mask);
+
+ if (do_fsidnum) {
+ uint32_t search_fsidnum = 0;
+ if (exp->e_reexport != REEXP_NONE && reexpdb_fsidnum_by_path(path, &search_fsidnum,
+ exp->e_reexport == REEXP_AUTO_FSIDNUM) == 0) {
+ errno = EINVAL;
+ return -1;
+ }
+ fsidnum = search_fsidnum;
+ qword_addint(&bp, &blen, exp->e_flags | NFSEXP_FSID);
+ } else {
+ qword_addint(&bp, &blen, exp->e_flags & flag_mask);
+ }
+
qword_addint(&bp, &blen, exp->e_anonuid);
qword_addint(&bp, &blen, exp->e_anongid);
- qword_addint(&bp, &blen, exp->e_fsid);
+ qword_addint(&bp, &blen, fsidnum);

#ifdef HAVE_JUNCTION_SUPPORT
write_fsloc(&bp, &blen, exp);
#endif
- write_secinfo(&bp, &blen, exp, flag_mask);
+ write_secinfo(&bp, &blen, exp, flag_mask, do_fsidnum ? NFSEXP_FSID : 0);
if (exp->e_uuid == NULL || different_fs) {
char u[16];
if ((exp->e_flags & flag_mask & NFSEXP_FSID) == 0 &&
--
2.31.1

2023-04-18 09:44:24

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH 8/8] Add fsid systemd service file

Co-developed-by: Chris Chilvers <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
---
systemd/Makefile.am | 3 ++-
systemd/fsidd.service | 10 ++++++++++
2 files changed, 12 insertions(+), 1 deletion(-)
create mode 100644 systemd/fsidd.service

diff --git a/systemd/Makefile.am b/systemd/Makefile.am
index 2e250dca..b4483222 100644
--- a/systemd/Makefile.am
+++ b/systemd/Makefile.am
@@ -15,7 +15,8 @@ unit_files = \
rpc-statd-notify.service \
rpc-statd.service \
\
- proc-fs-nfsd.mount
+ proc-fs-nfsd.mount \
+ fsidd.service

rpc_pipefs_mount_file = \
var-lib-nfs-rpc_pipefs.mount
diff --git a/systemd/fsidd.service b/systemd/fsidd.service
new file mode 100644
index 00000000..9cb480e3
--- /dev/null
+++ b/systemd/fsidd.service
@@ -0,0 +1,10 @@
+[Unit]
+Description=NFS FSID Daemon
+After=local-fs.target
+Before=nfs-mountd.service nfs-server.service
+
+[Service]
+ExecStart=/usr/sbin/fsidd
+
+[Install]
+RequiredBy=nfs-mountd.service nfs-server.service
--
2.31.1

2023-04-19 15:04:13

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH 0/8 v3] nfs-utils: Improving NFS re-export wrt. crossmnt



On 4/18/23 5:33 AM, Richard Weinberger wrote:
> After a longer hiatus I'm sending the next iteration of my re-export
> improvement patch series. While the kernel side is upstream since v6.2,
> the nfs-utils parts are still missing.
> This patch series aims to solve this.
>
> The core idea is adding new export option, reeport=
> Using reexport= it is possible to mark an export entry in the exports
> file explicitly as NFS re-export and select a strategy on how unique
> identifiers should be provided. This makes the crossmnt feature work
> in the re-export case.
> Currently two strategies are supported, "auto-fsidnum" and
> "predefined-fsidnum".
>
> In my earlier series a sqlite database was mandatory to keep track of
> generated fsids.
> This series follows a different approach, instead of directly using
> sqlite in all nfs-utils components (linking libsqlite), a new deamon
> manages the database, fsidd.
> fsidd offers a simple (but stupid?) text based interface over a unix domain
> socket which can be queried by mountd, exportfs, etc. for fsidnums.
> The main idea behind fsidd is allowing users to implement their own
> fsidd which keeps global state across load balancers.
> I'm still not happy with fsidd, there is room for improvement but first
> I'd like to know whether you like or hate this approach.
>
> A typical export entry on a re-exporting server looks like:
> /nfs *(rw,no_root_squash,no_subtree_check,crossmnt,reexport=auto-fsidnum)
> reexport=auto-fsidnum will automatically assign an fsid= to /nfs and all
> uncovered subvolumes.
>
> Changes since v2, https://lore.kernel.org/linux-nfs/[email protected]/
> - Split patch series
> - Add improved fsidd system unit file
> - Rebased to nfs-utils master as of today
> - Dropped init code from exportd
>
> Changes since v1, https://lore.kernel.org/linux-nfs/[email protected]/
> - Factor out Sqlite and put it into a daemon
> - Add fsidd
> - Basically re-implemented the patch series
> - Lot's of fixes (e.g. nfs v4 root export)
>
>
> Richard Weinberger (8):
> Add reexport helper library
> Implement reexport= export option
> export: Wireup reexport mechanism
> export: Uncover NFS subvolume after reboot
> exports.man: Document reexport= option
> reexport: Add sqlite backend
> export: Add fsidd
> Add fsid systemd service file
>
> configure.ac | 1 +
> support/Makefile.am | 2 +-
> support/export/Makefile.am | 2 +
> support/export/cache.c | 74 ++++++-
> support/export/export.c | 20 ++
> support/include/nfslib.h | 1 +
> support/nfs/Makefile.am | 1 +
> support/nfs/exports.c | 62 ++++++
> support/reexport/Makefile.am | 18 ++
> support/reexport/backend_sqlite.c | 267 +++++++++++++++++++++++
> support/reexport/fsidd.c | 198 +++++++++++++++++
> support/reexport/reexport.c | 326 ++++++++++++++++++++++++++++
> support/reexport/reexport.h | 18 ++
> support/reexport/reexport_backend.h | 47 ++++
> systemd/Makefile.am | 5 +-
> systemd/fsidd.service | 10 +
> utils/exportd/Makefile.am | 4 +-
> utils/exportfs/Makefile.am | 3 +
> utils/exportfs/exportfs.c | 11 +
> utils/exportfs/exports.man | 31 +++
> utils/mount/Makefile.am | 3 +-
> utils/mountd/Makefile.am | 2 +
> 22 files changed, 1096 insertions(+), 10 deletions(-)
> create mode 100644 support/reexport/Makefile.am
> create mode 100644 support/reexport/backend_sqlite.c
> create mode 100644 support/reexport/fsidd.c
> create mode 100644 support/reexport/reexport.c
> create mode 100644 support/reexport/reexport.h
> create mode 100644 support/reexport/reexport_backend.h
> create mode 100644 systemd/fsidd.service
>
Committed... (tag: nfs-utils-2-6-3-rc9)

Thank you!

steved.