As write_filehandle can't accept zero-bytes writting,
remove read permission of /proc/fs/nfsd/filehandle
Signed-off-by: Chen Hanxiao <[email protected]>
---
fs/nfsd/nfsctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index ecd18bffeebc..da1387f1e30a 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1346,7 +1346,7 @@ static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
&transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_FO_UnlockFS] = {"unlock_filesystem",
&transaction_ops, S_IWUSR|S_IRUSR},
- [NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
+ [NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},
--
2.39.1
On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> As write_filehandle can't accept zero-bytes writting,
> remove read permission of /proc/fs/nfsd/filehandle
>
> Signed-off-by: Chen Hanxiao <[email protected]>
> ---
> fs/nfsd/nfsctl.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index ecd18bffeebc..da1387f1e30a 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1346,7 +1346,7 @@ static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
> &transaction_ops, S_IWUSR|S_IRUSR},
> [NFSD_FO_UnlockFS] = {"unlock_filesystem",
> &transaction_ops, S_IWUSR|S_IRUSR},
> - [NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
> + [NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR},
> [NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
> [NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
> [NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},
I don't think we can do this. mountd does this to get the rootfh for an
exported fs and it'll probably start failing if it can't open that file
for read:
f = open("/proc/fs/nfsd/filehandle", O_RDWR);
That write_filehandle interface sure is weird though. Could we get rid
of it by teaching cache_get_filehandle how to use name_to_handle_at
instead? That would be a lot cleaner and would get rid of yet another
dependency on /proc/fs/nfsd.
--
Jeff Layton <[email protected]>
> -----?ʼ?ԭ??-----
> ??????: Jeff Layton <[email protected]>
> ????ʱ??: 2024??3??19?? 21:09
> ?ռ???: Chen, [email protected]>; Chuck Lever
> <[email protected]>; Neil Brown <[email protected]>; Olga Kornievskaia
> <[email protected]>; Dai Ngo <[email protected]>; Tom Talpey
> <[email protected]>
> ????: [email protected]
> ????: Re: [PATCH] NFSD: nfsctl: remove read permission of filehandle
>
> On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> > As write_filehandle can't accept zero-bytes writting,
> > remove read permission of /proc/fs/nfsd/filehandle
> >
> > Signed-off-by: Chen Hanxiao <[email protected]>
...
>
>
> I don't think we can do this. mountd does this to get the rootfh for an
> exported fs and it'll probably start failing if it can't open that file
> for read:
>
> f = open("/proc/fs/nfsd/filehandle", O_RDWR);
>
> That write_filehandle interface sure is weird though. Could we get rid
> of it by teaching cache_get_filehandle how to use name_to_handle_at
> instead? That would be a lot cleaner and would get rid of yet another
> dependency on /proc/fs/nfsd.
name_to_handle_at return struct file_handle
cache_get_filehandle return hexed struct knfsd_fh
Could you please give some hints on how to convert from file_handle. f_handle to knfsd_fh.fh_raw?
Regards,
- Chen
> --
> Jeff Layton <[email protected]>
On Wed, 2024-03-20 at 15:49 +0000, Hanxiao Chen (Fujitsu) wrote:
>
> > -----邮件原件-----
> > 发件人: Jeff Layton <[email protected]>
> > 发送时间: 2024年3月19日 21:09
> > 收件人: Chen, [email protected]>; Chuck Lever
> > <[email protected]>; Neil Brown <[email protected]>; Olga Kornievskaia
> > <[email protected]>; Dai Ngo <[email protected]>; Tom Talpey
> > <[email protected]>
> > 抄送: [email protected]
> > 主题: Re: [PATCH] NFSD: nfsctl: remove read permission of filehandle
> >
> > On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> > > As write_filehandle can't accept zero-bytes writting,
> > > remove read permission of /proc/fs/nfsd/filehandle
> > >
> > > Signed-off-by: Chen Hanxiao <[email protected]>
> ...
> >
> >
> > I don't think we can do this. mountd does this to get the rootfh for an
> > exported fs and it'll probably start failing if it can't open that file
> > for read:
> >
> > f = open("/proc/fs/nfsd/filehandle", O_RDWR);
> >
> > That write_filehandle interface sure is weird though. Could we get rid
> > of it by teaching cache_get_filehandle how to use name_to_handle_at
> > instead? That would be a lot cleaner and would get rid of yet another
> > dependency on /proc/fs/nfsd.
>
> name_to_handle_at return struct file_handle
> cache_get_filehandle return hexed struct knfsd_fh
>
> Could you please give some hints on how to convert from file_handle. f_handle to knfsd_fh.fh_raw?
>
I started looking at this yesterday. knfsd_fh is mostly constructed in
fh_compose. The exportfs_encode_fh call is in _fh_update. If we look at
the comments over knfsd_fh, it explains pretty well what we need to fill
out:
fh_version : should always be 1
fh_auth_type: always 0
fh_fsid_type: this is the tricky part (see below)
fh_fileid_type: returned by name_to_handle_at
fh_fsid: fsid concatenated with the fileid for the inode
name_to_handle_at should be able to fill out the fileid portion of the
knfsd_fh. The harder part is: how do we get the fh_fsid_type and its
content?
The types are shown here:
* The fsid_type identifies how the filesystem (or export point) is
* encoded.
* Current values:
* 0 - 4 byte device id (ms-2-bytes major, ls-2-bytes minor), 4byte inode number
* NOTE: we cannot use the kdev_t device id value, because kdev_t.h
* says we mustn't. We must break it up and reassemble.
* 1 - 4 byte user specified identifier
* 2 - 4 byte major, 4 byte minor, 4 byte inode number - DEPRECATED
* 3 - 4 byte device id, encoded for user-space, 4 byte inode number
* 4 - 4 byte inode number and 4 byte uuid
* 5 - 8 byte uuid
* 6 - 16 byte uuid
* 7 - 8 byte inode number and 16 byte uuid
The kernel decides this in set_version_and_fsid_type, so I suppose we
could try to replicate that logic in userland.
There is the problem though of what to do about exports with the fsid=
option set. That's tougher to determine from userland, but maybe we can
somehow get that from the export record?
That's about as far as I got with it yesterday...
--
Jeff Layton <[email protected]>
Hi, Jeff
I wrote a POC patch, use name_to_handle_at to get nfs root filehandle
instead of /proc/fs/nfsd/filehandle.
I did some simple tests and it works.
Pls check the POC patch below:
---
support/export/cache.c | 95 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 93 insertions(+), 2 deletions(-)
diff --git a/support/export/cache.c b/support/export/cache.c
index 6c0a44a3..adecac2e 100644
--- a/support/export/cache.c
+++ b/support/export/cache.c
@@ -43,6 +43,25 @@
#include "blkid/blkid.h"
#endif
+#define NFS4_FHSIZE 128
+
+struct knfsd_fh {
+ unsigned int fh_size; /*
+ * Points to the current size while
+ * building a new file handle.
+ */
+ union {
+ char fh_raw[NFS4_FHSIZE];
+ struct {
+ uint8_t fh_version; /* == 1 */
+ uint8_t fh_auth_type; /* deprecated */
+ uint8_t fh_fsid_type;
+ uint8_t fh_fileid_type;
+ uint32_t fh_fsid[]; /* flexible-array member */
+ };
+ };
+};
+
enum nfsd_fsid {
FSID_DEV = 0,
FSID_NUM,
@@ -1827,8 +1846,8 @@ int cache_export(nfs_export *exp, char *path)
* read filehandle <&0
* } <> /proc/fs/nfsd/filehandle
*/
-struct nfs_fh_len *
-cache_get_filehandle(nfs_export *exp, int len, char *p)
+static struct nfs_fh_len *
+cache_get_filehandle_by_proc(nfs_export *exp, int len, char *p)
{
static struct nfs_fh_len fh;
char buf[RPC_CHAN_BUF_SIZE], *bp;
@@ -1862,6 +1881,78 @@ cache_get_filehandle(nfs_export *exp, int len, char *p)
return &fh;
}
+static struct nfs_fh_len *
+cache_get_filehandle_by_name(nfs_export *exp, char *name)
+{
+ static struct nfs_fh_len fh;
+ struct {
+ struct file_handle fh;
+ unsigned char handle[128];
+ } file_fh;
+ char buf[RPC_CHAN_BUF_SIZE] = {0};
+ char *mesg = buf;
+ int len, mnt_id;
+ unsigned int e_fsid;
+ struct knfsd_fh kfh;
+ char u[16];
+
+ memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
+
+ file_fh.fh.handle_bytes = 128;
+ if (name_to_handle_at(AT_FDCWD, name, &file_fh.fh, &mnt_id, 0) < 0)
+ return NULL;
+
+ memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
+ memset(&kfh, 0, sizeof(struct knfsd_fh));
+ kfh.fh_version = 1;
+ kfh.fh_auth_type = 0;
+ e_fsid = exp->m_export.e_fsid;
+
+ if (e_fsid > 0) {
+ len = 12;
+ fh.fh_size = 8;
+ kfh.fh_size = 12;
+ kfh.fh_fsid_type = 1;
+ kfh.fh_fsid[0] = e_fsid;
+ } else {
+ len = file_fh.fh.handle_bytes + 8;
+ fh.fh_size = file_fh.fh.handle_bytes;
+ kfh.fh_size = file_fh.fh.handle_bytes + sizeof(kfh.fh_size);
+ kfh.fh_fsid_type = FSID_UUID16_INUM;
+ if (file_fh.fh.handle_bytes <= 12) {
+ kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
+ kfh.fh_fsid[1] = 0;
+ } else {
+ kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
+ kfh.fh_fsid[1] = *((uint32_t *)file_fh.fh.f_handle + 1);
+ }
+ }
+ kfh.fh_fileid_type = 0; // FILEID_ROOT
+
+ qword_addhex(&mesg, &len, kfh.fh_raw, kfh.fh_size);
+ mesg = buf;
+ len = qword_get(&mesg, (char *)fh.fh_handle, NFS3_FHSIZE);
+ if (e_fsid == 0) {
+ len = 16;
+ uuid_by_path(name, 0, 16, u);
+ memcpy((char *)fh.fh_handle+12, u, 16);
+ fh.fh_size += 16;
+ }
+
+ return &fh;
+}
+
+struct nfs_fh_len *
+cache_get_filehandle(nfs_export *exp, int len, char *p)
+{
+ struct nfs_fh_len *fh;
+ fh = cache_get_filehandle_by_name(exp, p);
+ if (!fh)
+ fh = cache_get_filehandle_by_proc(exp, len, p);
+
+ return fh;
+}
+
/* Wait for all worker child processes to exit and reap them */
void
cache_wait_for_workers(char *prog)
--
2.39.1
On Wed, 2024-03-27 at 14:40 +0800, Chen Hanxiao wrote:
> Hi, Jeff
>
> I wrote a POC patch, use name_to_handle_at to get nfs root filehandle
> instead of /proc/fs/nfsd/filehandle.
>
> I did some simple tests and it works.
>
> Pls check the POC patch below:
> ---
> support/export/cache.c | 95 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 93 insertions(+), 2 deletions(-)
>
Very cool!
I do think this is the basic direction we want to go with revising how
mountd gets its filehandles.
> diff --git a/support/export/cache.c b/support/export/cache.c
> index 6c0a44a3..adecac2e 100644
> --- a/support/export/cache.c
> +++ b/support/export/cache.c
> @@ -43,6 +43,25 @@
> #include "blkid/blkid.h"
> #endif
>
> +#define NFS4_FHSIZE 128
> +
> +struct knfsd_fh {
> + unsigned int fh_size; /*
> + * Points to the current size while
> + * building a new file handle.
> + */
> + union {
> + char fh_raw[NFS4_FHSIZE];
> + struct {
> + uint8_t fh_version; /* == 1 */
> + uint8_t fh_auth_type; /* deprecated */
> + uint8_t fh_fsid_type;
> + uint8_t fh_fileid_type;
> + uint32_t fh_fsid[]; /* flexible-array member */
> + };
> + };
> +};
> +
This is a large departure from how this has worked in the past. Before
this, userland could always treat a knfsd_fh as opaque. With this change
though, knfsd_fh will need to be part of the kernel's ABI, and we'll
need to be very careful about future changes to the FH format (not that
we're planning any).
I think we probably ought to start by moving the definitions of knfsd_fh
and its component parts into a UAPI header that nfs-utils could then
use. For now, we can add duplicate definitions like above for when the
case where they aren't defined in UAPI headers.
> enum nfsd_fsid {
> FSID_DEV = 0,
> FSID_NUM,
Side note: I wonder if some of the nfsd_fsid values can (effectively) be
deprecated these days? I don't think we really use FSID_DEV anymore at
all, do we?
> @@ -1827,8 +1846,8 @@ int cache_export(nfs_export *exp, char *path)
> * read filehandle <&0
> * } <> /proc/fs/nfsd/filehandle
> */
> -struct nfs_fh_len *
> -cache_get_filehandle(nfs_export *exp, int len, char *p)
> +static struct nfs_fh_len *
> +cache_get_filehandle_by_proc(nfs_export *exp, int len, char *p)
> {
> static struct nfs_fh_len fh;
> char buf[RPC_CHAN_BUF_SIZE], *bp;
> @@ -1862,6 +1881,78 @@ cache_get_filehandle(nfs_export *exp, int len, char *p)
> return &fh;
> }
>
> +static struct nfs_fh_len *
> +cache_get_filehandle_by_name(nfs_export *exp, char *name)
> +{
> + static struct nfs_fh_len fh;
> + struct {
> + struct file_handle fh;
> + unsigned char handle[128];
> + } file_fh;
> + char buf[RPC_CHAN_BUF_SIZE] = {0};
> + char *mesg = buf;
> + int len, mnt_id;
> + unsigned int e_fsid;
> + struct knfsd_fh kfh;
> + char u[16];
> +
> + memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
> +
> + file_fh.fh.handle_bytes = 128;
> + if (name_to_handle_at(AT_FDCWD, name, &file_fh.fh, &mnt_id, 0) < 0)
> + return NULL;
> +
> + memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
> + memset(&kfh, 0, sizeof(struct knfsd_fh));
> + kfh.fh_version = 1;
> + kfh.fh_auth_type = 0;
> + e_fsid = exp->m_export.e_fsid;
> +
> + if (e_fsid > 0) {
> + len = 12;
> + fh.fh_size = 8;
> + kfh.fh_size = 12;
> + kfh.fh_fsid_type = 1;
> + kfh.fh_fsid[0] = e_fsid;
> + } else {
> + len = file_fh.fh.handle_bytes + 8;
> + fh.fh_size = file_fh.fh.handle_bytes;
> + kfh.fh_size = file_fh.fh.handle_bytes + sizeof(kfh.fh_size);
> + kfh.fh_fsid_type = FSID_UUID16_INUM;
Note that we will need to deal with NFSv2 here too. It has 32-byte
filehandles, and so we likely can't use FSID_UUID16_INUM for those.
> + if (file_fh.fh.handle_bytes <= 12) {
> + kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
> + kfh.fh_fsid[1] = 0;
> + } else {
> + kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
> + kfh.fh_fsid[1] = *((uint32_t *)file_fh.fh.f_handle + 1);
> + }
> + }
> + kfh.fh_fileid_type = 0; // FILEID_ROOT
> +
> + qword_addhex(&mesg, &len, kfh.fh_raw, kfh.fh_size);
> + mesg = buf;
> + len = qword_get(&mesg, (char *)fh.fh_handle, NFS3_FHSIZE);
We may need to pass to this function what version of NFS MNT request was
sent so that we know what sort of FH is needed here.
> + if (e_fsid == 0) {
> + len = 16;
> + uuid_by_path(name, 0, 16, u);
> + memcpy((char *)fh.fh_handle+12, u, 16);
> + fh.fh_size += 16;
> + }
> +
> + return &fh;
> +}
> +
> +struct nfs_fh_len *
> +cache_get_filehandle(nfs_export *exp, int len, char *p)
> +{
> + struct nfs_fh_len *fh;
> + fh = cache_get_filehandle_by_name(exp, p);
> + if (!fh)
> + fh = cache_get_filehandle_by_proc(exp, len, p);
> +
> + return fh;
> +}
> +
> /* Wait for all worker child processes to exit and reap them */
> void
> cache_wait_for_workers(char *prog)
This looks like a great start -- nice work! There are a number of more
obscure corner-cases that we'll need to deal with, but this does look
like a promising direction.
--
Jeff Layton <[email protected]>