2017-12-11 06:04:19

by NeilBrown

[permalink] [raw]
Subject: [PATCH 0/4] VFS: fix assorted issues with name_to_handle conversions.

This series of patches is the result of discussion following the
previous patch which increases MAX_HANDLE_SZ.

Thanks,
NeilBrown

---

NeilBrown (4):
fs/notify: fdinfo can report unsupported file handles.
fs/notify: don't put file handle buffer on stack.
NFS: allow name_to_handle_at() to work for Amazon EFS.
fhandle: Improve error responses in name_to_handle_at()


fs/exportfs/expfs.c | 4 +++-
fs/fhandle.c | 16 +++++++++-----
fs/nfs/export.c | 2 ++
fs/notify/fdinfo.c | 51 +++++++++++++++++++++++-----------------------
include/linux/exportfs.h | 10 +++++++--
5 files changed, 49 insertions(+), 34 deletions(-)

--
Signature


2017-12-11 06:04:28

by NeilBrown

[permalink] [raw]
Subject: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

If a filesystem does not set sb->s_export_op, then it
does not support filehandles and export_fs_encode_fh()
and exportfs_encode_inode_fh() should not be called.
They will use export_encode_fh() is which is a default
that uses inode number generation number, but in general
they may not be stable.

So change exportfs_encode_inode_fh() to return FILEID_INVALID
if called on an unsupported Filesystem. Currently only
notify/fdinfo can do that.

Also remove the WARNing from fdinfo when exportfs_encode_inode_fh()
returns FILEID_INVALID, as that is no an erroneous condition.

Signed-off-by: NeilBrown <[email protected]>
---
fs/exportfs/expfs.c | 4 +++-
fs/notify/fdinfo.c | 4 +---
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 329a5d103846..f5b27dd843a1 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -385,7 +385,9 @@ int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
{
const struct export_operations *nop = inode->i_sb->s_export_op;

- if (nop && nop->encode_fh)
+ if (nop)
+ return FILEID_INVALID;
+ if (nop->encode_fh)
return nop->encode_fh(inode, fid->raw, max_len, parent);

return export_encode_fh(inode, fid, max_len, parent);
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index d478629c728b..d1135ed61229 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -50,10 +50,8 @@ static void show_mark_fhandle(struct seq_file *m, struct inode *inode)
size = f.handle.handle_bytes >> 2;

ret = exportfs_encode_inode_fh(inode, (struct fid *)f.handle.f_handle, &size, 0);
- if ((ret == FILEID_INVALID) || (ret < 0)) {
- WARN_ONCE(1, "Can't encode file handler for inotify: %d\n", ret);
+ if ((ret == FILEID_INVALID) || (ret < 0))
return;
- }

f.handle.handle_type = ret;
f.handle.handle_bytes = size * sizeof(u32);


2017-12-11 06:04:38

by NeilBrown

[permalink] [raw]
Subject: [PATCH 2/4] fs/notify: don't put file handle buffer on stack.

A file handle buffer is not tiny, and could need to be larger in future,
so it isn't safe to allocate one on the stack. Instead, we need to
kmalloc().

There is no way to return an error status from a ->show_fdinfo()
function, so if the kmalloc fails, we silently exclude the filehandle
from the output. As it is at the end of line, this shouldn't
upset parsing too much. In any case, it can only fail when the
process is being killed by the OOM killer, so the file will never
be parsed anyway.

Signed-off-by: NeilBrown <[email protected]>
---
fs/notify/fdinfo.c | 47 +++++++++++++++++++++++++----------------------
1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index d1135ed61229..7347f295bc0f 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -23,54 +23,56 @@

static void show_fdinfo(struct seq_file *m, struct file *f,
void (*show)(struct seq_file *m,
- struct fsnotify_mark *mark))
+ struct fsnotify_mark *mark,
+ struct fid *fh))
{
struct fsnotify_group *group = f->private_data;
struct fsnotify_mark *mark;
+ struct fid *fh = kmalloc(MAX_HANDLE_SZ, GFP_KERNEL);

mutex_lock(&group->mark_mutex);
list_for_each_entry(mark, &group->marks_list, g_list) {
- show(m, mark);
+ show(m, mark, fh);
if (seq_has_overflowed(m))
break;
}
mutex_unlock(&group->mark_mutex);
+ kfree(fh);
}

#if defined(CONFIG_EXPORTFS)
-static void show_mark_fhandle(struct seq_file *m, struct inode *inode)
+static void show_mark_fhandle(struct seq_file *m, struct inode *inode,
+ struct fid *fhbuf)
{
- struct {
- struct file_handle handle;
- u8 pad[MAX_HANDLE_SZ];
- } f;
int size, ret, i;
+ unsigned char *bytes;

- f.handle.handle_bytes = sizeof(f.pad);
- size = f.handle.handle_bytes >> 2;
+ if (!fhbuf)
+ return;
+ size = MAX_HANDLE_SZ >> 2;

- ret = exportfs_encode_inode_fh(inode, (struct fid *)f.handle.f_handle, &size, 0);
+ ret = exportfs_encode_inode_fh(inode, fhbuf, &size, 0);
if ((ret == FILEID_INVALID) || (ret < 0))
return;

- f.handle.handle_type = ret;
- f.handle.handle_bytes = size * sizeof(u32);
-
- seq_printf(m, "fhandle-bytes:%x fhandle-type:%x f_handle:",
- f.handle.handle_bytes, f.handle.handle_type);
+ seq_printf(m, "fhandle-bytes:%zx fhandle-type:%x f_handle:",
+ size * sizeof(u32), ret);

- for (i = 0; i < f.handle.handle_bytes; i++)
- seq_printf(m, "%02x", (int)f.handle.f_handle[i]);
+ bytes = (unsigned char *)(fhbuf->raw);
+ for (i = 0; i < size * sizeof(u32); i++)
+ seq_printf(m, "%02x", bytes[i]);
}
#else
-static void show_mark_fhandle(struct seq_file *m, struct inode *inode)
+static void show_mark_fhandle(struct seq_file *m, struct inode *inode,
+ struct fid *fhbuf)
{
}
#endif

#ifdef CONFIG_INOTIFY_USER

-static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
+static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark,
+ struct fid *fhbuf)
{
struct inotify_inode_mark *inode_mark;
struct inode *inode;
@@ -91,7 +93,7 @@ static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x ignored_mask:%x ",
inode_mark->wd, inode->i_ino, inode->i_sb->s_dev,
mask, mark->ignored_mask);
- show_mark_fhandle(m, inode);
+ show_mark_fhandle(m, inode, fhbuf);
seq_putc(m, '\n');
iput(inode);
}
@@ -106,7 +108,8 @@ void inotify_show_fdinfo(struct seq_file *m, struct file *f)

#ifdef CONFIG_FANOTIFY

-static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
+static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark,
+ struct fid *fhbuf)
{
unsigned int mflags = 0;
struct inode *inode;
@@ -121,7 +124,7 @@ static void fanotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
seq_printf(m, "fanotify ino:%lx sdev:%x mflags:%x mask:%x ignored_mask:%x ",
inode->i_ino, inode->i_sb->s_dev,
mflags, mark->mask, mark->ignored_mask);
- show_mark_fhandle(m, inode);
+ show_mark_fhandle(m, inode, fhbuf);
seq_putc(m, '\n');
iput(inode);
} else if (mark->connector->flags & FSNOTIFY_OBJ_TYPE_VFSMOUNT) {


2017-12-11 06:04:45

by NeilBrown

[permalink] [raw]
Subject: [PATCH 3/4] NFS: allow name_to_handle_at() to work for Amazon EFS.

Amazon EFS provides an NFSv4.1 filesystem which appears to use
(close to) full length (128 byte) file handles.
This causes the handle reported by name_to_handle_at() to exceed
MAX_HANDLE_SZ, resulting in
EOVERFLOW if 128 bytes were requested, or
EINVAL if the size reported by the previous call was requested.

To fix this, increase MAX_HANDLE_SIZE a little, and add a BUILD_BUG
so that this sort of inconsistent error reporting won't happen again.

Link: https://github.com/systemd/systemd/issues/7082#issuecomment-347380436
Signed-off-by: NeilBrown <[email protected]>
---
fs/nfs/export.c | 2 ++
include/linux/exportfs.h | 10 ++++++++--
2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/export.c b/fs/nfs/export.c
index 83fd09fc8f77..23b2fc3ab2bb 100644
--- a/fs/nfs/export.c
+++ b/fs/nfs/export.c
@@ -39,6 +39,8 @@ nfs_encode_fh(struct inode *inode, __u32 *p, int *max_len, struct inode *parent)
size_t fh_size = offsetof(struct nfs_fh, data) + server_fh->size;
int len = EMBED_FH_OFF + XDR_QUADLEN(fh_size);

+ BUILD_BUG_ON_MSG(EMBED_FH_OFF + NFS_MAXFHSIZE > MAX_HANDLE_SZ,
+ "MAX_HANDLE_SZ is too small");
dprintk("%s: max fh len %d inode %p parent %p",
__func__, *max_len, inode, parent);

diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index 0d3037419bc7..71eb9c2cc2fb 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -11,8 +11,14 @@ struct iomap;
struct super_block;
struct vfsmount;

-/* limit the handle size to NFSv4 handle size now */
-#define MAX_HANDLE_SZ 128
+/* Must be larger than NFSv4 file handle.
+ * overlayfs doesn't want this too close to 255.
+ * NOTE: This value MUST NOT be exported to user-space.
+ * Applications must only ever see MAX_HANDLE_SZ == 128.
+ * If they try a larger number on older kernels, they
+ * will get -EINVAL which will be confusing.
+ */
+#define MAX_HANDLE_SZ 200

/*
* The fileid_type identifies how the file within the filesystem is encoded.


2017-12-11 06:04:52

by NeilBrown

[permalink] [raw]
Subject: [PATCH 4/4] fhandle: Improve error responses in name_to_handle_at()

1/ Always return the mnt_id, even if some other error occurs.
It can be useful without the file handle.
An application can initialise the memory to, e.g. -1
and if there is some other value after name_to_handle_at()
returns, then it is a valid mnt_id.
If the value is unchanged, then the kernel does not
have this patch.

2/ Don't return -EINVAL if the requested handle_bytes is
larger than MAX_HANDLE_SZ. There is no need for an
error and it causes unnecessary behavior change
in the kernel ever needs to increase MAX_HANDLE_SZ.
Simple limit handle_bytes to MAX_HANDLE_SZ silently.

Signed-off-by: NeilBrown <[email protected]>
---
fs/fhandle.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/fhandle.c b/fs/fhandle.c
index 0ace128f5d23..04afffaeb742 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -23,9 +23,16 @@ static long do_sys_name_to_handle(struct path *path,
int handle_dwords, handle_bytes;
struct file_handle *handle = NULL;

+ /*
+ * Always return the mnt_id, it might be useful even
+ * without the file handle
+ */
+ if (copy_to_user(mnt_id, &real_mount(path->mnt)->mnt_id,
+ sizeof(*mnt_id)))
+ return -EFAULT;
/*
* We need to make sure whether the file system
- * support decoding of the file handle
+ * supports decoding of the file handle.
*/
if (!path->dentry->d_sb->s_export_op ||
!path->dentry->d_sb->s_export_op->fh_to_dentry)
@@ -35,7 +42,7 @@ static long do_sys_name_to_handle(struct path *path,
return -EFAULT;

if (f_handle.handle_bytes > MAX_HANDLE_SZ)
- return -EINVAL;
+ f_handle.handle_bytes = MAX_HANDLE_SZ;

handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_bytes,
GFP_KERNEL);
@@ -68,10 +75,7 @@ static long do_sys_name_to_handle(struct path *path,
retval = -EOVERFLOW;
} else
retval = 0;
- /* copy the mount id */
- if (copy_to_user(mnt_id, &real_mount(path->mnt)->mnt_id,
- sizeof(*mnt_id)) ||
- copy_to_user(ufh, handle,
+ if (copy_to_user(ufh, handle,
sizeof(struct file_handle) + handle_bytes))
retval = -EFAULT;
kfree(handle);


2017-12-11 06:29:51

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11, 2017 at 05:04:05PM +1100, NeilBrown wrote:
> @@ -385,7 +385,9 @@ int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
> {
> const struct export_operations *nop = inode->i_sb->s_export_op;
>
> - if (nop && nop->encode_fh)
> + if (nop)
> + return FILEID_INVALID;
> + if (nop->encode_fh)
> return nop->encode_fh(inode, fid->raw, max_len, parent);

This might as well have been

if (nop)
return FILEID_INVALID;
BUG();

Have you ever tested that?

2017-12-11 06:41:14

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
> If a filesystem does not set sb->s_export_op, then it
> does not support filehandles and export_fs_encode_fh()
> and exportfs_encode_inode_fh() should not be called.
> They will use export_encode_fh() is which is a default
> that uses inode number generation number, but in general
> they may not be stable.
>
> So change exportfs_encode_inode_fh() to return FILEID_INVALID
> if called on an unsupported Filesystem. Currently only
> notify/fdinfo can do that.
>

I wish you would leave this check to the caller, maybe add a helper
exportfs_can_decode_fh() for callers to use.

Although there are no current uses for it in-tree, there is value in
being able to encode a unique file handle even when it cannot be
decoded back to an open file.

I am using this property in my fanotify super block watch patches,
where the object identifier on the event is an encoded file handle
of the object, which delegates tracking filesystem objects to
userspace and prevents fanotify from keeping elevated refcounts
on inodes and dentries.

There are quite a few userspace tools out there that are checking
that st_ino hasn't changed on a file between non atomic operations.
Those tools (or others) could benefit from a unique file handle if
we ever decide to provide a relaxed version of name_to_handle_at().

Cheers,
Amir.

2017-12-11 06:47:47

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 2/4] fs/notify: don't put file handle buffer on stack.

On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
> A file handle buffer is not tiny, and could need to be larger in future,
> so it isn't safe to allocate one on the stack. Instead, we need to
> kmalloc().
>
> There is no way to return an error status from a ->show_fdinfo()
> function, so if the kmalloc fails, we silently exclude the filehandle
> from the output. As it is at the end of line, this shouldn't
> upset parsing too much.

I think that is a bold assumption to make about parsers ;)
Anyway, the second reasoning is stronger, so may as well drop this one.
There is probably a single userspace user for this which is CRIU,
for which this fdinfo file handle was added, so you could possibly say
that this change does not upset this user (?).

> In any case, it can only fail when the
> process is being killed by the OOM killer, so the file will never
> be parsed anyway.
>
> Signed-off-by: NeilBrown <[email protected]>

2017-12-11 07:05:53

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11, 2017 at 8:41 AM, Amir Goldstein <[email protected]> wrote:
> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>> If a filesystem does not set sb->s_export_op, then it
>> does not support filehandles and export_fs_encode_fh()
>> and exportfs_encode_inode_fh() should not be called.
>> They will use export_encode_fh() is which is a default
>> that uses inode number generation number, but in general
>> they may not be stable.
>>
>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>> if called on an unsupported Filesystem. Currently only
>> notify/fdinfo can do that.
>>
>
> I wish you would leave this check to the caller, maybe add a helper
> exportfs_can_decode_fh() for callers to use.
>
> Although there are no current uses for it in-tree, there is value in
> being able to encode a unique file handle even when it cannot be
> decoded back to an open file.
>
> I am using this property in my fanotify super block watch patches,
> where the object identifier on the event is an encoded file handle
> of the object, which delegates tracking filesystem objects to
> userspace and prevents fanotify from keeping elevated refcounts
> on inodes and dentries.
>
> There are quite a few userspace tools out there that are checking
> that st_ino hasn't changed on a file between non atomic operations.
> Those tools (or others) could benefit from a unique file handle if
> we ever decide to provide a relaxed version of name_to_handle_at().
>

And this change need a clause about not breaking userspace.

Pavel,

Will this break any version of criu in the wild?

Amir.

2017-12-11 14:01:53

by Pavel Emelianov

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On 12/11/2017 10:05 AM, Amir Goldstein wrote:
> On Mon, Dec 11, 2017 at 8:41 AM, Amir Goldstein <[email protected]> wrote:
>> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>>> If a filesystem does not set sb->s_export_op, then it
>>> does not support filehandles and export_fs_encode_fh()
>>> and exportfs_encode_inode_fh() should not be called.
>>> They will use export_encode_fh() is which is a default
>>> that uses inode number generation number, but in general
>>> they may not be stable.
>>>
>>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>>> if called on an unsupported Filesystem. Currently only
>>> notify/fdinfo can do that.
>>>
>>
>> I wish you would leave this check to the caller, maybe add a helper
>> exportfs_can_decode_fh() for callers to use.
>>
>> Although there are no current uses for it in-tree, there is value in
>> being able to encode a unique file handle even when it cannot be
>> decoded back to an open file.
>>
>> I am using this property in my fanotify super block watch patches,
>> where the object identifier on the event is an encoded file handle
>> of the object, which delegates tracking filesystem objects to
>> userspace and prevents fanotify from keeping elevated refcounts
>> on inodes and dentries.
>>
>> There are quite a few userspace tools out there that are checking
>> that st_ino hasn't changed on a file between non atomic operations.
>> Those tools (or others) could benefit from a unique file handle if
>> we ever decide to provide a relaxed version of name_to_handle_at().
>>
>
> And this change need a clause about not breaking userspace.
>
> Pavel,
>
> Will this break any version of criu in the wild?

If there's no fliehandle in the output, it will make dump fail, but we're
already prepared for the fact, that there's no handle at hands. In the
worst case criu will exit with error.

I also agree that it should only happen when current is OOM killed, and in
case of CRIU this means killing criu process itself.

-- Pavel

2017-12-11 14:08:55

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11, 2017 at 3:46 PM, Pavel Emelyanov <[email protected]> wrote:
> On 12/11/2017 10:05 AM, Amir Goldstein wrote:
>> On Mon, Dec 11, 2017 at 8:41 AM, Amir Goldstein <[email protected]> wrote:
>>> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>>>> If a filesystem does not set sb->s_export_op, then it
>>>> does not support filehandles and export_fs_encode_fh()
>>>> and exportfs_encode_inode_fh() should not be called.
>>>> They will use export_encode_fh() is which is a default
>>>> that uses inode number generation number, but in general
>>>> they may not be stable.
>>>>
>>>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>>>> if called on an unsupported Filesystem. Currently only
>>>> notify/fdinfo can do that.
>>>>
>>>
>>> I wish you would leave this check to the caller, maybe add a helper
>>> exportfs_can_decode_fh() for callers to use.
>>>
>>> Although there are no current uses for it in-tree, there is value in
>>> being able to encode a unique file handle even when it cannot be
>>> decoded back to an open file.
>>>
>>> I am using this property in my fanotify super block watch patches,
>>> where the object identifier on the event is an encoded file handle
>>> of the object, which delegates tracking filesystem objects to
>>> userspace and prevents fanotify from keeping elevated refcounts
>>> on inodes and dentries.
>>>
>>> There are quite a few userspace tools out there that are checking
>>> that st_ino hasn't changed on a file between non atomic operations.
>>> Those tools (or others) could benefit from a unique file handle if
>>> we ever decide to provide a relaxed version of name_to_handle_at().
>>>
>>
>> And this change need a clause about not breaking userspace.
>>
>> Pavel,
>>
>> Will this break any version of criu in the wild?
>
> If there's no fliehandle in the output, it will make dump fail, but we're
> already prepared for the fact, that there's no handle at hands. In the
> worst case criu will exit with error.
>
> I also agree that it should only happen when current is OOM killed, and in
> case of CRIU this means killing criu process itself.
>

But this patch [1/4] changes behavior so you cannot dump fsnotify
state if watched file system does not support *decoding* file handles.
This means that criu anyway won't be able to restore the fsnotify state.
Is it OK that criu dump state will fail in that case?

Amir.

2017-12-11 15:21:40

by Pavel Emelianov

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On 12/11/2017 05:08 PM, Amir Goldstein wrote:
> On Mon, Dec 11, 2017 at 3:46 PM, Pavel Emelyanov <[email protected]> wrote:
>> On 12/11/2017 10:05 AM, Amir Goldstein wrote:
>>> On Mon, Dec 11, 2017 at 8:41 AM, Amir Goldstein <[email protected]> wrote:
>>>> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>>>>> If a filesystem does not set sb->s_export_op, then it
>>>>> does not support filehandles and export_fs_encode_fh()
>>>>> and exportfs_encode_inode_fh() should not be called.
>>>>> They will use export_encode_fh() is which is a default
>>>>> that uses inode number generation number, but in general
>>>>> they may not be stable.
>>>>>
>>>>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>>>>> if called on an unsupported Filesystem. Currently only
>>>>> notify/fdinfo can do that.
>>>>>
>>>>
>>>> I wish you would leave this check to the caller, maybe add a helper
>>>> exportfs_can_decode_fh() for callers to use.
>>>>
>>>> Although there are no current uses for it in-tree, there is value in
>>>> being able to encode a unique file handle even when it cannot be
>>>> decoded back to an open file.
>>>>
>>>> I am using this property in my fanotify super block watch patches,
>>>> where the object identifier on the event is an encoded file handle
>>>> of the object, which delegates tracking filesystem objects to
>>>> userspace and prevents fanotify from keeping elevated refcounts
>>>> on inodes and dentries.
>>>>
>>>> There are quite a few userspace tools out there that are checking
>>>> that st_ino hasn't changed on a file between non atomic operations.
>>>> Those tools (or others) could benefit from a unique file handle if
>>>> we ever decide to provide a relaxed version of name_to_handle_at().
>>>>
>>>
>>> And this change need a clause about not breaking userspace.
>>>
>>> Pavel,
>>>
>>> Will this break any version of criu in the wild?
>>
>> If there's no fliehandle in the output, it will make dump fail, but we're
>> already prepared for the fact, that there's no handle at hands. In the
>> worst case criu will exit with error.
>>
>> I also agree that it should only happen when current is OOM killed, and in
>> case of CRIU this means killing criu process itself.
>>
>
> But this patch [1/4] changes behavior so you cannot dump fsnotify
> state if watched file system does not support *decoding* file handles.

That's OK :) After we get a filehandle we check it's accessible, so for
FSs that couldn't decode one, we'd fail the dump anyway.

> This means that criu anyway won't be able to restore the fsnotify state.
> Is it OK that criu dump state will fail in that case?
>
> Amir.
> .
>

2017-12-11 16:08:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 4/4] fhandle: Improve error responses in name_to_handle_at()

On Mon, Dec 11, 2017 at 05:04:05PM +1100, NeilBrown wrote:
> 1/ Always return the mnt_id, even if some other error occurs.
> It can be useful without the file handle.
> An application can initialise the memory to, e.g. -1
> and if there is some other value after name_to_handle_at()
> returns, then it is a valid mnt_id.
> If the value is unchanged, then the kernel does not
> have this patch.

That's interesting--not the interface we might have chosen if we were
starting from scratch, but I can't see why it wouldn't work. OK!
ACK.--b.

>
> 2/ Don't return -EINVAL if the requested handle_bytes is
> larger than MAX_HANDLE_SZ. There is no need for an
> error and it causes unnecessary behavior change
> in the kernel ever needs to increase MAX_HANDLE_SZ.
> Simple limit handle_bytes to MAX_HANDLE_SZ silently.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/fhandle.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 0ace128f5d23..04afffaeb742 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -23,9 +23,16 @@ static long do_sys_name_to_handle(struct path *path,
> int handle_dwords, handle_bytes;
> struct file_handle *handle = NULL;
>
> + /*
> + * Always return the mnt_id, it might be useful even
> + * without the file handle
> + */
> + if (copy_to_user(mnt_id, &real_mount(path->mnt)->mnt_id,
> + sizeof(*mnt_id)))
> + return -EFAULT;
> /*
> * We need to make sure whether the file system
> - * support decoding of the file handle
> + * supports decoding of the file handle.
> */
> if (!path->dentry->d_sb->s_export_op ||
> !path->dentry->d_sb->s_export_op->fh_to_dentry)
> @@ -35,7 +42,7 @@ static long do_sys_name_to_handle(struct path *path,
> return -EFAULT;
>
> if (f_handle.handle_bytes > MAX_HANDLE_SZ)
> - return -EINVAL;
> + f_handle.handle_bytes = MAX_HANDLE_SZ;
>
> handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_bytes,
> GFP_KERNEL);
> @@ -68,10 +75,7 @@ static long do_sys_name_to_handle(struct path *path,
> retval = -EOVERFLOW;
> } else
> retval = 0;
> - /* copy the mount id */
> - if (copy_to_user(mnt_id, &real_mount(path->mnt)->mnt_id,
> - sizeof(*mnt_id)) ||
> - copy_to_user(ufh, handle,
> + if (copy_to_user(ufh, handle,
> sizeof(struct file_handle) + handle_bytes))
> retval = -EFAULT;
> kfree(handle);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-12-11 21:53:12

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11 2017, Amir Goldstein wrote:

> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>> If a filesystem does not set sb->s_export_op, then it
>> does not support filehandles and export_fs_encode_fh()
>> and exportfs_encode_inode_fh() should not be called.
>> They will use export_encode_fh() is which is a default
>> that uses inode number generation number, but in general
>> they may not be stable.
>>
>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>> if called on an unsupported Filesystem. Currently only
>> notify/fdinfo can do that.
>>
>
> I wish you would leave this check to the caller, maybe add a helper
> exportfs_can_decode_fh() for callers to use.
>
> Although there are no current uses for it in-tree, there is value in
> being able to encode a unique file handle even when it cannot be
> decoded back to an open file.
>
> I am using this property in my fanotify super block watch patches,
> where the object identifier on the event is an encoded file handle
> of the object, which delegates tracking filesystem objects to
> userspace and prevents fanotify from keeping elevated refcounts
> on inodes and dentries.
>
> There are quite a few userspace tools out there that are checking
> that st_ino hasn't changed on a file between non atomic operations.
> Those tools (or others) could benefit from a unique file handle if
> we ever decide to provide a relaxed version of name_to_handle_at().

If the filesystem doesn't define ->s_export_op, then you really cannot
trust anything beyond the inode number (and maybe not even that), and
the inode number is already easily available.
What actual value do you think you get from this pretend-file-handle
on filesystems that don't support file handles?

If there is a demonstrated need for some sort of identifier that is
stronger than an inode number, but not strong enough for
open_by_handle_at(), then you should explain that need and propose a
well defined interface. You shouldn't use a back-door and hope no-one
notices.

Thanks,
NeilBrown


Attachments:
signature.asc (832.00 B)

2017-12-11 22:12:47

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11 2017, Al Viro wrote:

> On Mon, Dec 11, 2017 at 05:04:05PM +1100, NeilBrown wrote:
>> @@ -385,7 +385,9 @@ int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
>> {
>> const struct export_operations *nop = inode->i_sb->s_export_op;
>>
>> - if (nop && nop->encode_fh)
>> + if (nop)
>> + return FILEID_INVALID;
>> + if (nop->encode_fh)
>> return nop->encode_fh(inode, fid->raw, max_len, parent);
>
> This might as well have been
>
> if (nop)
> return FILEID_INVALID;
> BUG();
>
> Have you ever tested that?

I have now - except for testing against Amazon EFS.
Only problem is the one you found - thanks.

NeilBrown

-----------8<------------>8----------------
From: NeilBrown <[email protected]>
Date: Mon, 11 Dec 2017 16:41:01 +1100
Subject: [PATCH] fs/notify: fdinfo can report unsupported file handles.

If a filesystem does not set sb->s_export_op, then it
does not support filehandles and export_fs_encode_fh()
and exportfs_encode_inode_fh() should not be called.
They will use export_encode_fh() is which is a default
that uses inode number generation number, but in general
they may not be stable.

So change exportfs_encode_inode_fh() to return FILEID_INVALID
if called on an unsupported Filesystem. Currently only
notify/fdinfo can do that.

Also remove the WARNing from fdinfo when exportfs_encode_inode_fh()
returns FILEID_INVALID, as that is no an erroneous condition.

Signed-off-by: NeilBrown <[email protected]>
---
fs/exportfs/expfs.c | 4 +++-
fs/notify/fdinfo.c | 4 +---
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 329a5d103846..2fff333b750c 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -385,7 +385,9 @@ int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
{
const struct export_operations *nop = inode->i_sb->s_export_op;

- if (nop && nop->encode_fh)
+ if (!nop)
+ return FILEID_INVALID;
+ if (nop->encode_fh)
return nop->encode_fh(inode, fid->raw, max_len, parent);

return export_encode_fh(inode, fid, max_len, parent);
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index d478629c728b..d1135ed61229 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -50,10 +50,8 @@ static void show_mark_fhandle(struct seq_file *m, struct inode *inode)
size = f.handle.handle_bytes >> 2;

ret = exportfs_encode_inode_fh(inode, (struct fid *)f.handle.f_handle, &size, 0);
- if ((ret == FILEID_INVALID) || (ret < 0)) {
- WARN_ONCE(1, "Can't encode file handler for inotify: %d\n", ret);
+ if ((ret == FILEID_INVALID) || (ret < 0))
return;
- }

f.handle.handle_type = ret;
f.handle.handle_bytes = size * sizeof(u32);
--
2.14.0.rc0.dirty


Attachments:
signature.asc (832.00 B)

2017-12-12 06:40:01

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Mon, Dec 11, 2017 at 11:52 PM, NeilBrown <[email protected]> wrote:
> On Mon, Dec 11 2017, Amir Goldstein wrote:
>
>> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>>> If a filesystem does not set sb->s_export_op, then it
>>> does not support filehandles and export_fs_encode_fh()
>>> and exportfs_encode_inode_fh() should not be called.
>>> They will use export_encode_fh() is which is a default
>>> that uses inode number generation number, but in general
>>> they may not be stable.
>>>
>>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>>> if called on an unsupported Filesystem. Currently only
>>> notify/fdinfo can do that.
>>>
>>
>> I wish you would leave this check to the caller, maybe add a helper
>> exportfs_can_decode_fh() for callers to use.
>>
>> Although there are no current uses for it in-tree, there is value in
>> being able to encode a unique file handle even when it cannot be
>> decoded back to an open file.
>>
>> I am using this property in my fanotify super block watch patches,
>> where the object identifier on the event is an encoded file handle
>> of the object, which delegates tracking filesystem objects to
>> userspace and prevents fanotify from keeping elevated refcounts
>> on inodes and dentries.
>>
>> There are quite a few userspace tools out there that are checking
>> that st_ino hasn't changed on a file between non atomic operations.
>> Those tools (or others) could benefit from a unique file handle if
>> we ever decide to provide a relaxed version of name_to_handle_at().
>
> If the filesystem doesn't define ->s_export_op, then you really cannot
> trust anything beyond the inode number (and maybe not even that), and
> the inode number is already easily available.
> What actual value do you think you get from this pretend-file-handle
> on filesystems that don't support file handles?
>

Sorry, I misread your patch. In my mind I thought you wanted to
eliminate the default export_encode_fh if there was no fh_to_dentry
operation like do_sys_name_to_handle() does. Just in my head...

FWIW, according to Pavel, if fdinfo would not export file handle
in case !fh_to_dentry op would probably be desirable, because
criu has no need for file handles that cannot be decoded.

Amir.

2017-12-13 02:20:39

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/4] fs/notify: fdinfo can report unsupported file handles.

On Tue, Dec 12 2017, Amir Goldstein wrote:

> On Mon, Dec 11, 2017 at 11:52 PM, NeilBrown <[email protected]> wrote:
>> On Mon, Dec 11 2017, Amir Goldstein wrote:
>>
>>> On Mon, Dec 11, 2017 at 8:04 AM, NeilBrown <[email protected]> wrote:
>>>> If a filesystem does not set sb->s_export_op, then it
>>>> does not support filehandles and export_fs_encode_fh()
>>>> and exportfs_encode_inode_fh() should not be called.
>>>> They will use export_encode_fh() is which is a default
>>>> that uses inode number generation number, but in general
>>>> they may not be stable.
>>>>
>>>> So change exportfs_encode_inode_fh() to return FILEID_INVALID
>>>> if called on an unsupported Filesystem. Currently only
>>>> notify/fdinfo can do that.
>>>>
>>>
>>> I wish you would leave this check to the caller, maybe add a helper
>>> exportfs_can_decode_fh() for callers to use.
>>>
>>> Although there are no current uses for it in-tree, there is value in
>>> being able to encode a unique file handle even when it cannot be
>>> decoded back to an open file.
>>>
>>> I am using this property in my fanotify super block watch patches,
>>> where the object identifier on the event is an encoded file handle
>>> of the object, which delegates tracking filesystem objects to
>>> userspace and prevents fanotify from keeping elevated refcounts
>>> on inodes and dentries.
>>>
>>> There are quite a few userspace tools out there that are checking
>>> that st_ino hasn't changed on a file between non atomic operations.
>>> Those tools (or others) could benefit from a unique file handle if
>>> we ever decide to provide a relaxed version of name_to_handle_at().
>>
>> If the filesystem doesn't define ->s_export_op, then you really cannot
>> trust anything beyond the inode number (and maybe not even that), and
>> the inode number is already easily available.
>> What actual value do you think you get from this pretend-file-handle
>> on filesystems that don't support file handles?
>>
>
> Sorry, I misread your patch. In my mind I thought you wanted to
> eliminate the default export_encode_fh if there was no fh_to_dentry
> operation like do_sys_name_to_handle() does. Just in my head...

I see... I would have said that fh_to_dentry was mandatory if
s_export_op was set, and Documentation/filesystems/nfs/Exporting agrees
with me. But I do see that sys_name_to_handle() and nfsd explicitly
test for it as well as for s_export_op.

It appears that cifs sets s_export_op, but doesn't provide fh_to_dentry,
and it is unique in this.
But the CIFS_NFSD_EXPORT config option is marked as BROKEN, so
that probably doesn't matter.

So for all current filesystems, filehandles are only reported if they
are usable for dentry lookup...

>
> FWIW, according to Pavel, if fdinfo would not export file handle
> in case !fh_to_dentry op would probably be desirable, because
> criu has no need for file handles that cannot be decoded.

Yes, it was good to have that confirmed - thanks for getting that
checked.

NeilBrown


Attachments:
signature.asc (832.00 B)