2015-04-22 03:50:43

by Boqun Feng

[permalink] [raw]
Subject: [PATCH 0/2] staging: lustre: Replace ll_getname with vfs' getname

As Al Viro pointed out:

https://lkml.org/lkml/2015/4/11/243

There are bugs in ll_getname() because of wrong assumptions of returning
values from strncpy_from_user(). Moreover, what ll_getname want to do is
just to try copy the file name from userland. Since we already have
getname() for the same purpose, it's better to replace ll_getname() with
getname().

To do that, we need to:
1) export the symbols of getname() and putname() to be used by modules.
2) actually replace ll_getname()/ll_putname() with getname()/putname().

One more thing is that as ll_getname() and getname() both treat a zero-length
file name as an error(-ENOENT), and if ll_getname() or getname() has an error,
ll_dir_ioctl() will return the error immediately, so checking whether these names
are zero-length is unnecessary and -ENIVAL shall not return from that code path,
no matter using ll_getname() or getname(). So remove the checking code.

This patchset is based on v4.0, and I only did build tests, because I found a
little difficult to set up a lustre environment. I'll try to do the testing once
I'm able to set up an environment, in the meanwhile, comments,
inputs from lustre's point of view and voluntary tests are welcome. Thank you. ;-)

Regards,
Boqun Feng


Boqun Feng (2):
vfs: Export symbol 'getname' and 'putname'
staging: lustre: replace ll_{get,put}name() with {get,put}name()

drivers/staging/lustre/lustre/llite/dir.c | 60 ++++++----------------
.../staging/lustre/lustre/llite/llite_internal.h | 2 +-
drivers/staging/lustre/lustre/llite/namei.c | 2 +-
fs/namei.c | 18 +++++++
4 files changed, 36 insertions(+), 46 deletions(-)

--
2.3.5


2015-04-22 03:50:58

by Boqun Feng

[permalink] [raw]
Subject: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

getname/putname in fs/namei.c is a well-implemented way to copy a file
name from userland, however other ways, such as directly calling
__getname() and strncpy_from_user(), may lack features(e.g. auditing and
reusing), introduce errors or at least reinvent wheels. Therefore for
places need a kernel file name copy from userland, it's better to use
getname and putname if possible.

To be able to use these functions all over the kernel, symbols 'getname'
and 'putname' are exported and comments of their behaviors and
constraints are added.

Suggested-by: Al Viro <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>
---
fs/namei.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index c83145a..472911c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -199,11 +199,23 @@ error:
return err;
}

+/**
+ * getname() - Get a file name copy from userland
+ * @filename: userland pointer to the file name
+ *
+ * If successful, return a 'struct filename' pointer and ->name is the pointer
+ * to the kernel copy of the file name, otherwise an ERR_PTR.
+ *
+ * getname() should only be called in a system call context, and for each
+ * getname() that returns a successful value, callers must ensure exactly one
+ * corresponding putname() is called before returning to userland.
+ */
struct filename *
getname(const char __user * filename)
{
return getname_flags(filename, 0, NULL);
}
+EXPORT_SYMBOL(getname);

struct filename *
getname_kernel(const char * filename)
@@ -242,6 +254,11 @@ getname_kernel(const char * filename)
return result;
}

+/* putname() - Release a 'struct filename' structure
+ * @name: the 'struct filename' structure to be release
+ *
+ * See more at getname()
+ */
void putname(struct filename *name)
{
BUG_ON(name->refcnt <= 0);
@@ -255,6 +272,7 @@ void putname(struct filename *name)
} else
__putname(name);
}
+EXPORT_SYMBOL(putname);

static int check_acl(struct inode *inode, int mask)
{
--
2.3.5

2015-04-22 03:51:05

by Boqun Feng

[permalink] [raw]
Subject: [PATCH 2/2] staging: lustre: replace ll_{get,put}name() with {get,put}name()

As pointed by Al Viro:

https://lkml.org/lkml/2015/4/11/243

There are bugs in ll_getname() because of wrong assumptions of returning
values from strncpy_from_user(). Moreover, what ll_getname want to do is
just to try copy the file name from userland. Since we already have
getname() for the same purpose, it's better to replace ll_getname() with
getname(), so is ll_putname().

Besides, remove unused code for checking whether namelen is 0 or not in
case LL_IOC_REMOVE_ENTRY, because zero-length file name is already
handled by getname() in the same way as ll_getname().

Suggested-by: Al Viro <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>
---
drivers/staging/lustre/lustre/llite/dir.c | 60 ++++++----------------
.../staging/lustre/lustre/llite/llite_internal.h | 2 +-
drivers/staging/lustre/lustre/llite/namei.c | 2 +-
3 files changed, 18 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index a182019..c75fc38 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1216,30 +1216,6 @@ out:
return rc;
}

-static char *
-ll_getname(const char __user *filename)
-{
- int ret = 0, len;
- char *tmp = __getname();
-
- if (!tmp)
- return ERR_PTR(-ENOMEM);
-
- len = strncpy_from_user(tmp, filename, PATH_MAX);
- if (len == 0)
- ret = -ENOENT;
- else if (len > PATH_MAX)
- ret = -ENAMETOOLONG;
-
- if (ret) {
- __putname(tmp);
- tmp = ERR_PTR(ret);
- }
- return tmp;
-}
-
-#define ll_putname(filename) __putname(filename)
-
static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
struct inode *inode = file_inode(file);
@@ -1441,7 +1417,7 @@ free_lmv:
return rc;
}
case LL_IOC_REMOVE_ENTRY: {
- char *filename = NULL;
+ struct filename *name = NULL;
int namelen = 0;
int rc;

@@ -1453,20 +1429,16 @@ free_lmv:
if (!(exp_connect_flags(sbi->ll_md_exp) & OBD_CONNECT_LVB_TYPE))
return -ENOTSUPP;

- filename = ll_getname((const char *)arg);
- if (IS_ERR(filename))
- return PTR_ERR(filename);
+ name = getname((const char *)arg);
+ if (IS_ERR(name))
+ return PTR_ERR(name);

- namelen = strlen(filename);
- if (namelen < 1) {
- rc = -EINVAL;
- goto out_rmdir;
- }
+ namelen = strlen(name->name);
+
+ rc = ll_rmdir_entry(inode, name->name, namelen);

- rc = ll_rmdir_entry(inode, filename, namelen);
-out_rmdir:
- if (filename)
- ll_putname(filename);
+ if (name)
+ putname(name);
return rc;
}
case LL_IOC_LOV_SWAP_LAYOUTS:
@@ -1481,16 +1453,16 @@ out_rmdir:
struct lov_user_md *lump;
struct lov_mds_md *lmm = NULL;
struct mdt_body *body;
- char *filename = NULL;
+ struct filename *name = NULL;
int lmmsize;

if (cmd == IOC_MDC_GETFILEINFO ||
cmd == IOC_MDC_GETFILESTRIPE) {
- filename = ll_getname((const char *)arg);
- if (IS_ERR(filename))
- return PTR_ERR(filename);
+ name = getname((const char *)arg);
+ if (IS_ERR(name))
+ return PTR_ERR(name);

- rc = ll_lov_getstripe_ea_info(inode, filename, &lmm,
+ rc = ll_lov_getstripe_ea_info(inode, name->name, &lmm,
&lmmsize, &request);
} else {
rc = ll_dir_getstripe(inode, &lmm, &lmmsize, &request);
@@ -1556,8 +1528,8 @@ skip_lmm:

out_req:
ptlrpc_req_finished(request);
- if (filename)
- ll_putname(filename);
+ if (name)
+ putname(name);
return rc;
}
case IOC_LOV_GETINFO: {
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 2af1d72..0950565 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -714,7 +714,7 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash,
int ll_md_blocking_ast(struct ldlm_lock *, struct ldlm_lock_desc *,
void *data, int flag);
struct dentry *ll_splice_alias(struct inode *inode, struct dentry *de);
-int ll_rmdir_entry(struct inode *dir, char *name, int namelen);
+int ll_rmdir_entry(struct inode *dir, const char *name, int namelen);

/* llite/rw.c */
int ll_prepare_write(struct file *, struct page *, unsigned from, unsigned to);
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 890ac19..ec48d8d 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -867,7 +867,7 @@ static inline void ll_get_child_fid(struct dentry *child, struct lu_fid *fid)
/**
* Remove dir entry
**/
-int ll_rmdir_entry(struct inode *dir, char *name, int namelen)
+int ll_rmdir_entry(struct inode *dir, const char *name, int namelen)
{
struct ptlrpc_request *request = NULL;
struct md_op_data *op_data;
--
2.3.5

2015-04-22 05:46:02

by Dilger, Andreas

[permalink] [raw]
Subject: Re: [PATCH 2/2] staging: lustre: replace ll_{get,put}name() with {get,put}name()

On 2015/04/21, 9:50 PM, "Boqun Feng" <[email protected]> wrote:

>As pointed by Al Viro:
>
>https://lkml.org/lkml/2015/4/11/243
>
>There are bugs in ll_getname() because of wrong assumptions of returning
>values from strncpy_from_user(). Moreover, what ll_getname want to do is
>just to try copy the file name from userland. Since we already have
>getname() for the same purpose, it's better to replace ll_getname() with
>getname(), so is ll_putname().
>
>Besides, remove unused code for checking whether namelen is 0 or not in
>case LL_IOC_REMOVE_ENTRY, because zero-length file name is already
>handled by getname() in the same way as ll_getname().
>
>Suggested-by: Al Viro <[email protected]>
>Signed-off-by: Boqun Feng <[email protected]>

Looks good, you can add my:
Reviewed-by: Andreas Dilger <[email protected]>

>---
> drivers/staging/lustre/lustre/llite/dir.c | 60
>++++++----------------
> .../staging/lustre/lustre/llite/llite_internal.h | 2 +-
> drivers/staging/lustre/lustre/llite/namei.c | 2 +-
> 3 files changed, 18 insertions(+), 46 deletions(-)
>
>diff --git a/drivers/staging/lustre/lustre/llite/dir.c
>b/drivers/staging/lustre/lustre/llite/dir.c
>index a182019..c75fc38 100644
>--- a/drivers/staging/lustre/lustre/llite/dir.c
>+++ b/drivers/staging/lustre/lustre/llite/dir.c
>@@ -1216,30 +1216,6 @@ out:
> return rc;
> }
>
>-static char *
>-ll_getname(const char __user *filename)
>-{
>- int ret = 0, len;
>- char *tmp = __getname();
>-
>- if (!tmp)
>- return ERR_PTR(-ENOMEM);
>-
>- len = strncpy_from_user(tmp, filename, PATH_MAX);
>- if (len == 0)
>- ret = -ENOENT;
>- else if (len > PATH_MAX)
>- ret = -ENAMETOOLONG;
>-
>- if (ret) {
>- __putname(tmp);
>- tmp = ERR_PTR(ret);
>- }
>- return tmp;
>-}
>-
>-#define ll_putname(filename) __putname(filename)
>-
> static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned
>long arg)
> {
> struct inode *inode = file_inode(file);
>@@ -1441,7 +1417,7 @@ free_lmv:
> return rc;
> }
> case LL_IOC_REMOVE_ENTRY: {
>- char *filename = NULL;
>+ struct filename *name = NULL;
> int namelen = 0;
> int rc;
>
>@@ -1453,20 +1429,16 @@ free_lmv:
> if (!(exp_connect_flags(sbi->ll_md_exp) & OBD_CONNECT_LVB_TYPE))
> return -ENOTSUPP;
>
>- filename = ll_getname((const char *)arg);
>- if (IS_ERR(filename))
>- return PTR_ERR(filename);
>+ name = getname((const char *)arg);
>+ if (IS_ERR(name))
>+ return PTR_ERR(name);
>
>- namelen = strlen(filename);
>- if (namelen < 1) {
>- rc = -EINVAL;
>- goto out_rmdir;
>- }
>+ namelen = strlen(name->name);
>+
>+ rc = ll_rmdir_entry(inode, name->name, namelen);
>
>- rc = ll_rmdir_entry(inode, filename, namelen);
>-out_rmdir:
>- if (filename)
>- ll_putname(filename);
>+ if (name)
>+ putname(name);
> return rc;
> }
> case LL_IOC_LOV_SWAP_LAYOUTS:
>@@ -1481,16 +1453,16 @@ out_rmdir:
> struct lov_user_md *lump;
> struct lov_mds_md *lmm = NULL;
> struct mdt_body *body;
>- char *filename = NULL;
>+ struct filename *name = NULL;
> int lmmsize;
>
> if (cmd == IOC_MDC_GETFILEINFO ||
> cmd == IOC_MDC_GETFILESTRIPE) {
>- filename = ll_getname((const char *)arg);
>- if (IS_ERR(filename))
>- return PTR_ERR(filename);
>+ name = getname((const char *)arg);
>+ if (IS_ERR(name))
>+ return PTR_ERR(name);
>
>- rc = ll_lov_getstripe_ea_info(inode, filename, &lmm,
>+ rc = ll_lov_getstripe_ea_info(inode, name->name, &lmm,
> &lmmsize, &request);
> } else {
> rc = ll_dir_getstripe(inode, &lmm, &lmmsize, &request);
>@@ -1556,8 +1528,8 @@ skip_lmm:
>
> out_req:
> ptlrpc_req_finished(request);
>- if (filename)
>- ll_putname(filename);
>+ if (name)
>+ putname(name);
> return rc;
> }
> case IOC_LOV_GETINFO: {
>diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h
>b/drivers/staging/lustre/lustre/llite/llite_internal.h
>index 2af1d72..0950565 100644
>--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
>+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
>@@ -714,7 +714,7 @@ struct inode *ll_iget(struct super_block *sb, ino_t
>hash,
> int ll_md_blocking_ast(struct ldlm_lock *, struct ldlm_lock_desc *,
> void *data, int flag);
> struct dentry *ll_splice_alias(struct inode *inode, struct dentry *de);
>-int ll_rmdir_entry(struct inode *dir, char *name, int namelen);
>+int ll_rmdir_entry(struct inode *dir, const char *name, int namelen);
>
> /* llite/rw.c */
> int ll_prepare_write(struct file *, struct page *, unsigned from,
>unsigned to);
>diff --git a/drivers/staging/lustre/lustre/llite/namei.c
>b/drivers/staging/lustre/lustre/llite/namei.c
>index 890ac19..ec48d8d 100644
>--- a/drivers/staging/lustre/lustre/llite/namei.c
>+++ b/drivers/staging/lustre/lustre/llite/namei.c
>@@ -867,7 +867,7 @@ static inline void ll_get_child_fid(struct dentry
>*child, struct lu_fid *fid)
> /**
> * Remove dir entry
> **/
>-int ll_rmdir_entry(struct inode *dir, char *name, int namelen)
>+int ll_rmdir_entry(struct inode *dir, const char *name, int namelen)
> {
> struct ptlrpc_request *request = NULL;
> struct md_op_data *op_data;
>--
>2.3.5
>
>


Cheers, Andreas
--
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division

2015-04-22 05:53:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

Nak on exporting symbols for broken staging code. Please get rid of
the ioctls looking up path names in horrible ways in the lustre code.

2015-04-22 06:27:22

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

On Apr 22, 2015, at 1:53 AM, Christoph Hellwig wrote:

> Nak on exporting symbols for broken staging code. Please get rid of
> the ioctls looking up path names in horrible ways in the lustre code.

For a reference, is there a good example of a non-horrible way to look up a pathname?

Thanks.

Bye,
Oleg-

2015-04-22 06:31:13

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

On Wed, Apr 22, 2015 at 06:27:11AM +0000, Drokin, Oleg wrote:
> > Nak on exporting symbols for broken staging code. Please get rid of
> > the ioctls looking up path names in horrible ways in the lustre code.
>
> For a reference, is there a good example of a non-horrible way to look up a pathname?

Just dont do it from an ioctl, it's got an fd parameter for a reason.

2015-04-22 06:31:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

On Tue, Apr 21, 2015 at 10:53:11PM -0700, Christoph Hellwig wrote:
> Nak on exporting symbols for broken staging code. Please get rid of
> the ioctls looking up path names in horrible ways in the lustre code.

I agree with Christoph, we shouldn't be doing this. Let's fix lustre
"properly", which should be possible, right?

thanks,

greg k-h

2015-04-22 06:40:54

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'


On Apr 22, 2015, at 2:31 AM, Greg Kroah-Hartman wrote:

> On Tue, Apr 21, 2015 at 10:53:11PM -0700, Christoph Hellwig wrote:
>> Nak on exporting symbols for broken staging code. Please get rid of
>> the ioctls looking up path names in horrible ways in the lustre code.
>
> I agree with Christoph, we shouldn't be doing this. Let's fix lustre
> "properly", which should be possible, right?

Well, sort of.

For the first ioctl we clearly can do an fd pass and
get info from there (need to also teach the tools about the new ioctl,
so I cannot submit a patch right away, but I'll get on it).

In the second case the usecase is a bit more involved.
It deals with the problem of having directory entry pointing to a non-existing
directory (so basically a name only, but we do know it's supposed to be
a directory, just on another server), so I doubt we can even open it to get
the fd.
Now, perhaps we can just allow regular rmdir to ignore the error and kill
the bogus entry anyway, but there was this thought that we might want to alert
the user there's something more fundamentally broken going on here,
and so they would need to call this certain ioctl called if they just would like
to kill the entry anyway as opposed to calling say fsck to make sure there's
nothing else broken.
I am not entirely sure of this idea myself, but it probably made sense
at some point.
I see there's quite a dislike for this approach, so we can remove it if there's
no better option.

Bye,
Oleg-

2015-04-22 06:49:13

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'


On Apr 22, 2015, at 2:31 AM, Christoph Hellwig wrote:

> On Wed, Apr 22, 2015 at 06:27:11AM +0000, Drokin, Oleg wrote:
>>> Nak on exporting symbols for broken staging code. Please get rid of
>>> the ioctls looking up path names in horrible ways in the lustre code.
>>
>> For a reference, is there a good example of a non-horrible way to look up a pathname?
>
> Just dont do it from an ioctl, it's got an fd parameter for a reason.

I know this is not going to be a popular opinion with you, but sometimes opening a file
is just too expensive. 1 RPC roudntrip to open a file and then another one to close it.

Also some files could not be opened (fs corruption).

Anyway, I got your point and there will be a solution. I was just hoping there was a way
to do it because what if e.g. I need to create something new, not do something with already
existing stuff, certainly there's no way to get an fd from a not yet existing fs object.

Bye,
Oleg-

2015-04-22 07:34:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

On Wed, Apr 22, 2015 at 06:49:08AM +0000, Drokin, Oleg wrote:
> I know this is not going to be a popular opinion with you, but sometimes opening a file
> is just too expensive. 1 RPC roudntrip to open a file and then another one to close it.

Use O_PATH to avoid this.

2015-04-24 02:38:20

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 1/2] vfs: export symbol 'getname' and 'putname'

On Apr 22, 2015, at 3:34 AM, Christoph Hellwig wrote:

> On Wed, Apr 22, 2015 at 06:49:08AM +0000, Drokin, Oleg wrote:
>> I know this is not going to be a popular opinion with you, but sometimes opening a file
>> is just too expensive. 1 RPC roudntrip to open a file and then another one to close it.
> Use O_PATH to avoid this.

Hm, I guess I can open with O_PATH, but?
if (unlikely(f->f_flags & O_PATH)) {
f->f_mode = FMODE_PATH;
f->f_op = &empty_fops;
return 0;
}

so with such an fd I am never getting into my ioctl handler, you see?

Let's suppose I overcome this somehow, still that does not completely solve my problem
that has more dimensions.

So, imagine I have this huge file tree (common with really big filesystems), and I want to do a one-off find
on it for whatever reason.
I do not want to pollute my dentry and inode cache with all the extra entries because (I think) I know I will
never need them again.
So with our broken ioctl from the past that was somewhat easy - I just open a dir, I do getdents, I get
a bunch of names and I proceed to call my ioctl on this bunch of names and get all the info I need
(one rpc per entry, which is not all that great, but oh well) and my dentry cache is only getting
directories, but not the files.
Now, if I convert to O_PATH (or to some other single call thing that does not need it, like say getxattr
that might work for some subset of intended usage), I get pretty much the same thing, but I also get dcache pollution
and in order to guard my dcache, I am getting a bunch of lustre locks (the expensive kind of lustre locks
issued by server so that the cache stays coherent cluster wide), even if I somehow do uncached dentries so I
can avoid the lock, there would still be that pesky LOOKUP RPC (that I would need to somehow teach to
not just do lookup, but to bring me other interesting things, kind of like with open intents).
This looks like it's getting out of hand rather fast.

Now, I probably can create some sort of an RPC that is "extended getdents with attributes" and so my extended_getdents_rpc
would return me the name and a bunch of other data like file striping, stat information and the like. This also saves me some
more RPCs, but I imagine if I try to expose that over an ioctl, you would not be very happy with it either and I don't think
we have this sort of an extended getdents interface at the kernel too, do we (though I think internally nfs is capable
of such a thing)?

Do you think any of this makes sense, or do you think I should just convert this ioctl from our broken getname version to
something like user_path_at() (an exported symbol) to do the lookup+fetch whatever info I need and immediately unhash the
resultant dentry/inode and be done with it (at least then I do not need any tools changes).

Do you think there's something else I might be doing, but not yet realizing this?

Thanks.

Bye,
Oleg-