2019-08-29 05:03:47

by Sergey Senozhatsky

[permalink] [raw]
Subject: build_path_from_dentry_optional_prefix() may schedule from invalid context

Hello,

Looking at commit "cifs: create a helper to find a writeable handle
by path name":

->open_file_lock scope is atomic context, while build_path_from_dentry()
can schedule - kmalloc(GFP_KERNEL)

spin_lock(&tcon->open_file_lock);
list_for_each(tmp, &tcon->openFileList) {
cfile = list_entry(tmp, struct cifsFileInfo,
tlist);
full_path = build_path_from_dentry(cfile->dentry);
if (full_path == NULL) {
spin_unlock(&tcon->open_file_lock);
return -ENOMEM;
}
if (strcmp(full_path, name)) {
kfree(full_path);
continue;
}
kfree(full_path);

cinode = CIFS_I(d_inode(cfile->dentry));
spin_unlock(&tcon->open_file_lock);
return cifs_get_writable_file(cinode, 0, ret_file);
}

spin_unlock(&tcon->open_file_lock);

Additionally, kfree() can (and should) be done outside of
->open_file_lock scope.

-ss


2019-09-20 21:28:20

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: build_path_from_dentry_optional_prefix() may schedule from invalid context

ср, 28 авг. 2019 г. в 22:02, Sergey Senozhatsky
<[email protected]>:
>
> Hello,
>
> Looking at commit "cifs: create a helper to find a writeable handle
> by path name":
>
> ->open_file_lock scope is atomic context, while build_path_from_dentry()
> can schedule - kmalloc(GFP_KERNEL)
>
> spin_lock(&tcon->open_file_lock);
> list_for_each(tmp, &tcon->openFileList) {
> cfile = list_entry(tmp, struct cifsFileInfo,
> tlist);
> full_path = build_path_from_dentry(cfile->dentry);
> if (full_path == NULL) {
> spin_unlock(&tcon->open_file_lock);
> return -ENOMEM;
> }
> if (strcmp(full_path, name)) {
> kfree(full_path);
> continue;
> }
> kfree(full_path);
>
> cinode = CIFS_I(d_inode(cfile->dentry));
> spin_unlock(&tcon->open_file_lock);
> return cifs_get_writable_file(cinode, 0, ret_file);
> }
>
> spin_unlock(&tcon->open_file_lock);
>
> Additionally, kfree() can (and should) be done outside of
> ->open_file_lock scope.
>
> -ss

Good catch. I think we should have another version of
build_path_from_dentry() which takes pre-allocated (probably on stack)
full_path as an argument. This would allow us to avoid allocations
under the spin lock.
--
Best regards,
Pavel Shilovsky

2019-09-23 19:01:02

by Al Viro

[permalink] [raw]
Subject: Re: build_path_from_dentry_optional_prefix() may schedule from invalid context

On Thu, Sep 19, 2019 at 05:11:54PM -0700, Pavel Shilovsky wrote:

> Good catch. I think we should have another version of
> build_path_from_dentry() which takes pre-allocated (probably on stack)
> full_path as an argument. This would allow us to avoid allocations
> under the spin lock.

On _stack_? For relative pathname? Er... You do realize that
kernel stack is small, right? And said relative pathname can
bloody well be up to 4Kb (i.e. the half of said stack already,
on top of whatever the call chain has already eaten up)...

BTW, looking at build_path_from_dentry()... WTF is this?
temp = temp->d_parent;
if (temp == NULL) {
cifs_dbg(VFS, "corrupt dentry\n");
rcu_read_unlock();
return NULL;
}
Why not check for any number of other forms of memory corruption?
Like, say it, if (temp == (void *)0xf0adf0adf0adf0ad)?

IOW, kindly lose that nonsense. More importantly, why bother
with that kmalloc()? Just __getname() in the very beginning
and __putname() on failure (and for freeing the result afterwards).

What's more, you are open-coding dentry_path_raw(), badly.
The only differences are
* use of dirsep instead of '/' and
* a prefix slapped in the beginning.

I'm fairly sure that
char *buf = __getname();
char *s;

*to_free = NULL;
if (unlikely(!buf))
return NULL;

s = dentry_path_raw(dentry, buf, PATH_MAX);
if (IS_ERR(s) || s < buf + prefix_len)
__putname(buf);
return NULL; // assuming that you don't care about details
}

if (dirsep != '/') {
char *p = s;
while ((p = strchr(p, '/')) != NULL)
*p++ = dirsep;
}

s -= prefix_len;
memcpy(s, prefix, prefix_len);

*to_free = buf;
return s;

would end up being faster, not to mention much easier to understand.
With the caller expected to pass &to_free among the arguments and
__putname() it once it's done.

Or just do __getname() in the caller and pass it to the function -
in that case freeing (in all cases) would be up to the caller.

2019-09-30 21:40:10

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: build_path_from_dentry_optional_prefix() may schedule from invalid context

сб, 21 сент. 2019 г. в 15:38, Al Viro <[email protected]>:
>
> On Thu, Sep 19, 2019 at 05:11:54PM -0700, Pavel Shilovsky wrote:
>
> > Good catch. I think we should have another version of
> > build_path_from_dentry() which takes pre-allocated (probably on stack)
> > full_path as an argument. This would allow us to avoid allocations
> > under the spin lock.
>
> On _stack_? For relative pathname? Er... You do realize that
> kernel stack is small, right? And said relative pathname can
> bloody well be up to 4Kb (i.e. the half of said stack already,
> on top of whatever the call chain has already eaten up)...

My idea was to use a small stack-allocated array which satisfies most
cases (say 100-200 bytes) and fallback to dynamic a heap allocation
for longer path names.

>
> BTW, looking at build_path_from_dentry()... WTF is this?
> temp = temp->d_parent;
> if (temp == NULL) {
> cifs_dbg(VFS, "corrupt dentry\n");
> rcu_read_unlock();
> return NULL;
> }
> Why not check for any number of other forms of memory corruption?
> Like, say it, if (temp == (void *)0xf0adf0adf0adf0ad)?
>
> IOW, kindly lose that nonsense. More importantly, why bother
> with that kmalloc()? Just __getname() in the very beginning
> and __putname() on failure (and for freeing the result afterwards).
>
> What's more, you are open-coding dentry_path_raw(), badly.
> The only differences are
> * use of dirsep instead of '/' and
> * a prefix slapped in the beginning.
>
> I'm fairly sure that
> char *buf = __getname();
> char *s;
>
> *to_free = NULL;
> if (unlikely(!buf))
> return NULL;
>
> s = dentry_path_raw(dentry, buf, PATH_MAX);
> if (IS_ERR(s) || s < buf + prefix_len)
> __putname(buf);
> return NULL; // assuming that you don't care about details
> }
>
> if (dirsep != '/') {
> char *p = s;
> while ((p = strchr(p, '/')) != NULL)
> *p++ = dirsep;
> }
>
> s -= prefix_len;
> memcpy(s, prefix, prefix_len);
>
> *to_free = buf;
> return s;
>
> would end up being faster, not to mention much easier to understand.
> With the caller expected to pass &to_free among the arguments and
> __putname() it once it's done.
>
> Or just do __getname() in the caller and pass it to the function -
> in that case freeing (in all cases) would be up to the caller.

Thanks for pointing this out. Someone should look at this closely and
clean it up.

--
Best regards,
Pavel Shilovsky

2019-12-09 00:35:36

by Al Viro

[permalink] [raw]
Subject: Re: build_path_from_dentry_optional_prefix() may schedule from invalid context

On Mon, Sep 30, 2019 at 10:32:16AM -0700, Pavel Shilovsky wrote:
> сб, 21 сент. 2019 г. в 15:38, Al Viro <[email protected]>:

> > IOW, kindly lose that nonsense. More importantly, why bother
> > with that kmalloc()? Just __getname() in the very beginning
> > and __putname() on failure (and for freeing the result afterwards).
> >
> > What's more, you are open-coding dentry_path_raw(), badly.
> > The only differences are
> > * use of dirsep instead of '/' and
> > * a prefix slapped in the beginning.
> >
> > I'm fairly sure that
> > char *buf = __getname();
> > char *s;
> >
> > *to_free = NULL;
> > if (unlikely(!buf))
> > return NULL;
> >
> > s = dentry_path_raw(dentry, buf, PATH_MAX);
> > if (IS_ERR(s) || s < buf + prefix_len)
> > __putname(buf);
> > return NULL; // assuming that you don't care about details
> > }
> >
> > if (dirsep != '/') {
> > char *p = s;
> > while ((p = strchr(p, '/')) != NULL)
> > *p++ = dirsep;
> > }
> >
> > s -= prefix_len;
> > memcpy(s, prefix, prefix_len);
> >
> > *to_free = buf;
> > return s;
> >
> > would end up being faster, not to mention much easier to understand.
> > With the caller expected to pass &to_free among the arguments and
> > __putname() it once it's done.
> >
> > Or just do __getname() in the caller and pass it to the function -
> > in that case freeing (in all cases) would be up to the caller.
>
> Thanks for pointing this out. Someone should look at this closely and
> clean it up.

Could you take a look through vfs.git#misc.cifs?

2019-12-10 19:16:05

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: build_path_from_dentry_optional_prefix() may schedule from invalid context

вс, 8 дек. 2019 г. в 16:34, Al Viro <[email protected]>:
>
> On Mon, Sep 30, 2019 at 10:32:16AM -0700, Pavel Shilovsky wrote:
> > сб, 21 сент. 2019 г. в 15:38, Al Viro <[email protected]>:
>
> > > IOW, kindly lose that nonsense. More importantly, why bother
> > > with that kmalloc()? Just __getname() in the very beginning
> > > and __putname() on failure (and for freeing the result afterwards).
> > >
> > > What's more, you are open-coding dentry_path_raw(), badly.
> > > The only differences are
> > > * use of dirsep instead of '/' and
> > > * a prefix slapped in the beginning.
> > >
> > > I'm fairly sure that
> > > char *buf = __getname();
> > > char *s;
> > >
> > > *to_free = NULL;
> > > if (unlikely(!buf))
> > > return NULL;
> > >
> > > s = dentry_path_raw(dentry, buf, PATH_MAX);
> > > if (IS_ERR(s) || s < buf + prefix_len)
> > > __putname(buf);
> > > return NULL; // assuming that you don't care about details
> > > }
> > >
> > > if (dirsep != '/') {
> > > char *p = s;
> > > while ((p = strchr(p, '/')) != NULL)
> > > *p++ = dirsep;
> > > }
> > >
> > > s -= prefix_len;
> > > memcpy(s, prefix, prefix_len);
> > >
> > > *to_free = buf;
> > > return s;
> > >
> > > would end up being faster, not to mention much easier to understand.
> > > With the caller expected to pass &to_free among the arguments and
> > > __putname() it once it's done.
> > >
> > > Or just do __getname() in the caller and pass it to the function -
> > > in that case freeing (in all cases) would be up to the caller.
> >
> > Thanks for pointing this out. Someone should look at this closely and
> > clean it up.
>
> Could you take a look through vfs.git#misc.cifs?

Looks good. I would only add the same or a similar comment as
fs/hostfs/hostfs_kern.c has when calling dentry_path_raw():

/*
* This function relies on the fact that dentry_path_raw() will place
* the path name at the end of the provided buffer.
*/

Otherwise it is not straightforward at the first glance how the code works.

Acked-by: Pavel Shilovsky <[email protected]>

--
Best regards,
Pavel Shilovsky