2012-02-25 02:33:00

by Eric Wong

[permalink] [raw]
Subject: [PATCH] fadvise: avoid EINVAL if user input is valid

The kernel is not required to act on fadvise, so fail silently
and ignore advice as long as it has a valid descriptor and
parameters.

Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Signed-off-by: Eric Wong <[email protected]>
---

Of course I wouldn't knowingly call posix_fadvise() on a file in
tmpfs, but a userspace app often doesn't know (nor should it
care) what type of filesystem it's on.

I encountered EINVAL while running the Ruby 1.9.3 test suite on a
stock Debian wheezy installation. Wheezy uses tmpfs for "/tmp" by
default and the test suite creates a temporary file to test the
Ruby wrapper for posix_fadvise() on.

mm/fadvise.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/mm/fadvise.c b/mm/fadvise.c
index 469491e0..f9e48dd 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
goto out;
}

- mapping = file->f_mapping;
- if (!mapping || len < 0) {
+ if (len < 0) {
ret = -EINVAL;
goto out;
}

- if (mapping->a_ops->get_xip_mem) {
+ mapping = file->f_mapping;
+ if (!mapping || mapping->a_ops->get_xip_mem) {
switch (advice) {
case POSIX_FADV_NORMAL:
case POSIX_FADV_RANDOM:
@@ -93,10 +93,9 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
spin_unlock(&file->f_lock);
break;
case POSIX_FADV_WILLNEED:
- if (!mapping->a_ops->readpage) {
- ret = -EINVAL;
+ /* ignore the advice if readahead isn't possible (tmpfs) */
+ if (!mapping->a_ops->readpage)
break;
- }

/* First and last PARTIAL page! */
start_index = offset >> PAGE_CACHE_SHIFT;
@@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
nrpages = end_index - start_index + 1;
if (!nrpages)
nrpages = ~0UL;
-
- ret = force_page_cache_readahead(mapping, file,
- start_index,
- nrpages);
- if (ret > 0)
- ret = 0;
+
+ force_page_cache_readahead(mapping, file, start_index, nrpages);
break;
case POSIX_FADV_NOREUSE:
break;
--
Eric Wong


2012-02-25 22:56:25

by Pádraig Brady

[permalink] [raw]
Subject: Re: [PATCH] fadvise: avoid EINVAL if user input is valid

On 02/25/2012 02:27 AM, Eric Wong wrote:
> The kernel is not required to act on fadvise, so fail silently
> and ignore advice as long as it has a valid descriptor and
> parameters.
>

> @@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> nrpages = end_index - start_index + 1;
> if (!nrpages)
> nrpages = ~0UL;
> -
> - ret = force_page_cache_readahead(mapping, file,
> - start_index,
> - nrpages);
> - if (ret > 0)
> - ret = 0;
> +
> + force_page_cache_readahead(mapping, file, start_index, nrpages);
> break;

This whole patch makes sense to me.
The above chunk might cause confusion in future,
if people wonder for a moment why the return is ignored.
Should you use cast with (void) like this to be explicit?

(void) force_page_cache_readahead(...);

cheers,
P?draig.

2012-02-25 23:10:27

by Eric Wong

[permalink] [raw]
Subject: Re: [PATCH] fadvise: avoid EINVAL if user input is valid

Pádraig Brady <[email protected]> wrote:
> On 02/25/2012 02:27 AM, Eric Wong wrote:
> > + force_page_cache_readahead(mapping, file, start_index, nrpages);
> > break;
>
> This whole patch makes sense to me.
> The above chunk might cause confusion in future,
> if people wonder for a moment why the return is ignored.
> Should you use cast with (void) like this to be explicit?
>
> (void) force_page_cache_readahead(...);

I considered this, too[1]. However I checked for existing usages of
force_page_cache_readahead() noticed they just ignore the return value
like I did in my patch, so I followed existing convention for this
function. I didn't find any suggestion in Documentation/CodingStyle
for this.

Thanks for looking at this.

[1] - it's what I normally do in my own projects.

2012-02-26 05:52:37

by Hillf Danton

[permalink] [raw]
Subject: Re: [PATCH] fadvise: avoid EINVAL if user input is valid

On Sat, Feb 25, 2012 at 10:27 AM, Eric Wong <[email protected]> wrote:
> The kernel is not required to act on fadvise, so fail silently
> and ignore advice as long as it has a valid descriptor and
> parameters.
>
> Cc: [email protected]
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Eric Wong <[email protected]>
> ---
>
>  Of course I wouldn't knowingly call posix_fadvise() on a file in
>  tmpfs, but a userspace app often doesn't know (nor should it
>  care) what type of filesystem it's on.
>
>  I encountered EINVAL while running the Ruby 1.9.3 test suite on a
>  stock Debian wheezy installation.  Wheezy uses tmpfs for "/tmp" by
>  default and the test suite creates a temporary file to test the
>  Ruby wrapper for posix_fadvise() on.
>
>  mm/fadvise.c |   19 +++++++------------
>  1 file changed, 7 insertions(+), 12 deletions(-)
>
> diff --git a/mm/fadvise.c b/mm/fadvise.c
> index 469491e0..f9e48dd 100644
> --- a/mm/fadvise.c
> +++ b/mm/fadvise.c
> @@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
>                goto out;
>        }
>
> -       mapping = file->f_mapping;
> -       if (!mapping || len < 0) {
> +       if (len < 0) {

Current code makes sure mapping is valid after the above check,

>                ret = -EINVAL;
>                goto out;
>        }
>
> -       if (mapping->a_ops->get_xip_mem) {
> +       mapping = file->f_mapping;
> +       if (!mapping || mapping->a_ops->get_xip_mem) {
>                switch (advice) {
>                case POSIX_FADV_NORMAL:
>                case POSIX_FADV_RANDOM:

but backing devices info is no longer evaluated with that
guarantee in your change.

-hd

75: bdi = mapping->backing_dev_info;

> @@ -93,10 +93,9 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
>                spin_unlock(&file->f_lock);
>                break;
>        case POSIX_FADV_WILLNEED:
> -               if (!mapping->a_ops->readpage) {
> -                       ret = -EINVAL;
> +               /* ignore the advice if readahead isn't possible (tmpfs) */
> +               if (!mapping->a_ops->readpage)
>                        break;
> -               }
>
>                /* First and last PARTIAL page! */
>                start_index = offset >> PAGE_CACHE_SHIFT;
> @@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
>                nrpages = end_index - start_index + 1;
>                if (!nrpages)
>                        nrpages = ~0UL;
> -
> -               ret = force_page_cache_readahead(mapping, file,
> -                               start_index,
> -                               nrpages);
> -               if (ret > 0)
> -                       ret = 0;
> +
> +               force_page_cache_readahead(mapping, file, start_index, nrpages);
>                break;
>        case POSIX_FADV_NOREUSE:
>                break;
> --
> Eric Wong
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>

2012-02-26 08:44:06

by Eric Wong

[permalink] [raw]
Subject: Re: [PATCH] fadvise: avoid EINVAL if user input is valid

Hillf Danton <[email protected]> wrote:
> On Sat, Feb 25, 2012 at 10:27 AM, Eric Wong <[email protected]> wrote:
> > index 469491e0..f9e48dd 100644
> > --- a/mm/fadvise.c
> > +++ b/mm/fadvise.c
> > @@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> >                goto out;
> >        }
> >
> > -       mapping = file->f_mapping;
> > -       if (!mapping || len < 0) {
> > +       if (len < 0) {
>
> Current code makes sure mapping is valid after the above check,

Right. I moved the !mapping check down a few lines.

> >                ret = -EINVAL;
> >                goto out;
> >        }

Now the check hits the "goto out" the get_xip_mem check hits:

> > -       if (mapping->a_ops->get_xip_mem) {
> > +       mapping = file->f_mapping;
> > +       if (!mapping || mapping->a_ops->get_xip_mem) {
> >                switch (advice) {
> >                case POSIX_FADV_NORMAL:
> >                case POSIX_FADV_RANDOM:

case POSIX_FADV_SEQUENTIAL:
case POSIX_FADV_WILLNEED:
case POSIX_FADV_NOREUSE:
case POSIX_FADV_DONTNEED:
/* no bad return value, but ignore advice */
break;
default:
ret = -EINVAL;
}
goto out; <------ we hit this if (mapping == NULL)
}

> but backing devices info is no longer evaluated with that
> guarantee in your change.
>
> -hd
>
> 75: bdi = mapping->backing_dev_info;

The above line still doesn't evaluated because of the goto.

out:
fput(file);
return ret;
}