The kernel is not required to act on fadvise, so fail silently
and ignore advice as long as it has a valid descriptor and
parameters.
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Signed-off-by: Eric Wong <[email protected]>
---
Of course I wouldn't knowingly call posix_fadvise() on a file in
tmpfs, but a userspace app often doesn't know (nor should it
care) what type of filesystem it's on.
I encountered EINVAL while running the Ruby 1.9.3 test suite on a
stock Debian wheezy installation. Wheezy uses tmpfs for "/tmp" by
default and the test suite creates a temporary file to test the
Ruby wrapper for posix_fadvise() on.
mm/fadvise.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/mm/fadvise.c b/mm/fadvise.c
index 469491e0..f9e48dd 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
goto out;
}
- mapping = file->f_mapping;
- if (!mapping || len < 0) {
+ if (len < 0) {
ret = -EINVAL;
goto out;
}
- if (mapping->a_ops->get_xip_mem) {
+ mapping = file->f_mapping;
+ if (!mapping || mapping->a_ops->get_xip_mem) {
switch (advice) {
case POSIX_FADV_NORMAL:
case POSIX_FADV_RANDOM:
@@ -93,10 +93,9 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
spin_unlock(&file->f_lock);
break;
case POSIX_FADV_WILLNEED:
- if (!mapping->a_ops->readpage) {
- ret = -EINVAL;
+ /* ignore the advice if readahead isn't possible (tmpfs) */
+ if (!mapping->a_ops->readpage)
break;
- }
/* First and last PARTIAL page! */
start_index = offset >> PAGE_CACHE_SHIFT;
@@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
nrpages = end_index - start_index + 1;
if (!nrpages)
nrpages = ~0UL;
-
- ret = force_page_cache_readahead(mapping, file,
- start_index,
- nrpages);
- if (ret > 0)
- ret = 0;
+
+ force_page_cache_readahead(mapping, file, start_index, nrpages);
break;
case POSIX_FADV_NOREUSE:
break;
--
Eric Wong
On 02/25/2012 02:27 AM, Eric Wong wrote:
> The kernel is not required to act on fadvise, so fail silently
> and ignore advice as long as it has a valid descriptor and
> parameters.
>
> @@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> nrpages = end_index - start_index + 1;
> if (!nrpages)
> nrpages = ~0UL;
> -
> - ret = force_page_cache_readahead(mapping, file,
> - start_index,
> - nrpages);
> - if (ret > 0)
> - ret = 0;
> +
> + force_page_cache_readahead(mapping, file, start_index, nrpages);
> break;
This whole patch makes sense to me.
The above chunk might cause confusion in future,
if people wonder for a moment why the return is ignored.
Should you use cast with (void) like this to be explicit?
(void) force_page_cache_readahead(...);
cheers,
P?draig.
Pádraig Brady <[email protected]> wrote:
> On 02/25/2012 02:27 AM, Eric Wong wrote:
> > + force_page_cache_readahead(mapping, file, start_index, nrpages);
> > break;
>
> This whole patch makes sense to me.
> The above chunk might cause confusion in future,
> if people wonder for a moment why the return is ignored.
> Should you use cast with (void) like this to be explicit?
>
> (void) force_page_cache_readahead(...);
I considered this, too[1]. However I checked for existing usages of
force_page_cache_readahead() noticed they just ignore the return value
like I did in my patch, so I followed existing convention for this
function. I didn't find any suggestion in Documentation/CodingStyle
for this.
Thanks for looking at this.
[1] - it's what I normally do in my own projects.
On Sat, Feb 25, 2012 at 10:27 AM, Eric Wong <[email protected]> wrote:
> The kernel is not required to act on fadvise, so fail silently
> and ignore advice as long as it has a valid descriptor and
> parameters.
>
> Cc: [email protected]
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Eric Wong <[email protected]>
> ---
>
> Of course I wouldn't knowingly call posix_fadvise() on a file in
> tmpfs, but a userspace app often doesn't know (nor should it
> care) what type of filesystem it's on.
>
> I encountered EINVAL while running the Ruby 1.9.3 test suite on a
> stock Debian wheezy installation. Wheezy uses tmpfs for "/tmp" by
> default and the test suite creates a temporary file to test the
> Ruby wrapper for posix_fadvise() on.
>
> mm/fadvise.c | 19 +++++++------------
> 1 file changed, 7 insertions(+), 12 deletions(-)
>
> diff --git a/mm/fadvise.c b/mm/fadvise.c
> index 469491e0..f9e48dd 100644
> --- a/mm/fadvise.c
> +++ b/mm/fadvise.c
> @@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> goto out;
> }
>
> - mapping = file->f_mapping;
> - if (!mapping || len < 0) {
> + if (len < 0) {
Current code makes sure mapping is valid after the above check,
> ret = -EINVAL;
> goto out;
> }
>
> - if (mapping->a_ops->get_xip_mem) {
> + mapping = file->f_mapping;
> + if (!mapping || mapping->a_ops->get_xip_mem) {
> switch (advice) {
> case POSIX_FADV_NORMAL:
> case POSIX_FADV_RANDOM:
but backing devices info is no longer evaluated with that
guarantee in your change.
-hd
75: bdi = mapping->backing_dev_info;
> @@ -93,10 +93,9 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> spin_unlock(&file->f_lock);
> break;
> case POSIX_FADV_WILLNEED:
> - if (!mapping->a_ops->readpage) {
> - ret = -EINVAL;
> + /* ignore the advice if readahead isn't possible (tmpfs) */
> + if (!mapping->a_ops->readpage)
> break;
> - }
>
> /* First and last PARTIAL page! */
> start_index = offset >> PAGE_CACHE_SHIFT;
> @@ -106,12 +105,8 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> nrpages = end_index - start_index + 1;
> if (!nrpages)
> nrpages = ~0UL;
> -
> - ret = force_page_cache_readahead(mapping, file,
> - start_index,
> - nrpages);
> - if (ret > 0)
> - ret = 0;
> +
> + force_page_cache_readahead(mapping, file, start_index, nrpages);
> break;
> case POSIX_FADV_NOREUSE:
> break;
> --
> Eric Wong
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
Hillf Danton <[email protected]> wrote:
> On Sat, Feb 25, 2012 at 10:27 AM, Eric Wong <[email protected]> wrote:
> > index 469491e0..f9e48dd 100644
> > --- a/mm/fadvise.c
> > +++ b/mm/fadvise.c
> > @@ -43,13 +43,13 @@ SYSCALL_DEFINE(fadvise64_64)(int fd, loff_t offset, loff_t len, int advice)
> > goto out;
> > }
> >
> > - mapping = file->f_mapping;
> > - if (!mapping || len < 0) {
> > + if (len < 0) {
>
> Current code makes sure mapping is valid after the above check,
Right. I moved the !mapping check down a few lines.
> > ret = -EINVAL;
> > goto out;
> > }
Now the check hits the "goto out" the get_xip_mem check hits:
> > - if (mapping->a_ops->get_xip_mem) {
> > + mapping = file->f_mapping;
> > + if (!mapping || mapping->a_ops->get_xip_mem) {
> > switch (advice) {
> > case POSIX_FADV_NORMAL:
> > case POSIX_FADV_RANDOM:
case POSIX_FADV_SEQUENTIAL:
case POSIX_FADV_WILLNEED:
case POSIX_FADV_NOREUSE:
case POSIX_FADV_DONTNEED:
/* no bad return value, but ignore advice */
break;
default:
ret = -EINVAL;
}
goto out; <------ we hit this if (mapping == NULL)
}
> but backing devices info is no longer evaluated with that
> guarantee in your change.
>
> -hd
>
> 75: bdi = mapping->backing_dev_info;
The above line still doesn't evaluated because of the goto.
out:
fput(file);
return ret;
}