2016-03-24 11:09:24

by Nicolai Stange

[permalink] [raw]
Subject: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally

If
- generic_file_read_iter() gets called with a zero read length,
- the read offset is at a page boundary,
- IOCB_DIRECT is not set
- and the page in question hasn't made it into the page cache yet,
then do_generic_file_read() will trigger a readahead with a req_size hint
of zero.

Since roundup_pow_of_two(0) is undefined, UBSAN reports

UBSAN: Undefined behaviour in include/linux/log2.h:63:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
[...]
Call Trace:
[...]
[<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
[<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
[<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
[<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
[<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
[<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
[...]
[<ffffffff81510b06>] __vfs_read+0x256/0x3d0
[...]

when get_init_ra_size() gets called from ondemand_readahead().

The net effect is that the initial readahead size is arch dependent for
requested read lengths of zero: for example, since

1UL << (sizeof(unsigned long) * 8)

evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
size becomes 4 on the former and 0 on the latter.

What's more, whether or not the file access timestamp is updated for zero
length reads is decided differently for the two cases of IOCB_DIRECT
being set or cleared: in the first case, generic_file_read_iter()
explicitly skips updating that timestamp while in the latter case, it is
always updated through the call to do_generic_file_read().

According to POSIX, zero length reads "do not modify the last data access
timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.

Let generic_file_read_iter() unconditionally check the requested read
length at its entry and return immediately with success if it is zero.

Signed-off-by: Nicolai Stange <[email protected]>
---
Applicable to linux-next-20160324

mm/filemap.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 7c00f10..a8c69c8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
ssize_t retval = 0;
loff_t *ppos = &iocb->ki_pos;
loff_t pos = *ppos;
+ size_t count = iov_iter_count(iter);
+
+ if (!count)
+ goto out; /* skip atime */

if (iocb->ki_flags & IOCB_DIRECT) {
struct address_space *mapping = file->f_mapping;
struct inode *inode = mapping->host;
- size_t count = iov_iter_count(iter);
loff_t size;

- if (!count)
- goto out; /* skip atime */
size = i_size_read(inode);
retval = filemap_write_and_wait_range(mapping, pos,
pos + count - 1);
--
2.7.4


2016-03-24 11:45:08

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally

On Thu 24-03-16 12:08:58, Nicolai Stange wrote:
> If
> - generic_file_read_iter() gets called with a zero read length,
> - the read offset is at a page boundary,
> - IOCB_DIRECT is not set
> - and the page in question hasn't made it into the page cache yet,
> then do_generic_file_read() will trigger a readahead with a req_size hint
> of zero.
>
> Since roundup_pow_of_two(0) is undefined, UBSAN reports
>
> UBSAN: Undefined behaviour in include/linux/log2.h:63:13
> shift exponent 64 is too large for 64-bit type 'long unsigned int'
> CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
> [...]
> Call Trace:
> [...]
> [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
> [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
> [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
> [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
> [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
> [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
> [...]
> [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
> [...]
>
> when get_init_ra_size() gets called from ondemand_readahead().
>
> The net effect is that the initial readahead size is arch dependent for
> requested read lengths of zero: for example, since
>
> 1UL << (sizeof(unsigned long) * 8)
>
> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
> size becomes 4 on the former and 0 on the latter.
>
> What's more, whether or not the file access timestamp is updated for zero
> length reads is decided differently for the two cases of IOCB_DIRECT
> being set or cleared: in the first case, generic_file_read_iter()
> explicitly skips updating that timestamp while in the latter case, it is
> always updated through the call to do_generic_file_read().
>
> According to POSIX, zero length reads "do not modify the last data access
> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
>
> Let generic_file_read_iter() unconditionally check the requested read
> length at its entry and return immediately with success if it is zero.
>
> Signed-off-by: Nicolai Stange <[email protected]>

Makes sense to me. You can add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7c00f10..a8c69c8 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> ssize_t retval = 0;
> loff_t *ppos = &iocb->ki_pos;
> loff_t pos = *ppos;
> + size_t count = iov_iter_count(iter);
> +
> + if (!count)
> + goto out; /* skip atime */
>
> if (iocb->ki_flags & IOCB_DIRECT) {
> struct address_space *mapping = file->f_mapping;
> struct inode *inode = mapping->host;
> - size_t count = iov_iter_count(iter);
> loff_t size;
>
> - if (!count)
> - goto out; /* skip atime */
> size = i_size_read(inode);
> retval = filemap_write_and_wait_range(mapping, pos,
> pos + count - 1);
> --
> 2.7.4
>
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2016-03-25 07:50:19

by Nicolai Stange

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally

Jan Kara <[email protected]> writes:

> On Thu 24-03-16 12:08:58, Nicolai Stange wrote:
>> If
>> - generic_file_read_iter() gets called with a zero read length,
>> - the read offset is at a page boundary,
>> - IOCB_DIRECT is not set
>> - and the page in question hasn't made it into the page cache yet,
>> then do_generic_file_read() will trigger a readahead with a req_size hint
>> of zero.
>>
>> Since roundup_pow_of_two(0) is undefined, UBSAN reports
>>
>> UBSAN: Undefined behaviour in include/linux/log2.h:63:13
>> shift exponent 64 is too large for 64-bit type 'long unsigned int'
>> CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
>> [...]
>> Call Trace:
>> [...]
>> [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
>> [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
>> [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
>> [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
>> [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
>> [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
>> [...]
>> [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
>> [...]
>>
>> when get_init_ra_size() gets called from ondemand_readahead().
>>
>> The net effect is that the initial readahead size is arch dependent for
>> requested read lengths of zero: for example, since
>>
>> 1UL << (sizeof(unsigned long) * 8)
>>
>> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
>> size becomes 4 on the former and 0 on the latter.
>>
>> What's more, whether or not the file access timestamp is updated for zero
>> length reads is decided differently for the two cases of IOCB_DIRECT
>> being set or cleared: in the first case, generic_file_read_iter()
>> explicitly skips updating that timestamp while in the latter case, it is
>> always updated through the call to do_generic_file_read().
>>
>> According to POSIX, zero length reads "do not modify the last data access
>> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
>>
>> Let generic_file_read_iter() unconditionally check the requested read
>> length at its entry and return immediately with success if it is zero.
>>
>> Signed-off-by: Nicolai Stange <[email protected]>
>
> Makes sense to me. You can add:
>
> Reviewed-by: Jan Kara <[email protected]>

Thank you very much for reviewing this!

Nicolai


>
> Honza
>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 7c00f10..a8c69c8 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>> ssize_t retval = 0;
>> loff_t *ppos = &iocb->ki_pos;
>> loff_t pos = *ppos;
>> + size_t count = iov_iter_count(iter);
>> +
>> + if (!count)
>> + goto out; /* skip atime */
>>
>> if (iocb->ki_flags & IOCB_DIRECT) {
>> struct address_space *mapping = file->f_mapping;
>> struct inode *inode = mapping->host;
>> - size_t count = iov_iter_count(iter);
>> loff_t size;
>>
>> - if (!count)
>> - goto out; /* skip atime */
>> size = i_size_read(inode);
>> retval = filemap_write_and_wait_range(mapping, pos,
>> pos + count - 1);
>> --
>> 2.7.4
>>
>>

2016-03-29 08:46:26

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally

On Thu, Mar 24, 2016 at 12:08:58PM +0100, Nicolai Stange wrote:
> If
> - generic_file_read_iter() gets called with a zero read length,
> - the read offset is at a page boundary,
> - IOCB_DIRECT is not set
> - and the page in question hasn't made it into the page cache yet,
> then do_generic_file_read() will trigger a readahead with a req_size hint
> of zero.
>
> Since roundup_pow_of_two(0) is undefined, UBSAN reports
>
> UBSAN: Undefined behaviour in include/linux/log2.h:63:13
> shift exponent 64 is too large for 64-bit type 'long unsigned int'
> CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
> [...]
> Call Trace:
> [...]
> [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
> [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
> [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
> [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
> [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
> [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
> [...]
> [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
> [...]
>
> when get_init_ra_size() gets called from ondemand_readahead().
>
> The net effect is that the initial readahead size is arch dependent for
> requested read lengths of zero: for example, since
>
> 1UL << (sizeof(unsigned long) * 8)
>
> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
> size becomes 4 on the former and 0 on the latter.
>
> What's more, whether or not the file access timestamp is updated for zero
> length reads is decided differently for the two cases of IOCB_DIRECT
> being set or cleared: in the first case, generic_file_read_iter()
> explicitly skips updating that timestamp while in the latter case, it is
> always updated through the call to do_generic_file_read().
>
> According to POSIX, zero length reads "do not modify the last data access
> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
>
> Let generic_file_read_iter() unconditionally check the requested read
> length at its entry and return immediately with success if it is zero.
>
> Signed-off-by: Nicolai Stange <[email protected]>

Acked-by: Mel Gorman <[email protected]>

--
Mel Gorman
SUSE Labs