Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752510AbcCYHuT (ORCPT ); Fri, 25 Mar 2016 03:50:19 -0400 Received: from mail-wm0-f43.google.com ([74.125.82.43]:33582 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752008AbcCYHuQ (ORCPT ); Fri, 25 Mar 2016 03:50:16 -0400 From: Nicolai Stange To: Jan Kara Cc: Nicolai Stange , Andrew Morton , Al Viro , Jan Kara , Johannes Weiner , Michal Hocko , Ross Zwisler , Mel Gorman , Junichi Nomura , Hugh Dickins , Matthew Wilcox , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally References: <1458817738-2753-1-git-send-email-nicstange@gmail.com> <20160324114529.GC4025@quack.suse.cz> Date: Fri, 25 Mar 2016 08:50:10 +0100 In-Reply-To: <20160324114529.GC4025@quack.suse.cz> (Jan Kara's message of "Thu, 24 Mar 2016 12:45:29 +0100") Message-ID: <87mvpngfnx.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.92 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3184 Lines: 93 Jan Kara writes: > On Thu 24-03-16 12:08:58, Nicolai Stange wrote: >> If >> - generic_file_read_iter() gets called with a zero read length, >> - the read offset is at a page boundary, >> - IOCB_DIRECT is not set >> - and the page in question hasn't made it into the page cache yet, >> then do_generic_file_read() will trigger a readahead with a req_size hint >> of zero. >> >> Since roundup_pow_of_two(0) is undefined, UBSAN reports >> >> UBSAN: Undefined behaviour in include/linux/log2.h:63:13 >> shift exponent 64 is too large for 64-bit type 'long unsigned int' >> CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14 >> [...] >> Call Trace: >> [...] >> [] ondemand_readahead+0x3aa/0x3d0 >> [] ? ondemand_readahead+0x3aa/0x3d0 >> [] ? find_get_entry+0x2d/0x210 >> [] page_cache_sync_readahead+0x63/0xa0 >> [] do_generic_file_read+0x80d/0xf90 >> [] generic_file_read_iter+0x185/0x420 >> [...] >> [] __vfs_read+0x256/0x3d0 >> [...] >> >> when get_init_ra_size() gets called from ondemand_readahead(). >> >> The net effect is that the initial readahead size is arch dependent for >> requested read lengths of zero: for example, since >> >> 1UL << (sizeof(unsigned long) * 8) >> >> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead >> size becomes 4 on the former and 0 on the latter. >> >> What's more, whether or not the file access timestamp is updated for zero >> length reads is decided differently for the two cases of IOCB_DIRECT >> being set or cleared: in the first case, generic_file_read_iter() >> explicitly skips updating that timestamp while in the latter case, it is >> always updated through the call to do_generic_file_read(). >> >> According to POSIX, zero length reads "do not modify the last data access >> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct. >> >> Let generic_file_read_iter() unconditionally check the requested read >> length at its entry and return immediately with success if it is zero. >> >> Signed-off-by: Nicolai Stange > > Makes sense to me. You can add: > > Reviewed-by: Jan Kara Thank you very much for reviewing this! Nicolai > > Honza > >> diff --git a/mm/filemap.c b/mm/filemap.c >> index 7c00f10..a8c69c8 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) >> ssize_t retval = 0; >> loff_t *ppos = &iocb->ki_pos; >> loff_t pos = *ppos; >> + size_t count = iov_iter_count(iter); >> + >> + if (!count) >> + goto out; /* skip atime */ >> >> if (iocb->ki_flags & IOCB_DIRECT) { >> struct address_space *mapping = file->f_mapping; >> struct inode *inode = mapping->host; >> - size_t count = iov_iter_count(iter); >> loff_t size; >> >> - if (!count) >> - goto out; /* skip atime */ >> size = i_size_read(inode); >> retval = filemap_write_and_wait_range(mapping, pos, >> pos + count - 1); >> -- >> 2.7.4 >> >>