Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933371AbdCLCRH (ORCPT ); Sat, 11 Mar 2017 21:17:07 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:33691 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753782AbdCLCRA (ORCPT ); Sat, 11 Mar 2017 21:17:00 -0500 Date: Sat, 11 Mar 2017 18:16:55 -0800 From: Eric Biggers To: Al Viro Cc: linux-fsdevel@vger.kernel.org, David Howells , linux-kernel@vger.kernel.org, Eric Biggers Subject: Re: [PATCH v2] statx: optimize copy of struct statx to userspace Message-ID: <20170312021655.GA593@zzz> References: <20170311214555.941-1-ebiggers3@gmail.com> <20170312012411.GN29622@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170312012411.GN29622@ZenIV.linux.org.uk> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2605 Lines: 56 Hi Al, On Sun, Mar 12, 2017 at 01:24:15AM +0000, Al Viro wrote: > On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote: > > From: Eric Biggers > > > > I found that statx() was significantly slower than stat(). As a > > microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs > > file to the same with statx() passed a NULL path: > > Umm... > Well, it's a silly benchmark, but stat performance is important, and usually things are cached already so most of the time is just overhead --- which this measures. And since nothing actually uses statx() yet, you can't do a benchmark just by running some command like 'git status' or whatever. > > + tmp.stx_dev_major = MAJOR(stat->dev); > > + tmp.stx_dev_minor = MINOR(stat->dev); > > + memset(tmp.__spare2, 0, sizeof(tmp.__spare2)); > > + > > + return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; > > That relies upon there being no padding in the damn structure. > It's true and probably will be true on any target, but > a) it's bloody well worth stating explicitly > and > b) struct statx tmp = {.stx_mask = stat->result_mask}; > will get rid of those memset() you've got there by implicit > zeroing of fields missing in partial structure initializer. > Padding is *not* included into that, but you are relying upon > having no padding anyway... If padding is a concern at all (AFAICS it's not actually an issue now with struct statx, but people tend to have different opinions on how careful they want to be with padding), then I think we'll just have to start by memsetting the whole struct to 0. stat() actually does that, except that it's overridden on some architectures, like x86, to only memset the actual reserved fields. I had kind of assumed that as long as we're defining a new struct that has the same layout on all architectures, with no padding, that we'd want to take the opportunity to only memset the actual reserved fields, to make it a bit faster. And yes, a designated initializer would make this look a bit nicer, though it might have to be changed later if any future statx() extensions involve fields that are unioned or conditionally written. Note that another approach would be to copy the fields individually, but with unsafe_put_user() instead of __put_user(). That would avoid the overhead of user_access_begin()/user_access_end() which was the main performance problem. However, the infrastructure around this doesn't seem to be ready quite yet: there aren't good implementations of unsafe_put_user() yet, nor is there an unsafe_clear_user() yet. - Eric