Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933021AbdC2VP3 (ORCPT ); Wed, 29 Mar 2017 17:15:29 -0400 Received: from smtprelay.synopsys.com ([198.182.60.111]:42805 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932700AbdC2VOw (ORCPT ); Wed, 29 Mar 2017 17:14:52 -0400 Subject: Re: [RFC][CFT][PATCHSET v1] uaccess unification To: Al Viro References: <20170329055706.GH29622@ZenIV.linux.org.uk> <3399faa9-795e-39db-42f5-7d1e10bbff9c@synopsys.com> <20170329202939.GI29622@ZenIV.linux.org.uk> CC: "linux-arch@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Linus Torvalds , Richard Henderson , Russell King , Will Deacon , Haavard Skinnemoen , Steven Miao , Jesper Nilsson , Mark Salter , Yoshinori Sato , "Richard Kuo" , Tony Luck , "Geert Uytterhoeven" , James Hogan , Michal Simek , David Howells , "Ley Foon Tan" , Jonas Bonn Newsgroups: gmane.linux.kernel,gmane.linux.kernel.cross-arch From: Vineet Gupta Message-ID: <32129bc4-0e0a-c21d-0e94-67f73a09ac6e@synopsys.com> Date: Wed, 29 Mar 2017 14:14:22 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170329202939.GI29622@ZenIV.linux.org.uk> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.10.161.82] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1594 Lines: 41 On 03/29/2017 01:29 PM, Al Viro wrote: > On Wed, Mar 29, 2017 at 01:08:12PM -0700, Vineet Gupta wrote: > >> Hi Al, >> >> Thx for taking this up. It seems ARC was missing INLINE_COPY* switch likely due to >> existing 2 variants (inline/out-of-line) we already have. >> I've added a patch for that (attached too) - boot tested the series on ARC. > > BTW, I wonder if inlining all of the copy_{to,from}_user() is actually a win. Just to be clear, your series was doing this for everyone. > It's probably arch-dependent and it would be nice if somebody compared > performance with and without inlining those... ARC, in particular, has > __arc_copy_{to,from}_user() inlining a whole lot, even in case of non-constant > size and your patch, AFAICS, will inline all of it in *all* cases. Yes we do inline all of it: the non-constant case is actually simpler, it is a simple byte loop. " mov.f lp_count, %0 \n" " lpnz 3f \n" " ldb.ab %1, [%3, 1] \n" "1: stb.ab %1, [%2, 1] \n" " sub %0, %0, 1 \n" Doing it out of line (3 args) will be 4 instructions anyways. For constant size, there's laddered copy for blocks of 16 bytes + stragglers 1-15. We do "manual" constant propagation there to compile time optimize away the straggler part. But yes all of this is emitted inline. > It might > end up being a win, but that's not apriori obvious... Do you have any > profiling results in that area? Unfortunately not at the moment. The reason for adding out-of-line variant was not so much as performance but to improve the footprint for -Os case (some customer I think).