Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753894AbdHQXms (ORCPT ); Thu, 17 Aug 2017 19:42:48 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:57845 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753449AbdHQXmp (ORCPT ); Thu, 17 Aug 2017 19:42:45 -0400 Date: Thu, 17 Aug 2017 16:42:31 -0700 From: Ram Pai To: Michael Ellerman Cc: Thiago Jung Bauermann , linux-arch@vger.kernel.org, corbet@lwn.net, arnd@arndb.de, linux-doc@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, mhocko@kernel.org, linux-mm@kvack.org, dave.hansen@intel.com, mingo@redhat.com, paulus@samba.org, aneesh.kumar@linux.vnet.ibm.com, linux-kselftest@vger.kernel.org, akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org, khandual@linux.vnet.ibm.com Subject: Re: [RFC v6 21/62] powerpc: introduce execute-only pkey Reply-To: Ram Pai References: <1500177424-13695-1-git-send-email-linuxram@us.ibm.com> <1500177424-13695-22-git-send-email-linuxram@us.ibm.com> <87shhgdx5i.fsf@linux.vnet.ibm.com> <87d18fu6o1.fsf@concordia.ellerman.id.au> <87d18fw9it.fsf@linux.vnet.ibm.com> <871sous3xd.fsf@concordia.ellerman.id.au> <20170817233555.GC5427@ram.oc3035372033.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170817233555.GC5427@ram.oc3035372033.ibm.com> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 17081723-0036-0000-0000-0000025A8275 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007563; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000222; SDB=6.00903838; UDB=6.00452815; IPR=6.00684049; BA=6.00005538; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016747; XFM=3.00000015; UTC=2017-08-17 23:42:41 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17081723-0037-0000-0000-00004179719C Message-Id: <20170817234231.GA5445@ram.oc3035372033.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-17_13:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708170383 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2253 Lines: 55 On Thu, Aug 17, 2017 at 04:35:55PM -0700, Ram Pai wrote: > On Wed, Aug 02, 2017 at 07:40:46PM +1000, Michael Ellerman wrote: > > Thiago Jung Bauermann writes: > > > > > Michael Ellerman writes: > > > > > >> Thiago Jung Bauermann writes: > > >>> Ram Pai writes: > > >> ... > > >>>> + > > >>>> + /* We got one, store it and use it from here on out */ > > >>>> + if (need_to_set_mm_pkey) > > >>>> + mm->context.execute_only_pkey = execute_only_pkey; > > >>>> + return execute_only_pkey; > > >>>> +} > > >>> > > >>> If you follow the code flow in __execute_only_pkey, the AMR and UAMOR > > >>> are read 3 times in total, and AMR is written twice. IAMR is read and > > >>> written twice. Since they are SPRs and access to them is slow (or isn't > > >>> it?), > > >> > > >> SPRs read/writes are slow, but they're not *that* slow in comparison to > > >> a system call (which I think is where this code is being called?). > > > > > > Yes, this code runs on mprotect and mmap syscalls if the memory is > > > requested to have execute but not read nor write permissions. > > > > Yep. That's not in the fast path for key usage, ie. the fast path is > > userspace changing the AMR itself, and the overhead of a syscall is > > already hundreds of cycles. > > > > >> So we should try to avoid too many SPR read/writes, but at the same time > > >> we can accept more than the minimum if it makes the code much easier to > > >> follow. > > > > > > Ok. Ram had asked me to suggest a way to optimize the SPR reads and > > > writes and I came up with the patch below. Do you think it's worth it? > > > > At a glance no I don't think it is. Sorry you spent that much time on it. > > > > I think we can probably reduce the number of SPR accesses without > > needing to go to that level of complexity. > > > > But don't throw the patch away, I may eat my words once I have the full > > series applied and am looking at it hard - at the moment I'm just > > reviewing the patches piecemeal as I get time. > Thiago's patch does save some cycles. I dont feel like throwing his work. I agree, It should be considered after applying all the patches. RP -- Ram Pai