Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp12717pxf; Tue, 23 Mar 2021 20:10:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzpHjkiGJhOTJY0I82BJcjM1ZXiRvCmzW9wYdcirpurz1gXEXEaYMRDupQWIPG8H2Nle4AB X-Received: by 2002:a17:907:3da3:: with SMTP id he35mr1321467ejc.148.1616555443909; Tue, 23 Mar 2021 20:10:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616555443; cv=none; d=google.com; s=arc-20160816; b=qezImzs1/+pRVTBjiYrj8LqTgoZqIc1tsbBV71Ot0GxuufLB3Edsa6XzNeRsC3xj+4 Q1k7lBqpQV23YK/8k7U+3zsgZdBC0qVzGl8H+Qf55rcHf41eCBJsbBcGl7nQeQPdYsfY zzeLS8oUSNDprrmNXi6S9U7/LbcDm3H6wp9CfHG+eSlGDT6OC23RBvmb+yL67/6jJEHB 3iHuBo5t8jrtrhOmvbTrkD1M0un80W/OanPCsR8uxGU+Wr+B2Fru8wbDu+tuV7BQQ0RD o6OnOtmsdR+edci4l4/nNlQ74WSppEiEDAAgWlCV/HJjEkgyvpdbSxYH9d1nqLg2iRI1 7zTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=8i1tUZ46MHuGlVlmzUkXQ2v/NalU9vVabdjk6ixaef4=; b=Wzc+N63+a/frlCnUYj4A/zUXy5QcH3o5EWRytiVEgJhwYsoO9OyNSBs9P7QsHQc72b 3qNwm2LzkKQK+gPm76I2yZdK6Zcnt3MiG7vep4H6plyVMDxW0DCeCUjSG2QkKClru2zQ i3xRMIFYsx8WXW2kszWiy4/syFyl5MUjxfEnk6PGLqNKINfFZrBPvVCM1wBU9SHqb96c 5xjqgg8JRCMdvcHGKZhJx4TFvRcw/hokpR4c8q12SpNVWa0ZfKZxL9Lc9xgLE2Kgr7aR 5DUp9UdfZmZe2hZ9bsHZzQXykLfGdTscEAaXJ/QESPvpzYVxi0awcFfprkN606SbxCQJ VobA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i6si700887ejz.383.2021.03.23.20.10.20; Tue, 23 Mar 2021 20:10:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232546AbhCWPEf (ORCPT + 99 others); Tue, 23 Mar 2021 11:04:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:40182 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232576AbhCWPED (ORCPT ); Tue, 23 Mar 2021 11:04:03 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id B9CC46198C; Tue, 23 Mar 2021 15:04:01 +0000 (UTC) Date: Tue, 23 Mar 2021 15:03:59 +0000 From: Catalin Marinas To: Will Deacon Cc: Robin Murphy , Yang Yingliang , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, guohanjun@huawei.com Subject: Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes Message-ID: <20210323150358.GA10576@arm.com> References: <20210323073432.3422227-1-yangyingliang@huawei.com> <20210323073432.3422227-3-yangyingliang@huawei.com> <03ac41af-c433-cd66-8195-afbf9c49554c@arm.com> <20210323133217.GA11802@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210323133217.GA11802@willie-the-truck> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 23, 2021 at 01:32:18PM +0000, Will Deacon wrote: > On Tue, Mar 23, 2021 at 12:08:56PM +0000, Robin Murphy wrote: > > On 2021-03-23 07:34, Yang Yingliang wrote: > > > When copy over 128 bytes, src/dst is added after > > > each ldp/stp instruction, it will cost more time. > > > To improve this, we only add src/dst after load > > > or store 64 bytes. > > > > This breaks the required behaviour for copy_*_user(), since the fault > > handler expects the base address to be up-to-date at all times. Say you're > > copying 128 bytes and fault on the 4th store, it should return 80 bytes not > > copied; the code below would return 128 bytes not copied, even though 48 > > bytes have actually been written to the destination. > > > > We've had a couple of tries at updating this code (because the whole > > template is frankly a bit terrible, and a long way from the well-optimised > > code it was derived from), but getting the fault-handling behaviour right > > without making the handler itself ludicrously complex has proven tricky. And > > then it got bumped down the priority list while the uaccess behaviour in > > general was in flux - now that the dust has largely settled on that I should > > probably try to find time to pick this up again... > > I think the v5 from Oli was pretty close, but it didn't get any review: > > https://lore.kernel.org/r/20200914151800.2270-1-oli.swede@arm.com These are still unread in my inbox as I was planning to look at them again. However, I think we discussed a few options on how to proceed but no concrete plans: 1. Merge Oli's patches as they are, with some potential complexity issues as fixing the user copy accuracy was non-trivial. I think the latest version uses a two-stage approach: when taking a fault, it falls back to to byte-by-byte with the expectation that it faults again and we can then report the correct fault address. 2. Only use Cortex Strings for in-kernel memcpy() while the uaccess routines are some simple loops that align the uaccess part only (unlike Cortex Strings which usually to align the source). 3. Similar to 2 but with Cortex Strings imported automatically with some script to make it easier to keep the routines up to date. If having non-optimal (but good enough) uaccess routines is acceptable, I'd go for (2) with a plan to move to (3) at the next Cortex Strings update. I also need to look again at option (1) to see how complex it is but given the time one spends on importing a new Cortex Strings library, I don't think (1) scales well on the long term. We could, however, go for (1) now and look at (3) with the next update. -- Catalin