Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp6254652ybx; Mon, 11 Nov 2019 06:24:40 -0800 (PST) X-Google-Smtp-Source: APXvYqyhUJDLzbCM4RVcUVe8ok5RUairN0YQ/TeYzL1DLdgvZWi7/CkTgku3+EnDFM0Gn6ZItKA9 X-Received: by 2002:a17:906:c45:: with SMTP id t5mr22327699ejf.296.1573482280849; Mon, 11 Nov 2019 06:24:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573482280; cv=none; d=google.com; s=arc-20160816; b=Fq2npkXm/NZheJY3ptfK4WW62KJiUa06q4P4QElDCiEtEnONprvqmsvX99a3kYaVJt URzrSOUt/xZp8e79K5zsTLyfzPln+qLuMFrjC+3IYJYTtQbT8RdJPBfG539PrDMMNAS0 Mdh1k3xqBWOSp2gOw7+UJUn0TL2/REoqIQ1whsiQAFx1q1m8JtRjlvNKrNK+rBMR5gDu +l3Xh1fZh8XFJsAtVjD9fnFHIjEu7F4+MAYR0X/uPowAvffESYHTecpLSFZOM3bglv+g A/UDIa57qwCKgj9vQGpu2uW7aCXyQY1jJrKbHC2QdX49rnS0wtpEfvcjJizl4Zy2RG8c gdWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=pIYP2VRjhw/akiehN3v/iIX7DPkHmLUly+LVZ83Zb5I=; b=lQV5Kd6bPpYNZmxlOdLk1gSH9mDKC0PgonuSpNefeuRb50nuaIgKa6VzD0uEbHd1qR jXq/HYfacW/fC1r+6LJ36NW/0mGaiVwvPvEj7kp00vxtBAx27d4nwsCLL0P3EuwPGito GGxJclr++YeXxyUxv8nUmc0ojR8hsziwhaBRq/8kYsU6Z5J5LQCbXaKgJNxALI0c/1cK 8/HDgHYjBww3QabeLM8GCtOBDpDklYY6jXVSrIXb7AhYayHIzRLzxOSVWVa14bCp7k24 Pt30aSIe6b9EdSZsoj2/8e5oSYULCMcPI185sQDlS2ooNJizHgTXyd6rIdgOSDLVfPzs GPQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a10si12577181edj.424.2019.11.11.06.24.16; Mon, 11 Nov 2019 06:24:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726985AbfKKOXf (ORCPT + 99 others); Mon, 11 Nov 2019 09:23:35 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:41012 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726832AbfKKOXf (ORCPT ); Mon, 11 Nov 2019 09:23:35 -0500 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 15A924AF6F555173F9E9; Mon, 11 Nov 2019 22:23:24 +0800 (CST) Received: from [127.0.0.1] (10.173.223.212) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.439.0; Mon, 11 Nov 2019 22:23:16 +0800 Subject: Re: [RFC PATCH v2] arm64: cpufeatures: add support for tlbi range instructions To: Marc Zyngier References: <5DC960EB.9050503@huawei.com> <20191111132716.GA9394@willie-the-truck> <5DC96660.8040505@huawei.com> CC: Will Deacon , , , , , , , From: Zhenyu Ye Message-ID: <5DC96ED3.90205@huawei.com> Date: Mon, 11 Nov 2019 22:23:15 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.223.212] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/11/11 22:04, Marc Zyngier wrote: > On 2019-11-11 14:56, Zhenyu Ye wrote: >> On 2019/11/11 21:27, Will Deacon wrote: >>> On Mon, Nov 11, 2019 at 09:23:55PM +0800, Zhenyu Ye wrote: >>>> ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a >>>> range of input addresses. This patch adds support for this feature. >>>> This is the second version of the patch. >>>> >>>> I traced the __flush_tlb_range() for a minute and get some statistical >>>> data as below: >>>> >>>> PAGENUM COUNT >>>> 1 34944 >>>> 2 5683 >>>> 3 1343 >>>> 4 7857 >>>> 5 838 >>>> 9 339 >>>> 16 933 >>>> 19 427 >>>> 20 5821 >>>> 23 279 >>>> 41 338 >>>> 141 279 >>>> 512 428 >>>> 1668 120 >>>> 2038 100 >>>> >>>> Those data are based on kernel-5.4.0, where PAGENUM = end - start, COUNT >>>> shows number of calls to the __flush_tlb_range() in a minute. There only >>>> shows the data which COUNT >= 100. The kernel is started normally, and >>>> transparent hugepage is opened. As we can see, though most user TLBI >>>> ranges were 1 pages long, the num of long-range can not be ignored. >>>> >>>> The new feature of TLB range can improve lots of performance compared to >>>> the current implementation. As an example, flush 512 ranges needs only 1 >>>> instruction as opposed to 512 instructions using current implementation. >>>> >>>> And for a new hardware feature, support is better than not. >>>> >>>> Signed-off-by: Zhenyu Ye >>>> --- >>>> ChangeLog v1 -> v2: >>>> - Change the main implementation of this feature. >>>> - Add some comments. >>> >>> How does this address my concerns here: >>> >>> >>> https://lore.kernel.org/linux-arm-kernel/20191031131649.GB27196@willie-the-truck/ >>> >>> ? >>> >>> Will >>> >>> . >>> >> >> I think your concern is more about the hardware level, and we can do >> nothing about >> this at all. The interconnect/DVM implementation is not exposed to >> software layer >> (and no need), and may should be constrained at hardware level. > > You're missing the point here: the instruction may be implemented > and perfectly working at the CPU level, and yet not carried over > the interconnect. In this situation, other CPUs may not observe > the DVM messages instructing them of such invalidation, and you'll end > up with memory corruption. > > So, in the absence of an architectural guarantee that range invalidation > is supported and observed by all the DVM agents in the system, there must > be a firmware description for it on which the kernel can rely. > > M. OK, sounds like an architectural defect. Some hardware-level improvements may should be done here, which is unfamiliar to me. However, We can discuss other aspects of the patch. Anyway, this should be support by the kernel in the future, isn't it?