Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp173509rdh; Thu, 23 Nov 2023 00:02:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IFCV2gkvQfBB/f5URJaoT1A4YY6+WLj4/Mcwy04Xe6zZi+1WEF5F09d5J4sBtv1SRQ+HBqi X-Received: by 2002:a05:6a20:96ce:b0:187:ee15:82dc with SMTP id hq14-20020a056a2096ce00b00187ee1582dcmr4846982pzc.20.1700726547250; Thu, 23 Nov 2023 00:02:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700726547; cv=none; d=google.com; s=arc-20160816; b=Rgq9eYMzI4ElKRBbPjkflG0WyMqRfUm9RQ4YP7+QJ88vJt2IVDvgcEaLcYIGKs76jz 2y57pr2WwFUcbhlZnEInWkNOi7Ny4OZhVoL/pAl4I7TOzXw/DBP0Cj9w0m+bokd9mLA2 R6XQKGJdt/RhuOqllH6EDTnFXSZHeKY5ut7JS4BidGI5IEaaS1FYOIq4V8+uiz2U632Y bbdOuMVFoFAi8cf4iNlxCsyk2eVDGIa/dosxkC9XZWgWP0VJlakoxwho91eF8MXexs84 65sM+hlM0hEa1GU79tKwlJ+NjPgFgsMbpJdq66QrCzh6QqG9dSdLGHSp1qpdZkAhNlQ8 J43Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=RWxVY5t9reTe/CS5uerA+GekX77vgZBS1Nj/db3aCOI=; fh=HcwdKEUzDfrA/91vNi6E4tzGBiqnHZyjH2kbnU5CpsI=; b=qQOXhz7C6HZqXE6ufQ9hQea7ubQNUkGXaOlxrwZ7Hq+mzZDZwqtQLzudnxEfBnZr5g vgT74fzcPq2VuQfqjLvXrsIDTYDVF6AdP+vxskLEzGy9yWLsilo8wMEGHPnTr0dzBxRr eD1IJstKXi1s86D/9OpeXxZDfvTtkN0mvBZeFlqMMfj+KonurtDh+2Dbf/HetO8TlsY3 8TRoHF8omLPkqJ+1xBr5RY4ZS1jziVr2jtlW4SkyOZmI1J7JehOZ0GDMRMX4nNCf7E+5 BrGkGu5J71TmMIGnOWmLH150RAOKoWaaUZ9yzuSoZUSmSIzYdNKJYWrMQ1P2rCZE6zu2 jOOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FPBIJnxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id j29-20020a63595d000000b0056949ba3f3fsi831999pgm.253.2023.11.23.00.02.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Nov 2023 00:02:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FPBIJnxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6A93280BBC42; Thu, 23 Nov 2023 00:01:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344917AbjKWIAn (ORCPT + 99 others); Thu, 23 Nov 2023 03:00:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235296AbjKWIA3 (ORCPT ); Thu, 23 Nov 2023 03:00:29 -0500 Received: from mail-yw1-x1135.google.com (mail-yw1-x1135.google.com [IPv6:2607:f8b0:4864:20::1135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1283A1BD7 for ; Thu, 23 Nov 2023 00:00:01 -0800 (PST) Received: by mail-yw1-x1135.google.com with SMTP id 00721157ae682-5ca11922cedso5932077b3.2 for ; Thu, 23 Nov 2023 00:00:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1700726400; x=1701331200; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RWxVY5t9reTe/CS5uerA+GekX77vgZBS1Nj/db3aCOI=; b=FPBIJnxowlESh6hWBzgJRK95k8A9J7LYT61Gk2U32hsz6XXyavC5MfSI1m+lVICiuK wA8+bXCimBsJ22E3fonJR2+eZMVz25z3Nt3PNr2BwTHpWDKHd2ua+k9QdF7WTqggJCUN ka55XCLXWAzjt8HJrf5+8cEjGYGzzBe/pdTpEtFAO/HMZL53gIo0isfMKjN1DtjB2rBv gVwUFgxzkhcLFRdsJFTWANfokpCXajgrHWsZPeyU6DGTSq39TmEWq7+J39IvtVD199/1 EOxIxEaLiFA+tQLzBJoDpZukyWW0pZKJ3HAfNA0+SqPdHr5YihqQKFNf/t/4AoqlUQzf H4/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700726400; x=1701331200; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RWxVY5t9reTe/CS5uerA+GekX77vgZBS1Nj/db3aCOI=; b=KmhYS4nzw7wlOTgVkZAcYGUyEcZt9VcT5xUv7q1Ii86pAEX97E9rMkEFGzxOc/IPzl eCmfCrvahm5pSCkdm5nQguK6RtmUHD7BW0825TrBzgRrr84s4ELd1xguZpRqAp5Szmuc mk92Zbyow6XsR/ggNHkWoQH0r4rhlFkhiSpfrevvbLBXI2TicyVUVR9Vcn+9fRcJzz/a nD6FQTtU7jqFHLl5rXYNwsy6QkyXhsBz6gdRCXCfLso+0zA5LSs5ZG3ggpVls5VkmOjm qRQIeBD8XEVVLq3RlW9qbfi0P6Cpm1PgPae9cBiQxXE1FDE5rFfNYN0aqUIohLv0BDVb 16pQ== X-Gm-Message-State: AOJu0YwUdsA9ULo+B/sWYBQxjlwvbSaBj+urmt9tQbNask3NsJYweB5r IjpUg8TRsD3b9bZuaWjp5v5d0Jfy9tECoS3ydvOWBA== X-Received: by 2002:a81:ae06:0:b0:5cc:61c7:b058 with SMTP id m6-20020a81ae06000000b005cc61c7b058mr4742376ywh.22.1700726400505; Thu, 23 Nov 2023 00:00:00 -0800 (PST) MIME-Version: 1.0 References: <20231122092855.4440-1-shijie@os.amperecomputing.com> In-Reply-To: <20231122092855.4440-1-shijie@os.amperecomputing.com> From: Linus Walleij Date: Thu, 23 Nov 2023 08:59:48 +0100 Message-ID: Subject: Re: [PATCH 0/4] arm64: an optimization for AmpereOne To: Huang Shijie Cc: catalin.marinas@arm.com, will@kernel.org, mark.rutland@arm.com, suzuki.poulose@arm.com, broonie@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, anshuman.khandual@arm.com, robh@kernel.org, oliver.upton@linux.dev, maz@kernel.org, patches@amperecomputing.com, Kohei Tarumizu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 23 Nov 2023 00:01:08 -0800 (PST) On Wed, Nov 22, 2023 at 10:29=E2=80=AFAM Huang Shijie wrote: > 0) Background: > We found that AmpereOne benefits from aggressive prefetches when > using 4K page size. > > 1) This patch: > 1.1) adds new WORKAROUND_AMPERE_AC03_PREFETCH capability. > 1.2) uses MIDR_AMPERE1 to filter the processor. > 1.3) uses alternative_if to alternative the code > for AmpereOne. > 1.4) adds software prefetches for the specific loop. > Also add a macro add_prefetch. > > 2) Test result: > In hugetlb or tmpfs, We can get big seqential read performance improv= ement > up to 1.3x ~ 1.4x. In June 2022 Fujitsu tried to add a similar feature for A64FX, here is the essence of my feedback from back then, it applies here as well: https://lore.kernel.org/linux-arm-kernel/CACRpkdbPLFOoPdX4L6ABV8GKpC8cQGP3s= 2aN2AvRHEK49U9VMg@mail.gmail.com/#t TL;DR: this is a hack, if you want to accelerate the memory hierarchy, then work with the MM developers to figure out how to do that in a structured and scientific way that will work with any prefetching hardware on any CPU. Yours, Linus Walleij