Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp21019786rwd; Thu, 29 Jun 2023 09:51:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5XRf5SBWlr7/psqJ3Fp7d9GZMTQXBX0zb2ah93NMngfEMarxd+BeS3WBsGvYIvE0Ffo518 X-Received: by 2002:a17:902:f648:b0:1b6:86ff:9363 with SMTP id m8-20020a170902f64800b001b686ff9363mr13741531plg.41.1688057472486; Thu, 29 Jun 2023 09:51:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688057472; cv=none; d=google.com; s=arc-20160816; b=wfAHtZ53O4Bo6l9zQjiQKLU/xPYSdwQKczkEexQOFtqLksIklK48+SFsKn3g2PD3od //i73Ru5fXMMKN57RXAWBE6smoGp9vlLODMN5PoYLR5gGhFgTOcpEjYoRhSFdWy8m0r8 L0FVwz5P2kDLyua+EbNZNwTBdk1t+RLHYSb52aa29Q0bKr2dByI6kR++CkA7oi1o6dE7 Z5bmi8YrDYLY7d8Tbq9yQ+pOsJex2t2n5pcwOk0ItzaUTBJRtw9lVXyFSVdQrveFjfmf WM6OMqTQRA3GoZSpdNh70ALFIp/imOxIwqEVF6x6tm+QZZwFFk2iU60yeYgOqeV/W/RW Pojg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=PhADCJ4K3AGZ+NLYnaC2ZyPdXUnlqRSriww5+TxUrHc=; fh=jYsygb2zWFrOzCCf3rDLRlBl8w2tvaj5KJlvrJzyZk8=; b=0YreEex29yVxqmG8OLa2SrvxRqm5y+78epewWCJhYGS4D0sQvyBF+sVmrvbweEACeU a7D6gve4DiqzeKZta/WYw+gs9KbThvP6N3A2Odiz4Xppv8jozF7Wg4tOscaVCiZDO5y+ UxoCINGwmVB2Mdyu7yyZfuXRSNmpW+EiGijyFf571qZ+oRv94hEj1Vp7uqB814kYPRZP bSA3cVHhnA95AtSh42wcsbVP3Obrhw8GprlXwHnJA6UUClic1fJ6Zn0eZtdIfGq3uX6W rIlBWGMs4WTS3YCnRLI1AEyhpkIRQEwYKZRccYn0zNmMvKQ3Md5EWTmySpIjqF4Cu5Aa kBhA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c17-20020a170902b69100b001a5089fe47fsi10303416pls.326.2023.06.29.09.50.56; Thu, 29 Jun 2023 09:51:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231680AbjF2Qbs (ORCPT + 99 others); Thu, 29 Jun 2023 12:31:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229445AbjF2Qbq (ORCPT ); Thu, 29 Jun 2023 12:31:46 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7A92110; Thu, 29 Jun 2023 09:31:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 66ED561597; Thu, 29 Jun 2023 16:31:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 38922C433C0; Thu, 29 Jun 2023 16:31:39 +0000 (UTC) Date: Thu, 29 Jun 2023 17:31:36 +0100 From: Catalin Marinas To: Yicong Yang Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, mark.rutland@arm.com, ryan.roberts@arm.com, will@kernel.org, anshuman.khandual@arm.com, linux-doc@vger.kernel.org, corbet@lwn.net, peterz@infradead.org, arnd@arndb.de, punit.agrawal@bytedance.com, linux-kernel@vger.kernel.org, darren@os.amperecomputing.com, yangyicong@hisilicon.com, huzhanyuan@oppo.com, lipeifeng@oppo.com, zhangshiming@oppo.com, guojian@oppo.com, realmz6@gmail.com, linux-mips@vger.kernel.org, openrisc@lists.librecores.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, Barry Song <21cnbao@gmail.com>, wangkefeng.wang@huawei.com, xhao@linux.alibaba.com, prime.zeng@hisilicon.com, Jonathan.Cameron@huawei.com, Barry Song , Nadav Amit , Mel Gorman Subject: Re: [RESEND PATCH v9 2/2] arm64: support batched/deferred tlb shootdown during page reclamation/migration Message-ID: References: <20230518065934.12877-1-yangyicong@huawei.com> <20230518065934.12877-3-yangyicong@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230518065934.12877-3-yangyicong@huawei.com> X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 18, 2023 at 02:59:34PM +0800, Yicong Yang wrote: > From: Barry Song > > on x86, batched and deferred tlb shootdown has lead to 90% > performance increase on tlb shootdown. on arm64, HW can do > tlb shootdown without software IPI. But sync tlbi is still > quite expensive. [...] > .../features/vm/TLB/arch-support.txt | 2 +- > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/tlbbatch.h | 12 ++++ > arch/arm64/include/asm/tlbflush.h | 33 ++++++++- > arch/arm64/mm/flush.c | 69 +++++++++++++++++++ > arch/x86/include/asm/tlbflush.h | 5 +- > include/linux/mm_types_task.h | 4 +- > mm/rmap.c | 12 ++-- First of all, this patch needs to be split in some preparatory patches introducing/renaming functions with no functional change for x86. Once done, you can add the arm64-only changes. Now, on the implementation, I had some comments on v7 but we didn't get to a conclusion and the thread eventually died: https://lore.kernel.org/linux-mm/Y7cToj5mWd1ZbMyQ@arm.com/ I know I said a command line argument is better than Kconfig or some random number of CPUs heuristics but it would be even better if we don't bother with any, just make this always on. Barry had some comments around mprotect() being racy and that's why we have flush_tlb_batched_pending() but I don't think it's needed (or, for arm64, it can be a DSB since this patch issues the TLBIs but without the DVM Sync). So we need to clarify this (see Barry's last email on the above thread) and before attempting new versions of this patchset. With flush_tlb_batched_pending() removed (or DSB), I have a suspicion such implementation would be faster on any SoC irrespective of the number of CPUs. -- Catalin