Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp1805771iof; Tue, 7 Jun 2022 11:38:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwOwcT6oDCIWUrZMWynptMMQZyjWhSR6kRMITboK3IYSY2C4ms8QtE97NYhE/Vf1K1UDNrC X-Received: by 2002:a17:906:ff18:b0:711:d197:b942 with SMTP id zn24-20020a170906ff1800b00711d197b942mr10451522ejb.357.1654626560376; Tue, 07 Jun 2022 11:29:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654626560; cv=none; d=google.com; s=arc-20160816; b=vg/wS1lhT3TB34QPLzJKUAB/HIRc2GTglFCXgVrRUn24R9pxMTPV2GCwi6EdoS4qPb bF6eMVaMsLzDP/a89y0IrgP6ySlphxZWOg4oRgjmv5VjQc3hA22goLEIlCIUo3xkztdA 4XV6tGX8u4Rzk+LSIcEFPYQjqP2q5ymTZwoZUVLeQtUkDIHRFhtCngw3dClDwKPWneb+ hRBmkBW56soa1k6VusxuOftNyLy78dv2sic9uRcfs7qWXZ79Lu/kBtUvilw7OKtCisdh HsMgqaeeBmg+/Ns2xKK5cNdmlYWNw0PHrqLSfm+WW8NxeDjid0DM8hjZNTb6eSBfJvN4 1xsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=eMj6o1dpOH0W9Y5HTwykUKgzdtPrlRWnrwX9rpljZKc=; b=Gfja/R4XkxJ0o4N0luc1vHN083jeTcilztFxQx6EXCfkcETojUCjo+myIxcWANWIIF /lyY7i6RAwOM/JY+b9ncgMp13kgoJIupOiKO9l6cqOxhtAt0YcMZp3iK33SGjjYgwZLI pUgfWi79b0fxn4bNlBHkBE0IreUytw4BupVXaT6IHKCBSyHU3B6UcILc2lC0YEmA+96T ywFhIGpB5ZdQlne2M93Ac/s58Zos5pvjooxoF4+MtMbsHApaHmYk7i4oWkbRIPx6JrVH sm6NIIRC214GEjiX3eYMdUdsGj1+Guxli8NB64UYC/4y0fLv98FPC2RHamcC5dSEYLM3 zJMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="b8/7tGHs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k22-20020a17090646d600b006fec955c743si12819763ejs.375.2022.06.07.11.28.52; Tue, 07 Jun 2022 11:29:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="b8/7tGHs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235049AbiFFWB2 (ORCPT + 99 others); Mon, 6 Jun 2022 18:01:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231645AbiFFWB0 (ORCPT ); Mon, 6 Jun 2022 18:01:26 -0400 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9F6BDFD31 for ; Mon, 6 Jun 2022 15:01:22 -0700 (PDT) Received: by mail-ej1-x632.google.com with SMTP id fu3so30057362ejc.7 for ; Mon, 06 Jun 2022 15:01:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eMj6o1dpOH0W9Y5HTwykUKgzdtPrlRWnrwX9rpljZKc=; b=b8/7tGHsUvQuYEuE6TdH0YhTdZGjEQ7/PD1G684q5IgDFtECbox6b5ndqcP97oaCbM Y0l6Oyia8uQVhys0BPY3yzBe0RhQPwP+5ngtzKQL1sR6XgRSPB5KL3p+JjGCWVhSIsj4 I7WERSQh3C32HeCnKQiQXKWNocrpxH5DPorFI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eMj6o1dpOH0W9Y5HTwykUKgzdtPrlRWnrwX9rpljZKc=; b=Wv4/kjv6z4S/Q+EAkrrTWVq8JccuEc1wmAKIa5nU+8LplJDEFZwGfIzQBDF41znz88 lWvBdPWxj+xLJt70WlF7bDXkJ9vLErE8U+zMMF0UdJvwHqXqDsOwCR0elOJOJ4NCZEbq oCUvXcKlIn+dPwm+V5nyH9tv0XtdcVZLnfETbAphAjYyObmDcHOafEc9230XbpyXxwa5 e1xoUMMGBt+PYu/tb/Ca9AL0bYbuR+TBcwBFHWEwWHh5izlnop7RAe3rQMfvwY9AFXXs Tq1Z4vsgh7e9VCCNatYBmncEBvPRVjHAnked/Qcu0A7iBql1yJUJwnbIaGXvc/WSIDTS gSrg== X-Gm-Message-State: AOAM532SIFNZpDCDiTPZbcFD//ujUVPS+p4yD86+IK3CbDEKq50TJN99 fWE4L5q3eaSNZoUJJLoWj/7p5iJ0KSl+FN1kiLw= X-Received: by 2002:a17:907:3f1f:b0:6fe:b40a:21f0 with SMTP id hq31-20020a1709073f1f00b006feb40a21f0mr23604926ejc.744.1654552880776; Mon, 06 Jun 2022 15:01:20 -0700 (PDT) Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com. [209.85.128.45]) by smtp.gmail.com with ESMTPSA id s3-20020a056402014300b0042617ba63a5sm9158098edu.47.2022.06.06.15.01.19 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 Jun 2022 15:01:20 -0700 (PDT) Received: by mail-wm1-f45.google.com with SMTP id e5so1498940wma.0 for ; Mon, 06 Jun 2022 15:01:19 -0700 (PDT) X-Received: by 2002:a05:600c:4982:b0:39c:3c0d:437c with SMTP id h2-20020a05600c498200b0039c3c0d437cmr20758360wmp.38.1654552405818; Mon, 06 Jun 2022 14:53:25 -0700 (PDT) MIME-Version: 1.0 References: <20220606202109.1306034-1-ankur.a.arora@oracle.com> In-Reply-To: <20220606202109.1306034-1-ankur.a.arora@oracle.com> From: Linus Torvalds Date: Mon, 6 Jun 2022 14:53:09 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 00/21] huge page clearing optimizations To: Ankur Arora Cc: Linux Kernel Mailing List , Linux-MM , "the arch/x86 maintainers" , Andrew Morton , Mike Kravetz , Ingo Molnar , Andrew Lutomirski , Thomas Gleixner , Borislav Petkov , Peter Zijlstra , Andi Kleen , Arnd Bergmann , Jason Gunthorpe , jon.grimm@amd.com, Boris Ostrovsky , Konrad Rzeszutek Wilk , joao.martins@oracle.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 6, 2022 at 1:22 PM Ankur Arora wrote: > > This series introduces two optimizations in the huge page clearing path: > > 1. extends the clear_page() machinery to also handle extents larger > than a single page. > 2. support non-cached page clearing for huge and gigantic pages. > > The first optimization is useful for hugepage fault handling, the > second for prefaulting, or for gigantic pages. Please just split these two issues up into entirely different patch series. That said, I have a few complaints about the individual patches even in this form, to the point where I think the whole series is nasty: - get rid of 3/21 entirely. It's wrong in every possible way: (a) That shouldn't be an inline function in a header file at all. If you're clearing several pages of data, that just shouldn't be an inline function. (b) Get rid of __HAVE_ARCH_CLEAR_USER_PAGES. I hate how people make up those idiotic pointless names. If you have to use a #ifdef, just use the name of the function that the architecture overrides, not some other new name. But you don't need it at all, because (c) Just make a __weak function called clear_user_highpages() in mm/highmem.c, and allow architectures to just create their own non-weak ones. - patch 4/21 and 5/32: can we instead just get rid of that silly "process_huge_page()" thing entirely. It's disgusting, and it's a big part of why 'rep movs/stos' cannot work efficiently. It also makes NO SENSE if you then use non-temporal accesses. So instead of doubling down on the craziness of that function, just get rid of it entirely. There are two users, and they want to clear a hugepage and copy it respectively. Don't make it harder than it is. *Maybe* the code wants to do a "prefetch" afterwards. Who knows. But I really think you sh ould do the crapectomy first, make the code simpler and more straightforward, and just allow architectures to override the *simple* "copy or clear a lage page" rather than keep feeding this butt-ugly monstrosity. - 13/21: see 3/21. - 14-17/21: see 4/21 and 5/21. Once you do the crapectomy and get rid of the crazy process_huge_page() abstraction, and just let architectures do their own clear/copy huge pages, *all* this craziness goes away. Those "when to use which type of clear/copy" becomes a *local* question, no silly arch_clear_page_non_caching_threshold() garbage. So I really don't like this series. A *lot* of it comes from that horrible process_huge_page() model, and the whole model is just wrong and pointless. You're literally trying to fix the mess that that function is, but you're keeping the fundamental problem around. The whole *point* of your patch-set is to use non-temporal stores, which makes all the process_huge_page() things entirely pointless, and only complicates things. And even if we don't use non-temporal stores, that process_huge_page() thing makes for trouble for any "rep stos/movs" implementation that might actualyl do a better job if it was just chunked up in bigger chunks. Yes, yes, you probably still want to chunk that up somewhat due to latency reasons, but even then architectures might as well just make their own decisions, rather than have the core mm code make one clearly bad decision for them. Maybe chunking it up in bigger chunks than one page. Maybe an architecture could do even more radical things like "let's just 'rep stos' for the whole area, but set a special thread flag that causes the interrupt return to break it up on return to kernel space". IOW, the "latency fix" might not even be about chunking it up, it might look more like our exception handling thing. So I really think that crapectomy should be the first thing you do, and that should be that first part of "extends the clear_page() machinery to also handle extents larger than a single page" Linus