Received: by 2002:ac8:734e:0:b0:40f:fb00:664b with SMTP id q14csp1721493qtp; Wed, 9 Aug 2023 09:57:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHSvgVcyXwGSayB3+lBtWTgNgAZdz0xFKPQie7v4GxN0/iATNatRV2Kw+UE8JyksEx4L814 X-Received: by 2002:aa7:d918:0:b0:522:3149:159b with SMTP id a24-20020aa7d918000000b005223149159bmr1087edr.2.1691600222077; Wed, 09 Aug 2023 09:57:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691600222; cv=none; d=google.com; s=arc-20160816; b=od8Mt0F4dDUbPCDvkjp1mDxitkOadDZPXKPtI1vbTu7lH88aP/LIARB/cWDNrEYCcx Ze2CVTAmakvgQDu552WF/7Q2K933Lc8I4yRZBDCbwarohwl7ICNy0s8bDbEvVFaJqyx/ VmJ8Z27ObXc84oS+QcuUl6xFhwDdG6xiNu37l2GshFdwQOKdNH/VvQsxFJnZ9KYk7SNZ pN0YHridnSnqQBLFiIDEHkO/Kmx20ej1UmpjRUFR4FhLin3bDirpNB0edDcfYyWF8fxA pBubjhWULpph4XAMfjKE5a/QHOAq47I8e6ZnRWkEh7dGFXaaKlo4XQ68T1orsAxaUrPL a5lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=TTqDCS/heK07tIU2NLBP3yG/m7nG9+GXjzs23haqr/g=; fh=Xm6XXNElyJjJOAOqLou8A5LRpDm2d+FklQC6qUIJNEw=; b=ZkRw7G6Fre8950ZCySZL6g8MTikBSrbnkLEzY6xhSg0vlvSZqMlK2hqfRZAxAJLNoZ U1qK9mMmfe2Gh52My1E3Ka0VnAl6CDy+RTIwqAK9pGqBQhhuobeXskXEoKEUTHUDNW5A PtzRwDyPPQDsSG2xEjT9brUQU8c6OVpQojhl0OopbIfgjowE9vj+lG68w9Qbb3umn0sb pIWaFnQMnJcF07mi+Nka9YtJQccX2IeycfxkKyOElvdfsC0/4FmZIwOMPxX8nlQSbA1x bDnSykTm1fQY/5BTr7C6CuJ78N06Z4ykpDdCqpZtUVqTKpn6EfBbmKfbClFcA/Qh+apA Zgiw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u19-20020aa7d993000000b0052333df2015si5535616eds.627.2023.08.09.09.56.37; Wed, 09 Aug 2023 09:57:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229782AbjHIQIj (ORCPT + 99 others); Wed, 9 Aug 2023 12:08:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229671AbjHIQIi (ORCPT ); Wed, 9 Aug 2023 12:08:38 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 317D919E for ; Wed, 9 Aug 2023 09:08:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1BA56D75; Wed, 9 Aug 2023 09:09:19 -0700 (PDT) Received: from [10.57.79.142] (unknown [10.57.79.142]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7C2E63F59C; Wed, 9 Aug 2023 09:08:34 -0700 (PDT) Message-ID: Date: Wed, 9 Aug 2023 17:08:32 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 2/5] mm: LARGE_ANON_FOLIO for improved performance To: Yu Zhao Cc: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <20230726095146.2826796-1-ryan.roberts@arm.com> <20230726095146.2826796-3-ryan.roberts@arm.com> <433fb8de-f5c0-d150-ac7b-5d73e9958e02@arm.com> <20469f02-d62d-d925-3536-d6a1f1099fda@arm.com> Content-Language: en-GB From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [...] >>>> Let me reiterate [1]: >>>> My impression is we only agreed on one thing: at the current stage, we >>>> should respect things we absolutely have to. We didn't agree on what >>>> "never" means ("never 2MB" or "never >4KB"), and we didn't touch on >>>> how "always" should behave at all. >>>> >>>> And [2]: >>>> (Thanks to David, now I agree that) we have to interpret MADV_NOHUGEPAGE >>>> as nothing >4KB. >>>> >>>> My final take [3]: >>>> I agree these points require more discussion. But I don't think we >>>> need to conclude them now, unless they cause correctness issues like >>>> ignoring MADV_NOHUGEPAGE would. >>> >>> Thanks, I've read all of these comments previously, and appreciate the time you >>> have put into the feedback. I'm not sure I fully agree with your point that we >>> don't need to conclude on a policy now; I certainly don't think we need the >>> whole thing in place on day 1, but I do think that whatever we put in should >>> strive to be a strict subset of where we think we are going. For example, if we >>> put something in with one policy (i.e. "never" only means "never 2MB") then find >>> a problem and have to change that to be more conservative, are we risking perf >>> regressions for any LAF users that started using it on day 1? >> >> It's not that I don't want to -- I just don't think we have enough >> information before we have a wider deployment [1] and gain a better >> understanding of real-world scenarios. >> >> Of course we could force a conclusion, a mostly opinion-based one. But >> it would still involve prolonged discussions and delay this series, or >> rush into decisions we might regret later. >> >> [1] Our fleets (servers, laptops and phones) support large-scale >> experiments and I plan to run them on both client and server devices. This all sounds great and I'm looking forward to seeing results! But I guess I had been assuming that this sort of testing would be preferable to do before we merge; that allows us to get confidence in the approach and reduces the changes of having to change it later. I guess you have policies that prevent you from testing this series at the scale you want until it is merged? I'm not convinced this testing will help us answer the "what does never mean?" question; if nothing breaks in your testing, it doesn't mean there aren't systems out there that would break - it's hard to prove a negative. I think its mostly embedded systems that use thp=never to reduce memory footprint to the absolute minimum? >> >>>> But I should have been clear about the parameters to >>>> hugepage_vma_check(): enforce_sysfs=false. >>> >>> So hugepage_vma_check(..., smaps=false, in_pf=true, enforce_sysfs=false) would >>> give us: >>> >>> | prctl/fw | sysfs | sysfs | sysfs >>> | disable | never | madvise | always >>> ----------------|-----------|-----------|-----------|----------- >>> no hint | S | LAF>S | LAF>S | THP>LAF>S >>> MADV_HUGEPAGE | S | LAF>S | THP>LAF>S | THP>LAF>S >>> MADV_NOHUGEPAGE | S | S | S | S >>> >>> Where "prctl/fw disable" trumps the sysfs setting. >>> >>> I can certainly see the benefit of this approach; it gives us a way to enable >>> LAF while disabling THP (thp=never). It doesn't give us a way to enable THP >>> without enabling LAF though (unless you recompile with LAF disabled). Does >>> anyone see a problem with this? >> >> I do myself :) >> >> This is just something temporary to get this series landed. We are >> hiding behind a Kconfig, not making any ABI changes, and not exposing >> this policy to userspace (i.e., not updating Documentation/, man >> pages, etc.) >> >> Meanwhile, we can keep discussing all the open questions in parallel. You're right - don't want to slow down the testing, so I'm going to post a v5 tomorrow with the policy in the table above. We're still waiting for the prerequisites to land before we can kick off testing in anger though. > > And the stat ABI changes should be discussed before or at the same > time. If we came up with a policy but there was *zero* observability > of how well that policy works... Yep agreed. I have a series at [1] which I hoped would kickstart that discussion. [1] https://lore.kernel.org/linux-mm/20230613160950.3554675-1-ryan.roberts@arm.com/ Thanks, Ryan