Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp3291694rwo; Mon, 24 Jul 2023 08:57:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlHWaJzNyBpRcFg+VSRnJb3fVM9zP7qZGQSyacseekQIaz4rnmX/yXfKVD6IxbGnMNC3lu9l X-Received: by 2002:a17:906:3185:b0:965:fb87:4215 with SMTP id 5-20020a170906318500b00965fb874215mr9732590ejy.15.1690214266839; Mon, 24 Jul 2023 08:57:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690214266; cv=none; d=google.com; s=arc-20160816; b=f1749J1Rtxv8/3bfnRWht8/P37H8oGD1v+4dC2VXUckXenvwwLDwnEZ5v0dePxkOBZ kLNO4aSX6MRl/Rjnm45D0hAvd1/V1IiYMV1bQ5xjLfQa4jVQdD157xcCpMag5r6zfEHB k6p5GSccodheTJWuOYTABDytJ1MuCsH8p5BcNvfOaYfBhwMd8lqYMidjlnBX4Qqswcyp ZuWY0FBZt1yzE6OAT1QTGx1WYpNq8uyntKnOscHmOvZu0UegMC/zb8XGWzcmiizi3Mkl //0ijLBuyErUppE8aoeO9mBqDT7f5wdiiQocqQ0pUiJ7G7MmEL42bZRlrvEHAv1mUQqI JCgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=/nFr6bshaiORuuUwUU8EKgDlP2y5UnGCnMeIdeRk7DY=; fh=K1SivsAKjri1pakohJK9Nm+Q83Gxarwyp+SO9k5V54Q=; b=jg+4It5Bn2R//BwOnwG8ur4vX6efmyV8nObVoFsTXPEqi3T06x2lrX3paZafLEZdwZ X2FHp9ab0Jh5xG+AO16Z4V+cxAVx//W0HGpSqi+V6uYXsRvJWg4koWXWUOJj8KxTNLLE TF9ly7DtwBtV2otoe56votg9x4+fDJHvf0NdMH6ef1me+b91l88Z9byo1fOzGulRmg3T XmeDaIhP9CI25s4Vf6KEwQLk0KbkWH9HniSEWsTmNqbWgM4VKVMKMx4PkkoxFTaajIxD WcKX4gWSJpAlwrZEstRC09jtJz1oSi/+uj/KPl2HqSxrznfUVV7lLUX61wEEN4bIGIwB XJnQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gf21-20020a170906e21500b00992f1a3b9d6si6233148ejb.852.2023.07.24.08.57.21; Mon, 24 Jul 2023 08:57:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229904AbjGXPmH (ORCPT + 99 others); Mon, 24 Jul 2023 11:42:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229461AbjGXPmE (ORCPT ); Mon, 24 Jul 2023 11:42:04 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0920C10E4 for ; Mon, 24 Jul 2023 08:42:00 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D774AFEC; Mon, 24 Jul 2023 08:42:42 -0700 (PDT) Received: from [10.57.76.172] (unknown [10.57.76.172]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9FC643F67D; Mon, 24 Jul 2023 08:41:56 -0700 (PDT) Message-ID: <34979a4c-0bab-fbb9-f8dd-ab3da816de52@arm.com> Date: Mon, 24 Jul 2023 16:41:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v3 0/4] variable-order, large folios for anonymous memory To: Zi Yan Cc: Andrew Morton , Matthew Wilcox , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Luis Chamberlain , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230714160407.4142030-1-ryan.roberts@arm.com> <83bb1b99-81d3-0f32-4bf2-032cb512a1a1@arm.com> <2FCD9E8A-D38A-40C4-9825-AE7ECEEFC715@nvidia.com> From: Ryan Roberts In-Reply-To: <2FCD9E8A-D38A-40C4-9825-AE7ECEEFC715@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24/07/2023 15:58, Zi Yan wrote: > On 24 Jul 2023, at 7:59, Ryan Roberts wrote: > >> On 14/07/2023 17:04, Ryan Roberts wrote: >>> Hi All, >>> >>> This is v3 of a series to implement variable order, large folios for anonymous >>> memory. (currently called "FLEXIBLE_THP") The objective of this is to improve >>> performance by allocating larger chunks of memory during anonymous page faults. >>> See [1] and [2] for background. >> >> A question for anyone that can help; I'm preparing v4 and as part of that am >> running the mm selftests, now that I've fixed them up to run reliably for >> arm64. This is showing 2 regressions vs the v6.5-rc3 baseline: >> >> 1) khugepaged test fails here: >> # Run test: collapse_max_ptes_none (khugepaged:anon) >> # Maybe collapse with max_ptes_none exceeded.... Fail >> # Unexpected huge page >> >> 2) split_huge_page_test fails with: >> # Still AnonHugePages not split >> >> I *think* (but haven't yet verified) that (1) is due to khugepaged ignoring >> non-order-0 folios when looking for candidates to collapse. Now that we have >> large anon folios, the memory allocated by the test is in large folios and >> therefore does not get collapsed. We understand this issue, and I believe >> DavidH's new scheme for determining exclusive vs shared should give us the tools >> to solve this. >> >> But (2) is weird. If I run this test on its own immediately after booting, it >> passes. If I then run the khugepaged test, then re-run this test, it fails. >> >> The test is allocating 4 hugepages, then requesting they are split using the >> debugfs interface. Then the test looks at /proc/self/smaps to check that >> AnonHugePages is back to 0. >> >> In both the passing and failing cases, the kernel thinks that it has >> successfully split the pages; the debug logs in split_huge_pages_pid() confirm >> this. In the failing case, I wonder if somehow khugepaged could be immediately >> re-collapsing the pages before user sapce can observe the split? Perhaps the >> failed khugepaged test has left khugepaged in an "awake" state and it >> immediately pounces? > > This is more likely to be a stats issue. Have you checked smap to see if > AnonHugePages is 0 KB by placing a getchar() before the exit(EXIT_FAILURE)? Yes - its still 8192K. But looking at the code that value is determined from the fact that there is a PMD block mapping present. And the split definitely succeeded so something must have re-collapsed it. Looking into the khugepaged test suite, it saves the thp and khugepaged settings out of sysfs, modifies them for the tests, then restores them when finished. But it doesn't restore if exiting early (due to failure). It changes the settings for alloc_sleep_millisecs and scan_sleep_millisecs from a large number of seconds to 10 ms, for example. So I'm pretty sure this is the culprit. > Since split_huge_page_test checks that stats to make sure the split indeed > happened. > > -- > Best Regards, > Yan, Zi