Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp205145rdb; Tue, 5 Dec 2023 03:13:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IGacfRoaXDsx7jWDdEOhWqJIT3tA4SI2it0gTi4YX/dNugyXKnu//JIny/kSQz0Qu0Fi2qo X-Received: by 2002:aa7:9e81:0:b0:6ce:6c6d:1622 with SMTP id p1-20020aa79e81000000b006ce6c6d1622mr693940pfq.62.1701774816842; Tue, 05 Dec 2023 03:13:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701774816; cv=none; d=google.com; s=arc-20160816; b=EgtRPxp2TjoRi+OvQPm9B73G6Jjq/IEz8j1WjLj2mSwFFE5DlIBNjTsr+O5rYWWmYK KtHwio3FPHSkK99B3utbl1PFLV11nDJZy7l1UZkPcNI3noHgzAflHjMm3j2WtI4x8X7s bN/v+uFOzlEQExkLOUThTkkmdccnwa/TugvE8Edu79fswXKTCuu8OgNwsGF+EeQhXuvS 0KzmLO+rH99v8LFc6XB5LI3zqaYa70i5Pj9Vlpnri2QSwG+ypIC+tCV6YbRUBhBxKTNz 8+d9OQjV4unWGZwG++UOJ8ZcQS5Y/Z49rc3blshuWy0qLpkKMO9YYV+vwCXUoVLt2Usr ChKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=/HNIsDOg5rv6ptBpv+BqkNLcyLWiTt4AY/yqi0xgwII=; fh=BCah++obOZkogj4/YMHL979DdnoQkPYrnuOw/l8QseU=; b=LHXcyN+zWduqTejjWkXmwhpeOl6zo1im7aZInMRoqzr5zvXe9OxW4Y53JFW44JDjP6 TS8ZpVw0XOgsFp2SBBARK1gbxfhrl8bKA+GVXITdCXRbF6UuMbMJsYo7jnXFnhCTCE7s fRkZNbOKTNbRP4Q/mvP62RnZ3x+A5/5LMQ7OjNg6DjXjSUqDUYHnlzLu2Fk9ttsluyHf Lm/5ONHYGZrYKhsAZmXoG2bVDNWiSsjSet8scnCjweMWe86qFo5jMytQdPDTS2oakLOS y/0uS31pYgLXGP2f90sDoyQ1vVb6K3VkRKFSkt5VWzPk0LYoPUfm5CFEqiBIr5VGOn12 1xCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id u14-20020a056a00158e00b0069026254582si8425615pfk.98.2023.12.05.03.13.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 03:13:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id F3CD48077483; Tue, 5 Dec 2023 03:13:33 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345082AbjLELNT (ORCPT + 99 others); Tue, 5 Dec 2023 06:13:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235072AbjLELNS (ORCPT ); Tue, 5 Dec 2023 06:13:18 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0010F9E for ; Tue, 5 Dec 2023 03:13:23 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 49DC2139F; Tue, 5 Dec 2023 03:14:10 -0800 (PST) Received: from [10.57.73.130] (unknown [10.57.73.130]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 43D593F5A1; Tue, 5 Dec 2023 03:13:20 -0800 (PST) Message-ID: <888e20e4-6073-426c-9159-e359c758d78a@arm.com> Date: Tue, 5 Dec 2023 11:13:18 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 00/10] Multi-size THP for anonymous memory Content-Language: en-GB To: John Hubbard , Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , David Rientjes , Vlastimil Babka , Hugh Dickins , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Alistair Popple Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20231204102027.57185-1-ryan.roberts@arm.com> <2be046e1-ef95-4244-ae23-e56071ae1218@nvidia.com> From: Ryan Roberts In-Reply-To: <2be046e1-ef95-4244-ae23-e56071ae1218@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 05 Dec 2023 03:13:34 -0800 (PST) On 05/12/2023 03:37, John Hubbard wrote: > On 12/4/23 02:20, Ryan Roberts wrote: >> Hi All, >> >> A new week, a new version, a new name... This is v8 of a series to implement >> multi-size THP (mTHP) for anonymous memory (previously called "small-sized THP" >> and "large anonymous folios"). Matthew objected to "small huge" so hopefully >> this fares better. >> >> The objective of this is to improve performance by allocating larger chunks of >> memory during anonymous page faults: >> >> 1) Since SW (the kernel) is dealing with larger chunks of memory than base >>     pages, there are efficiency savings to be had; fewer page faults, batched PTE >>     and RMAP manipulation, reduced lru list, etc. In short, we reduce kernel >>     overhead. This should benefit all architectures. >> 2) Since we are now mapping physically contiguous chunks of memory, we can take >>     advantage of HW TLB compression techniques. A reduction in TLB pressure >>     speeds up kernel and user space. arm64 systems have 2 mechanisms to coalesce >>     TLB entries; "the contiguous bit" (architectural) and HPA (uarch). >> >> This version changes the name and tidies up some of the kernel code and test >> code, based on feedback against v7 (see change log for details). > > Using a couple of Armv8 systems, I've tested this patchset. I applied it > to top of tree (Linux 6.7-rc4), on top of your latest contig pte series > [1]. > > With those two patchsets applied, the mm selftests look OK--or at least > as OK as they normally do. I compared test runs between THP/mTHP set to > "always", vs "never", to verify that there were no new test failures. > Details: specifically, I set one particular page size (2 MB) to > "inherit", and then toggled /sys/kernel/mm/transparent_hugepage/enabled > between "always" and "never". Excellent - I'm guessing this was for 64K base pages? > > I also re-ran my usual compute/AI benchmark, and I'm still seeing the > same 10x performance improvement that I reported for the v6 patchset. > > So for this patchset and for [1] as well, please feel free to add: > > Tested-by: John Hubbard Thanks! > > > [1] https://lore.kernel.org/all/20231204105440.61448-1-ryan.roberts@arm.com/ > > > thanks,