Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp2822094lqp; Mon, 25 Mar 2024 10:07:38 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUsrN3gTNDhUKtzZMTEjcjLNA+uCfD70p8/tXYb4A8r0uYgKKAVLeXs/IxETUzDa7yoba+d9GyMLQs8Cj/bgWbKzk+Agd0twhaNdp+M9Q== X-Google-Smtp-Source: AGHT+IH/3vvppy15a3d9Yk4Kgjp3iMgDojfDZtFQPYy0nWnSmSB0UFP9L+Mew4fQYYl7oZ7RIkg1 X-Received: by 2002:a92:c84a:0:b0:366:6d31:2af8 with SMTP id b10-20020a92c84a000000b003666d312af8mr8154596ilq.18.1711386458085; Mon, 25 Mar 2024 10:07:38 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711386458; cv=pass; d=google.com; s=arc-20160816; b=LckanfKcfOZF5U8l9WUdtP3D1Ktcm0zSFeIVC5+U3xgcvFRGAzNKkV1rdcbsdjs0tU RuGDo2VoV/ndI0/+1iMprcB3+HQMgSOzfbfieytA9JQT7YqnE+Czsdr/TJBLaASE9HUO OBL676LY5BD9/YnEGkdjnk+jkaGeiAnl91xqXnGieYBkLwlhYlxZ46S5WZWYmr1ejqPj YcOB5WzQAXtfFTsBdRT0Jv+Gwnx1FFZgViW7AbXgLgnMTEaK8W4BTPvLRQ6WN0+4sc1l 7Gkcf59zZdWm21k/ccoymjsBBPS5KEyVofSBjS1jMns7D8EpU4XIi5MN2/wdxsqxry7s Obgw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :references:cc:to:subject; bh=Uqms0HuvHOIzPGuGrg2DlhNwdt4CiBsOsP5kgK9SO3s=; fh=Wze/I6mH21aYCfQXFqt26cYS/uJrmP/PEzKIgUB18pA=; b=BeCbOhhOQnXLf9ClO9bCrnNTPR4nh+Tpb+AYCSLKLO655bSZdTFAW9tOdmIeWQ2jmh tu10lZHIc+2+AQKDo4/KogFbQodgwicOtSfotmFmg86f3hxvcS4KWQ8E6Qgc4ow0dxjM JsHALhmxGUvfJyaHY0jzUHBzt2o/Y/p34kL23J0AD4s5ZrIW5RQDdveymAzfoAHGUmCs PUhbSX7njVGtVomUT0D2LzI+Ok27sXOqaS19gpIP6tXNH03j/QoEdUtofDE+T0R/2Cwj O8W9vcV0gzRl9DOFsQr8J4lGqXzZvhJCc1JQ9clUtReDz1yunqg4c1jDWPBum5DRNJ62 QUuQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-117357-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-117357-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id y38-20020a631826000000b005e4901151ddsi8029982pgl.92.2024.03.25.10.07.37 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:07:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-117357-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-117357-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-117357-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id C9999305CFA for ; Mon, 25 Mar 2024 16:57:41 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BDA1A1B815; Mon, 25 Mar 2024 15:24:42 +0000 (UTC) Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 882801B95B for ; Mon, 25 Mar 2024 15:24:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711380282; cv=none; b=bVP6SIHyhCtFOOCMGn5gFfbMDiWjpt7bg/sCDoAlIC1vUV2yph1q0RBS99YE9oe40tByqywU0C/dJ3zmTLUwGtCHPkLL/OHwjw+1097RPkTHryF/Lh7kIzMvQC/6tMuAUcbXI9o0ZJeju+s0cFdo9jH0IimMFqld2ApjOkiZICo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711380282; c=relaxed/simple; bh=syzKgDeLyYIAXjh7NmgB2At8G8Zh2reAEcwgbzqURQk=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=sbJUDxy/5Co97zkjMQQU9Y4obdnA29skxNj7R3NBcNP1Fom10ge1ouEI13moLMyJc4WMfwVCIXG0Ij1mhgIN5kiQWrx+B3JK88nfp9tSD/gqxg7Cd21NTJ1wJoFZgvmdCR5rDqTiLxc2dH1d+ppOyLM9M9kDDeP0WiKqSfxXfK0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4V3Gqz73GmzXjkc; Mon, 25 Mar 2024 23:21:51 +0800 (CST) Received: from kwepemm600003.china.huawei.com (unknown [7.193.23.202]) by mail.maildlp.com (Postfix) with ESMTPS id 5AFB9140ABF; Mon, 25 Mar 2024 23:24:36 +0800 (CST) Received: from [10.174.179.79] (10.174.179.79) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 25 Mar 2024 23:24:35 +0800 Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize To: David Rientjes , Will Deacon CC: Catalin Marinas , Matthew Wilcox , , Andrew Morton , , , , , , Yu Zhao , Yosry Ahmed , Sourav Panda References: <20240113094436.2506396-1-sunnanyong@huawei.com> <20240207111252.GA22167@willie-the-truck> <44075bc2-ac5f-ffcd-0d2f-4093351a6151@huawei.com> <20240208131734.GA23428@willie-the-truck> From: Nanyong Sun Message-ID: <22c14513-af78-0f1d-5647-384ff9cb5993@huawei.com> Date: Mon, 25 Mar 2024 23:24:34 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemm600003.china.huawei.com (7.193.23.202) On 2024/3/14 7:32, David Rientjes wrote: > On Thu, 8 Feb 2024, Will Deacon wrote: > >>> How about take a new lock with irq disabled during BBM, like: >>> >>> +void vmemmap_update_pte(unsigned long addr, pte_t *ptep, pte_t pte) >>> +{ >>> +    (NEW_LOCK); >>> +    pte_clear(&init_mm, addr, ptep); >>> +    flush_tlb_kernel_range(addr, addr + PAGE_SIZE); >>> +    set_pte_at(&init_mm, addr, ptep, pte); >>> +    spin_unlock_irq(NEW_LOCK); >>> +} >> I really think the only maintainable way to achieve this is to avoid the >> possibility of a fault altogether. >> >> Will >> >> > Nanyong, are you still actively working on making HVO possible on arm64? > > This would yield a substantial memory savings on hosts that are largely > configured with hugetlbfs. In our case, the size of this hugetlbfs pool > is actually never changed after boot, but it sounds from the thread that > there was an idea to make HVO conditional on FEAT_BBM. Is this being > pursued? > > If so, any testing help needed? I'm afraid that FEAT_BBM may not solve the problem here, because from Arm ARM, I see that FEAT_BBM is only used for changing block size. Therefore, in this HVO feature, it can work in the split PMD stage, that is, BBM can be avoided in vmemmap_split_pmd, but in the subsequent vmemmap_remap_pte, the Output address of PTE still needs to be changed. I'm afraid FEAT_BBM is not competent for this stage. Perhaps my understanding of ARM FEAT_BBM is wrong, and I hope someone can correct me. Actually, the solution I first considered was to use the stop_machine method, but we have products that rely on /proc/sys/vm/nr_overcommit_hugepages to dynamically use hugepages, so I have to consider performance issues. If your product does not change the amount of huge pages after booting, using stop_machine() may be a feasible way. So far, I still haven't come up with a good solution.