Received: by 2002:a05:7412:9c07:b0:fa:6e18:a558 with SMTP id lr7csp92854rdb; Fri, 26 Jan 2024 21:04:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IEJywhh4dGD09LRSlhZYnqcS62Z5baH98nZdwuwaIYaWEg0xvWttupcA0StAyGXiJ177vE2 X-Received: by 2002:a9d:6a0a:0:b0:6dc:167d:f5a9 with SMTP id g10-20020a9d6a0a000000b006dc167df5a9mr963574otn.65.1706331868091; Fri, 26 Jan 2024 21:04:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706331868; cv=pass; d=google.com; s=arc-20160816; b=ZzuR70sUWiqaw7/6wZP0ByBhCXNaGUtV3UGRsbfSdCkgOt6/kMGE1hO1JqMeZdo17b +AXR8N962rWpZJsZN+ZwteQXhFC3djEP5Wd2MwLyuJZHTpyBhoM7LqKkxE+0vb9YOVS3 UGggtYgWsPgP+Bv+r0p5m4a8svJ6+dlJgxXWKYUVXG4bcxVmHx4rtfEE1HApP57OHMnT R4nBohDr/k2b+L9HtQIgkw5jEXzZKgZGo1tu3Zo+fnVY96xwNFI/chedDxl7AGmRzuNn I6C/7K/FsuRbWozXATsQ1VoFEvGP/1h6FIMtSCw2ZDbnFAHN+vUPeDpZ4Q5AnJg+rM1W gtNg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :references:cc:to:subject; bh=KfrXgFqMM3iM6nFU/mvcsMIBYGQhFQYNhxsp+ZPC7+g=; fh=0h/TPPp48SJYn+X1bUbH+MEaCoaAkErkmQuvY3JDjkU=; b=fIjS2M5He3cFR1N1wSwT51suBXnYDjz2+CeZcsWWaBC+AM57J4zhW0las8K3b6HuQo c6jC+HeO/nyPnAUVTmyxoLOFtWHZw+gT5msqEvHprEZy2OsnESEVOHpHV8b4ixOWGAl7 lyIV4PVhbkt5zri0AJSd+mb7Ggyka6oxf5RD7yuQYl2Xf0wukye+Vm2OhTPkdb43EQsp fMm52/JuWvTS6hJ1U1/RloKFi3NGIQ9c6OtIapc1JKzyFItQVRBV23n718ZMGpDxRY1b /1naSA5LAt5nH7E+81b5Ftwydoi4AzzqoEgf0mIXy82teylLAr0B32rz0KGIYnoB2AtH kiNg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-41067-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-41067-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id g70-20020a636b49000000b005d81865c2bfsi1855246pgc.880.2024.01.26.21.04.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 21:04:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-41067-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-41067-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-41067-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id AECCB286340 for ; Sat, 27 Jan 2024 05:04:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B55FC13AE3; Sat, 27 Jan 2024 05:04:22 +0000 (UTC) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5C0C33D5 for ; Sat, 27 Jan 2024 05:04:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706331862; cv=none; b=tRKxOp3Qy/KyZOBk/6uow8GTAlDQn78GwSq2S44Y9odzQ6277DAbxlDrMqXIaq7Tf6yOoyP/FvBh6TvTpe894lU5G8JHFKxxMhjHqOS7r1PsX1Qjyt1RK9aM6QE6iR3qp6+DS25RnHxIsM6HxhT2yU6CEtaMW2l/AwFYSgsmJMQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706331862; c=relaxed/simple; bh=J2unJM+3mMjNZeHB7lKLolhX8D8xRXo3BGhcnnVVfHA=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=D44ylbyML8v6YXoLoWUm+e2F5ai27un3re+lv7bAvWZidEYpZBuzOA2L2xfqX6Qs4KVhQ7OCIULMf8m6zy8w7Mp/glygIGHDHf7vSM1OpMYme/atGHPzYlhkGW+yaa+oA6kvPQLswUFI4NktpS3h7oN17geHQc9eGT1NhCoahK8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4TMMrK55wkzvVNC; Sat, 27 Jan 2024 13:02:41 +0800 (CST) Received: from kwepemm600003.china.huawei.com (unknown [7.193.23.202]) by mail.maildlp.com (Postfix) with ESMTPS id DECCE140153; Sat, 27 Jan 2024 13:04:16 +0800 (CST) Received: from [10.174.179.79] (10.174.179.79) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 27 Jan 2024 13:04:15 +0800 Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize To: Catalin Marinas CC: , , , , , , , , , References: <20240113094436.2506396-1-sunnanyong@huawei.com> From: Nanyong Sun Message-ID: Date: Sat, 27 Jan 2024 13:04:15 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600003.china.huawei.com (7.193.23.202) On 2024/1/26 2:06, Catalin Marinas wrote: > On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote: >> HVO was previously disabled on arm64 [1] due to the lack of necessary >> BBM(break-before-make) logic when changing page tables. >> This set of patches fix this by adding necessary BBM sequence when >> changing page table, and supporting vmemmap page fault handling to >> fixup kernel address translation fault if vmemmap is concurrently accessed. > I'm not keen on this approach. I'm not even sure it's safe. In the > second patch, you take the init_mm.page_table_lock on the fault path but > are we sure this is unlocked when the fault was taken? I think this situation is impossible. In the implementation of the second patch, when the page table is being corrupted (the time window when a page fault may occur), vmemmap_update_pte() already holds the init_mm.page_table_lock, and unlock it until page table update is done.Another thread could not hold the init_mm.page_table_lock and also trigger a page fault at the same time. If I have missed any points in my thinking, please correct me. Thank you. > Basically you can > get a fault anywhere something accesses a struct page. > > How often is this code path called? I wonder whether a stop_machine() > approach would be simpler. As long as allocating or releasing hugetlb is called.  We cannot limit users to only allocate or release hugetlb when booting or not running any workload on all other cpus, so if use stop_machine(), it will be triggered 8 times every 2M and 4096 times every 1G, which is probably too expensive. I saw that on the X86, in order to improve performance, optimizations such as batch tlb flushing have been done, means that some users are concerned about the performance of hugetlb allocation: https://lwn.net/ml/linux-kernel/20230905214412.89152-1-mike.kravetz@oracle.com/ > Andrew, I'd suggest we drop these patches from the mm tree for the time > being. They haven't received much review from the arm64 folk. Thanks. >