Received: by 2002:ab2:23c8:0:b0:1f2:fdbc:cb93 with SMTP id a8csp213454lqe; Wed, 27 Mar 2024 03:45:57 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXAPj099inpM0ByV7Rs0JpUPLD3vTWisnDZuNnGWYyKwivk19bKGBpUcAEPNU7Xa5VseNjyC8Brjby0KqczCZVuSCgLV0oBxLg5yw+6Ow== X-Google-Smtp-Source: AGHT+IHgfKCemj1sbhCvCNiPMj3g+3BwNbV7UEpWRgLtXM56N5ZvrX1tukj+SM8BCzSrJRIsAfHg X-Received: by 2002:a17:90a:fa0d:b0:2a0:3df6:55fc with SMTP id cm13-20020a17090afa0d00b002a03df655fcmr3975342pjb.9.1711536357424; Wed, 27 Mar 2024 03:45:57 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711536357; cv=pass; d=google.com; s=arc-20160816; b=G+teO/iGs10ieA2y1HCG+e9yUy5Fu+gTo5iI3skii4upsIgsaYutlEES8awijsgGLc Tn2/NutrHFAfzTIWqJl4z4zwH+VmJzTPhP7kjbuqkmgCuppVk5bM4xJcPCbtvf6gtdWf /sR2t3digkAWV1FDlaVGvtqnNMlZZSDwvr34Zg53majFEay9JhhXAF4IAXuf13f8Ygj2 b5j6wMEw7Uo0+uSXeNgvzMgcF02WK2Lz/xgn3RO4dt+vO67gk7MRyZVUBTh7tTE7CX1H ej2DS2jxjELiuWMlX3Rwfuwj7aAJk+nOW/M+FTCEsNUieUm8JvIeb30ZqFtlBtFaFXBe RrUw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=/Npfd0UP46an6m87CbVpskoQ8ZZTUOJIEfCFfXBQimc=; fh=vyMZCxqrSbtK05GLIvik38o8UjWUWAdgOh53Y9QMyB0=; b=LR2e1yqOidg2SahwcV2YT0GY9fc3N0usyZ0xyeJRj0I58m0VsEcFlvrfpw5jEWdhz1 zq8LwGxaZ1u3iBcZzuW9uWV5sZYveA1S3JP+rvKCFWBGadwX7pR3/KE/9MvgjlWCEcPQ IYeQ8leQFMrqgiB7ZM3OHGrkZjtxKYHdek9/QwR9e59U/bIB6FriqIpF7W5wSydqrx0j 3w5eIIliG8hkAz9J0cywhm/78WkGJyXLrJPHy7pJ4V3eeG3Ds+R6dTl6M0PsEoSurIMl z3p/kNjYU9STorZOWGPMtG29wnZ5ST45ONN7VryquOxIE/OuywEsJ8QVIJsDBd9ngZkh 2yBw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-120682-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-120682-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id ot14-20020a17090b3b4e00b002a000f2d59bsi1282752pjb.36.2024.03.27.03.45.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Mar 2024 03:45:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-120682-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-120682-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-120682-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id ACA0A299058 for ; Wed, 27 Mar 2024 10:43:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 166A34F1F2; Wed, 27 Mar 2024 10:43:21 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 62A28225D9 for ; Wed, 27 Mar 2024 10:43:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711536200; cv=none; b=TVVvHmRcKJM2QukPmAlIXI2zK4OGpBCYYQ5sz28kzhdiWxKu++/WIQjI1lvoumGgP+hkqOqBvFuuaSv3ADuA37OJSMlRPLXmHx5huByvqRFHncQ6BB/fHK24ogp88w//FWZUJyZSL4hC3X5rdFz7Feq0rC3Yfv74H2GF+MPMwak= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711536200; c=relaxed/simple; bh=7g4XIr2NhS8M2o8/S8aRYxM/ziuBm7aT3Fbm2qvofKQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=uE29/ZR4lKG0NGoacKV9c/pC9iVjMom/xvw59gNTXp1UDY7rqMiyU9OyIHA+hv+P39m8ucLKmpbwho73Jk6vaeBdEAahm29WRb7LR0mw/iQuoJcRY7QjXQ+3SVoBbhGvJSOLITv+Tj9htTQjEhkuFs5aOqlxqf35JBfVSbEffgI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B1D622F4; Wed, 27 Mar 2024 03:43:51 -0700 (PDT) Received: from [10.57.72.121] (unknown [10.57.72.121]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1C9883F694; Wed, 27 Mar 2024 03:43:15 -0700 (PDT) Message-ID: Date: Wed, 27 Mar 2024 10:43:14 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 0/3] Speed up boot with faster linear map creation Content-Language: en-GB To: Ard Biesheuvel Cc: Catalin Marinas , Will Deacon , Mark Rutland , David Hildenbrand , Donald Dutile , Eric Chanudet , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20240326101448.3453626-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 27/03/2024 10:09, Ard Biesheuvel wrote: > Hi Ryan, > > On Tue, 26 Mar 2024 at 12:15, Ryan Roberts wrote: >> >> Hi All, >> >> It turns out that creating the linear map can take a significant proportion of >> the total boot time, especially when rodata=full. And a large portion of the >> time it takes to create the linear map is issuing TLBIs. This series reworks the >> kernel pgtable generation code to significantly reduce the number of TLBIs. See >> each patch for details. >> >> The below shows the execution time of map_mem() across a couple of different >> systems with different RAM configurations. We measure after applying each patch >> and show the improvement relative to base (v6.9-rc1): >> >> | Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra >> | VM, 16G | VM, 64G | VM, 256G | Metal, 512G >> ---------------|-------------|-------------|-------------|------------- >> | ms (%) | ms (%) | ms (%) | ms (%) >> ---------------|-------------|-------------|-------------|------------- >> base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%) >> no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%) >> no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%) >> lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%) >> >> This series applies on top of v6.9-rc1. All mm selftests pass. I haven't yet >> tested all VA size configs (although I don't anticipate any issues); I'll do >> this as part of followup. >> > > These are very nice results! > > Before digging into the details: do we still have a strong case for > supporting contiguous PTEs and PMDs in these routines? We are currently using contptes and pmds for the linear map when rodata=[on|off] IIRC? I don't see a need to remove the capability personally. Also I was talking with Mark R yesterday and he suggested that an even better solution might be to create a temp pgtable that maps the linear map with pmds, switch to it, then create the real pgtable that maps the linear map with ptes, then switch to that. The benefit being that we can avoid the fixmap entirely when creating the second pgtable - we think this would likely be significantly faster still. My second patch adds the infrastructure to make this possible. But your changes for LPA2 make it significantly more effort; since that change we are now using the swapper pgtable when we populate the linear map into it - the kernel is already mapped and that isn't done in paging_init() anymore. So I'm not quite sure how we can easily make that work at the moment. Thanks, Ryan