Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp5601747rdb; Sun, 17 Sep 2023 04:25:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHBWJGq5cVlsfneJ5XHK/M13F1oz1nTpLqvbppXQjcFWZAf6dS0MFHuZOFVKVmuH8kPvfki X-Received: by 2002:a05:6a20:1007:b0:154:9196:16c0 with SMTP id gs7-20020a056a20100700b00154919616c0mr4982144pzc.61.1694949933054; Sun, 17 Sep 2023 04:25:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694949933; cv=none; d=google.com; s=arc-20160816; b=eVpCH/LeZKOKQHYCSiXQJ3VM5yz9sJd6nVy0jhI+RYNWLjzWdNL6xdQ5grbUW13wYw ajF5DigL3he6tpjmMy+1YRxLsIA5XqypkL+4vRW3d0zo+0nII16RZJ/jAxoVjIwJv8Up qQP66WUNDAZQzcSZpUt+k6P9lG/3hK23cWdUgdbaeDnMQrXGxsYqwbBOKJ1dKDyoO70O h2K5rQB2t45iqEtxT2a3u4rzeUrKrJAh04g7jLXZunjuLBSnir+pZW/qHd/dW5e12Hza HBeJA/UcY2cchfqRqSU/IR5VsKpsudOWXysAREmY9+AeCM/W7zSCi7R7ETJE42K0G9Ru iEDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=T+Xt67ikLSJQkDZ5RHaKRKUGeO9dxXKuGY9TkbnvghE=; fh=DvJpOgpqesTAZlr3bqDFz3CtlbK9yGxVry+sEHnDWYk=; b=sg1O1DSAESFRhJAQLaqiqcMEFMmWBKIAGa7t98KfC0pK83F8NujBTZwt3klBndnzFT pws9hJcYTC09Qzanu5uepWFjeWjrf9hRv3qdTEWgQ3bgBNgBX5sK2t/3ZI3zOm9n6tnT 52rupuGSxJi3IHASLPZiRt1PLotxcTIF+oZKOICZjs4HzBNPi6fC9zCHHeavElc9MN9i 0CrWd83YOJfToxyu7HIl5KnXJnWISnQ63rlfyTif3SQrnOkH8C985LbexvSQRkwKEknY XQSuSIQbr1FU1G0N8wMD+CcQnVB6GVl+224/I6TZZ5RpL5n3oxX+XJoq+XKDkfugQNb2 35yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GDIA0CxQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id j70-20020a638049000000b00573fe94635asi6292486pgd.642.2023.09.17.04.25.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Sep 2023 04:25:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GDIA0CxQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 5797680ABFC5; Sun, 17 Sep 2023 00:55:37 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233940AbjIQHy7 (ORCPT + 99 others); Sun, 17 Sep 2023 03:54:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233329AbjIQHyy (ORCPT ); Sun, 17 Sep 2023 03:54:54 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56473191; Sun, 17 Sep 2023 00:54:49 -0700 (PDT) Date: Sun, 17 Sep 2023 07:54:47 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1694937287; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T+Xt67ikLSJQkDZ5RHaKRKUGeO9dxXKuGY9TkbnvghE=; b=GDIA0CxQcYVzG86kFCa4cUx3CAjIaYc+wSrEJhg6NefAvm8SB6hHxrkNGph+mv+pqOFsmr c9xQ+OeD1W+GHw87pUZi4RczNmxMOWnB822TfK3jECGTlckpIQ2YCbxkxPqcRNxLPTHcay 8JFnu+15uTnR0CZ6eY/7fHQmMjjdrH8SRL1ZAdjVumaUK6F2QqEMuBPFQiO8r5eCeKvmlj caH0oa4HnDn7LGP8rakch1ZEmgYKJNMyWrkzUi6PExrWOby+aWD4A934DzXk2ewmg6uQO7 s7KHQmehuXR27sXgAQRGqHCvKl79SF41pDPICs/0meGWbpqPbqe6q9RAtnrGcg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1694937287; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T+Xt67ikLSJQkDZ5RHaKRKUGeO9dxXKuGY9TkbnvghE=; b=Ii7cn8pfqoeq3Gm4afMMmAhf4zhaP41kEDCE7cNlWHFCBqUj/O9gSMhqusEvlrhX3bO8EE URsWJvUwZ+Wk8qBA== From: "tip-bot2 for Kirill A. Shutemov" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/urgent] x86/boot/compressed: Reserve more memory for page tables Cc: Aaron Lu , "Kirill A. Shutemov" , Ingo Molnar , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20230915070221.10266-1-kirill.shutemov@linux.intel.com> References: <20230915070221.10266-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Message-ID: <169493728726.27769.6657332180593925239.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Sun, 17 Sep 2023 00:55:37 -0700 (PDT) The following commit has been merged into the x86/urgent branch of tip: Commit-ID: f530ee95b72e77b09c141c4b1a4b94d1199ffbd9 Gitweb: https://git.kernel.org/tip/f530ee95b72e77b09c141c4b1a4b94d1199ffbd9 Author: Kirill A. Shutemov AuthorDate: Fri, 15 Sep 2023 10:02:21 +03:00 Committer: Ingo Molnar CommitterDate: Sun, 17 Sep 2023 09:48:57 +02:00 x86/boot/compressed: Reserve more memory for page tables The decompressor has a hard limit on the number of page tables it can allocate. This limit is defined at compile-time and will cause boot failure if it is reached. The kernel is very strict and calculates the limit precisely for the worst-case scenario based on the current configuration. However, it is easy to forget to adjust the limit when a new use-case arises. The worst-case scenario is rarely encountered during sanity checks. In the case of enabling 5-level paging, a use-case was overlooked. The limit needs to be increased by one to accommodate the additional level. This oversight went unnoticed until Aaron attempted to run the kernel via kexec with 5-level paging and unaccepted memory enabled. Update wost-case calculations to include 5-level paging. To address this issue, let's allocate some extra space for page tables. 128K should be sufficient for any use-case. The logic can be simplified by using a single value for all kernel configurations. [ Also add a warning, should this memory run low - by Dave Hansen. ] Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage") Reported-by: Aaron Lu Signed-off-by: Kirill A. Shutemov Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com --- arch/x86/boot/compressed/ident_map_64.c | 8 ++++- arch/x86/include/asm/boot.h | 45 ++++++++++++++++-------- 2 files changed, 39 insertions(+), 14 deletions(-) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index bcc956c..08f93b0 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -59,6 +59,14 @@ static void *alloc_pgt_page(void *context) return NULL; } + /* Consumed more tables than expected? */ + if (pages->pgt_buf_offset == BOOT_PGT_SIZE_WARN) { + debug_putstr("pgt_buf running low in " __FILE__ "\n"); + debug_putstr("Need to raise BOOT_PGT_SIZE?\n"); + debug_putaddr(pages->pgt_buf_offset); + debug_putaddr(pages->pgt_buf_size); + } + entry = pages->pgt_buf + pages->pgt_buf_offset; pages->pgt_buf_offset += PAGE_SIZE; diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 4ae1433..b3a7cfb 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -40,23 +40,40 @@ #ifdef CONFIG_X86_64 # define BOOT_STACK_SIZE 0x4000 +/* + * Used by decompressor's startup_32() to allocate page tables for identity + * mapping of the 4G of RAM in 4-level paging mode: + * - 1 level4 table; + * - 1 level3 table; + * - 4 level2 table that maps everything with 2M pages; + * + * The additional level5 table needed for 5-level paging is allocated from + * trampoline_32bit memory. + */ # define BOOT_INIT_PGT_SIZE (6*4096) -# ifdef CONFIG_RANDOMIZE_BASE + /* - * Assuming all cross the 512GB boundary: - * 1 page for level4 - * (2+2)*4 pages for kernel, param, cmd_line, and randomized kernel - * 2 pages for first 2M (video RAM: CONFIG_X86_VERBOSE_BOOTUP). - * Total is 19 pages. + * Total number of page tables kernel_add_identity_map() can allocate, + * including page tables consumed by startup_32(). + * + * Worst-case scenario: + * - 5-level paging needs 1 level5 table; + * - KASLR needs to map kernel, boot_params, cmdline and randomized kernel, + * assuming all of them cross 256T boundary: + * + 4*2 level4 table; + * + 4*2 level3 table; + * + 4*2 level2 table; + * - X86_VERBOSE_BOOTUP needs to map the first 2M (video RAM): + * + 1 level4 table; + * + 1 level3 table; + * + 1 level2 table; + * Total: 28 tables + * + * Add 4 spare table in case decompressor touches anything beyond what is + * accounted above. Warn if it happens. */ -# ifdef CONFIG_X86_VERBOSE_BOOTUP -# define BOOT_PGT_SIZE (19*4096) -# else /* !CONFIG_X86_VERBOSE_BOOTUP */ -# define BOOT_PGT_SIZE (17*4096) -# endif -# else /* !CONFIG_RANDOMIZE_BASE */ -# define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE -# endif +# define BOOT_PGT_SIZE_WARN (28*4096) +# define BOOT_PGT_SIZE (32*4096) #else /* !CONFIG_X86_64 */ # define BOOT_STACK_SIZE 0x1000