Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp7008772rwd; Tue, 6 Jun 2023 05:19:26 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4dz3DBAUxuDPq907+nzUSDtThtbWoiAJ5goLLo8KF2SRdUhLpnOnIm0eP7lQS7n5PBEIPQ X-Received: by 2002:a05:620a:298f:b0:75b:23a1:3f5 with SMTP id r15-20020a05620a298f00b0075b23a103f5mr2120349qkp.11.1686053966238; Tue, 06 Jun 2023 05:19:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686053966; cv=none; d=google.com; s=arc-20160816; b=GQqPZWwLrL3nirLvluTjLBA7o7VW4wwv0FmHWSWrgyvaKXE+5xQ8j4HNNs+wqwvuXe CNNGseldnTuOL7PE65uUj0e+bvdP1UDfZ0CjtYm0T+3z1KnwvJVwl0vBWlafQsZYP3Id Za8W2Tnkr6Kw7Jhih69624u8XHPGcdPTZAt26uVB0cYQfVCeQo/COR2OuZ/fEQVHNmsl Lsp2f9XT9SqR7mfyHDlw2xv7uBDtPrNwxJ2Ap8Nq9XRU1mA+uN5I3OSKJF7FSCnLEJfx Aoa/jKXmsTBPthNLw3gsryoiLTev4ZOXovSVr3bDQ7MxB3v2H44UixYgqYJQ2jntL7xx 1gww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=yv/nwaFCNhZH4A3ux4g0q6uUHRjrk5iZbFjNByEVFmk=; b=CJnj4mSDFmbP/ReWSFdBGQ4ENAzlSOEg3Q1w3GwWPw2n5I26Ad5qZnh/yCIIdvbWwZ hP7AESesgTNVVOYzsHkHyB77CdGuhu0yLtxz2bX3FAsgeHNyZSuj0fOf0cOBi3F54qWn htpybtNKH//9AiAuvTUXZ0Dwosn5Q3C/Rr09PUKouJ7oWIXqB/RUnyf27zNeOzjESz5b h+bg7FPo/G9NsFbinA5du1YQg/fsEeSrY/9/nJ92pNSkxvIQf0Ldcxd60d7Tt/wdoOq7 +DR+NGFzMyxQkSuFGxKkhP2ljBzuV3NIuYSwdHE2QZPXNFmo73nxmZziaaWJD0SK9eet OvCA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j28-20020a05620a147c00b00759101c7e26si5721084qkl.382.2023.06.06.05.19.10; Tue, 06 Jun 2023 05:19:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233324AbjFFLxy (ORCPT + 99 others); Tue, 6 Jun 2023 07:53:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231883AbjFFLxw (ORCPT ); Tue, 6 Jun 2023 07:53:52 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6A9E186 for ; Tue, 6 Jun 2023 04:53:50 -0700 (PDT) Received: from dggpemm500006.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Qb82R3PfhztQZC; Tue, 6 Jun 2023 19:51:27 +0800 (CST) Received: from [10.174.178.55] (10.174.178.55) by dggpemm500006.china.huawei.com (7.185.36.236) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 6 Jun 2023 19:53:48 +0800 Subject: Re: [PATCH v6 RESEND 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high To: Baoquan He , CC: , , , , , References: <20230515060259.830662-1-bhe@redhat.com> <20230515060259.830662-2-bhe@redhat.com> From: "Leizhen (ThunderTown)" Message-ID: <52743932-b071-6dd4-d891-b463946a3017@huawei.com> Date: Tue, 6 Jun 2023 19:53:47 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.55] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500006.china.huawei.com (7.185.36.236) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/5/15 17:54, Baoquan He wrote: > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > suitable memory region top down. If the 'xM' of crashkernel high memory > is reserved from high memory successfully, it will try to reserve > crashkernel low memory later accoringly. Otherwise, it will try to search > low memory area for the 'xM' suitable region. Please see the details in > Documentation/admin-guide/kernel-parameters.txt. > > While we observed an unexpected case where a reserved region crosses the > high and low meomry boundary. E.g on a system with 4G as low memory end, > user added the kernel parameters like: 'crashkernel=512M,high', it could > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > The crashkernel high region crossing low and high memory boudary will bring > issues: > > 1) For crashkernel=x,high, if getting crashkernel high region across > low and high memory boundary, then user will see two memory regions in > low memory, and one memory region in high memory. The two crashkernel > low memory regions are confusing as shown in above example. > > 2) If people explicityly specify "crashkernel=x,high crashkernel=y,low" > and y <= 128M, when crashkernel high region crosses low and high memory > boundary and the part of crashkernel high reservation below boundary is > bigger than y, the expected crahskernel low reservation will be skipped. > But the expected crashkernel high reservation is shrank and could not > satisfy user space requirement. > > 3) The crossing boundary behaviour of crahskernel high reservation is > different than x86 arch. On x86_64, the low memory end is 4G fixedly, > and the memory near 4G is reserved by system, e.g for mapping firmware, > pci mapping, so the crashkernel reservation crossing boundary never happens. >>From distros point of view, this brings inconsistency and confusion. Users > need to dig into x86 and arm64 system details to find out why. > > For kernel itself, the impact of issue 3) could be slight. While issue > 1) and 2) cause actual impact because it brings obscure semantics and > behaviour to crashkernel=,high reservation. > > Here, for crashkernel=xM,high, search the high memory for the suitable > region only in high memory. If failed, try reserving the suitable > region only in low memory. Like this, the crashkernel high region will > only exist in high memory, and crashkernel low region only exists in low > memory. The reservation behaviour for crashkernel=,high is clearer and > simpler. > > Note: RPi4 has different zone ranges than normal memory. Its DMA zone is > 0~1G, and DMA32 zone is 1G~4G if CONFIG_ZONE_DMA|DMA32 are enabled by > default. The low memory end is 1G in order to validate all devices, high > memory starts at 1G memory. However, for being consistent with normla normla --> normal > arm64 system, its low memory end is still 1G, while reserving crashkernel > high memory from 4G if crashkernel=size,high specified. This will remove > confusion. Reviewed-by: Zhen Lei > > With above change applied, summary of arm64 crashkernel reservation range: > 1) > RPi4(zone DMA:0~1G; DMA32:1G~4G): > crashkernel=size > 0~1G: low memory | 1G~top: high memory > > crashkernel=size,high > 0~1G: low memory | 4G~top: high memory > > 2) > Other normal system: > crashkernel=size > crashkernel=size,high > 0~4G: low memory | 4G~top: high memory > > 3) > Systems w/o zone DMA|DMA32 > crashkernel=size > crashkernel=size,high > 0~top: low memory > > Signed-off-by: Baoquan He > Reviewed-by: Catalin Marinas > --- > v6-RESEND: > - Remove the relic of local patch merging at the end of patch log. > - Add Catalin's Reviewed-by tag. > > arch/arm64/mm/init.c | 44 ++++++++++++++++++++++++++++++++++---------- > 1 file changed, 34 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 66e70ca47680..c28c2c8483cc 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -69,6 +69,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit; > > #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit > #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) > +#define CRASH_HIGH_SEARCH_BASE SZ_4G > > #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) > > @@ -101,12 +102,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) > */ > static void __init reserve_crashkernel(void) > { > - unsigned long long crash_base, crash_size; > - unsigned long long crash_low_size = 0; > + unsigned long long crash_low_size = 0, search_base = 0; > unsigned long long crash_max = CRASH_ADDR_LOW_MAX; > + unsigned long long crash_base, crash_size; > char *cmdline = boot_command_line; > - int ret; > bool fixed_base = false; > + bool high = false; > + int ret; > > if (!IS_ENABLED(CONFIG_KEXEC_CORE)) > return; > @@ -129,7 +131,9 @@ static void __init reserve_crashkernel(void) > else if (ret) > return; > > + search_base = CRASH_HIGH_SEARCH_BASE; > crash_max = CRASH_ADDR_HIGH_MAX; > + high = true; > } else if (ret || !crash_size) { > /* The specified value is invalid */ > return; > @@ -140,31 +144,51 @@ static void __init reserve_crashkernel(void) > /* User specifies base address explicitly. */ > if (crash_base) { > fixed_base = true; > + search_base = crash_base; > crash_max = crash_base + crash_size; > } > > retry: > crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > - crash_base, crash_max); > + search_base, crash_max); > if (!crash_base) { > /* > - * If the first attempt was for low memory, fall back to > - * high memory, the minimum required low memory will be > - * reserved later. > + * For crashkernel=size[KMG]@offset[KMG], print out failure > + * message if can't reserve the specified region. > */ > - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { > + if (fixed_base) { > + pr_warn("crashkernel reservation failed - memory is in use.\n"); > + return; > + } > + > + /* > + * For crashkernel=size[KMG], if the first attempt was for > + * low memory, fall back to high memory, the minimum required > + * low memory will be reserved later. > + */ > + if (!high && crash_max == CRASH_ADDR_LOW_MAX) { > crash_max = CRASH_ADDR_HIGH_MAX; > + search_base = CRASH_ADDR_LOW_MAX; > crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > goto retry; > } > > + /* > + * For crashkernel=size[KMG],high, if the first attempt was > + * for high memory, fall back to low memory. > + */ > + if (high && crash_max == CRASH_ADDR_HIGH_MAX) { > + crash_max = CRASH_ADDR_LOW_MAX; > + search_base = 0; > + goto retry; > + } > pr_warn("cannot allocate crashkernel (size:0x%llx)\n", > crash_size); > return; > } > > - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) && > - crash_low_size && reserve_crashkernel_low(crash_low_size)) { > + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size && > + reserve_crashkernel_low(crash_low_size)) { > memblock_phys_free(crash_base, crash_size); > return; > } > -- Regards, Zhen Lei