Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp608798pxb; Thu, 19 Nov 2020 09:13:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJyFnk1pgh81odS3QDjyOI5DH7slNENBqXV4tR5Dz71dwFsC9xUDJirHP/+dLRFLtTvIk9rD X-Received: by 2002:a17:906:824a:: with SMTP id f10mr30594354ejx.167.1605806012977; Thu, 19 Nov 2020 09:13:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605806012; cv=none; d=google.com; s=arc-20160816; b=wHbZxkXyTrC8u9hSgsz6KZbcMfUvPp3NhcDGsyCfwCVmNIcn2/DsmG35LRlMAJPfRx e/7iJPHyjElYM9cFIkkfPdQJDuM/ZmYMdRzEfqywhzZAVrpzZR7XhvFU/r+fykVBGuFb kuyR4FjWfDTa/3hS0ldeN/h1wgDnfyEK5QiWMVg2cT9PUpKO0XYa4KbF8WSqq61fHGG3 9X4AccpgSQ/r/F4pwLCnP5qT4AwRQUpr/ZCANXIaFof1WQWfLK/Qhan34SWzZZxqcfvd NwBgSVlbF5TNauIGiXLN8Hfg4u1SXE8ezreYS/7yFI/vjbwIZgAXJsmPuJAetgFIbVpj HKJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Y+uVCUggwHhJMnq6TyCbc6Dpk3CWa5hXE9kahRv+DbQ=; b=bb6pgOZJ3fqh1xZjWc/qqmqEBYQ+iKjVFMgOq96I6VGfZG5ZHDTkUeVKIlsnT0Regy pj/0vC6kGUCI5nddQ6ECX518JN8rZDxziEMgaWp11oBTlDNQTQH7wAYkDffwKihaL5XX C9Xk8epZqeFaJWEnEy4h8cNIZirhXd3TUhwIGFFLWWS0FrepPTeTpKijFU2oGArf5S4Q jrVd05lP2siBx4T/qq88eWOGyKZ6JTdiB7rLA+H7lStckYiRKsL5RN89PQkI5o6YKTc+ OVKKvbHIopPgsWFRvFzlHvloTTv3jL7ZlUEPJ/Ahuw37sEnAzgbsGWCTaE1+h3GD+ne2 xqgA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qu6si168172ejb.528.2020.11.19.09.13.08; Thu, 19 Nov 2020 09:13:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728698AbgKSRK4 (ORCPT + 99 others); Thu, 19 Nov 2020 12:10:56 -0500 Received: from mail.kernel.org ([198.145.29.99]:37096 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728511AbgKSRK4 (ORCPT ); Thu, 19 Nov 2020 12:10:56 -0500 Received: from gaia (unknown [2.26.170.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3F167241A6; Thu, 19 Nov 2020 17:10:52 +0000 (UTC) Date: Thu, 19 Nov 2020 17:10:49 +0000 From: Catalin Marinas To: Nicolas Saenz Julienne Cc: James Morse , robh+dt@kernel.org, hch@lst.de, ardb@kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, lorenzo.pieralisi@arm.com, will@kernel.org, jeremy.linton@arm.com, iommu@lists.linux-foundation.org, linux-rpi-kernel@lists.infradead.org, guohanjun@huawei.com, robin.murphy@arm.com, linux-arm-kernel@lists.infradead.org, Chen Zhou Subject: Re: [PATCH v6 1/7] arm64: mm: Move reserve_crashkernel() into mem_init() Message-ID: <20201119171048.GD4376@gaia> References: <20201103173159.27570-1-nsaenzjulienne@suse.de> <20201103173159.27570-2-nsaenzjulienne@suse.de> <88c69ac0c9d7e144c80cebc7e9f82b000828e7f5.camel@suse.de> <20201113112901.GA3212@gaia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 19, 2020 at 03:09:58PM +0100, Nicolas Saenz Julienne wrote: > On Fri, 2020-11-13 at 11:29 +0000, Catalin Marinas wrote: > [...] > > > > > Let me stress that knowing the DMA constraints in the system before reserving > > > > > crashkernel's regions is necessary if we ever want it to work seamlessly on all > > > > > platforms. Be it small stuff like the Raspberry Pi or huge servers with TB of > > > > > memory. > > > > > > > > Indeed. So we have 3 options (so far): > > > > > > > > 1. Allow the crashkernel reservation to go into the linear map but set > > > > it to invalid once allocated. > > > > > > > > 2. Parse the flattened DT (not sure what we do with ACPI) before > > > > creating the linear map. We may have to rely on some SoC ID here > > > > instead of actual DMA ranges. > > > > > > > > 3. Assume the smallest ZONE_DMA possible on arm64 (1GB) for crashkernel > > > > reservations and not rely on arm64_dma_phys_limit in > > > > reserve_crashkernel(). > > > > > > > > I think (2) we tried hard to avoid. Option (3) brings us back to the > > > > issues we had on large crashkernel reservations regressing on some > > > > platforms (though it's been a while since, they mostly went quiet ;)). > > > > However, with Chen's crashkernel patches we end up with two > > > > reservations, one in the low DMA zone and one higher, potentially above > > > > 4GB. Having a fixed 1GB limit wouldn't be any worse for crashkernel > > > > reservations than what we have now. > > > > > > > > If (1) works, I'd go for it (James knows this part better than me), > > > > otherwise we can go for (3). > > > > > > Overall, I'd prefer (1) as well, and I'd be happy to have a got at it. If not > > > I'll append (3) in this series. > > > > I think for 1 we could also remove the additional KEXEC_CORE checks, > > something like below, untested: > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 3e5a6913acc8..27ab609c1c0c 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -477,7 +477,8 @@ static void __init map_mem(pgd_t *pgdp) > > int flags = 0; > > u64 i; > > > > - if (rodata_full || debug_pagealloc_enabled()) > > + if (rodata_full || debug_pagealloc_enabled() || > > + IS_ENABLED(CONFIG_KEXEC_CORE)) > > flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; > > > > /* > > @@ -487,11 +488,6 @@ static void __init map_mem(pgd_t *pgdp) > > * the following for-loop > > */ > > memblock_mark_nomap(kernel_start, kernel_end - kernel_start); > > -#ifdef CONFIG_KEXEC_CORE > > - if (crashk_res.end) > > - memblock_mark_nomap(crashk_res.start, > > - resource_size(&crashk_res)); > > -#endif > > > > /* map all the memory banks */ > > for_each_mem_range(i, &start, &end) { > > @@ -518,21 +514,6 @@ static void __init map_mem(pgd_t *pgdp) > > __map_memblock(pgdp, kernel_start, kernel_end, > > PAGE_KERNEL, NO_CONT_MAPPINGS); > > memblock_clear_nomap(kernel_start, kernel_end - kernel_start); > > - > > -#ifdef CONFIG_KEXEC_CORE > > - /* > > - * Use page-level mappings here so that we can shrink the region > > - * in page granularity and put back unused memory to buddy system > > - * through /sys/kernel/kexec_crash_size interface. > > - */ > > - if (crashk_res.end) { > > - __map_memblock(pgdp, crashk_res.start, crashk_res.end + 1, > > - PAGE_KERNEL, > > - NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); > > - memblock_clear_nomap(crashk_res.start, > > - resource_size(&crashk_res)); > > - } > > -#endif > > } > > > > void mark_rodata_ro(void) > > So as far as I'm concerned this is good enough for me. I took the time to > properly test crashkernel on RPi4 using the series, this patch, and another > small fix to properly update /proc/iomem. > > I'll send v7 soon, but before, James (or anyone for that matter) any obvious > push-back to Catalin's solution? I talked to James earlier and he was suggesting that we check the command line for any crashkernel reservations and only disable block mappings in that case, see the diff below on top of the one I already sent (still testing it). If you don't have any other changes for v7, I'm happy to pick v6 up on top of the no-block-mapping fix. diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index ed71b1c305d7..acdec0c67d3b 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -469,6 +469,21 @@ void __init mark_linear_text_alias_ro(void) PAGE_KERNEL_RO); } +static bool crash_mem_map __initdata; + +static int __init enable_crash_mem_map(char *arg) +{ + /* + * Proper parameter parsing is done by reserve_crashkernel(). We only + * need to know if the linear map has to avoid block mappings so that + * the crashkernel reservations can be unmapped later. + */ + crash_mem_map = false; + + return 0; +} +early_param("crashkernel", enable_crash_mem_map); + static void __init map_mem(pgd_t *pgdp) { phys_addr_t kernel_start = __pa_symbol(_stext); @@ -477,8 +492,7 @@ static void __init map_mem(pgd_t *pgdp) int flags = 0; u64 i; - if (rodata_full || debug_pagealloc_enabled() || - IS_ENABLED(CONFIG_KEXEC_CORE)) + if (rodata_full || debug_pagealloc_enabled() || crash_mem_map) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; /*