Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp7028819ybi; Thu, 13 Jun 2019 08:23:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCPdwM6b3cXmPrhY4iqmXtuM/1kBj00Zcf0OEsGkgbi5w1w+1J2+feAZfOcM+Ci+kZ/Yq1 X-Received: by 2002:a17:902:25ab:: with SMTP id y40mr32306571pla.268.1560439390635; Thu, 13 Jun 2019 08:23:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560439390; cv=none; d=google.com; s=arc-20160816; b=g0ZxXXRNTNFXBO5CoHae64h6GFHzY6UpLhR46SjGViejUK4bUrmo8MM8H10k+sQ58y I+G7njdR0TkZu6rDFP2f16C0i2iAXk2cfyQHaCB78IUwz/1hIK40q5KRIsM3T+25SKqf vFc4YdDi9AlQft8JZ7ZxutJwS/NIEFoa2csgoy8sJPyWHq6xQLEMr4eJ4Os/mapGNKoP /88NBaBAwKM/9EmtOlYyeRFGm9a0h2trZPvk4WI4/DFz4K1FWUbi6i8jG8rZxsr4uDjr 87Ivp4MsbewBWFFiQv5OZl7ctVp75QGL9TtS+qoJ5TZWOC5FZh4x9FXlD4bP96FfZgs6 ld5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=LDzHsITbSygtqp7A2owYkiEC/rrO33c+0yJUYCUO57s=; b=Q265jFnxsrq7hPaXYJhQ0lkpzctkKnuP9yxQZXEjHCREN2dv+HMDLwo18CBENVv7t+ V4ADb3xN1uETm+9jVyMIkt209EDd6aLKVeXipT/Rqa1ZwlV/ulRCR22Afl+agdOe9w+W Quyl4B7paHVaJQeIld9UDNa22KkYoeCokwYOKNgIrzRXvaQ+Qm9l7NrZou7/Vwe1Xfwj OpWISjIpFf0LkBIuKIebeeNK2cCv48EX7o/rTD/6MyEh8HXdB8y7P3yieBTBbBrGEhbl IMgJMd8Q0rew015VkCI+tEzCGScniBwU5UTnZvF+HsDhHRyPQ8T2IAnR9HVmzhDfzlfm R2OQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x3si70118pgk.413.2019.06.13.08.22.55; Thu, 13 Jun 2019 08:23:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388386AbfFMPVz (ORCPT + 99 others); Thu, 13 Jun 2019 11:21:55 -0400 Received: from foss.arm.com ([217.140.110.172]:39324 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731990AbfFMMod (ORCPT ); Thu, 13 Jun 2019 08:44:33 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 36E163EF; Thu, 13 Jun 2019 05:44:32 -0700 (PDT) Received: from [10.1.196.105] (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0F5113F694; Thu, 13 Jun 2019 05:44:29 -0700 (PDT) Subject: Re: [PATCH 2/4] arm64: kdump: support reserving crashkernel above 4G To: Chen Zhou Cc: catalin.marinas@arm.com, will.deacon@arm.com, akpm@linux-foundation.org, ard.biesheuvel@linaro.org, rppt@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, ebiederm@xmission.com, horms@verge.net.au, takahiro.akashi@linaro.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org, wangkefeng.wang@huawei.com References: <20190507035058.63992-1-chenzhou10@huawei.com> <20190507035058.63992-3-chenzhou10@huawei.com> From: James Morse Message-ID: Date: Thu, 13 Jun 2019 13:44:28 +0100 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Chen Zhou, On 13/06/2019 12:27, Chen Zhou wrote: > On 2019/6/6 0:29, James Morse wrote: >> On 07/05/2019 04:50, Chen Zhou wrote: >>> When crashkernel is reserved above 4G in memory, kernel should >>> reserve some amount of low memory for swiotlb and some DMA buffers. >> >>> Meanwhile, support crashkernel=X,[high,low] in arm64. When use >>> crashkernel=X parameter, try low memory first and fall back to high >>> memory unless "crashkernel=X,high" is specified. >> >> What is the 'unless crashkernel=...,high' for? I think it would be simpler to relax the >> ARCH_LOW_ADDRESS_LIMIT if reserve_crashkernel_low() allocated something. >> >> This way "crashkernel=1G" tries to allocate 1G below 4G, but fails if there isn't enough >> memory. "crashkernel=1G crashkernel=16M,low" allocates 16M below 4G, which is more likely >> to succeed, if it does it can then place the 1G block anywhere. >> > Yeah, this is much simpler. >>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c >>> index 413d566..82cd9a0 100644 >>> --- a/arch/arm64/kernel/setup.c >>> +++ b/arch/arm64/kernel/setup.c >>> @@ -243,6 +243,9 @@ static void __init request_standard_resources(void) >>> request_resource(res, &kernel_data); >>> #ifdef CONFIG_KEXEC_CORE >>> /* Userspace will find "Crash kernel" region in /proc/iomem. */ >>> + if (crashk_low_res.end && crashk_low_res.start >= res->start && >>> + crashk_low_res.end <= res->end) >>> + request_resource(res, &crashk_low_res); >>> if (crashk_res.end && crashk_res.start >= res->start && >>> crashk_res.end <= res->end) >>> request_resource(res, &crashk_res); >> >> With both crashk_low_res and crashk_res, we end up with two entries in /proc/iomem called >> "Crash kernel". Because its sorted by address, and kexec-tools stops searching when it >> find "Crash kernel", you are always going to get the kernel placed in the lower portion. >> >> I suspect this isn't what you want, can we rename crashk_low_res for arm64 so that >> existing kexec-tools doesn't use it? > In my patchset, in addition to the kernel patches, i also modify the kexec-tools. > arm64: support more than one crash kernel regions(http://lists.infradead.org/pipermail/kexec/2019-April/022792.html). > In kexec-tools patch, we read all the "Crash kernel" entry and load crash kernel high. But we can't rely on people updating user-space when they update the kernel! [...] >> I'm afraid you've missed the ugly bit of the crashkernel reservation... >> >> arch/arm64/mm/mmu.c::map_mem() marks the crashkernel as 'nomap' during the first pass of >> page-table generation. This means it isn't mapped in the linear map. It then maps it with >> page-size mappings, and removes the nomap flag. >> >> This is done so that arch_kexec_protect_crashkres() and >> arch_kexec_unprotect_crashkres() can remove the valid bits of the crashkernel mapping. >> This way the old-kernel can't accidentally overwrite the crashkernel. It also saves us if >> the old-kernel and the crashkernel use different memory attributes for the mapping. >> >> As your low-memory reservation is intended to be used for devices, having it mapped by the >> old-kernel as cacheable memory is going to cause problems if those CPUs aren't taken >> offline and go corrupting this memory. (we did crash for a reason after all) >> >> >> I think the simplest thing to do is mark the low region as 'nomap' in >> reserve_crashkernel() and always leave it unmapped. We can then describe it via a >> different string in /proc/iomem, something like "Crash kernel (low)". Older kexec-tools >> shouldn't use it, (I assume its not using strncmp() in a way that would do this by >> accident), and newer kexec-tools can know to describe it in the DT, but it can't write to it. > I did miss the bit of the crashkernel reservation. > I will fix this in next version. I think all that is needed is to make the low-region nmap, and describe it via /proc/iomem with a name where nothing will try and use it by accident. Thanks, James