Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp2740225imm; Wed, 16 May 2018 19:16:16 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp9TWPq2XE8mV3gabSFMouHAYbehOcJ70gXbpSzBAFMCM1pPgJBfEMqJIPwjASjFXF5qHyH X-Received: by 2002:a17:902:14cb:: with SMTP id y11-v6mr3255540plg.229.1526523376221; Wed, 16 May 2018 19:16:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526523376; cv=none; d=google.com; s=arc-20160816; b=qxl/UsSO3A6pkdmYDZ6y6c0DEKR/iGHXxn3n9vCbcOkPpBTCw6lGEO6cQI2UohEoWi NdSBjcI7uB7e/oOJRS8XVEd4zBoUGln+U1GQJUOXnwimE/10IxSxeLnpkMKsiqkfBNJz n+ye3p1I1TNBS4FvQdOLvYrqTuEFGezkNxemVfPZ8+GqlOpVTNj7QU0liQnLwBy//rd/ h6HR6K6lrTj2vQ8EMIaisXew4zLmIk3IzuO2Wq1LbZPcY2qYQiNpQ+Ft7jk5mp9cOiKH PFl5ZjqBr+vEUZ7O0WqcJwu4V/OHRxvUYSDJxxoPcs+eLuH3xJRwzc7/xEtz64mdacvl e/gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:to :from:date:arc-authentication-results; bh=Pv5pGVyxLnFENLKfMb0J1Q6j+2Qfmm4Rq6nbvmoycXU=; b=iyibGOWVS4AiRhR56hp0UBGk+7M9K0SyrmAvfTwU5J1ry8MqJQR/MHi576scRWt2Lc 2f7ZhcyWhu12mkKt5uTi0uBe3eEgZzuxYyy892ijLnx9TpHvnW9tn2poUaYapTbQwQOM OFvAAw3tjue06LA3Ef0abp+7BjWTWBlLbmh8p6t0lTP4ncz/erSVlyVYZzK25eLmo+rN ISDb8ZvVpUsr8BA9hYqiAcJoudyP7tqOx+m3t/mc965X+jEct1Wpgq6q2tFZBkD7FGHA mPYTKUPOl9KFWL+GQ5t69bl+UfWcxSRys93TEWRXQvybWDxQ+mMSwNa/SBIy84sXA6XZ JCmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r10-v6si953815pgp.363.2018.05.16.19.16.01; Wed, 16 May 2018 19:16:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751610AbeEQCPy (ORCPT + 99 others); Wed, 16 May 2018 22:15:54 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:57838 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751259AbeEQCPx (ORCPT ); Wed, 16 May 2018 22:15:53 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8AE88805A52F; Thu, 17 May 2018 02:15:52 +0000 (UTC) Received: from localhost (ovpn-8-16.pek2.redhat.com [10.72.8.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B5C752026990; Thu, 17 May 2018 02:15:51 +0000 (UTC) Date: Thu, 17 May 2018 10:15:47 +0800 From: Baoquan He To: AKASHI Takahiro , James Morse , catalin.marinas@arm.com, will.deacon@arm.com, dhowells@redhat.com, vgoyal@redhat.com, herbert@gondor.apana.org.au, davem@davemloft.net, dyoung@redhat.com, arnd@arndb.de, ard.biesheuvel@linaro.org, bhsharma@redhat.com, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v9 04/11] arm64: kexec_file: allocate memory walking through memblock list Message-ID: <20180517021547.GJ24627@MiWiFi-R3L-srv> References: <20180425062629.29404-1-takahiro.akashi@linaro.org> <20180425062629.29404-5-takahiro.akashi@linaro.org> <648656ef-1f1e-b0ac-581c-aba1e62f4eee@arm.com> <20180507055906.GE11326@linaro.org> <20180517021024.GI24627@MiWiFi-R3L-srv> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180517021024.GI24627@MiWiFi-R3L-srv> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 02:15:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 02:15:52 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'bhe@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/17/18 at 10:10am, Baoquan He wrote: > On 05/07/18 at 02:59pm, AKASHI Takahiro wrote: > > James, > > > > On Tue, May 01, 2018 at 06:46:09PM +0100, James Morse wrote: > > > Hi Akashi, > > > > > > On 25/04/18 07:26, AKASHI Takahiro wrote: > > > > We need to prevent firmware-reserved memory regions, particularly EFI > > > > memory map as well as ACPI tables, from being corrupted by loading > > > > kernel/initrd (or other kexec buffers). We also want to support memory > > > > allocation in top-down manner in addition to default bottom-up. > > > > So let's have arm64 specific arch_kexec_walk_mem() which will search > > > > for available memory ranges in usable memblock list, > > > > i.e. !NOMAP & !reserved, > > > > > > > instead of system resource tree. > > > > > > Didn't we try to fix the system-resource-tree in order to fix regular-kexec to > > > be safe in the EFI-memory-map/ACPI-tables case? > > > > > > It would be good to avoid having two ways of doing this, and I would like to > > > avoid having extra arch code... > > > > I know what you mean. > > /proc/iomem or system resource is, in my opinion, not the best place to > > describe memory usage of kernel but rather to describe *physical* hardware > > layout. As we are still discussing about "reserved" memory, I don't want > > to depend on it. > > Along with memblock list, we will have more accurate control over memory > > usage. > > In kexec-tools, we see any usable memory as candidate which can be used Here I said 'any', it's not accurate. Those memory which need be passed to 2nd kernel for use need be excluded, just as we have done in kexec-tools. > to load kexec kernel image/initrd etc. However kexec loading is a > preparation work, it just books those position for later kexec kernel > jumping after "kexec -e", that is why we need kexec_buf to remember > them and do the real content copy of kernel/initrd. Here you use > memblock to search available memory, isn't it deviating too far away > from the original design in kexec-tools. Assume kexec loading and > kexec_file loading should be consistent on loading even though they are > done in different space, kernel space and user space. > > I didn't follow the earlier post, may miss something. > > Thanks > Baoquan > > > > > > > > > > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c > > > > new file mode 100644 > > > > index 000000000000..f9ebf54ca247 > > > > --- /dev/null > > > > +++ b/arch/arm64/kernel/machine_kexec_file.c > > > > @@ -0,0 +1,57 @@ > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > +/* > > > > + * kexec_file for arm64 > > > > + * > > > > + * Copyright (C) 2018 Linaro Limited > > > > + * Author: AKASHI Takahiro > > > > + * > > > > > > > + * Most code is derived from arm64 port of kexec-tools > > > > > > How does kexec-tools walk memblock? > > > > Will remove this comment from this patch. > > Obviously, this comment is for the rest of the code which will be > > added to succeeding patches (patch #5 and #7). > > > > > > > > > > > + */ > > > > + > > > > +#define pr_fmt(fmt) "kexec_file: " fmt > > > > + > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > + > > > > +int arch_kexec_walk_mem(struct kexec_buf *kbuf, > > > > + int (*func)(struct resource *, void *)) > > > > +{ > > > > + phys_addr_t start, end; > > > > + struct resource res; > > > > + u64 i; > > > > + int ret = 0; > > > > + > > > > + if (kbuf->image->type == KEXEC_TYPE_CRASH) > > > > + return func(&crashk_res, kbuf); > > > > + > > > > + if (kbuf->top_down) > > > > + for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved, > > > > + NUMA_NO_NODE, MEMBLOCK_NONE, > > > > + &start, &end, NULL) { > > > > > > for_each_free_mem_range_reverse() is a more readable version of this helper. > > > > OK. I used to use my own limited list of reserved memory instead of > > memblock.reserved here to exclude verbose ranges. > > > > > > > > + if (!memblock_is_map_memory(start)) > > > > + continue; > > > > > > Passing MEMBLOCK_NONE means this walk will never find MEMBLOCK_NOMAP memory. > > > > Sure, I confirmed it. > > > > > > > > > + res.start = start; > > > > + res.end = end; > > > > + ret = func(&res, kbuf); > > > > + if (ret) > > > > + break; > > > > + } > > > > + else > > > > + for_each_mem_range(i, &memblock.memory, &memblock.reserved, > > > > + NUMA_NO_NODE, MEMBLOCK_NONE, > > > > + &start, &end, NULL) { > > > > > > for_each_free_mem_range()? > > > > OK. > > > > > > + if (!memblock_is_map_memory(start)) > > > > + continue; > > > > + > > > > + res.start = start; > > > > + res.end = end; > > > > + ret = func(&res, kbuf); > > > > + if (ret) > > > > + break; > > > > + } > > > > + > > > > + return ret; > > > > +} > > > > > > > > > > With these changes, what we have is almost: > > > arch/powerpc/kernel/machine_kexec_file_64.c::arch_kexec_walk_mem() ! > > > (the difference being powerpc doesn't yet support crash-kernels here) > > > > > > If the argument is walking memblock gives a better answer than the stringy > > > walk_system_ram_res() thing, is there any mileage in moving this code into > > > kexec_file.c, and using it if !IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK)? > > > > > > This would save arm64/powerpc having near-identical implementations. > > > 32bit arm keeps memblock if it has kexec, so it may be useful there too if > > > kexec_file_load() support is added. > > > > Thanks. I've forgot ppc. > > > > -Takahiro AKASHI > > > > > > > > > > Thanks, > > > > > > James > > > > _______________________________________________ > > kexec mailing list > > kexec@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec