Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp3108148ybn; Fri, 27 Sep 2019 00:56:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqzkSM9YkvimRcAx3uOEZ3MQG10O5g8qtnJ8DoodbOaY9Uk0CF/V24VEzoXQmh2FnozQ9RMy X-Received: by 2002:aa7:d313:: with SMTP id p19mr2958997edq.197.1569570989115; Fri, 27 Sep 2019 00:56:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569570989; cv=none; d=google.com; s=arc-20160816; b=Lo6LELRjlEQxQACNoHfXpBCItMXLV74Ya9y3iDQVVtOuE2QuNcDSAvAduQHF764dgE b+H0BYzXNb0jH7lqdnoq5Ci6O48DMezVBw6PYHGcWNwWotvk/tJ0RQqw8rZQ5C4cbCbt Sj9knqHMn76R5OuHj1/GWHo9++48JI6qIroe/3rhT4yAGHRbikBrvpVjvgth4NS7tcbt 2Cq5t6PzcjVEWVhkdOdSdekOvSgDcMM3EOn8V1yOb2bolOuDQoNQbCOUMwCKxfu81Ko/ s7HC+4oF/0V1c6ecsWZhy0dd3MbA/REekUNfH9DLdRhjGV5sz6ahKS89riDkqylpkgSE EPFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=+z967OGKFyzsipkAMvwKR90NL+oBP6ClqcqeupYnwJM=; b=I5v+UVszQf+ML+tjCU+DKL066UtEiaGHdSdemoqakXzKGQypFuXw9qR0sL1y/klRFv IeofUnxjcw5IuVnubE0cRG0nq31OqZKRIRPxE/tjQRYFKaFxOiC7u5wTnVeMP2CROfo2 9Tl499+cJhXSyy5LMaVHjh8ugjKWSQIXdNFSBDbFZuK6nZdct30hGfwNi+zhV5pBnQ/B zAAhnck5iUIKNtU7dlZ13bgaCoIaxa/u6npM0TjXzIcEbDPic9dxMD0scG1+w0maGHjT jUhSPkpK2RYeRrDWy/OSqBaPtBZBSuAIIZOGdtDk8U9tpr0y25XibEhMbYLS+dL75zGR s1DQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gh2si2451776ejb.330.2019.09.27.00.56.04; Fri, 27 Sep 2019 00:56:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726628AbfI0HxM (ORCPT + 99 others); Fri, 27 Sep 2019 03:53:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:31305 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725812AbfI0HxL (ORCPT ); Fri, 27 Sep 2019 03:53:11 -0400 Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E7969C05AA52 for ; Fri, 27 Sep 2019 07:53:10 +0000 (UTC) Received: by mail-io1-f71.google.com with SMTP id w8so10669022iod.21 for ; Fri, 27 Sep 2019 00:53:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+z967OGKFyzsipkAMvwKR90NL+oBP6ClqcqeupYnwJM=; b=C6RwmpEjkaWECYOp1TJeL+CsFzrIr0zgH2PuhzEMTW77i+VoPUuFC8aA65Abhz4h5F CDTWIGUQMprzt7IZ2B1z4JFaBjvfnkDjzMsE4kmuJMzR3AE1x7rEfLL6A+NhtHEd2hP6 HM0Y194EUmcSVed2hhPCUoLDyXjbPcsCp/oidCm+gtfyPRj5ak6S+tXVqS+CPEEiLa8a 5puVcpEVhIUBmLBP3JfsrpJpGssbGQvuQ2h+aL+WumaidAQMylL881vdDRnVnbAv5y+E gIpF/pQzQuCL0inaFiw65/HlubaheI5ac+JLkh9GKFi8/sd8XqGNTuoLcPUsS7qYclKO oY9Q== X-Gm-Message-State: APjAAAUSDFi3SP2t8b71oKmplu6WsFB5Jin0oasoL9vE2NMn05iiLxVh 66fYD427hoCrc4EFepqNv3h2nILo+/AfERDwE3tH0zL3ZbuUwOssDZ+QnyjElbKNpydJL4Sh8VN omsjm5Iszq0wLr1DPbAIgyTVa4xmglZmYAeZeLBuc X-Received: by 2002:a92:5c13:: with SMTP id q19mr3296382ilb.249.1569570790131; Fri, 27 Sep 2019 00:53:10 -0700 (PDT) X-Received: by 2002:a92:5c13:: with SMTP id q19mr3296358ilb.249.1569570789810; Fri, 27 Sep 2019 00:53:09 -0700 (PDT) MIME-Version: 1.0 References: <20190910151341.14986-1-kasong@redhat.com> <20190910151341.14986-3-kasong@redhat.com> <20190911055618.GA104115@gmail.com> <20190927054208.GA13426@dhcp-128-65.nay.redhat.com> In-Reply-To: <20190927054208.GA13426@dhcp-128-65.nay.redhat.com> From: Kairui Song Date: Fri, 27 Sep 2019 15:52:58 +0800 Message-ID: Subject: Re: [PATCH v3 2/2] x86/kdump: Reserve extra memory when SME or SEV is active To: Dave Young Cc: Ingo Molnar , Linux Kernel Mailing List , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Thomas Lendacky , Baoquan He , Lianbo Jiang , "the arch/x86 maintainers" , "kexec@lists.infradead.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 27, 2019 at 1:42 PM Dave Young wrote: > > On 09/25/19 at 06:36pm, Kairui Song wrote: > > On Wed, Sep 11, 2019 at 1:56 PM Ingo Molnar wrote: > > > * Kairui Song wrote: > > > > > > > Since commit c7753208a94c ("x86, swiotlb: Add memory encryption support"), > > > > SWIOTLB will be enabled even if there is less than 4G of memory when SME > > > > is active, to support DMA of devices that not support address with the > > > > encrypt bit. > > > > > > > > And commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is > > > > active") make the kernel keep SWIOTLB enabled even if there is an IOMMU. > > > > > > > > Then commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory > > > > encryption") will always force SWIOTLB to be enabled when SEV is active > > > > in all cases. > > > > > > > > Now, when either SME or SEV is active, SWIOTLB will be force enabled, > > > > and this is also true for kdump kernel. As a result kdump kernel will > > > > run out of already scarce pre-reserved memory easily. > > > > > > > > So when SME/SEV is active, reserve extra memory for SWIOTLB to ensure > > > > kdump kernel have enough memory, except when "crashkernel=size[KMG],high" > > > > is specified or any offset is used. As for the high reservation case, an > > > > extra low memory region will always be reserved and that is enough for > > > > SWIOTLB. Else if the offset format is used, user should be fully aware > > > > of any possible kdump kernel memory requirement and have to organize the > > > > memory usage carefully. > > > > > > > > Signed-off-by: Kairui Song > > > > --- > > > > arch/x86/kernel/setup.c | 20 +++++++++++++++++--- > > > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > > > index 71f20bb18cb0..ee6a2f1e2226 100644 > > > > --- a/arch/x86/kernel/setup.c > > > > +++ b/arch/x86/kernel/setup.c > > > > @@ -530,7 +530,7 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, > > > > unsigned long long *crash_size, > > > > bool high) > > > > { > > > > - unsigned long long base, size; > > > > + unsigned long long base, size, mem_enc_req = 0; > > > > > > > > base = *crash_base; > > > > size = *crash_size; > > > > @@ -561,11 +561,25 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, > > > > if (high) > > > > goto high_reserve; > > > > > > > > + /* > > > > + * When SME/SEV is active and not using high reserve, > > > > + * it will always required an extra SWIOTLB region. > > > > + */ > > > > + if (mem_encrypt_active()) > > > > + mem_enc_req = ALIGN(swiotlb_size_or_default(), SZ_1M); > > > > + > > > > base = memblock_find_in_range(CRASH_ALIGN, > > > > - CRASH_ADDR_LOW_MAX, size, > > > > + CRASH_ADDR_LOW_MAX, > > > > + size + mem_enc_req, > > > > CRASH_ALIGN); > > > > > > > Hi Ingo, > > > > I re-read my previous reply, it's long and tedious, let me try to make > > a more effective reply: > > > > > What sizes are we talking about here? > > > > The size here is how much memory will be reserved for kdump kernel, to > > ensure kdump kernel and userspace can run without OOM. > > > > > > > > - What is the possible size range of swiotlb_size_or_default() > > > > swiotlb_size_or_default() returns the swiotlb size, it's specified by > > user using swiotlb=, or default size (64MB) > > > > > > > > - What is the size of CRASH_ADDR_LOW_MAX (the old limit)? > > > > It's 4G. > > > > > > > > - Why do we replace one fixed limit with another fixed limit instead of > > > accurately sizing the area, with each required feature adding its own > > > requirement to the reservation size? > > > > It's quite hard to "accurately sizing the area". > > > > No way to tell the exact amount of memory kdump needs, we can only estimate. > > Kdump kernel use different cmdline, drivers and components will have > > special handling for kdump, and userspace is totally different. > > Agreed about your above, but specific this the problem in this patch > There should be other ways. > > First thought about doing generic handling in swiotlb part, and do > something like kdump_memory_reserve(size) Ingo suggested, but according > to you swiotlb init is late, so it can not increase the size, OTOH if > reserve another region for kdump in swiotlb will cause other issues. > > So let's think about other improvement, for example to see if you can > call kdump_memory_reserve(size) in AMD SME init path, for example in > mem_encrypt_init(), is it before crashkernel reservation? Yes, mem_encrypt_init is before the crashkernel reservation. I think this is a good idea. Will make a V4 patch based on this. > > If doable it will be at least cleaner than the code in this patch. > > Thanks > Dave -- Best Regards, Kairui Song