Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1804735ybn; Thu, 26 Sep 2019 02:27:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqxgwpEpvRI1fwNyBjZWHKoljXOLT+s8dcz3G7UukuY4JqTJsBcOMV6iaLk6guggbXVGwMAq X-Received: by 2002:a17:906:1c05:: with SMTP id k5mr2193350ejg.286.1569490066363; Thu, 26 Sep 2019 02:27:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569490066; cv=none; d=google.com; s=arc-20160816; b=MsgCxgBKOlX+WxHBPNvKzPiM9gES1wEYxIVgOajRC4OeTWyElgAUqdT0dZBIbRR52k LYXJzMxHJ7V2XKQJ4wYg066hYUMRPTZAc+9elaf+O/Y3nHH8Ua5EBKdU8OCzkUnyX/32 WzLiODxNnK5IjyUBBKXRMClfgnfOq9juADJD8ye7V6yalzSwGc414byw6sXiaWG1NXGf O7iD48bsvMud6hkGMyY/N2XFhgdG0XB8Q/uvEUvJPkJ4tQJ4iC24XQYV/bMLz3GsQ/+B /wAFyYfzCiGrmM7tm/DrDAFv86zgvxChaXEmJscCK34mV70TrdhV2CemsPaG7o8LHLPk MS5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=NdvVuCfFLXjVHxywXEjyNWAMi7Dm8AjJWXCcb/jJKAg=; b=K9yQ/XoSDla3UUJCJRnT6DEj13fXviMD8IkybmKsQRmK7R86ie6WlHWdLQFX+IyrHk JshEzXRVB1aqQ1WcjlJMdefYxC39cgod4/U+O2AQ9e6EQ7WvkZT9B04mjeNTmJXSFwdF zxmfccNE4eB/DX3edQVZHK1/zKZQULA/MwMsVWBpxvHjDpIS0Z7o4/+anjHWH5cEZqLm bT4SUkV32BlzEWfIYIf5IfEtKmP/Q6202KbV6k5M1rciASOvB8irLLnW6TRo8Ju0rNL0 9rHDZUlBHD3uVq8A2OosF5uIjpY0O1Ls6tR+bDA+HLwZcsJVNmlsnKG6aQZbqYNR/DnZ YOjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e23si704815ejt.379.2019.09.26.02.27.23; Thu, 26 Sep 2019 02:27:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390060AbfIYKgu (ORCPT + 99 others); Wed, 25 Sep 2019 06:36:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40492 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389655AbfIYKgt (ORCPT ); Wed, 25 Sep 2019 06:36:49 -0400 Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E684D89AC7 for ; Wed, 25 Sep 2019 10:36:48 +0000 (UTC) Received: by mail-io1-f69.google.com with SMTP id w1so8584165ioj.9 for ; Wed, 25 Sep 2019 03:36:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NdvVuCfFLXjVHxywXEjyNWAMi7Dm8AjJWXCcb/jJKAg=; b=nVhtl6HWaXnMfeBzxmSPb6OMEbNlUL54QTtl+zmfFVf3F5Nd/LI8HTa7IkrRtnwt9j KEleK0OFnC+dEBJaK7MYw/hjacxGKzOKujzctS1dah4BCxfGv7jFuB+stfI75GfogOeJ VlZ60T/FW1JrB4enwTCNj4AsMpI/ZQa1/tABr8Iv7CZXjXxnVSwXpyNQJEsCO5P7eeZA Qt4EU5Gc6OOsr1qqDP5NA3T+iVzLw4Guu/dxgkwR3mNR+oayP1cdZ7Wp7mROtUn4VAgX toeGxQu/uO69RQeJte6M19Abij9jIquVQmVEWeg3ParprSqy3kN0bvdXw3vzVjEhk7Y5 erJw== X-Gm-Message-State: APjAAAVSjAJ2jUfQYgFYRMfiTa8YWwgaYL3Jk34CS5t2/fgP3Jmed/EB WBHH1T3BFcz1Q8abz5+J5pT1pOZ0u1xGbWkKW/Qmbo+Ka87KZh4y0mdzVwkRT7FCXiIsmFNqBve 4jpYiy9mVNR2OMlzpMNnfSjy66A7ej7lKv2V9HBRb X-Received: by 2002:a6b:14c6:: with SMTP id 189mr9456809iou.202.1569407808281; Wed, 25 Sep 2019 03:36:48 -0700 (PDT) X-Received: by 2002:a6b:14c6:: with SMTP id 189mr9456782iou.202.1569407807952; Wed, 25 Sep 2019 03:36:47 -0700 (PDT) MIME-Version: 1.0 References: <20190910151341.14986-1-kasong@redhat.com> <20190910151341.14986-3-kasong@redhat.com> <20190911055618.GA104115@gmail.com> In-Reply-To: <20190911055618.GA104115@gmail.com> From: Kairui Song Date: Wed, 25 Sep 2019 18:36:36 +0800 Message-ID: Subject: Re: [PATCH v3 2/2] x86/kdump: Reserve extra memory when SME or SEV is active To: Ingo Molnar Cc: Linux Kernel Mailing List , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Thomas Lendacky , Baoquan He , Lianbo Jiang , Dave Young , "the arch/x86 maintainers" , "kexec@lists.infradead.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 11, 2019 at 1:56 PM Ingo Molnar wrote: > * Kairui Song wrote: > > > Since commit c7753208a94c ("x86, swiotlb: Add memory encryption support"), > > SWIOTLB will be enabled even if there is less than 4G of memory when SME > > is active, to support DMA of devices that not support address with the > > encrypt bit. > > > > And commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is > > active") make the kernel keep SWIOTLB enabled even if there is an IOMMU. > > > > Then commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory > > encryption") will always force SWIOTLB to be enabled when SEV is active > > in all cases. > > > > Now, when either SME or SEV is active, SWIOTLB will be force enabled, > > and this is also true for kdump kernel. As a result kdump kernel will > > run out of already scarce pre-reserved memory easily. > > > > So when SME/SEV is active, reserve extra memory for SWIOTLB to ensure > > kdump kernel have enough memory, except when "crashkernel=size[KMG],high" > > is specified or any offset is used. As for the high reservation case, an > > extra low memory region will always be reserved and that is enough for > > SWIOTLB. Else if the offset format is used, user should be fully aware > > of any possible kdump kernel memory requirement and have to organize the > > memory usage carefully. > > > > Signed-off-by: Kairui Song > > --- > > arch/x86/kernel/setup.c | 20 +++++++++++++++++--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 71f20bb18cb0..ee6a2f1e2226 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -530,7 +530,7 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, > > unsigned long long *crash_size, > > bool high) > > { > > - unsigned long long base, size; > > + unsigned long long base, size, mem_enc_req = 0; > > > > base = *crash_base; > > size = *crash_size; > > @@ -561,11 +561,25 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, > > if (high) > > goto high_reserve; > > > > + /* > > + * When SME/SEV is active and not using high reserve, > > + * it will always required an extra SWIOTLB region. > > + */ > > + if (mem_encrypt_active()) > > + mem_enc_req = ALIGN(swiotlb_size_or_default(), SZ_1M); > > + > > base = memblock_find_in_range(CRASH_ALIGN, > > - CRASH_ADDR_LOW_MAX, size, > > + CRASH_ADDR_LOW_MAX, > > + size + mem_enc_req, > > CRASH_ALIGN); > Hi Ingo, I re-read my previous reply, it's long and tedious, let me try to make a more effective reply: > What sizes are we talking about here? The size here is how much memory will be reserved for kdump kernel, to ensure kdump kernel and userspace can run without OOM. > > - What is the possible size range of swiotlb_size_or_default() swiotlb_size_or_default() returns the swiotlb size, it's specified by user using swiotlb=, or default size (64MB) > > - What is the size of CRASH_ADDR_LOW_MAX (the old limit)? It's 4G. > > - Why do we replace one fixed limit with another fixed limit instead of > accurately sizing the area, with each required feature adding its own > requirement to the reservation size? It's quite hard to "accurately sizing the area". No way to tell the exact amount of memory kdump needs, we can only estimate. Kdump kernel use different cmdline, drivers and components will have special handling for kdump, and userspace is totally different. > > I.e. please engineer this into a proper solution instead of just > modifying it around the edges. > > For example have you considered adding some sort of > kdump_memory_reserve(size) facility, which increases the reservation size > as something like SWIOTLB gets activated? That would avoid the ugly > mem_encrypt_active() flag, it would just automagically work. My first attempt is increase crashkernel memory as swiotlb is activated. There are problems. First, SME/SEV is currently the only case that both kernel require SWIOTLB, for most other case, it's wasting memory. If we don't care about the memory waste, it has to check/reserve/free crashkernel memory at three different points: 1. Early boot: - crash kernel reserved a region as usual. 2. Right before memblock freeing memoy: - If SWIOTLB is activated, crash kernel should reserve another region. 3. After Initcalls: - SWIOTLB may get deactivated by initcalls, so need to do a later check for if we need to release the later reserved region. It's more complex. And about a "kdump_memory_reserve(size)" facility, as talked above, it's hard to know how much kdump needs for now, also hard to find any user of this. Please let me know if I failed to make something clear or have any misunderstanding. > > Thanks, > > Ingo -- Best Regards, Kairui Song