Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp1597058rdh; Fri, 24 Nov 2023 17:58:35 -0800 (PST) X-Google-Smtp-Source: AGHT+IF0bxQ4xRgLZKoKZ/WfDqSgKeXybhuE23TrCCAXVdxCGvoDHQrHeJmgXQX5ooshuk3QEcUG X-Received: by 2002:a05:6a20:3944:b0:18b:b858:17a5 with SMTP id r4-20020a056a20394400b0018bb85817a5mr5970099pzg.28.1700877514755; Fri, 24 Nov 2023 17:58:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700877514; cv=none; d=google.com; s=arc-20160816; b=UN2OVfpKIR++9VuQUlHW07aYht2bFcRQ1GkFH1XNl2KYOPPoPqDzGMdMxGB8Vl93E5 PqpFQV0SP6XdC+VAgeF7dA4RJb72TGL7awnM0+/zV8CWbMPTVmEFvR/zfK/cP1cxwFdU TqPztLgPl0SSTDcbTqo+I4ZvEjJowcMtX4uF5DWDbu6ZQWhPjg0kyGSBmsLIrDnLtCtl S6mJRyR2B2bLbrYrjDnUmBUoF0frALVs7zKyGfzEDCqDxWJkwXGBeu1spkpRiXivJbCE YjIJcSnc5k3y+5jw4JiqV1OFbxGWVDmWRwq0oJFPN9BnCLzlpAGGZDWgJf1NXbwOsv54 8dFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=uhfO96FpBSwc0gssKbBJzT8GzmdBygFpNFQZ/ti3+ds=; fh=woOK0q/ggPXhSAS9HRv1lY3XQyzcKHhJ87GUbDA09OA=; b=A7Sd6RFf3NIvWPEhEqhbRtSdqCQuWwkaEmwQ7VvJjVbqenHIB0mwJWkT8ErJOsUwyJ 9KI2tuLyFI5CVdZqNU/qWAr7m+dETAbzOj5OxgSRo17uaRwOKN/TsukHLRn4/GAtKGWR Hzjn8fICzIU2ucBDlWDsW5mbHiOYIIDXe9tAJoMcYvhY9yfDBnuWDLyX60dHbDDVtptG uYi2XBak/eJchmzkOJY9n0WYhTsG5a6KObJQ6L0nATiVrLUPdtN067pJZbTWM30gBC8V y8pTavG/Zsb8DOF26BjALCR6ISVAKCA2Xa43pxpCXJ3ScGehdmv+phALXDdfHhH9Qcrt hG4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IIkHw+Mh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id s18-20020a056a00195200b006cbbd597aa1si4882666pfk.242.2023.11.24.17.58.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 17:58:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IIkHw+Mh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id CCBA881EA69B; Fri, 24 Nov 2023 17:58:31 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229710AbjKYBw3 (ORCPT + 99 others); Fri, 24 Nov 2023 20:52:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229584AbjKYBw2 (ORCPT ); Fri, 24 Nov 2023 20:52:28 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18CDF172E for ; Fri, 24 Nov 2023 17:52:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700877153; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uhfO96FpBSwc0gssKbBJzT8GzmdBygFpNFQZ/ti3+ds=; b=IIkHw+MhOzEFLTYe0XRYlUQtkX3Ev0aUBHez7cngvFvJNqq96DGuxLuoJ+1TIcIWXiTwn7 wKvnUbAE0LBCqm2SoBW+vjcjMPlOLD+bPo+k5OFf2F6kgi61VtZ+Qmv4WXRnyc2VjjTkBK WF8iZe5taEqskJBXPo3FEzkBV8KhBKo= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-192-u7P5NPanNdG15jA2xHoYRw-1; Fri, 24 Nov 2023 20:52:32 -0500 X-MC-Unique: u7P5NPanNdG15jA2xHoYRw-1 Received: by mail-ej1-f69.google.com with SMTP id a640c23a62f3a-9fa63374410so194991966b.1 for ; Fri, 24 Nov 2023 17:52:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700877151; x=1701481951; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uhfO96FpBSwc0gssKbBJzT8GzmdBygFpNFQZ/ti3+ds=; b=gjZQ/rWWpYwTHdG4SdLfUmIiS3C5ZkEKy8PanGJGjginZMe7bhycttnKWwiamgxB7n WxbsSDU/od/LHCUP/I6NwX8eso0IepfwhpnscVve/KFuo0Vl6GtvehHvwn6nrxK04Rhn 2chdpmTtAOxMQMsd/CFFb2Onwj6KpRXvfLZ/gzEuGzXmr0bbNsKJvAJmfdPEgnh2oKbM fQxNpPIuPiXZ4aBNB4m/gY+sj6rMrEs+glWfsqw6zCQz88+ht/vzA7c4e04cCTS7nAA2 yC9pp+zQlis0jgWnvoKzJ+TIy0LniRqeS9J+FGBVRSOsCA0jXgDRzziuXJSGhOgOJYP1 HSpg== X-Gm-Message-State: AOJu0Yz46U5v2hTVhp3S9UG1QqN1K/6yi/QQ4I8SBHFZ/M5hIe+aKDtg RVcvpy/7FjUE6XMW/5ETMJAQYqM7dal1qwfZv/pZjfFfrr7+i+pMClpaQ4gdGu0cUSFIhVVraPL FVPVfk7g/W+93Epv0Gd5YlMI/WnVXTTbuhJRU4GX+ X-Received: by 2002:a17:906:1d2:b0:a08:e229:5659 with SMTP id 18-20020a17090601d200b00a08e2295659mr2884179ejj.17.1700877150902; Fri, 24 Nov 2023 17:52:30 -0800 (PST) X-Received: by 2002:a17:906:1d2:b0:a08:e229:5659 with SMTP id 18-20020a17090601d200b00a08e2295659mr2884167ejj.17.1700877150590; Fri, 24 Nov 2023 17:52:30 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Tao Liu Date: Sat, 25 Nov 2023 09:51:54 +0800 Message-ID: Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA To: Jiri Bohac Cc: Baoquan He , Vivek Goyal , Dave Young , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, mhocko@suse.cz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 24 Nov 2023 17:58:32 -0800 (PST) Hi Jiri, On Sat, Nov 25, 2023 at 3:55=E2=80=AFAM Jiri Bohac wrote: > > Hi, > > this series implements a new way to reserve additional crash kernel > memory using CMA. > > Currently, all the memory for the crash kernel is not usable by > the 1st (production) kernel. It is also unmapped so that it can't > be corrupted by the fault that will eventually trigger the crash. > This makes sense for the memory actually used by the kexec-loaded > crash kernel image and initrd and the data prepared during the > load (vmcoreinfo, ...). However, the reserved space needs to be > much larger than that to provide enough run-time memory for the > crash kernel and the kdump userspace. Estimating the amount of > memory to reserve is difficult. Being too careful makes kdump > likely to end in OOM, being too generous takes even more memory > from the production system. Also, the reservation only allows > reserving a single contiguous block (or two with the "low" > suffix). I've seen systems where this fails because the physical > memory is fragmented. > > By reserving additional crashkernel memory from CMA, the main > crashkernel reservation can be just small enough to fit the > kernel and initrd image, minimizing the memory taken away from > the production system. Most of the run-time memory for the crash > kernel will be memory previously available to userspace in the > production system. As this memory is no longer wasted, the > reservation can be done with a generous margin, making kdump more > reliable. Kernel memory that we need to preserve for dumping is > never allocated from CMA. User data is typically not dumped by > makedumpfile. When dumping of user data is intended this new CMA > reservation cannot be used. > Thanks for the idea of using CMA as part of memory for the 2nd kernel. However I have a question: What if there is on-going DMA/RDMA access on the CMA range when 1st kernel crash? There might be data corruption when 2nd kernel and DMA/RDMA write to the same place, how to address such an issue? Thanks, Tao Liu > There are four patches in this series: > > The first adds a new ",cma" suffix to the recenly introduced generic > crashkernel parsing code. parse_crashkernel() takes one more > argument to store the cma reservation size. > > The second patch implements reserve_crashkernel_cma() which > performs the reservation. If the requested size is not available > in a single range, multiple smaller ranges will be reserved. > > The third patch enables the functionality for x86 as a proof of > concept. There are just three things every arch needs to do: > - call reserve_crashkernel_cma() > - include the CMA-reserved ranges in the physical memory map > - exclude the CMA-reserved ranges from the memory available > through /proc/vmcore by excluding them from the vmcoreinfo > PT_LOAD ranges. > Adding other architectures is easy and I can do that as soon as > this series is merged. > > The fourth patch just updates Documentation/ > > Now, specifying > crashkernel=3D100M craskhernel=3D1G,cma > on the command line will make a standard crashkernel reservation > of 100M, where kexec will load the kernel and initrd. > > An additional 1G will be reserved from CMA, still usable by the > production system. The crash kernel will have 1.1G memory > available. The 100M can be reliably predicted based on the size > of the kernel and initrd. > > When no crashkernel=3Dsize,cma is specified, everything works as > before. > > -- > Jiri Bohac > SUSE Labs, Prague, Czechia > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >