Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp985414imm; Mon, 21 May 2018 18:44:39 -0700 (PDT) X-Google-Smtp-Source: AB8JxZroPHNKq0EjikOHAF3L0OOOdugpFT0xIlcCl1oDM7qjuftHqe/vVpnvfNq0JqSMJt0knQfl X-Received: by 2002:a63:b54b:: with SMTP id u11-v6mr17433867pgo.365.1526953479372; Mon, 21 May 2018 18:44:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526953479; cv=none; d=google.com; s=arc-20160816; b=UwzD0+F/BomC73afa1g47e7YUY52CeOOOlvoQyAr/gBLghbRgUH19PhRLeWfg3IRhY mEBAzwNL8H3nAswr3yHkZrtKc8BdJ170NES/KDghw515L1E49zLn1vPMJkhIgzlEXk4O tj83iYu2/UeVFFQ5S28AvtnTIDx7kmwX/zWva7IDOPIq5yB87F6OLxMGr/jeLb4Fkd8u G58HW8zIAAt84r8FZzyEqV52BLT46zVgRsMimtSk3RI5Ujji2QEv3gLz6uy1Si4vUO9V HeyptEq7JORyspySjZSJ0eSEDjIQ8qG4T1CZOzyvY7cUGIgjbbzVGmjLl76HAHIVRGcX p0ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=4VmFGSxm94u2k5+jpYnGXLo5knIieOOq+XSiEhBqWxg=; b=uEt9ZxrwfzO71tddIwqViBnZ6ZtpWgZ6C0SB7gK7Jmme87wmyj/DIFlsaDpczlyAUa F6og3yz++H2Yqbz/G8eHGhOMCWt80IkLsTrBJpOlC8HLZB5HqJ/dNV2XxHym4Nt8osYU geyWnofaGZJu8FwZEwCP9YGLO1V6kMFdRP1iLYFYYlfbgat395qVqc1W4NSnvlhDmzc2 4la+6VptnNmOWxY69NQ4hFfNT3dnDGwX8cuqAXzOmyNAWe7pDdR6s5IeVlOdJRbTa024 DaOXLvUpaFgIL/+HVdRDSDHD8u6E/DyCgtEkaG4cnqntdkab7f4rQ95HtWttSKU94bAq Xvjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si15645308plh.34.2018.05.21.18.44.11; Mon, 21 May 2018 18:44:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752075AbeEVBoA (ORCPT + 99 others); Mon, 21 May 2018 21:44:00 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:47474 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751678AbeEVBn7 (ORCPT ); Mon, 21 May 2018 21:43:59 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D60254187E44; Tue, 22 May 2018 01:43:58 +0000 (UTC) Received: from dhcp-128-65.nay.redhat.com (ovpn-12-41.pek2.redhat.com [10.72.12.41]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 543EAD74B7; Tue, 22 May 2018 01:43:52 +0000 (UTC) Date: Tue, 22 May 2018 09:43:48 +0800 From: Dave Young To: Andrew Morton Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Cong Wang , Neil Horman , Ingo Molnar , "Eric W. Biederman" , Vivek Goyal , Tony Luck , Anton Vorontsov , Michael Ellerman , Benjamin Herrenschmidt , Martin Schwidefsky , Hari Bathini , dzickus@redhat.com, bhe@redhat.com Subject: Re: [PATCH] kdump: add default crashkernel reserve kernel config options Message-ID: <20180522014348.GA6827@dhcp-128-65.nay.redhat.com> References: <20180521025337.GA4627@dhcp-128-65.nay.redhat.com> <20180521120215.117d963a7619eb0d1f54bced@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180521120215.117d963a7619eb0d1f54bced@linux-foundation.org> User-Agent: Mutt/1.9.5 (2018-04-13) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Tue, 22 May 2018 01:43:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Tue, 22 May 2018 01:43:58 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'dyoung@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/21/18 at 12:02pm, Andrew Morton wrote: > On Mon, 21 May 2018 10:53:37 +0800 Dave Young wrote: > > > This is a rework of the crashkernel=auto patches back to 2009 although > > I'm not sure if below is the last version of the old effort: > > https://lkml.org/lkml/2009/8/12/61 > > https://lwn.net/Articles/345344/ > > > > I changed the original design, instead of adding the auto reserve logic > > in code, in this patch just introduce two kernel config options for > > the default crashkernel value in MB and the threshold of system memory > > in MB so that only reserve default when system memory is equal or > > above the threshold. > > > > With the kernel configs distributions can easily change the default > > values so that people do not need to manually set kernel cmdline > > for common use cases and one can still overwrite the default value > > with manual setup or disable it by using crashkernel=0 > > > > Signed-off-by: Dave Young > > --- > > Another difference is with original design the crashkernel size scales > > with system memory, according to test, large machine may need more > > memory in kdump kernel because of several factors: > > 1. cpu numbers, because of the percpu memory allocated for cpus. > > (kdump can use nr_cpus=1 to workaround this, but some > > arches do not support nr_cpus=X for example powerpc) > > 2. IO devices, large system can have a lot of io devices, although we > > can try to only add those device drivers we needed, it is still a > > problem because of some built-in drivers, some stacked logical devices > > eg. device mapper devices, acpi etc. Even if only considering the > > meta data for driver model it will still be a big number eg. sysfs > > files etc. > > 3. The minimum memory requirement for some device drivers are big, even > > if some of them have implemented low meory profile. It is usual to see > > 10M memory use for a storage driver. > > 4. user space initramfs size growing. Busybox is not usable if we need > > to add udev support and some complicate storage support. Use dracut > > with systemd, especially networking stuff need more memory. > > > > So probably add another kernel config option to scale the memory size > > eg. CRASHKERNEL_DEFAULT_SCALE_RATIO is also good to have, in RHEL we > > use base_value + system_mem >> (2^14) for x86. I'm still hesatating > > how to describe and add this option. Any suggestions will be appreciated. > > > > ... > > > > --- linux-x86.orig/arch/Kconfig > > +++ linux-x86/arch/Kconfig > > @@ -10,6 +10,22 @@ config KEXEC_CORE > > select CRASH_CORE > > bool > > > > +config CRASHKERNEL_DEFAULT_THRESHOLD_MB > > + int "System memory size threshold for kdump memory default reserving" > > + depends on CRASH_CORE > > + default 0 > > + help > > + CRASHKERNEL_DEFAULT_MB is used as default crashkernel value if > > + the system memory size is equal or bigger than the threshold. > > "the threshold" is rather vague. Can it be clarified? > > In fact I'm really struggling to understand the logic here.... > > > > +config CRASHKERNEL_DEFAULT_MB > > + int "Default crashkernel memory size reserved for kdump" > > + depends on CRASH_CORE > > + default 0 > > + help > > + This is used as the default kdump reserved memory size in MB. > > + crashkernel=X kernel cmdline can overwrite this value. > > + > > config HAVE_IMA_KEXEC > > bool > > > > @@ -143,6 +144,24 @@ static int __init parse_crashkernel_simp > > return 0; > > } > > > > +static int __init get_crashkernel_default(unsigned long long system_ram, > > + unsigned long long *size) > > +{ > > + unsigned long long sz = CONFIG_CRASHKERNEL_DEFAULT_MB; > > + unsigned long long thres = CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB; > > + > > + thres *= SZ_1M; > > + sz *= SZ_1M; > > + > > + if (sz >= system_ram || system_ram < thres) { > > + pr_debug("crashkernel default size can not be used.\n"); > > + return -EINVAL; > > In other words, > > if (system_ram <= CONFIG_CRASHKERNEL_DEFAULT_MB || > system_ram < CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB) > fail; > > yes? the first comparison is a sanity check for the default reserved size, if it is bigger than system ram size it is apparently bad: if ( CONFIG_CRASHKERNEL_DEFAULT_MB >= system_ram ) fail; The second comparison is for the threshold setting, it is a designed logic like: if ( system_ram >= CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB ) then go ahead to use the default value of CONFIG_CRASHKERNEL_DEFAULT_MB > > How come? What's happening here? Perhaps a (good) explanatory comment > is needed. And clearer Kconfig text. > > All confused :( Hmm, scratch head~, will think about how to describe it better. If you have any suggestions just let me know :) Thanks Dave