Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3048584imm; Thu, 24 May 2018 22:00:17 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpV6zUdls16nP+3QbIdGldc8kjR2m4wqTZiFXF9w29geFDT4KT2vWBfsjCBKiIH0e/yhJwb X-Received: by 2002:a65:550d:: with SMTP id f13-v6mr774376pgr.324.1527224417500; Thu, 24 May 2018 22:00:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527224417; cv=none; d=google.com; s=arc-20160816; b=ew6YE+gZRP+EpQivvZCPg6YLovehOFxzdP70sGz+xsmEUCpudKqtYdiWBGXSryVg22 oseX5xSTLck6NusFpv7KoMDqKScY6TBMSnoqHcjbmOMR7SmjwK5tmmObJ6+biNPYgoQw VQNgMyRAKUK+EzguVVfW+ZKvZxa14W8WoL47KFUMHpojx5Dn9DXv9MBLsjDLf50aM7yU ffCbxLovE7zYD7Mqn+kENPwilO2kGZRZi0HvtMDeYRpK3tRXoXsH5AfQhkBRGor/xrYL YJO5N0FIWTdPBV51YmvxpXaQ5TVtYDWm/t9V1nET1mOTfq1xPDQpTZA7r/WVW0dFdC2i odfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:arc-authentication-results; bh=/VbQhT3QnIfr8NKSw13tzrEKkY2KsST9q7KKXP8BN9A=; b=Cl72PHATO4kveEVvS2Hw+0hfAHwNE4JZT/qbDYgqy9YA7+wA7r0V4XH5QpqzjOi9OU entq5SsB5jWz6QfxQ05GflsKKZc2H/rBEKYzriuHfe8d9o5HzffyoDo2ykl1V4NWquXd vgc/Pr0qJXRuM5gVN1QGQyrW35GGZtABZteadMIPOsKp6Rivpa0MdetYv83RFr43qtAt 7IlLUvDYJ95yROkKMu7elfi9Th01DSjAi0jAlEsvyTdnoNhlNOBF4HTbLNK9OL2pmH8N WoKu5WSmWGR55HpcvL2Q+Hh9MugHcID83ygNzehKq+8fTnYYtJKjUJUjpnEWeDzK6Ipj yzeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u68-v6si23191852pfb.42.2018.05.24.22.00.00; Thu, 24 May 2018 22:00:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752398AbeEYE7w convert rfc822-to-8bit (ORCPT + 99 others); Fri, 25 May 2018 00:59:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:55725 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751873AbeEYE7v (ORCPT ); Fri, 25 May 2018 00:59:51 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 84200AC87; Fri, 25 May 2018 04:59:49 +0000 (UTC) Date: Fri, 25 May 2018 06:59:43 +0200 From: Petr Tesarik To: ebiederm@xmission.com (Eric W. Biederman) Cc: Dave Young , dzickus@redhat.com, Neil Horman , Tony Luck , bhe@redhat.com, Michael Ellerman , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Martin Schwidefsky , Benjamin Herrenschmidt , Hari Bathini , Cong Wang , Andrew Morton , Ingo Molnar , Vivek Goyal Subject: Re: [PATCH] kdump: add default crashkernel reserve kernel config options Message-ID: <20180525065943.03bcb911@ezekiel.suse.cz> In-Reply-To: <87k1rt3tdu.fsf@xmission.com> References: <20180521025337.GA4627@dhcp-128-65.nay.redhat.com> <20180521120215.117d963a7619eb0d1f54bced@linux-foundation.org> <20180523070641.GA1689@dhcp-128-65.nay.redhat.com> <877enucqr0.fsf@xmission.com> <20180523222236.5a96732e@ezekiel.suse.cz> <20180524014905.GB2031@dhcp-128-65.nay.redhat.com> <20180524085708.31aa311d@ezekiel.suse.cz> <87k1rt3tdu.fsf@xmission.com> Organization: SUSE Linux, s.r.o. X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.31; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org V Thu, 24 May 2018 11:34:05 -0500 ebiederm@xmission.com (Eric W. Biederman) napsáno: > Petr Tesarik writes: > > 2> On Thu, 24 May 2018 09:49:05 +0800 > > Dave Young wrote: > > > >> Hi Petr, > >> > >> On 05/23/18 at 10:22pm, Petr Tesarik wrote: > >>[...] > >> > In short, if one size fits none, what good is it to hardcode that "one > >> > size" into the kernel image? > >> > >> I agreed with all the things that we can not know the exact memory > >> requirement for 100% use cases. But that does not means this is useless > >> it is still useful for common use cases of no special and memory hog > >> requirements as I mentioned in another reply it can simplify the kdump > >> deployment for those people who do not need the special setup. > > > > I still tend to disagree. This "common-case" reservation depends on > > things that are defined by user space. It surely does not make it > > easier to build a distribution kernel. Today, I get bug reports that > > the number calculated and added to the boot loader configuration by the > > installer is inaccurate. If I put a fixed number into a kernel config > > option, I will start getting bugs that this number is incorrect (for > > some systems). > > > >> For example, if this is a workstation I just want to break into a shell > >> to collect some panic info, then I just need a very minimal initrd, then > >> the Kconfig will work just fine. > > > > What is "a very minimal initrd"? Last time I had to make a significant > > adjustment to the estimation for openSUSE, this was caused by growing > > user-space requirements (systemd in this case, but I don't want to > > start flamewars on that topic, please). > > > > Anyway, if you want to improve the "common case", then look how IBM > > tries to solve it for firmware-assisted dump (fadump) on powerpc: > > > > https://patchwork.ozlabs.org/patch/905026/ > > > > The main idea is: > > > >> Instead of setting aside a significant chunk of memory nobody can use, > >> [...] reserve a significant chunk of memory that the kernel is prevented > >> from using [...], but applications are free to use it. > > > > That works great, because user space pages are filtered out in the > > common case, so they can be used freely by the panic kernel. > > They absolutely can not be used in the kdump case. > > The kdump requirement is that they are pages no-one initiates any I/O > to. To avoid the problem of devices doing DMA as the new kernel starts > and runs. Good point. This means that memory reserved for this purpose would also have to be excluded from allocations that may be eventually used for DMA transfers. > Secondarily to avoid problems with cpus that refused to halt. Let's face it - if some CPUs refused to halt, all bets are off. The code running on such a CPU can break many other things besides memory, most importantly, it may meddle with the HW registers of crucial devices in the system. To be less abstract, I have seen a failure to stop a CPU in the crashed kernel a few times, and the panic kernel could never successfully save anything; it always crashed at boot or a little bit later. Anyway, of course we would still have to keep the current method, because user pages are not always filtered. For example, a major SUSE account runs a database in user space and also inspects its data structures in case of a system crash. Petr T