Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp449888ybl; Mon, 12 Aug 2019 19:55:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqxnLwSK6knnLEQl0claEuT4DyEzQU0Qw/LvqKJhXqHlzG2zrV1DRGIi6vX5YVkb/n+6xd8X X-Received: by 2002:a17:902:bf07:: with SMTP id bi7mr36473366plb.167.1565664937522; Mon, 12 Aug 2019 19:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565664937; cv=none; d=google.com; s=arc-20160816; b=SrEKoEUp9M4JbeknP3GtDoWZt8xw56Lq3iMTI6QG0W4J4lSLHFYNv3V2oa0Vnwhq7C ttktzGhi54OB1xXMczoOCJw1SyxSECJya6C3+L8yNGdRnWGwZg/znRjfc2PQwykKlPrx IYd4A9XRrx1QtfuVlHDEFyV1EmLr3/TaXJnkrD/+cYP1Awxza0mHCpqqB0THpYCuYQE5 GqixdU4Dwed6kZnc7p7IsmZoWwOZGP6bzz+rBHvJC8XOdfIN9ClJ/x+TlpE8zqMznxZz s3vhNnzKq1TAffuo/XQkpNAwgavdIzQztiPk/bvyVoh1oZuq4td5iCc1fdrMrzDIWBz0 waMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=1JxZsm+ULqm8FA9NfcI+UBJgSajOyXjLKGUJYMhVsSI=; b=g38K9Dd93+TA39IfysjslVfxZHtHDBgpvHfeN3LiCzKOhyEmxhcYAV9ZsRmlqVZCmC xSup59RPNBXramyXjnB58pHMMhysT3FmtNJAVXiqsTMMdQv3TqTR8FneHyhjNImd3RNV 3boRd2RmYYWkUchsrJ8TuUcJY7nYdaa8JyT+nZkrqFxBmM8J7xkpSfQlpJse9hu2Rawf vn2iTCAvoWCjWOpp1/1yZos6QDKdc/rWRUp0OXsN/Zi5O80kJxtLouaXcMzPMG3ABbvR W3Y6Aw7DUf9ny2ub2bZqdHYKQqC6VmIqf7pXG1QvGbdiqiUzyAj4ePOzP8a0pC14Ful/ suGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f1si15713319pld.12.2019.08.12.19.55.21; Mon, 12 Aug 2019 19:55:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726551AbfHMCyt (ORCPT + 99 others); Mon, 12 Aug 2019 22:54:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40242 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726488AbfHMCyt (ORCPT ); Mon, 12 Aug 2019 22:54:49 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 01593155DB; Tue, 13 Aug 2019 02:54:49 +0000 (UTC) Received: from dhcp-128-65.nay.redhat.com (ovpn-12-72.pek2.redhat.com [10.72.12.72]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E5ACB7BE42; Tue, 13 Aug 2019 02:54:44 +0000 (UTC) Date: Tue, 13 Aug 2019 10:54:41 +0800 From: Dave Young To: Michal Hocko Cc: Paul Menzel , linux-pci@vger.kernel.org, =?iso-8859-1?Q?J=F6rg_R=F6del?= , "x86@kernel.org" , kexec@lists.infradead.org, Linux Kernel Mailing List , iommu@lists.linux-foundation.org, kasong@redhat.com, lijiang@redhat.com, Donald Buczek Subject: Re: Crash kernel with 256 MB reserved memory runs into OOM condition Message-ID: <20190813025441.GA2979@dhcp-128-65.nay.redhat.com> References: <20190812095029.GE5117@dhcp22.suse.cz> <20190813024317.GA2862@dhcp-128-65.nay.redhat.com> <20190813024600.GA2944@dhcp-128-65.nay.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190813024600.GA2944@dhcp-128-65.nay.redhat.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 13 Aug 2019 02:54:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/13/19 at 10:46am, Dave Young wrote: > Add more cc. > On 08/13/19 at 10:43am, Dave Young wrote: > > Hi, > > > > On 08/12/19 at 11:50am, Michal Hocko wrote: > > > On Mon 12-08-19 11:42:33, Paul Menzel wrote: > > > > Dear Linux folks, > > > > > > > > > > > > On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and > > > > 1 TB RAM, the crash kernel with 256 MB of space reserved crashes. > > > > > > > > Please find the messages of the normal and the crash kernel attached. > > > > > > You will need more memory to reserve for the crash kernel because ... > > > > > > > [ 4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB > > > > [ 4.573612] lowmem_reserve[]: 0 125 125 125 > > > > [ 4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB > > > > > > ... the memory is really depleted and nothing to be reclaimed (no anon. > > > file pages) Look how tht free memory is below min watermark (node zone DMA has > > > lowmem protection for GFP_KERNEL allocation). > > > > We found similar issue on our side while working on kdump on SME enabled > > systemd. Kairui is working on some patches. > > > > Actually on those SME/SEV enabled machines, swiotlb is enabled > > automatically so at least we need extra 64M+ memory for kdump other > > than the normal expectation. > > > > Can you check if this is also your case? > > The question is to Paul, also it would be always good to cc kexec mail > list for kexec and kdump issues. Looks like hardware iommu is used, maybe you do not enable SME? Also replace maxcpus=1 with nr_cpus=1 can save some memory, can have a try.