Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp4194609ybp; Mon, 7 Oct 2019 04:54:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqxuhXjvF4cow2y9FnkO+iUATWL9+6UDtjQdCd72n00MTMV0XUnjU5LGG8QDoOfVe489S0l1 X-Received: by 2002:a50:9a05:: with SMTP id o5mr28711074edb.44.1570449299203; Mon, 07 Oct 2019 04:54:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570449299; cv=none; d=google.com; s=arc-20160816; b=Vyu73V/xVj57fKSj2AaW2nHTuKFJ6glOJ3bGY8h1ctYOtnb1fLp5wyjYDJQpK7uRM1 Ja4D11xohImGo48UclufsShnW2nXckoI2DVTpE7ocK7oBRMAHrpIrLzogzSI51nsqC8+ iu4452XByCQMwHMow6Q9ylR/WEFS3wfsMxRf7xNSt/mrrApv9fcUxgmLWIO8SZFLcdzR Sk6s/JJ7hcyeMlZlgaNKQmTEjff1wzhnokJZ3NveK0xSSucTi6uiz8tQGwAONbjJE78A Rrw7i3wPm1Ll7NWuyYOGzl/y/WfvPdxap4C6JGGPO/ECJHlW/mrMUSfvQZcTiJbr6GUd BHJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=ah89FFFsfSKVmNhX9t1UInorV1ttIh7b3VIIqGdVQR8=; b=oRMenIH694HMuZJ6MuPW4RYBUEVZLr9r5EtFZcS8XwcjrVd5tIBogv8lwaYLeFQ078 /eXNzlcwa+O4hHEL3XAfOmjjRuxZEBsSw/KIZX8HBLktxbOqoUhyjjQQd3RWYJJqSUAj tIQ/TZDREkbTrOPKtu1f69f8BWtauxsoQ+RVYxD2HtQ6NjRecgJnpGn4jPi+9mJ0aXmw M/5GWet47WRvpJ0DdfRm1JuKrfqYvbt2YFaZSr89iqvo7F2CFLfhduZUG07wNUUkqWGc ppN3Li3aq7HgJS29KQKejj0+lcHJEYEPUeejy/rr0KtEdgpX7g6LK0Z1YqD9sNk/Wq2z gPAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d25si8081316eds.40.2019.10.07.04.54.36; Mon, 07 Oct 2019 04:54:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727798AbfJGLyS (ORCPT + 99 others); Mon, 7 Oct 2019 07:54:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51096 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727467AbfJGLyR (ORCPT ); Mon, 7 Oct 2019 07:54:17 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 39A447BDAB; Mon, 7 Oct 2019 11:54:17 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-87.pek2.redhat.com [10.72.12.87]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EBF14600C1; Mon, 7 Oct 2019 11:54:01 +0000 (UTC) Subject: Re: [PATCH v2] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active To: Dave Young Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, x86@kernel.org, bhe@redhat.com, jgross@suse.com, dhowells@redhat.com, Thomas.Lendacky@amd.com, ebiederm@xmission.com, vgoyal@redhat.com, kexec@lists.infradead.org References: <20191007070844.15935-1-lijiang@redhat.com> <20191007093338.GA4710@dhcp-128-65.nay.redhat.com> From: lijiang Message-ID: Date: Mon, 7 Oct 2019 19:53:57 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20191007093338.GA4710@dhcp-128-65.nay.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 07 Oct 2019 11:54:17 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2019年10月07日 17:33, Dave Young 写道: > Hi Lianbo, > On 10/07/19 at 03:08pm, Lianbo Jiang wrote: >> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793 >> >> Kdump kernel will reuse the first 640k region because of some reasons, >> for example: the trampline and conventional PC system BIOS region may >> require to allocate memory in this area. Obviously, kdump kernel will >> also overwrite the first 640k region, therefore, kernel has to copy >> the contents of the first 640k area to a backup area, which is done in >> purgatory(), because vmcore may need the old memory. When vmcore is >> dumped, kdump kernel will read the old memory from the backup area of >> the first 640k area. >> >> Basically, the main reason should be clear, kernel does not correctly >> handle the first 640k region when SME is active, which causes that >> kernel does not properly copy these old memory to the backup area in >> purgatory(). Therefore, kdump kernel reads out the incorrect contents >> from the backup area when dumping vmcore. Finally, the phenomenon is >> as follow: >> >> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore >> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values >> >> KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux >> DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore [PARTIAL DUMP] >> CPUS: 128 >> DATE: Thu Sep 19 08:31:18 2019 >> UPTIME: 00:01:21 >> LOAD AVERAGE: 0.16, 0.07, 0.02 >> TASKS: 1343 >> NODENAME: amd-ethanol >> RELEASE: 5.3.0-rc7+ >> VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019 >> MACHINE: x86_64 (2195 Mhz) >> MEMORY: 127.9 GB >> PANIC: "Kernel panic - not syncing: sysrq triggered crash" >> PID: 9789 >> COMMAND: "bash" >> TASK: "ffff89711894ae80 [THREAD_INFO: ffff89711894ae80]" >> CPU: 83 >> STATE: TASK_RUNNING (PANIC) >> >> crash> kmem -s|grep -i invalid >> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4 >> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4 >> crash> >> >> BTW: I also tried to fix the above problem in purgatory(), but there >> are too many restricts in purgatory() context, for example: i can't >> allocate new memory to create the identity mapping page table for SME >> situation. >> >> Currently, there are two places where the first 640k area is needed, >> the first one is in the find_trampoline_placement(), another one is >> in the reserve_real_mode(), and their content doesn't matter. To avoid >> the above error, lets occupy the remain memory of the first 640k region >> (expect for the trampoline and real mode) so that the allocated memory >> does not fall into the first 640k area when SME is active, which makes >> us not to worry about whether kernel can correctly copy the contents of >> the first 640k area to a backup region in the purgatory(). >> >> Signed-off-by: Lianbo Jiang >> --- >> Changes since v1: >> 1. Improve patch log >> 2. Change the checking condition from sme_active() to sme_active() >> && strstr(boot_command_line, "crashkernel=") >> >> arch/x86/kernel/setup.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c >> index 77ea96b794bd..bdb1a02a84fd 100644 >> --- a/arch/x86/kernel/setup.c >> +++ b/arch/x86/kernel/setup.c >> @@ -1148,6 +1148,9 @@ void __init setup_arch(char **cmdline_p) >> >> reserve_real_mode(); >> >> + if (sme_active() && strstr(boot_command_line, "crashkernel=")) >> + memblock_reserve(0, 640*1024); >> + > > Seems you missed the comment about "unconditionally do it", only check > crashkernel param looks better. > If so, it means that copying the first 640k to a backup region is no longer needed, and i should post a patch series to remove the copy_backup_region(). Any idea? > Also I noticed reserve_crashkernel is called after initmem_init, I'm not > sure if memblock_reserve is good enough in early code before > initmem_init. > The first zero page and real mode are also reserved before the initmem_init(), and seems that they work well until now. Thanks. Lianbo >> trim_platform_memory_ranges(); >> trim_low_memory_range(); >> >> -- >> 2.17.1 >> > > Thanks > Dave >