Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp595610ybl; Fri, 30 Aug 2019 04:29:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqxQD+U1BiNyYc2AuP8UC01sa9dXPR9xZ3WRO6Crr0OWa25x1ZI1+1uKQEgQjENR++ChuMZa X-Received: by 2002:a17:902:16e:: with SMTP id 101mr1537321plb.139.1567164557631; Fri, 30 Aug 2019 04:29:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567164557; cv=none; d=google.com; s=arc-20160816; b=yh2m2Pi5W3wfi8ksz6dNVUxDD/Ykov2iM0fZPhRZQ9rK/8LGrBZtIsGHtGgnNtRZgj AtE05CvPsnBPAb8vhU/tn23tirf7ndt0lX3zp6jV4ug8ZTVXgRN3UYMWRplDDXKz7Xcz eWi7/u+Jq7KKc55mg36dTN+lk4cMhgQ0e00wXIoMZ8vwZV8mjWn2VpK+NMMCiYG3JsYt SdnifOqueecb4jWd3Zm/gG1W9yeB8aG3zgSY0M1qNRztivUDvEw+6BXIH8iHW3Q4bacS XXyZLmqTpaIc2E4jZ6+cO2ESBhtCHdRgxRuNyjpb4wk4UoJEfGUTGTAKGO2EaHno2TFh XiOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=Oz1vbjUAHGrImYZMfoJKseB+Nu6nsMjPvyj4df6yubs=; b=aNnEEQdBXSPgGzGZG0BSVoP4tQVqUobLH/emKVaC3v681GfsAT48VaY7s7Q4fTs3pp wLtOsRcUbq1yukBz9un++2M0lqKxWq03PdOgWRC8Cizx/4ZWJeIvbuIGzCCzcN+MtPyN 5Xs2kVeEzlJXc0bLe4T7oOvhGF9seST2r332E+hP7KD80ny0heD1vpucwLSoE53UcI5W 2jxbsBHyYSEsx3wxtJdWZHsv534miEscJ9vlWTapZNVgNHI6qe4My8PJCpo5miuR/1WM GJZmXQBeSqxinRQd9hx1jp6pP8F/0A2rGUccno+0Ov6mz5W6xhOjP4P25i5LHejrtCVG Db5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g5si4858953pjp.10.2019.08.30.04.29.02; Fri, 30 Aug 2019 04:29:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727883AbfH3L11 (ORCPT + 99 others); Fri, 30 Aug 2019 07:27:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43820 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727326AbfH3L11 (ORCPT ); Fri, 30 Aug 2019 07:27:27 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0D9633082A6C; Fri, 30 Aug 2019 11:27:27 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-59.pek2.redhat.com [10.72.12.59]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 92F7A60605; Fri, 30 Aug 2019 11:27:15 +0000 (UTC) Subject: Re: crash: `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore From: lijiang To: "Lendacky, Thomas" , Dave Young Cc: "linux-kernel@vger.kernel.org" , Dave Anderson , "kexec@lists.infradead.org" , "vgoyal@redhat.com" , "bhe@redhat.com" , "ebiederm@xmission.com" References: <20190802010538.GA2202@dhcp-128-65.nay.redhat.com> <5d91e856-01de-bc80-e4bc-497d57652072@amd.com> <2d3c7ab8-0b83-4ef5-bb89-0c7c476265b3@redhat.com> <467709e5-3f9e-85bd-60a8-255af71f3d4f@redhat.com> Message-ID: <9d2fa8a6-6b33-dd68-4dca-a84f7e68885f@redhat.com> Date: Fri, 30 Aug 2019 19:27:10 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <467709e5-3f9e-85bd-60a8-255af71f3d4f@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 30 Aug 2019 11:27:27 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2019年08月17日 15:23, lijiang 写道: > 在 2019年08月11日 10:29, lijiang 写道: >> 在 2019年08月09日 06:37, Lendacky, Thomas 写道: >>> On 8/1/19 8:05 PM, Dave Young wrote: >>>> Add kexec cc list. >>>> On 08/01/19 at 11:02pm, lijiang wrote: >>>>> Hi, Tom >>>>> >>>>> Recently, i ran into a problem about SME and used crash tool to check the vmcore as follow: >>>>> >>>>> crash> kmem -s | grep -i invalid >>>>> kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e >>>>> kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e >>>>> >>>>> And the crash tool reported the above error, probably, the main reason is that kernel does not >>>>> correctly handle the first 640k region when SME is enabled. >>>>> >>>>> When SME is enabled, the kernel and initramfs images are loaded into the decrypted memory, and >>>>> the backup area(first 640k) is also mapped as decrypted, but the first 640k data is copied to >>>>> the backup area in purgatory(). Please refer to this file: arch/x86/purgatory/purgatory.c >>>>> ...... >>>>> static int copy_backup_region(void) >>>>> { >>>>> if (purgatory_backup_dest) { >>>>> memcpy((void *)purgatory_backup_dest, >>>>> (void *)purgatory_backup_src, purgatory_backup_sz); >>>>> } >>>>> return 0; >>>>> } >>>>> ...... >>>>> >>>>> arch/x86/kernel/machine_kexec_64.c >>>>> ...... >>>>> machine_kexec_prepare()-> >>>>> arch_update_purgatory()-> >>>>> ..... >>>>> >>>>> Actually, the firs 640k area is encrypted in the first kernel when SME is enabled, here kernel >>>>> copies the first 640k data to the backup area in purgatory(), because the backup area is mapped >>>>> as decrypted, this copying operation makes that the first 640k data is decrypted(decoded) and >>>>> saved to the backup area, but probably kernel can not aware of SME in purgatory(), which causes >>>>> kernel mistakenly read out the first 640k. >>>>> >>>>> In addition, i hacked kernel code as follow: >>>>> >>>>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c >>>>> index 7bcc92add72c..a51631d36a7a 100644 >>>>> --- a/fs/proc/vmcore.c >>>>> +++ b/fs/proc/vmcore.c >>>>> @@ -377,6 +378,16 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos, >>>>> m->offset + m->size - *fpos, >>>>> buflen); >>>>> start = m->paddr + *fpos - m->offset; >>>>> + if (m->paddr == 0x73f60000) {//the backup area's start address:0x73f60000 >>>>> + tmp = read_from_oldmem(buffer, tsz, &start, >>>>> + userbuf, false); >>>>> + } else >>>>> tmp = read_from_oldmem(buffer, tsz, &start, >>>>> userbuf, mem_encrypt_active()); >>>>> if (tmp < 0) >>>>> >>>>> Here, i used the crash tool to check the vmcore, i can see that the backup area is decrypted, >>>>> except for the dma-kmalloc-512. So i suspect that kernel did not correctly read out the first >>>>> 640k data to backup area. Do you happen to know how to deal with the first 640k area in purgatory() >>>>> when SME is enabled? Any idea? >>> >>> I'm not all that familiar with kexec and purgatory, etc., but I think >>> that you want to setup the page table that is active when purgatory runs >>> so that the src and dest both have the SME encryption mask set in their >>> respective page table entries. This way, when the copy is performed, >>> everything is copied correctly. >> >> Exactly. That's just what i was thinking. >> > > I tried to setup the 1:1 mapping in the init_pgtable() with the memory encryption mask, but that still > did not correctly access the encrypted memory in purgatory(). I'm not sure whether i missed anything > else, i'm still digging into it. > As we know, kdump kernel will reuse the first 640k region, so the old content in the first 640k area will be copied to a backup area, which is done in purgatory(). When dumping the vmcore, kdump kernel will read the old content of the first 640k area from the backup area. According to above description, when SME is enabled in the first kernel, kernel has to setup the identity mapping for the first 640k area with encryption mask so that kernel can correctly access the old memory. And also setup the identity mapping for the backup region with encryption mask. But kdump kernel won't properly deal with the encrypted memory before SME is enabled, which causes the failure of kdump kernel boot. So i planed to setup the temporary mapping of page table with encryption mask for the first 640k area and backup region in purgatory(). > I guess that should make the 1:1 mapping in the purgatory context instead of in init_pgtable(). Does > anyone happen to know how to make the 1:1 mapping with memory encryption mask in purgatory() context? > I have initiated the SME related code in purgatory(), and also got the value of sme_me_mask, but there are too many restrictions in purgatory() context, for example, i can not allocate memory for creating the temporary mapping of page table with encryption mask, which prevents my attempt. Any idea? > In addition, there is another way to avoid encrypting the first 640k area. When SME is enabled, do not > encrypt the first 640k area, let it skip this area. Do you happen to know how to do it? Tom.(btw: I tried > to do it, unfortunately, that failed.). But that also needs to make extra things when dumpping the vmcore( > need to dump the vmcore according to whether the first 640k area is encrypted). > I rethought it, that could cause trouble for memory management. In addition, i will also try to copy the old memory in the first 640k area to backup area at the early boot stage of kdump kernel(and after SME is enabled). But, need to ensure that the old memory it not overwritten. Any suggestions will be appreciated. Thanks. Lianbo > Thanks. > Lianbo > >>> Remember, encrypted data from one page >>> cannot be directly copied as unencrypted data and decrypted properly in >>> the new location (e.g. a page of zeroes encrypted at one address will not >>> appear the same as a page of zeroes encrypted at a different address). >> >> Yes, that's right. Thank you, Tom. >> >> I'm considering how to solve it, and i guess that probably it needs to properly deal with >> this problem in purgatory(). >> >> Thanks. >> Lianbo >> >>> >>> Thanks, >>> Tom >>> >>>>> >>>>> BTW: I' curious the reason why the address of dma-kmalloc-512k always falls into the first 640k >>>>> region, and i did not see the same issue on another machine. >>>>> >>>>> Machine: >>>>> Serial Number diesel-sys9079-0001 >>>>> Model AMD Diesel (A0C) >>>>> CPU AMD EPYC 7601 32-Core Processor >>>>> >>>>> >>>>> Background: >>>>> On x86_64, the first 640k region is special because of some historical reasons. And kdump kernel will >>>>> reuse the first 640k region, so kernel will back up(copy) the first 640k region to a backup area in >>>>> purgatory(), in order not to rewrite the old region(640k) in kdump kernel, which makes sure that kdump >>>>> can read out the old memory from vmcore. >>>>> >>>>> >>>>> Thanks. >>>>> Lianbo