Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp4177130imc; Sun, 24 Feb 2019 23:43:05 -0800 (PST) X-Google-Smtp-Source: AHgI3IZNAolUgAyY4SW20PKqiLQ+1PRnAcDIhBHpFp14W9B9LsLMK91J/Us2GbPlgCHuyvMrlHdV X-Received: by 2002:a17:902:650b:: with SMTP id b11mr9962885plk.293.1551080585365; Sun, 24 Feb 2019 23:43:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551080585; cv=none; d=google.com; s=arc-20160816; b=Nnk2Znv/meugbxpvWrRRHfgi0Ip+enORYH7sORZY4jJXJL2v//EVimp9/69jPR+F30 61HbvNyjrRX8rpsZR2MyRzOnd661WALU77v1DG4LxwVrX+Bv3PPwvaTFcdGSapOVaoGS XNwSkrZf3gHLuM+PWbg0PgVG3FaGoB6VBjiMeC9pzVoF3AEjC7KSkGAL3+gfxpsuBclK ZRBJXGRvvuk8xTovvHl/K3SIIfRRekZcaveuLNz2SaTLDFhwonWM3uFtRMU7Ze2ucWpW vv7aURirnqWZzDEOV/0HCXjLsMVclTbdQ7INLC6kpgcQi4Mm0cYzWpmwxh3EHdLbB/By LvIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=5kyAv/VIZ2iDX0KKe3joHrgQb/nfeCMNSH84CRwWfLI=; b=xVglFarZ8x3/s8veUbTl7dF1QG24i3H3FwYanvytZxo0fAtVZYZ6gYN2y51efAw+03 22hKC7tqIHkkMPduI8KEVb0SilVxBaNjrds2gjZbuJX+vVAxGaJixpQa2aFFKB4WNO5R uXiapz698OGUQh+8O1VuQqjJPNd2+K44ppKS1F437bgMdPeHtB/TgHp/yeZEV8ei+I9W fT0A5MwcYJzTxCmnJoZiXybXOKsCfUhKYxC0x6DRqV7XbEmoHC/9GxtpqArKogzFMtM/ Ey2+9TO6zvzUthC3V+8z7k6n8UloZDSvH5yRojwMq5r+83shIVISYMmfXXcyLCSd36Hg xpSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f12si8484058pgf.184.2019.02.24.23.42.48; Sun, 24 Feb 2019 23:43:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725947AbfBYHm2 (ORCPT + 99 others); Mon, 25 Feb 2019 02:42:28 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:51071 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725863AbfBYHm2 (ORCPT ); Mon, 25 Feb 2019 02:42:28 -0500 Received: by mail-it1-f195.google.com with SMTP id m137so11510963ita.0 for ; Sun, 24 Feb 2019 23:42:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5kyAv/VIZ2iDX0KKe3joHrgQb/nfeCMNSH84CRwWfLI=; b=Av45OuQjAnKGxnXQRL6mphQ+ZTVQsM7E+iUn8aedJvc190Q09xJz2VfMgeXkbSEsBH XVEKHkwLvFUZKjhVOs41R92Ae1aEyP9GyoqDW4Ubo6/PLq4O4pLyQXa75YdOq3eoPpPM 5AwwSyxtZ7D5orxVizIe9L+ceX2Zn9ktFaGytKUAEGy8ni7GAeKgwdytwsjOEl3VNgCH FVJBfV91khtimp1dPt1ShIBWd068w4CxXRekhe23G/tpHBCxOhx4ZdQ1CyVzNPbwjc27 0u6DCUBN3oSh0AgKHqmXbbwIs2BleTI2PgIGWkf9JJqPx8fyNxAy4QfYVpr2e5w8LN6X mP+g== X-Gm-Message-State: AHQUAuacFl6ZCrlT+3781mak1tlASY7XJxbUKEjDzpVKpk1fl907t3W3 KKsX/IJi3fF5NqeKkUr2sVGAFbaebYkPSnfavfcJKg== X-Received: by 2002:a24:789:: with SMTP id f131mr10065727itf.19.1551080546654; Sun, 24 Feb 2019 23:42:26 -0800 (PST) MIME-Version: 1.0 References: <20190213082800.4400-1-kasong@redhat.com> In-Reply-To: <20190213082800.4400-1-kasong@redhat.com> From: Kairui Song Date: Mon, 25 Feb 2019 15:42:15 +0800 Message-ID: Subject: Re: [PATCH v3] x86/gart/kcore: Exclude GART aperture from kcore To: Jiri Bohac Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , "the arch/x86 maintainers" , Alexey Dobriyan , Andrew Morton , Omar Sandoval , Linux Kernel Mailing List , Baoquan He , Dave Young Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 13, 2019 at 4:28 PM Kairui Song wrote: > > On machines where the GART aperture is mapped over physical RAM, > /proc/kcore contains the GART aperture range and reading it may lead > to kernel panic. > > In 'commit 2a3e83c6f96c ("x86/gart: Exclude GART aperture from vmcore")', > a workaround is applied for vmcore to let /proc/vmcore return zeroes > when attempting to read the GART region, as vmcore have the same issue, > and after 'commit 707d4eefbdb3 ("Revert "[PATCH] Insert GART region > into resource map"")', userspace tools won't be able to detect GART > region so have to avoid it from being reading in kernel. > > This patch applies a similar workaround for kcore. Let /proc/kcore > return zeroes for GART aperture. > > Both vmcore and kcore maintain a memory mapping list, in the vmcore > workaround we exclude the GART region by registering a hook for checking > if PFN is valid before reading, because vmcore's memory mapping could > be generated by userspace which doesn't know about GART. But for kcore > it will be simpler to just alter the memory area list, kcore's area list > is always generated by kernel on init. > > Kcore's memory area list is generated very late so can't exclude the > overlapped area when GART is initialized, instead simply introduce a > new area enum type KCORE_NORAM, register GART aperture as KCORE_NORAM > and let kcore return zeros for all KCORE_NORAM area. This fixes the > problem well with minor code changes. > > --- > Update from V2: > Instead of repeating the same hook infrastructure for kcore, introduce > a new kcore area type to avoid reading from, and let kcore always bypass > this kind of area. > > Update from V1: > Fix a complie error when CONFIG_PROC_KCORE is not set > > arch/x86/kernel/aperture_64.c | 14 ++++++++++++++ > fs/proc/kcore.c | 13 +++++++++++++ > include/linux/kcore.h | 1 + > 3 files changed, 28 insertions(+) > > diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c > index 58176b56354e..5fb04bdd3221 100644 > --- a/arch/x86/kernel/aperture_64.c > +++ b/arch/x86/kernel/aperture_64.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > /* > * Using 512M as goal, in case kexec will load kernel_big > @@ -84,6 +85,17 @@ static void exclude_from_vmcore(u64 aper_base, u32 aper_order) > } > #endif > > +#ifdef CONFIG_PROC_KCORE > +static struct kcore_list kcore_gart; > + > +static void __init exclude_from_kcore(u64 aper_base, u32 aper_order) { > + u32 aper_size = (32 * 1024 * 1024) << aper_order; > + kclist_add(&kcore_gart, __va(aper_base), aper_size, KCORE_NORAM); > +} > +#else > +static inline void __init exclude_from_kcore(u64 aper_base, u32 aper_order) { } > +#endif > + > /* This code runs before the PCI subsystem is initialized, so just > access the northbridge directly. */ > > @@ -475,6 +487,7 @@ int __init gart_iommu_hole_init(void) > * and fixed up the northbridge > */ > exclude_from_vmcore(last_aper_base, last_aper_order); > + exclude_from_kcore(last_aper_base, last_aper_order); > > return 1; > } > @@ -521,6 +534,7 @@ int __init gart_iommu_hole_init(void) > * range through vmcore even though it should be part of the dump. > */ > exclude_from_vmcore(aper_alloc, aper_order); > + exclude_from_kcore(aper_alloc, aper_order); > > /* Fix up the north bridges */ > for (i = 0; i < amd_nb_bus_dev_ranges[i].dev_limit; i++) { > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c > index bbcc185062bb..15e0d74d7c56 100644 > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c > @@ -75,6 +75,8 @@ static size_t get_kcore_size(int *nphdr, size_t *phdrs_len, size_t *notes_len, > size = 0; > > list_for_each_entry(m, &kclist_head, list) { > + if (m->type == KCORE_NORAM) > + continue; > try = kc_vaddr_to_offset((size_t)m->addr + m->size); > if (try > size) > size = try; > @@ -256,6 +258,9 @@ static int kcore_update_ram(void) > list_for_each_entry_safe(pos, tmp, &kclist_head, list) { > if (pos->type == KCORE_RAM || pos->type == KCORE_VMEMMAP) > list_move(&pos->list, &garbage); > + /* Move NORAM area to head of the new list */ > + if (pos->type == KCORE_NORAM) > + list_move(&pos->list, &list); > } > list_splice_tail(&list, &kclist_head); > > @@ -356,6 +361,8 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > > phdr = &phdrs[1]; > list_for_each_entry(m, &kclist_head, list) { > + if (m->type == KCORE_NORAM) > + continue; > phdr->p_type = PT_LOAD; > phdr->p_flags = PF_R | PF_W | PF_X; > phdr->p_offset = kc_vaddr_to_offset(m->addr) + data_offset; > @@ -465,6 +472,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > goto out; > } > m = NULL; /* skip the list anchor */ > + } else if (m->type == KCORE_NORAM) { > + /* for NORAM area just fill zero */ > + if (clear_user(buffer, tsz)) { > + ret = -EFAULT; > + goto out; > + } > } else if (m->type == KCORE_VMALLOC) { > vread(buf, (char *)start, tsz); > /* we have to zero-fill user buffer even if no read */ > diff --git a/include/linux/kcore.h b/include/linux/kcore.h > index 8c3f8c14eeaa..372a4093f794 100644 > --- a/include/linux/kcore.h > +++ b/include/linux/kcore.h > @@ -13,6 +13,7 @@ enum kcore_type { > KCORE_USER, > KCORE_OTHER, > KCORE_REMAP, > + KCORE_NORAM, > }; > > struct kcore_list { > -- > 2.20.1 > Hi Jiri, Could you help have a look of this fix too? I saw your reviewed V1, and this V3 changed the approach a lot. Thanks! -- Best Regards, Kairui Song