Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp660759imc; Sun, 10 Mar 2019 17:45:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqz/QlnQPTsc9I2bJ3o/0rRF86loMVoL/dZocz8wv1/AawBHPJMYt+zNF+GJ4y1Xm50U9tiC X-Received: by 2002:a63:9a4a:: with SMTP id e10mr4013887pgo.382.1552265122643; Sun, 10 Mar 2019 17:45:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552265122; cv=none; d=google.com; s=arc-20160816; b=E9uOcx5JwNubSWMwhLxIoamf0KypmOUW3d3IcL96a2cMCOudpQPbMm9eFiNfrvFQnD 18h7L984QyToID/Hedb6btb/P4pHGW1hd2NScv2liroeZnLQQAnGk/efAIrM3KDdcHzj XdhiNpzNbOdVTz8uwc7PPBMU8G6a1sE9PfZjs/nb+caJJyizl/sEwz2O2CZ7RQjVdh2o Y+vPXxOr4nJD8krzBz9HRzijthc3ERoYWo3Ba4p3lBfXaNycDIAUkpqsrQHsg3+iRP4E usFBrCdAdPIQX/8a2IUD2pBAb/W0iCHfNQnyffgHbXcO5A89ynJ84ynn70wH7ngaXcSF RQPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=GUyeYYGaFXuFgYjrPwO2eTxvuEluWF2MB8kjo17anUg=; b=hXMeUA7KaqlQ3uzLStcL2jw2y0DkeHywNIHR4YCXMxRF68oouKi27PmaOyV51+sCRi XjkjoDJ/WTsrNqekS+U/QPcVinZOf/8jJyII9S5QIzyOGnojJGKq32kWS5CQldKqzthu Xy1+ctoeaLY+HZEH/DpJO7heISVZ3QJtF2KmmRN10elHi8XQuNrM2ZGeTentwjVTAQ9F lCnsbqRGzRgU+FcQK87xOIH9OjZZqNPv0lz4pL12aSIG3l+QvhTcrYZM+AOaHlcjwxB3 d22LOobYX/BbXrFI8Zjzp3AsPFpFerdL9ChmvUtH00vh24Gt4hjDYF4uNLYuHoRpJUCD Mg5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l64si3672023pgd.104.2019.03.10.17.45.06; Sun, 10 Mar 2019 17:45:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727098AbfCKACZ (ORCPT + 99 others); Sun, 10 Mar 2019 20:02:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58634 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727014AbfCKACZ (ORCPT ); Sun, 10 Mar 2019 20:02:25 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7C37C04959C; Mon, 11 Mar 2019 00:02:24 +0000 (UTC) Received: from localhost (ovpn-12-31.pek2.redhat.com [10.72.12.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9CA261001E69; Mon, 11 Mar 2019 00:02:21 +0000 (UTC) Date: Mon, 11 Mar 2019 08:02:18 +0800 From: Baoquan He To: Kairui Song Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, Alexey Dobriyan , Andrew Morton , Omar Sandoval , Jiri Bohac , Dave Young Subject: Re: [PATCH v5] x86/gart/kcore: Exclude GART aperture from kcore Message-ID: <20190311000218.GD21116@MiWiFi-R3L-srv> References: <20190308030508.13548-1-kasong@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190308030508.13548-1-kasong@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 11 Mar 2019 00:02:25 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/08/19 at 11:05am, Kairui Song wrote: > On machines where the GART aperture is mapped over physical RAM, > /proc/kcore contains the GART aperture range and reading it may lead > to kernel panic. > > Vmcore used to have the same issue, until we fixed it in > commit 2a3e83c6f96c ("x86/gart: Exclude GART aperture from vmcore")', > leveraging existing hook infrastructure in vmcore to let /proc/vmcore > return zeroes when attempting to read the aperture region, and so it > won't read from the actual memory. > > We apply the same workaround for kcore. First implement the same hook > infrastructure for kcore, then reuse the hook functions introduced in > previous vmcore fix. Just with some minor adjustment, rename some > functions for more general usage, and simplify the hook infrastructure > a bit as there is no module usage yet. > > Suggested-by: Baoquan He > Signed-off-by: Kairui Song > > --- > Looks good to me, thanks for the effort. Acked-by: Baoquan He Thanks Baoquan > Update from V4: > - Remove the unregistering funtion and move functions never used after > init to .init > > Update from V3: > - Reuse the approach in V2, as Jiri noticed V3 approach may fail > some use case. It introduce overlapped region in kcore, and can't > garenteen the read request will fall into the region we wanted. > - Improve some function naming suggested by Baoquan in V2. > - Simplify the hook registering and checking, we are not exporting the > hook register function for now, no need to make it that complex. > > Update from V2: > Instead of repeating the same hook infrastructure for kcore, introduce > a new kcore area type to avoid reading from, and let kcore always bypass > this kind of area. > > Update from V1: > Fix a complie error when CONFIG_PROC_KCORE is not set > > arch/x86/kernel/aperture_64.c | 20 +++++++++++++------- > fs/proc/kcore.c | 27 +++++++++++++++++++++++++++ > include/linux/kcore.h | 2 ++ > 3 files changed, 42 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c > index 58176b56354e..294ed4392a0e 100644 > --- a/arch/x86/kernel/aperture_64.c > +++ b/arch/x86/kernel/aperture_64.c > @@ -14,6 +14,7 @@ > #define pr_fmt(fmt) "AGP: " fmt > > #include > +#include > #include > #include > #include > @@ -57,7 +58,7 @@ int fallback_aper_force __initdata; > > int fix_aperture __initdata = 1; > > -#ifdef CONFIG_PROC_VMCORE > +#if defined(CONFIG_PROC_VMCORE) || defined(CONFIG_PROC_KCORE) > /* > * If the first kernel maps the aperture over e820 RAM, the kdump kernel will > * use the same range because it will remain configured in the northbridge. > @@ -66,20 +67,25 @@ int fix_aperture __initdata = 1; > */ > static unsigned long aperture_pfn_start, aperture_page_count; > > -static int gart_oldmem_pfn_is_ram(unsigned long pfn) > +static int gart_mem_pfn_is_ram(unsigned long pfn) > { > return likely((pfn < aperture_pfn_start) || > (pfn >= aperture_pfn_start + aperture_page_count)); > } > > -static void exclude_from_vmcore(u64 aper_base, u32 aper_order) > +static void __init exclude_from_core(u64 aper_base, u32 aper_order) > { > aperture_pfn_start = aper_base >> PAGE_SHIFT; > aperture_page_count = (32 * 1024 * 1024) << aper_order >> PAGE_SHIFT; > - WARN_ON(register_oldmem_pfn_is_ram(&gart_oldmem_pfn_is_ram)); > +#ifdef CONFIG_PROC_VMCORE > + WARN_ON(register_oldmem_pfn_is_ram(&gart_mem_pfn_is_ram)); > +#endif > +#ifdef CONFIG_PROC_KCORE > + WARN_ON(register_mem_pfn_is_ram(&gart_mem_pfn_is_ram)); > +#endif > } > #else > -static void exclude_from_vmcore(u64 aper_base, u32 aper_order) > +static void exclude_from_core(u64 aper_base, u32 aper_order) > { > } > #endif > @@ -474,7 +480,7 @@ int __init gart_iommu_hole_init(void) > * may have allocated the range over its e820 RAM > * and fixed up the northbridge > */ > - exclude_from_vmcore(last_aper_base, last_aper_order); > + exclude_from_core(last_aper_base, last_aper_order); > > return 1; > } > @@ -520,7 +526,7 @@ int __init gart_iommu_hole_init(void) > * overlap with the first kernel's memory. We can't access the > * range through vmcore even though it should be part of the dump. > */ > - exclude_from_vmcore(aper_alloc, aper_order); > + exclude_from_core(aper_alloc, aper_order); > > /* Fix up the north bridges */ > for (i = 0; i < amd_nb_bus_dev_ranges[i].dev_limit; i++) { > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c > index bbcc185062bb..d29d869abec1 100644 > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c > @@ -54,6 +54,28 @@ static LIST_HEAD(kclist_head); > static DECLARE_RWSEM(kclist_lock); > static int kcore_need_update = 1; > > +/* > + * Returns > 0 for RAM pages, 0 for non-RAM pages, < 0 on error > + * Same as oldmem_pfn_is_ram in vmcore > + */ > +static int (*mem_pfn_is_ram)(unsigned long pfn); > + > +int __init register_mem_pfn_is_ram(int (*fn)(unsigned long pfn)) > +{ > + if (mem_pfn_is_ram) > + return -EBUSY; > + mem_pfn_is_ram = fn; > + return 0; > +} > + > +static int pfn_is_ram(unsigned long pfn) > +{ > + if (mem_pfn_is_ram) > + return mem_pfn_is_ram(pfn); > + else > + return 1; > +} > + > /* This doesn't grab kclist_lock, so it should only be used at init time. */ > void __init kclist_add(struct kcore_list *new, void *addr, size_t size, > int type) > @@ -465,6 +487,11 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > goto out; > } > m = NULL; /* skip the list anchor */ > + } else if (!pfn_is_ram(__pa(start) >> PAGE_SHIFT)) { > + if (clear_user(buffer, tsz)) { > + ret = -EFAULT; > + goto out; > + } > } else if (m->type == KCORE_VMALLOC) { > vread(buf, (char *)start, tsz); > /* we have to zero-fill user buffer even if no read */ > diff --git a/include/linux/kcore.h b/include/linux/kcore.h > index 8c3f8c14eeaa..c843f4a9c512 100644 > --- a/include/linux/kcore.h > +++ b/include/linux/kcore.h > @@ -44,6 +44,8 @@ void kclist_add_remap(struct kcore_list *m, void *addr, void *vaddr, size_t sz) > m->vaddr = (unsigned long)vaddr; > kclist_add(m, addr, sz, KCORE_REMAP); > } > + > +extern int __init register_mem_pfn_is_ram(int (*fn)(unsigned long pfn)); > #else > static inline > void kclist_add(struct kcore_list *new, void *addr, size_t size, int type) > -- > 2.20.1 >