Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp2016505ybp; Thu, 10 Oct 2019 00:39:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqzfq30KGCXxjZSClKRIfhrZs3P8h9T+O9T1FxSpIuDL0e0LG5bqDpIHaIaj1XnOfkuVUrRK X-Received: by 2002:a17:906:52d0:: with SMTP id w16mr6798616ejn.206.1570693160053; Thu, 10 Oct 2019 00:39:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570693160; cv=none; d=google.com; s=arc-20160816; b=XPMbz/9swC6yzP+LIEC2Nn+JfKktaPMNUpvXjR0mVJhLys7C4V+JczrJmCrHI8KW9u 0GxrC3zcybKmP6+l9wukHUeG/ved69agcGS97cUMFxd/gr/Hxeyu2zDF4+KIL5MXbHNB 7Vna7Zi+SxvgxsUQvuF1H04Ho8H1zEIozL81LV5g4Lj45weYgwW8fRfprVHzYAsyQMZx Rlt7WdNhgr2bkV9xy+qzezyAJYFk6II6b71WtjbHHx6QnA3K1v1kCBKPhO4t6k50x8in vE631JEA1Ayda0TZCTQ1ssjoxUSlhXECD0VQQ56lOT9/p2I+oLEXQVDqnYDUcZ3nl/9/ mFgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=AkBesNpwsvPagv4aqPPYrZ4+aNAMrXsFX0GLEDG+I+s=; b=d57RQIhTj2JmrC2EfRYBavYfcc+W+LKzLrzUrtH83gcnkKxqd6UJYtAtGOSSgVlLn9 yOE76ogYHHva+tVe5Z/7VLArxHUrSC4+kMse7mI2MJ3+DIJzUh8uAbHvqP+ABpYd9wha ROGhmPVEnU7WWlYYj0CSO6ZxuLA0B4fH3/zQUKedv0gZjBnNxc/xc6u7n/ft2oYuaqbH keTlxU8MX9lDu3PHNebjL3lUmA/x3Kmk/vCcC8BW4SrHgk61MMXvQZEClaQW0AXawjDo V7l3ieiyjbF2nmUcP2oN8pe6UwD1tzs/o1SxU1LdiZuIss5gzNP449oUjQ87012ZHmUB J09Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c1si2715008edq.250.2019.10.10.00.38.56; Thu, 10 Oct 2019 00:39:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387410AbfJJHic (ORCPT + 99 others); Thu, 10 Oct 2019 03:38:32 -0400 Received: from mx2.suse.de ([195.135.220.15]:48930 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1733082AbfJJHf3 (ORCPT ); Thu, 10 Oct 2019 03:35:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 60907B02E; Thu, 10 Oct 2019 07:35:27 +0000 (UTC) Date: Thu, 10 Oct 2019 09:35:26 +0200 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Naoya Horiguchi , Andrew Morton Subject: Re: [PATCH v2 2/2] mm/memory-failure.c: Don't access uninitialized memmaps in memory_failure() Message-ID: <20191010073526.GC18412@dhcp22.suse.cz> References: <20191009142435.3975-1-david@redhat.com> <20191009142435.3975-3-david@redhat.com> <20191009144323.GH6681@dhcp22.suse.cz> <5a626821-77e9-e26b-c2ee-219670283bf0@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5a626821-77e9-e26b-c2ee-219670283bf0@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 10-10-19 09:27:32, David Hildenbrand wrote: > On 09.10.19 16:43, Michal Hocko wrote: > > On Wed 09-10-19 16:24:35, David Hildenbrand wrote: > >> We should check for pfn_to_online_page() to not access uninitialized > >> memmaps. Reshuffle the code so we don't have to duplicate the error > >> message. > >> > >> Cc: Naoya Horiguchi > >> Cc: Andrew Morton > >> Cc: Michal Hocko > >> Signed-off-by: David Hildenbrand > >> --- > >> mm/memory-failure.c | 14 ++++++++------ > >> 1 file changed, 8 insertions(+), 6 deletions(-) > >> > >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >> index 7ef849da8278..e866e6e5660b 100644 > >> --- a/mm/memory-failure.c > >> +++ b/mm/memory-failure.c > >> @@ -1253,17 +1253,19 @@ int memory_failure(unsigned long pfn, int flags) > >> if (!sysctl_memory_failure_recovery) > >> panic("Memory failure on page %lx", pfn); > >> > >> - if (!pfn_valid(pfn)) { > >> + p = pfn_to_online_page(pfn); > >> + if (!p) { > >> + if (pfn_valid(pfn)) { > >> + pgmap = get_dev_pagemap(pfn, NULL); > >> + if (pgmap) > >> + return memory_failure_dev_pagemap(pfn, flags, > >> + pgmap); > >> + } > >> pr_err("Memory failure: %#lx: memory outside kernel control\n", > >> pfn); > >> return -ENXIO; > > > > Don't we need that earlier at hwpoison_inject level? > > > > Theoretically yes, this is another instance. But pfn_to_online_page(pfn) > alone would not be sufficient as discussed. We would, again, have to > special-case ZONE_DEVICE via things like get_dev_pagemap() ... > > But mm/hwpoison-inject.c:hwpoison_inject() is a pure debug feature either way: > > /* > * Note that the below poison/unpoison interfaces do not involve > * hardware status change, hence do not require hardware support. > * They are mainly for testing hwpoison in software level. > */ > > So it's not that bad compared to memory_failure() called from real HW or > drivers/base/memory.c:soft_offline_page_store()/hard_offline_page_store() Yes, this is just a toy. And yes we need to handle zone device pages here because a) people likely want to test MCE behavior even on these pages and b) HW can really trigger MCEs there as well. I was just pointing that the patch is likely incomplete. -- Michal Hocko SUSE Labs