Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp451533imm; Thu, 12 Jul 2018 23:52:02 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfMuET6XUh+LbjqgSWSk4Gf8EqdfqLjBNi+F5MtTdnhLn/niKmixOu5x++rNKX1l/z80lRO X-Received: by 2002:a62:c00c:: with SMTP id x12-v6mr5700066pff.216.1531464722584; Thu, 12 Jul 2018 23:52:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531464722; cv=none; d=google.com; s=arc-20160816; b=ua2MqFXaWz8gXAGz3yWqXtjE0cKbseSs2jzCNjr5nIxL3H19QpHONc6dF/CZpHwLBN Jjd/rkJKNkJ/i6aT7fY7D8kHPKCloaPkqnXrA2oTMJaWKL+57MFNx9R6teylIh38htR2 Jdh2A05WW7IrYLphfilWBGZf4qaJFcRbOncP7Zij5Ub/U9UWa+Pn0Uyw3PJKKs3E9VgY YhDyDrwFKyWyLrOfz6NmwYtAd6nfHQpeV/SMKskYzokk/MAPHLXBlUkPf3+VyNtAz+gt Kp7Qh+BdkxfwMujeFxuI0x94ubmNfF02TT4ys0dqvgvDggNoAtmQd6h8nT8L49RQSvXT egvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-id:content-language:accept-language:in-reply-to:references :message-id:date:thread-index:thread-topic:subject:cc:to:from :arc-authentication-results; bh=7o2s7VyHSgLfzj4yzVzXIyjLFW6Pi30lPk4h5xq8RI8=; b=oFBRp1zTEkviKfvX8PBcWFMh/cUgZTgsPt7ibA+4c908f4mevkOgcDwIXNYLe4uuGF 9yy/7SYNT3ri+lkPFx1QI/Y+eVW/T2Dbqvy6sNtbSAtzdKooe6n3hGxN/R0waLTKMWKX xdkE2r6iYkU7bi/Xb8Nqt+mmzAA0kRn1NDnBcpWU6phnWv6gzDceIPeBS54I55XT5c9D fUJws9T5hZt8w0gXYz/RkExQzu3S8vwn4FOAZt/qRmaqbw0PRY62/cYclksThoJbM8m/ E5q0cnmZAWVqmXuHl/v7+eNWzcENB0zComJUKeyAn5LVle6ZjwCjPl4DuDlWZDiGzRyk Z5Lw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x128-v6si23673904pgx.156.2018.07.12.23.51.47; Thu, 12 Jul 2018 23:52:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729790AbeGMHEX convert rfc822-to-8bit (ORCPT + 99 others); Fri, 13 Jul 2018 03:04:23 -0400 Received: from tyo162.gate.nec.co.jp ([114.179.232.162]:49396 "EHLO tyo162.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727287AbeGMHEX (ORCPT ); Fri, 13 Jul 2018 03:04:23 -0400 Received: from mailgate02.nec.co.jp ([114.179.233.122]) by tyo162.gate.nec.co.jp (8.15.1/8.15.1) with ESMTPS id w6D6or7h016333 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 13 Jul 2018 15:50:53 +0900 Received: from mailsv02.nec.co.jp (mailgate-v.nec.co.jp [10.204.236.94]) by mailgate02.nec.co.jp (8.15.1/8.15.1) with ESMTP id w6D6orbh017692; Fri, 13 Jul 2018 15:50:53 +0900 Received: from mail03.kamome.nec.co.jp (mail03.kamome.nec.co.jp [10.25.43.7]) by mailsv02.nec.co.jp (8.15.1/8.15.1) with ESMTP id w6D6mFY1019146; Fri, 13 Jul 2018 15:50:53 +0900 Received: from bpxc99gp.gisp.nec.co.jp ([10.38.151.152] [10.38.151.152]) by mail01b.kamome.nec.co.jp with ESMTP id BT-MMP-1931052; Fri, 13 Jul 2018 15:49:18 +0900 Received: from BPXM23GP.gisp.nec.co.jp ([10.38.151.215]) by BPXC24GP.gisp.nec.co.jp ([10.38.151.152]) with mapi id 14.03.0319.002; Fri, 13 Jul 2018 15:49:18 +0900 From: Naoya Horiguchi To: Dan Williams CC: "linux-nvdimm@lists.01.org" , "hch@lst.de" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "jack@suse.cz" , "ross.zwisler@linux.intel.com" Subject: Re: [PATCH v5 06/11] mm, memory_failure: Collect mapping size in collect_procs() Thread-Topic: [PATCH v5 06/11] mm, memory_failure: Collect mapping size in collect_procs() Thread-Index: AQHUE+EV4wHT8Yr900u++c7tuoi4FaSMLJkA Date: Fri, 13 Jul 2018 06:49:16 +0000 Message-ID: <20180713064916.GB10034@hori1.linux.bs1.fc.nec.co.jp> References: <153074042316.27838.17319837331947007626.stgit@dwillia2-desk3.amr.corp.intel.com> <153074045526.27838.11460088022513024933.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <153074045526.27838.11460088022513024933.stgit@dwillia2-desk3.amr.corp.intel.com> Accept-Language: en-US, ja-JP Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.51.8.80] Content-Type: text/plain; charset="iso-2022-jp" Content-ID: <91675B8EB5C02F4A93C6F4133359C60B@gisp.nec.co.jp> Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-TM-AS-MML: disable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 04, 2018 at 02:40:55PM -0700, Dan Williams wrote: > In preparation for supporting memory_failure() for dax mappings, teach > collect_procs() to also determine the mapping size. Unlike typical > mappings the dax mapping size is determined by walking page-table > entries rather than using the compound-page accounting for THP pages. > > Cc: Naoya Horiguchi > Signed-off-by: Dan Williams Looks good to me. Acked-by: Naoya Horiguchi > --- > mm/memory-failure.c | 81 +++++++++++++++++++++++++-------------------------- > 1 file changed, 40 insertions(+), 41 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 9d142b9b86dc..4d70753af59c 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -174,22 +174,51 @@ int hwpoison_filter(struct page *p) > EXPORT_SYMBOL_GPL(hwpoison_filter); > > /* > + * Kill all processes that have a poisoned page mapped and then isolate > + * the page. > + * > + * General strategy: > + * Find all processes having the page mapped and kill them. > + * But we keep a page reference around so that the page is not > + * actually freed yet. > + * Then stash the page away > + * > + * There's no convenient way to get back to mapped processes > + * from the VMAs. So do a brute-force search over all > + * running processes. > + * > + * Remember that machine checks are not common (or rather > + * if they are common you have other problems), so this shouldn't > + * be a performance issue. > + * > + * Also there are some races possible while we get from the > + * error detection to actually handle it. > + */ > + > +struct to_kill { > + struct list_head nd; > + struct task_struct *tsk; > + unsigned long addr; > + short size_shift; > + char addr_valid; > +}; > + > +/* > * Send all the processes who have the page mapped a signal. > * ``action optional'' if they are not immediately affected by the error > * ``action required'' if error happened in current execution context > */ > -static int kill_proc(struct task_struct *t, unsigned long addr, > - unsigned long pfn, struct page *page, int flags) > +static int kill_proc(struct to_kill *tk, unsigned long pfn, int flags) > { > - short addr_lsb; > + struct task_struct *t = tk->tsk; > + short addr_lsb = tk->size_shift; > int ret; > > pr_err("Memory failure: %#lx: Killing %s:%d due to hardware memory corruption\n", > pfn, t->comm, t->pid); > - addr_lsb = compound_order(compound_head(page)) + PAGE_SHIFT; > > if ((flags & MF_ACTION_REQUIRED) && t->mm == current->mm) { > - ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)addr, > + ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)tk->addr, > addr_lsb, current); > } else { > /* > @@ -198,7 +227,7 @@ static int kill_proc(struct task_struct *t, unsigned long addr, > * This could cause a loop when the user sets SIGBUS > * to SIG_IGN, but hopefully no one will do that? > */ > - ret = send_sig_mceerr(BUS_MCEERR_AO, (void __user *)addr, > + ret = send_sig_mceerr(BUS_MCEERR_AO, (void __user *)tk->addr, > addr_lsb, t); /* synchronous? */ > } > if (ret < 0) > @@ -235,35 +264,6 @@ void shake_page(struct page *p, int access) > EXPORT_SYMBOL_GPL(shake_page); > > /* > - * Kill all processes that have a poisoned page mapped and then isolate > - * the page. > - * > - * General strategy: > - * Find all processes having the page mapped and kill them. > - * But we keep a page reference around so that the page is not > - * actually freed yet. > - * Then stash the page away > - * > - * There's no convenient way to get back to mapped processes > - * from the VMAs. So do a brute-force search over all > - * running processes. > - * > - * Remember that machine checks are not common (or rather > - * if they are common you have other problems), so this shouldn't > - * be a performance issue. > - * > - * Also there are some races possible while we get from the > - * error detection to actually handle it. > - */ > - > -struct to_kill { > - struct list_head nd; > - struct task_struct *tsk; > - unsigned long addr; > - char addr_valid; > -}; > - > -/* > * Failure handling: if we can't find or can't kill a process there's > * not much we can do. We just print a message and ignore otherwise. > */ > @@ -292,6 +292,7 @@ static void add_to_kill(struct task_struct *tsk, struct page *p, > } > tk->addr = page_address_in_vma(p, vma); > tk->addr_valid = 1; > + tk->size_shift = compound_order(compound_head(p)) + PAGE_SHIFT; > > /* > * In theory we don't have to kill when the page was > @@ -317,9 +318,8 @@ static void add_to_kill(struct task_struct *tsk, struct page *p, > * Also when FAIL is set do a force kill because something went > * wrong earlier. > */ > -static void kill_procs(struct list_head *to_kill, int forcekill, > - bool fail, struct page *page, unsigned long pfn, > - int flags) > +static void kill_procs(struct list_head *to_kill, int forcekill, bool fail, > + unsigned long pfn, int flags) > { > struct to_kill *tk, *next; > > @@ -342,8 +342,7 @@ static void kill_procs(struct list_head *to_kill, int forcekill, > * check for that, but we need to tell the > * process anyways. > */ > - else if (kill_proc(tk->tsk, tk->addr, > - pfn, page, flags) < 0) > + else if (kill_proc(tk, pfn, flags) < 0) > pr_err("Memory failure: %#lx: Cannot send advisory machine check signal to %s:%d\n", > pfn, tk->tsk->comm, tk->tsk->pid); > } > @@ -1012,7 +1011,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn, > * any accesses to the poisoned memory. > */ > forcekill = PageDirty(hpage) || (flags & MF_MUST_KILL); > - kill_procs(&tokill, forcekill, !unmap_success, p, pfn, flags); > + kill_procs(&tokill, forcekill, !unmap_success, pfn, flags); > > return unmap_success; > } > >