Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp475599imm; Mon, 1 Oct 2018 13:03:43 -0700 (PDT) X-Google-Smtp-Source: ACcGV60tXWZGNAUgXiDLtbHIE7OP5AQkpCamy+rXJOCwk0uvKWkqnrL4XPRbinx8Z9k9Yq/YgaZL X-Received: by 2002:a65:48cc:: with SMTP id o12-v6mr11327543pgs.22.1538424223619; Mon, 01 Oct 2018 13:03:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538424223; cv=none; d=google.com; s=arc-20160816; b=JEWqlzhew4w9xGcoPKT+x75GdWNjC9gFwVp0sfRAGiGIYT7dCwY6pjOPGD1CvR0Y0c wfcwWw6cwtuV3/fW3I9SMDbrf2b+2WC6pUCpcHTRFXO0DJtK0nL7iMGh9CGxCsz21PQY HHcQciwKtqV70NeDN6ZDG1yjZU7MBkUPbqp2GuH7ZAE3YdYh9h6iMWmtJDAftm5G/V8R 8UwVa1IqlefYb3u5po9yB3VyZ5QwdDD9MawyevV3U7kgfMLCbxSs7X4wfdrf5kwKRrjR eiMlwuhGQs7u+qxvyC6O5KnoNYkNriNW3/s2AIgU4EvImtRtFJcbf74q+SDre0bYvWPo Vlmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=GNlIUAwuN1sQQ+qJNf7a6orBI5uhXKmIz9IhShdF95M=; b=0JCRq12OFQAFcKPWelAFEC2UvJjcuOAqMWGKT/RhG0Pgvf127rzr0Tl2jL+/S3Hgw/ jXUCGO+CCb3r4RQMDfWxaEThCxrkttbqJ2aF8iMySdQ3qdLDBCJOFh6P+fC/tBzn4LLz oNBMlYRGYnP5r9zzMMZ4dsNWGtb2S0HsMrYoC7p8ZiCKfEmp/QV1D1Ke//c3N/CawGb/ YHdhF9+PdQi2vQDYRkB3IItRFMwhiE4wyCHGnSJdP/wUzSZ0mGXv+Am46y3eqEg76aEp pGwV0f2GwJ26zzu18lZ5q6HZdae6mINbJnPCB7RLCnIvLM9OCU95WXB2SDcRZhrTA4xS u8lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=MMztVRRo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k20-v6si3715525pll.501.2018.10.01.13.03.28; Mon, 01 Oct 2018 13:03:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=MMztVRRo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726349AbeJBCm1 (ORCPT + 99 others); Mon, 1 Oct 2018 22:42:27 -0400 Received: from mail-yb1-f195.google.com ([209.85.219.195]:45190 "EHLO mail-yb1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725948AbeJBCm1 (ORCPT ); Mon, 1 Oct 2018 22:42:27 -0400 Received: by mail-yb1-f195.google.com with SMTP id d9-v6so6142463ybr.12 for ; Mon, 01 Oct 2018 13:03:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GNlIUAwuN1sQQ+qJNf7a6orBI5uhXKmIz9IhShdF95M=; b=MMztVRRofm+KI9WR1xYJ7sb0nzYrDOnrVnigSK+U3j/A/d1/OyfHH9PATv5QF9AFQM VbIl6eEhRRnJPJsk+JS9oQcBxFh4F1f9FxUsR7ZiNQAP3xWAKl2x+QKC/VL7hzp3fABJ vtSGGKMIVf2fLclvlaQAGXXFh4zm5VLRg/5mI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GNlIUAwuN1sQQ+qJNf7a6orBI5uhXKmIz9IhShdF95M=; b=S//2JJ5ITKRBHX8aBqliO5Fz1EeJco+rK5UlzqZZ5gyp5zGRptqDF9pAEfd8HL60cT y0/R4gH3/SrDf5NchZIPnszqPS5b/ZslBWvBSpMJ3KQyHqngpWLun7kNuetTHpKQsyTc +1qvdEcRlbT4eO5tDtiJHxUL6YQfieUmkXuzqQ9j8GAovxzQvLk3a9UdJs/Z80BqVD07 Kq9s5lRZr/Jefy8rExDc3IxM48WM3ttWFHdiCJKZMnnVGjQfvIypGN39q1a2hEjXJtNY fo+CNy/CAXiy719vyMmDK0gvu3fFfyWCj2I+UpsRUIPEWw7P8MyEDOx6sDoFW6pCT1JT KHVg== X-Gm-Message-State: ABuFfoiurKjGGd96UHvqhx974DKgkJUaWA7eHkBpJRZMF3S2tsUdN+zy boEpctUDlNw9cuwQP34Gm01fvyx/qck= X-Received: by 2002:a25:5507:: with SMTP id j7-v6mr6966261ybb.133.1538424179522; Mon, 01 Oct 2018 13:02:59 -0700 (PDT) Received: from mail-yw1-f44.google.com (mail-yw1-f44.google.com. [209.85.161.44]) by smtp.gmail.com with ESMTPSA id t81-v6sm7428290ywb.90.2018.10.01.13.02.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Oct 2018 13:02:57 -0700 (PDT) Received: by mail-yw1-f44.google.com with SMTP id y14-v6so6053771ywa.4 for ; Mon, 01 Oct 2018 13:02:57 -0700 (PDT) X-Received: by 2002:a81:9b83:: with SMTP id s125-v6mr6689309ywg.47.1538424176795; Mon, 01 Oct 2018 13:02:56 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:d116:0:0:0:0:0 with HTTP; Mon, 1 Oct 2018 13:02:56 -0700 (PDT) In-Reply-To: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> References: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> From: Kees Cook Date: Mon, 1 Oct 2018 13:02:56 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] migration/mm: Add WARN_ON to try_offline_node To: Michael Bringmann Cc: PowerPC , LKML , Linux-MM , Michael Ellerman , Nathan Fontenot , Nicholas Piggin , Thiago Jung Bauermann , Russell Currey , Mauricio Faria de Oliveira , Christophe Leroy , Andrew Morton , Michal Hocko , Pavel Tatashin , Dan Williams , Oscar Salvador , YASUAKI ISHIMATSU , Mathieu Malaterre , Juliet Kim , Tyrel Datwyler , Thomas Falcon Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 1, 2018 at 11:56 AM, Michael Bringmann wrote: > In some LPAR migration scenarios, device-tree modifications are > made to the affinity of the memory in the system. For instance, > it may occur that memory is installed to nodes 0,3 on a source > system, and to nodes 0,2 on a target system. Node 2 may not > have been initialized/allocated on the target system. > > After migration, if a RTAS PRRN memory remove is made to a > memory block that was in node 3 on the source system, then > try_offline_node tries to remove it from node 2 on the target. > The NODE_DATA(2) block would not be initialized on the target, > and there is no validation check in the current code to prevent > the use of a NULL pointer. Call traces such as the following > may be observed: > > A similar problem of moving memory to an unitialized node has > also been observed on systems where multiple PRRN events occur > prior to a complete update of the device-tree. > > pseries-hotplug-mem: Attempting to update LMB, drc index 80000002 > Offlined Pages 4096 > ... > Oops: Kernel access of bad area, sig: 11 [#1] > ... > Workqueue: pseries hotplug workque pseries_hp_work_fn > ... > NIP [c0000000002bc088] try_offline_node+0x48/0x1e0 > LR [c0000000002e0b84] remove_memory+0xb4/0xf0 > Call Trace: > [c0000002bbee7a30] [c0000002bbee7a70] 0xc0000002bbee7a70 (unreliable) > [c0000002bbee7a70] [c0000000002e0b84] remove_memory+0xb4/0xf0 > [c0000002bbee7ab0] [c000000000097784] dlpar_remove_lmb+0xb4/0x160 > [c0000002bbee7af0] [c000000000097f38] dlpar_memory+0x328/0xcb0 > [c0000002bbee7ba0] [c0000000000906d0] handle_dlpar_errorlog+0xc0/0x130 > [c0000002bbee7c10] [c0000000000907d4] pseries_hp_work_fn+0x94/0xa0 > [c0000002bbee7c40] [c0000000000e1cd0] process_one_work+0x1a0/0x4e0 > [c0000002bbee7cd0] [c0000000000e21b0] worker_thread+0x1a0/0x610 > [c0000002bbee7d80] [c0000000000ea458] kthread+0x128/0x150 > [c0000002bbee7e30] [c00000000000982c] ret_from_kernel_thread+0x5c/0xb0 > > This patch adds a check for an incorrectly initialized to the > beginning of try_offline_node, and exits the routine. > > Another patch is being developed for powerpc to track the > node Id to which an LMB belongs, so that we can remove the > LMB from there instead of the nid as currently interpreted > from the device tree. > > Signed-off-by: Michael Bringmann Reviewed-by: Kees Cook -Kees > --- > mm/memory_hotplug.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 38d94b7..e48a4d0 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1831,10 +1831,16 @@ static int check_and_unmap_cpu_on_node(pg_data_t *pgdat) > void try_offline_node(int nid) > { > pg_data_t *pgdat = NODE_DATA(nid); > - unsigned long start_pfn = pgdat->node_start_pfn; > - unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages; > + unsigned long start_pfn; > + unsigned long end_pfn; > unsigned long pfn; > > + if (WARN_ON(pgdat == NULL)) > + return; > + > + start_pfn = pgdat->node_start_pfn; > + end_pfn = start_pfn + pgdat->node_spanned_pages; > + > for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { > unsigned long section_nr = pfn_to_section_nr(pfn); > > -- Kees Cook Pixel Security