Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp32629imm; Thu, 20 Sep 2018 17:37:20 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbYI6E3BHYD00/c5S5MHL11ywoNrzNs99v3txC7cq5d7ZyfpVTSu4wtdpyI/YM4YjNpdtjW X-Received: by 2002:a63:ad07:: with SMTP id g7-v6mr39033417pgf.19.1537490240721; Thu, 20 Sep 2018 17:37:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537490240; cv=none; d=google.com; s=arc-20160816; b=TrVUxKqQOt1ms9FN6SgJUpXNULwn8ByLyH9w1N7KFs1a6wfFHTf7BWmLG5N23ha/J5 s1ItXWJOI/ySrfNJLcnQwR02vyynqJsHePK2Qz/+nMR7Jd8Aq56qCwSXWf0G81yXtkgn elmEnI85vq8ElMvyDVyqbGuuoXs496Vn+c9GCob6pdyd1UOCceaVDuZLAmsOvaDleSBp rC8ACp+ZPR8XGJZT3aSVCpGXaP7UOKOTSTvoq5e8Uv0eROo/TR0KBNbtZ01CRTvZ1xeW FSV+fwQqkpog8p6dYt+pi2l5f2ECotqHkUbzHYsQE/Ee33cKX4CMOGobQf0BFG6nK62A HwNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=JNeOw6wF+fyeeK8mgQxBi5JZr4E00CGQeJmbAvqXvdI=; b=cZ929NcOylkVmhSImz2MGCWxfYJvu2Ss+pWsQMTTBokKenIBcndn9cdz+L9Tf/DF8t a2iwUqKdNurpDFBMFgczJ/AWm0Y4r44O7N1nMgNMVI/tYX/0UyCKMfJgXN5J2PgSxvk8 7/6vqoRrpx1aimGXpAP1WvckpX1VxysShAoksZwsF4JmczSRbQwwdmnvpVdnd3z3SwuY TtGnENpPPE78iGd3ouLM5Yv702QR398z+SFcAJHZnd2kxiOOrVqCWzxS/ctWFHoe5m+/ VN2gY2bNgVCosKj+hbCDxAmCywgNiYyje2dQZEEcyvgbsI+EUivWia0PoD+R3rzNzCgq 4NSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=gDbks14w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m23-v6si6396740pgn.603.2018.09.20.17.37.01; Thu, 20 Sep 2018 17:37:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=gDbks14w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726023AbeIUGXH (ORCPT + 99 others); Fri, 21 Sep 2018 02:23:07 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:43038 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725749AbeIUGXH (ORCPT ); Fri, 21 Sep 2018 02:23:07 -0400 Received: by mail-ot1-f67.google.com with SMTP id u20-v6so11322576otk.10 for ; Thu, 20 Sep 2018 17:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JNeOw6wF+fyeeK8mgQxBi5JZr4E00CGQeJmbAvqXvdI=; b=gDbks14wF2zD67ClPelGWpWhnSRxVK2Np0nTXdv5pXx6DgE7SwT3UfLLKB3TFdMgDE 5cFB5YTHCM/RwjLLbZWXhlahxvNhHpgn3auNA7GPCr2xpaDf3JsgcHZOTYGXNcMr6myD HmcSs9tnCJpmot5fMyHDtznC6Rhmv1tb/G0kUDhk7MPAnSsE63OXGMo1gUc8sEyfmi91 bkUfaftLMiTJRGogu0y/l5U2KzjXzyv5PZ0C9oBMlg9p4GUj8JAGG19ylkhXEhpw6vZf 0kkdNj3ZxHdwnWjaRxLsdrCe1Fg6Ycn9uhbXPhLkCHek3hjvRRhqtWa8n2u5AKX+x7XU U3Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JNeOw6wF+fyeeK8mgQxBi5JZr4E00CGQeJmbAvqXvdI=; b=VmBoPGzzL/GPo5VUYJ+napRaxCh1Z2hG9EpznbKCxEKOFhEYUhC37QuCq/Nm8MnEiW 4QKIgvJdmDTospSS72AZ+un0rnzmoHs7KeF7cDFTDtGGh/WTcA+156bBlPRC+TAHDcHL hbQKmedyTiowK5lu+vAEWkuNdp/ihzu6UApOjWqWyBpYVJEJ3kiUwJW4LTTD1PmJjbxN kI8REY2X8vbsepbRSDD0/I1tPIVNM4aSTW026f8peJIP/PdNDg3XB2k0y4oulMHXFxgT ZLgmE+08qnUeIFLipLiYsmkGRafb9aEeSjhTnNouX65q+Tc42u76X0tw9xNGZfrDCa2u 3SMw== X-Gm-Message-State: APzg51Dx23hDqP0dvHdICzZaLjTFcWKglNwnSbvBZyDQShosIKQypEcy hbwVspUK/P0KgQwhsaEcqtkKnXl0O6PDXWdfmZdLcw== X-Received: by 2002:a9d:50ac:: with SMTP id b44-v6mr24736909oth.267.1537490216501; Thu, 20 Sep 2018 17:36:56 -0700 (PDT) MIME-Version: 1.0 References: <20180920215824.19464.8884.stgit@localhost.localdomain> <20180920222951.19464.39241.stgit@localhost.localdomain> In-Reply-To: From: Dan Williams Date: Thu, 20 Sep 2018 17:36:44 -0700 Message-ID: Subject: Re: [PATCH v4 5/5] nvdimm: Schedule device registration on node local to the device To: alexander.h.duyck@linux.intel.com Cc: Linux MM , Linux Kernel Mailing List , linux-nvdimm , Pasha Tatashin , Michal Hocko , Dave Jiang , Ingo Molnar , Dave Hansen , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , Logan Gunthorpe , "Kirill A. Shutemov" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 20, 2018 at 5:26 PM Alexander Duyck wrote: > > On 9/20/2018 3:59 PM, Dan Williams wrote: > > On Thu, Sep 20, 2018 at 3:31 PM Alexander Duyck > > wrote: > >> > >> This patch is meant to force the device registration for nvdimm devices to > >> be closer to the actual device. This is achieved by using either the NUMA > >> node ID of the region, or of the parent. By doing this we can have > >> everything above the region based on the region, and everything below the > >> region based on the nvdimm bus. > >> > >> One additional change I made is that we hold onto a reference to the parent > >> while we are going through registration. By doing this we can guarantee we > >> can complete the registration before we have the parent device removed. > >> > >> By guaranteeing NUMA locality I see an improvement of as high as 25% for > >> per-node init of a system with 12TB of persistent memory. > >> > >> Signed-off-by: Alexander Duyck > >> --- > >> drivers/nvdimm/bus.c | 19 +++++++++++++++++-- > >> 1 file changed, 17 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c > >> index 8aae6dcc839f..ca935296d55e 100644 > >> --- a/drivers/nvdimm/bus.c > >> +++ b/drivers/nvdimm/bus.c > >> @@ -487,7 +487,9 @@ static void nd_async_device_register(void *d, async_cookie_t cookie) > >> dev_err(dev, "%s: failed\n", __func__); > >> put_device(dev); > >> } > >> + > >> put_device(dev); > >> + put_device(dev->parent); > > > > Good catch. The child does not pin the parent until registration, but > > we need to make sure the parent isn't gone while were waiting for the > > registration work to run. > > > > Let's break this reference count fix out into its own separate patch, > > because this looks to be covering a gap that may need to be > > recommended for -stable. > > Okay, I guess I can do that. > > > > >> > >> static void nd_async_device_unregister(void *d, async_cookie_t cookie) > >> @@ -504,12 +506,25 @@ static void nd_async_device_unregister(void *d, async_cookie_t cookie) > >> > >> void __nd_device_register(struct device *dev) > >> { > >> + int node; > >> + > >> if (!dev) > >> return; > >> + > >> dev->bus = &nvdimm_bus_type; > >> + get_device(dev->parent); > >> get_device(dev); > >> - async_schedule_domain(nd_async_device_register, dev, > >> - &nd_async_domain); > >> + > >> + /* > >> + * For a region we can break away from the parent node, > >> + * otherwise for all other devices we just inherit the node from > >> + * the parent. > >> + */ > >> + node = is_nd_region(dev) ? to_nd_region(dev)->numa_node : > >> + dev_to_node(dev->parent); > > > > Devices already automatically inherit the node of their parent, so I'm > > not understanding why this is needed? > > That doesn't happen until you call device_add, which you don't call > until nd_async_device_register. All that has been called on the device > up to now is device_initialize which leaves the node at NUMA_NO_NODE. Ooh, yeah, missed that. I think I'd prefer this policy to moved out to where we set the dev->parent before calling __nd_device_register, or at least a comment here about *why* we know region devices are special (i.e. because the nd_region_desc specified the node at region creation time).