Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp667822ybt; Wed, 8 Jul 2020 08:50:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwttqq6dOeUVA4rEnHksRyPWR26ZdQzGXB+cXS2+JNdgBY5EsYMi8id43poc3w5ilXt85K3 X-Received: by 2002:aa7:d446:: with SMTP id q6mr68053376edr.218.1594223457757; Wed, 08 Jul 2020 08:50:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594223457; cv=none; d=google.com; s=arc-20160816; b=qU5tq8YlGOyRwMMUsrAyymsWjMUrQMCO9HbjFe6h790rZokQbm/ck2t9Z8ZJvIRRYy TheU4g/O50xOrqmJuFMDcw940U/YTOmog7V8oAX78OFOW/JOJbuXZONU/1hq7Ouole98 GYzcocu1lssUCl5doBDjcEM1sJ+yODE+SzMAmX27xa8dEJccAZaw02pwBiZWu2Zrktml T1l+M76EV+LENnnNnUWr/sYsDsyCmtpyFNeyZ93ZKFJjJ/arHptRI5DQOTeQTcqv5fhx BAGaumQ6QqRykbGOqjElunfGrlsuiaGDlUnmVkalI5+Lwtr8RI/cn4MVc0yLfo8bxE6m MULw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=1EiYlSj53kAa10Re22X7U17L3JibixRtXdcfzQAF4RI=; b=m9IfBNOQfAzm7ansDWMa43H/hp0k/ls5SCilCAKJvdbZCU5Skv9+JMpB1XTRYJIafi /nb/rVs1zeooXqg3E6Ed+1XbNagigUOa/txIuztX4xgsNzF0QSaAogCYeQmDAGbrsvAt yC/CJ+Sp02gU2F8XcCnBa1U8r2BxEfxotP0xaJmjeP84TG2djJtpfYoAh8mciLJHNlT5 yCNqeUgkTNkLB6pX2aQ5pvil7is1CSqUnruIYEjFsl85TpVRWuos85+LKDRTHlw8jMfx pz14/m/3zffF543VDA4wArULbuHeTVtyqYXHwT1G0QeJFDxEfaPhsW9VxNvkzBhD/rMJ bxCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=eNptn699; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i4si145291eja.4.2020.07.08.08.50.34; Wed, 08 Jul 2020 08:50:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=eNptn699; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730229AbgGHPuQ (ORCPT + 99 others); Wed, 8 Jul 2020 11:50:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729022AbgGHPuQ (ORCPT ); Wed, 8 Jul 2020 11:50:16 -0400 Received: from mail-ej1-x644.google.com (mail-ej1-x644.google.com [IPv6:2a00:1450:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C725C061A0B for ; Wed, 8 Jul 2020 08:50:16 -0700 (PDT) Received: by mail-ej1-x644.google.com with SMTP id a1so51075741ejg.12 for ; Wed, 08 Jul 2020 08:50:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1EiYlSj53kAa10Re22X7U17L3JibixRtXdcfzQAF4RI=; b=eNptn699SYep16X4CoHPkgiGM5zR/mQnI3Mzfc9eEIY75pcRdSypj0pJntEDR4CHvS AogyjaLxg5b7iP5tvNUzz/jAPzLjWzHgZoJOeYZCaJYfJsP54+JncPgdDwCXaNdf5kDH K1mIGvj60uqPjUQAobqbGoJIlXdqdUK/+fD1DMteEv5JNXCo5U65YhxhYzb+IlmrLu27 E3IX7kZcsTEMvjr4RRH6V4KWQ0rfrgXmeEmYl93SiTjzlboiF4oiFV3u04EJoo9Su1be U4kIpVOKeOQn52xh78cieCUwmNTeMkzDU3sr3r03UQCxfqxMjQyf2Q76ItzqDLCVZDbq Bqjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1EiYlSj53kAa10Re22X7U17L3JibixRtXdcfzQAF4RI=; b=JmHZTD6CVi6IHSRQFgBsl/PwFp3ujU3/puTaRngRTyRCTiFUMSktu6f2AAOx++8OcD exoDJ8tFYw4FJFYeAKPVAb/es2fU4VrcLrmvtXxY1DhszXVF28kDkrWqT7mXRwmWaNaH FFhQG8Ys/i9ecVdce/OT8mTNThRTIISdfH26rBYdM/p5PRH1gcAqOrZ0Q1vdtFvuYHmg 0pN/BDjpwyfdu7ZsM3AquEwB+BgJ7B9vV2/KGhGtni9lCkEUEVfG9AtJfVTUj82iJDEr nwO9n2owy7WLFSReRkQpfGD8+SJXIXuY126JnUsF8Mj7Tmw6EhUi/eCe2TJhfqj2qzCg VJQQ== X-Gm-Message-State: AOAM5339qM7Q5vu8KezyhtgBc0kTxC9Bw8Bf60BNAsH75QxAkjHSjpAI 0FvFPlRP66HLjUPz8HsYdbivX+L8QqPsKXkRk0wbvWb5 X-Received: by 2002:a17:906:b888:: with SMTP id hb8mr51820947ejb.124.1594223414714; Wed, 08 Jul 2020 08:50:14 -0700 (PDT) MIME-Version: 1.0 References: <20200707180043.GA386073@linux.ibm.com> <20200708052626.GB386073@linux.ibm.com> <9a009cf6-6c30-91ca-a1a5-9aa090c66631@redhat.com> <999ea296-4695-1219-6a4d-a027718f61e5@redhat.com> <20200708083951.GH386073@linux.ibm.com> <20200708091520.GE128651@kernel.org> <20200708094549.GA781326@linux.ibm.com> <98166184-3aaf-479e-bfb3-fc737f4ac98d@redhat.com> In-Reply-To: <98166184-3aaf-479e-bfb3-fc737f4ac98d@redhat.com> From: Dan Williams Date: Wed, 8 Jul 2020 08:50:03 -0700 Message-ID: Subject: Re: [PATCH v2 1/3] arm64/numa: export memory_add_physaddr_to_nid as EXPORT_SYMBOL_GPL To: David Hildenbrand Cc: Mike Rapoport , Mike Rapoport , Michal Hocko , Jia He , Catalin Marinas , Will Deacon , Vishal Verma , Dave Jiang , Andrew Morton , Baoquan He , Chuhong Yuan , Linux ARM , Linux Kernel Mailing List , Linux MM , linux-nvdimm , Kaly Xin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 8, 2020 at 3:04 AM David Hildenbrand wrote: > > On 08.07.20 11:45, Mike Rapoport wrote: > > On Wed, Jul 08, 2020 at 11:25:36AM +0200, David Hildenbrand wrote: > >> On 08.07.20 11:15, Mike Rapoport wrote: > >>>>>>>> > >>>>> But on more theoretical/fundmanetal level, I think we lack a generic > >>>>> abstraction similar to e.g. x86 'struct numa_meminfo' that serves as > >>>>> translaton of firmware supplied information into data that can be used > >>>>> by the generic mm without need to reimplement it for each and every > >>>>> arch. > >>>> > >>>> Right. As I expressed, I am not a friend of using memblock for that, and > >>>> the pgdat node span is tricky. > >>>> > >>>> Maybe abstracting that x86 concept is possible in some way (and we could > >>>> restrict the information to boot-time properties, so we don't have to > >>>> mess with memory hot(un)plug - just as done for numa_meminfo AFAIKS). > >>> > >>> I agree with pgdat part and disagree about memblock. It already has > >>> non-init physmap, why won't we add memblock.memory to the mix? ;-) > >> > >> Can we generalize and tweak physmap to contain node info? That's all we > >> need, no? (the special mem= parameter handling should not matter for our > >> use case, where "physmap" and "memory" would differ) > > > > TBH, I have only random vague thoughts at the moment. This might be an > > option. But then we need to enable physmap on !s390, right? > > Yes, looks like it. > > > > >>> Now, seriously, memblock already has all the necessary information about > >>> the coldplug memory for several architectures. x86 being an exception > >>> because for some reason the reserved memory is not considered memory > >>> there. The infrastructure for quiering and iterating memory regions is > >>> already there. We just need to leave out the irrelevant parts, like > >>> memblock.reserved and allocation funcions. > >> > >> I *really* don't want to mess with memblocks on memory hot(un)plug on > >> x86 and s390x (+other architectures in the future). I also thought about > >> stopping to create memblocks for hotplugged memory on arm64, by tweaking > >> pfn_valid() to query memblocks only for early sections. > >> > >> If "physmem" is not an option, can we at least introduce something like > >> ARCH_UPDTAE_MEMBLOCK_ON_HOTPLUG to avoid doing that on x86 and s390x for > >> now (and later maybe for others)? > > > > I have to do more memory hotplug howework to answer that ;-) > > > > My general point is that we don't have to reinvent the wheel to have > > coldplug memory representation, it's already there. We just need a way > > to use it properly. > > Yes, I tend to agree. Details to be clarified :) I'm not quite understanding the concern, or requirement about "updating memblock" in the hotplug path. The routines memory_add_physaddr_to_nid() and phys_to_target_node() are helpers to interrogate platform-firmware numa info through a common abstraction. They place no burden on the memory hotplug code they're just used to see if a hot-added range lies within an existing node span when platform-firmware otherwise fails to communicate a node. x86 can continue to back those helpers with numa_meminfo, arm64 can use a generic memblock implementation and other archs can follow the arm64 example if they want better numa answers for drivers.