Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp690459rwd; Thu, 15 Jun 2023 00:12:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4GdCDoJdXcIwmTIYjad+eOQGnQkxnMWMUqMhTaWnZ0PlJZoyq7fY1dyhY4eUITFilVr1Z0 X-Received: by 2002:a05:6a20:3d1c:b0:10c:b1b0:3ee3 with SMTP id y28-20020a056a203d1c00b0010cb1b03ee3mr5264065pzi.21.1686813148661; Thu, 15 Jun 2023 00:12:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686813148; cv=none; d=google.com; s=arc-20160816; b=0yyULW0iipaV74VTOqjkFBao7QTtQ1P6PbPAYIgHhnRjfz5AjeBw8n8adj3wKuP0lU FjIG7hKzLi0ZgCyFbfllaunqs99OK27QyAr4pBslg8bqZPkvr6Jj0IFSO3BoEeXcX3vJ lQhsQDMtiSi6OmJ9eW3TFu4jTOprOg49CzzejX4QDFc/+kmRO2lzcoZdHZ2DkkyWrM3x WhHxta2RSHTQ4si5lEOdB2FmDK9cAz9nLFf+MPIj4xCDdubgntJkTd94zuH+GrG7FvHD qejPE7WCDf4F+zAjmgbjwkZCmzdXCqQ3vcU5i3caNQZbqrYijQgjv8INnTHoDyee1XE0 SlFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:cc:to:subject:message-id :from:content-transfer-encoding:date:dkim-signature:mime-version; bh=Xqyxiv3NjzIYmlM2b62iN8glzRDo+4uOYjVrDy5I0P8=; b=vfBSeuiPyrUWoxkFY/nCs3qpO3L02YuCOBgeBqIENoZQDyWHNANR4lYjTtzLHFgvK3 JI9IiixS+VtR9O1MR82dXbvh7ZZnCLSpgT2i7/tzSTkTo0ZTgJB2KTXpaHQU2QWesE+h kelUX7ibAl/xdJWFR2P5G6wfR90q30cv/KqyIL6Vf+jZ/q5/lQTyj0bLqmmwQkHUmCR0 K/jWAsX7r/oBTUbH0B9eRofQPdahZJxpEUVMckDuU1LUI60FQC20cgCBULm1wE5fchy7 IBMEaeZvzU56xjKpNGsLnaHgmjycsR87LwskRCe1iIzPEySHVwIPzVpZZjW8y8JL+D6r lP9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=t6zyKJvF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y3-20020aa79423000000b006669ba69adasi680405pfo.238.2023.06.15.00.12.16; Thu, 15 Jun 2023 00:12:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=t6zyKJvF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243558AbjFOGib (ORCPT + 99 others); Thu, 15 Jun 2023 02:38:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244085AbjFOGh7 (ORCPT ); Thu, 15 Jun 2023 02:37:59 -0400 Received: from out-51.mta0.migadu.com (out-51.mta0.migadu.com [91.218.175.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBB8D2945 for ; Wed, 14 Jun 2023 23:36:50 -0700 (PDT) MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1686811009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xqyxiv3NjzIYmlM2b62iN8glzRDo+4uOYjVrDy5I0P8=; b=t6zyKJvFRDiN0aAngGaVQ1jAQOVILdfKieEoYB+ja0n9eE8YfrwsOvUu7Fim0UompUZFbB Yhpnt8d4Q0WHDl0DwiUwqnpPjZhWkdvMlAJorKSk9MGdYM3Bvq9qaC86/OaSo5WBlI5rkH TI6eg8URPCr7DQabm+0ubQf3GWOTH2o= Date: Thu, 15 Jun 2023 06:36:48 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Yajun Deng" Message-ID: Subject: Re: [PATCH] mm/mm_init.c: remove spinlock in early_pfn_to_nid() To: "Mike Rapoport" Cc: "Greg KH" , rafael@kernel.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org In-Reply-To: <20230615062021.GI52412@kernel.org> References: <20230615062021.GI52412@kernel.org> <20230614115339.GX52412@kernel.org> <2023061431-litigate-upchuck-7ed1@gregkh> <20230614110324.3839354-1-yajun.deng@linux.dev> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org June 15, 2023 2:20 PM, "Mike Rapoport" wrote:=0A=0A> On= Thu, Jun 15, 2023 at 03:02:58AM +0000, Yajun Deng wrote:=0A> =0A>> June = 14, 2023 7:53 PM, "Mike Rapoport" wrote:=0A>> =0A>> Hi,= =0A>> =0A>> On Wed, Jun 14, 2023 at 11:28:32AM +0000, Yajun Deng wrote:= =0A>> =0A>> June 14, 2023 7:09 PM, "Greg KH" = wrote:=0A>> =0A>> On Wed, Jun 14, 2023 at 07:03:24PM +0800, Yajun Deng w= rote:=0A>> =0A>> When the system boots, only one cpu is enabled before sm= p_init().=0A>> So the spinlock is not needed in most cases, remove it.=0A= >> =0A>> Add spinlock in get_nid_for_pfn() because it is after smp_init()= .=0A>> =0A>> So this is two different things at once in the same patch?= =0A>> =0A>> Or are they the same problem and both need to go in to solve = it?=0A>> =0A>> And if a spinlock is not needed at early boot, is it reall= y causing any=0A>> problems?=0A>> =0A>> They are the same problem.=0A>> I= added pr_info in early_pfn_to_nid(), found get_nid_for_pfn() is the only= =0A>> case need to add spinlock.=0A>> This patch tested on my x86 system.= =0A>> =0A>> Are you sure it'll work on !x86?=0A>> =0A>> I'm probably sure= of that, although I don't have a !x86 machine.=0A>> =0A>> early_pfn_to_n= id() is called in smp_init() and kasan_init() on=0A>> different architect= ures. If it works well on x86, it'll work on=0A>> !x86.=0A> =0A> This is = often not true. Please verify that other architectures do not call=0A> ea= rly_pfn_to_nid() after smp_init(). The explanation why it is safe should= =0A> be a part of the changelog.=0A> =0A>> Signed-off-by: Yajun Deng =0A>> ---=0A>> drivers/base/node.c | 11 +++++++++--=0A>= > mm/mm_init.c | 18 +++---------------=0A>> 2 files changed, 12 insertion= s(+), 17 deletions(-)=0A>> =0A>> diff --git a/drivers/base/node.c b/drive= rs/base/node.c=0A>> index 9de524e56307..844102570ff2 100644=0A>> --- a/dr= ivers/base/node.c=0A>> +++ b/drivers/base/node.c=0A>> @@ -748,8 +748,15 @= @ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)=0A>> = static int __ref get_nid_for_pfn(unsigned long pfn)=0A>> {=0A>> #ifdef CO= NFIG_DEFERRED_STRUCT_PAGE_INIT=0A>> - if (system_state < SYSTEM_RUNNING)= =0A>> - return early_pfn_to_nid(pfn);=0A>> + static DEFINE_SPINLOCK(early= _pfn_lock);=0A>> + int nid;=0A>> +=0A>> + if (system_state < SYSTEM_RUNNI= NG) {=0A>> + spin_lock(&early_pfn_lock);=0A>> + nid =3D early_pfn_to_nid(= pfn);=0A>> + spin_unlock(&early_pfn_lock);=0A>> =0A>> Adding an external = lock for when you call a function is VERY dangerous=0A>> as you did not d= ocument this anywhere, and there's no way to enforce it=0A>> properly at = all.=0A>> =0A>> I should add a comment before early_pfn_to_nid().=0A>> = =0A>> Does your change actually result in any boot time changes? How was = this=0A>> tested?=0A>> =0A>> Just a bit.=0A>> =0A>> Just a bit tested? Or= just a bit of boot time changes?=0A>> For the latter, do you have number= s?=0A>> =0A>> For the latter, the most beneficial function is memmap_init= _reserved_pages(),=0A>> the boot time changes depending on whether DEFERR= ED_STRUCT_PAGE_INIT=0A>> is defined or not.=0A>> =0A>> -->memmap_init_res= erved_pages()=0A>> -->for_each_reserved_mem_range()=0A>> reserve_bootmem_= region()=0A>> -->for()=0A>> init_reserved_page()=0A>> --> early_pfn_to_ni= d()=0A> =0A> A better solution would be to pass nid to reserve_bootmem_ra= nge() and drop=0A> the call to early_pfn_to_nid() in init_reserved_page()= .=0A> =0A> Then there won't be lock contention and no need for fragile ch= anges in the=0A> locking.=0A>=0A=0AGreat, I will try it.=0A=0A =0A>> If d= efine CONFIG_DEFERRED_STRUCT_PAGE_INIT:=0A>> =0A>> before:=0A>> memmap_in= it_reserved_pages() 1.87 seconds=0A>> after:=0A>> memmap_init_reserved_pa= ges() 1.27 seconds=0A>> =0A>> 32% time reduction.=0A> =0A> These measurem= ents should be part of the changelog.=0A> =0A>> If not define CONFIG_DEFE= RRED_STRUCT_PAGE_INIT:=0A>> =0A>> early_pfn_to_nid() is called by few,=0A= >> boot time didn't change.=0A>> =0A>> By the way, this machine has 190GB= RAM.=0A>> =0A>> --=0A>> Sincerely yours,=0A>> Mike.=0A> =0A> --=0A> Sinc= erely yours,=0A> Mike.