Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1807962rwl; Thu, 30 Mar 2023 01:36:44 -0700 (PDT) X-Google-Smtp-Source: AKy350Y/PgfAkagplwqMri1I0fxWdAQ1gDwW4U2Uy1bAy6nrHNvhux86UJLI+R5VPtGyJzVjA/IA X-Received: by 2002:a17:906:15c:b0:8f5:14ab:94bc with SMTP id 28-20020a170906015c00b008f514ab94bcmr19461972ejh.6.1680165404736; Thu, 30 Mar 2023 01:36:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680165404; cv=none; d=google.com; s=arc-20160816; b=ZQYxWVS8zRLy1WN6zkfdhG1Q+gTXpUiYoUkYUfAVUmy8TNlYX0l1nNq/KAVCX391y5 BntS14IBsafeKIOo0ghPeeM9oU/k2pA/r3FHnS6m8q1WTDcD5Fc7DqGBCqR/5KqCNtpK uZ6w/xt92d8Zhxh7LaaqvsdzYgfCGwi3wxA1TPHt6I2maQcydERztSgsV30k9S8hcIOX gu4MAnlKQaVuj6WtXmECO55tPYUsCMXXVFUvD2udAo6mr4WgIPpSoE2PGrt2bnoz1R6z nyaV8DsXMF+AW8g1rvCuleqgdKopxFRi1Q/IHJZN96QLZ5jdAJox2jFlmhbT46lwahEh EDmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=u2ks/yQteKUDPoH4nQMQeWevey2v0X2s8MGpW1y8oT0=; b=jXT1E1DApy/n2GvM1ujaPFMqHBgUsHKHLJdvq7Z2jM1ky97RvCqUR+81l4Q7ltAo0q 30wNiipxeYsevfUYm8iQfZ8pQvugKTqHKZpwcRAEfQ+VKHjUqSEgSjdQTQ5nX14fdnwU Mt2feZZ5fMC9F7oDXX2evmz4bsPUDfrFuIFISf9lFI84dXXgMfyk0ZL6WzFH+6bxKNjt gKGoMWZrnUHtziLAi9luqbLfGhtUh5q5NokXiObQUVqWExVqa/thH1AXQwJxst0yKkdW mBFihDyoF8QsgpuTGTI162K0m01uxbiq+YsLQONAElDOMwZtP4Nkm1x/HHnH6nkp4IIb CJwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id uz29-20020a170907119d00b00923a472118csi29059101ejb.547.2023.03.30.01.36.19; Thu, 30 Mar 2023 01:36:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229954AbjC3IfV (ORCPT + 99 others); Thu, 30 Mar 2023 04:35:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbjC3IfQ (ORCPT ); Thu, 30 Mar 2023 04:35:16 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E6867AAD; Thu, 30 Mar 2023 01:35:02 -0700 (PDT) Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PnGtP6PTCz6J7sL; Thu, 30 Mar 2023 16:34:21 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Thu, 30 Mar 2023 09:35:00 +0100 Date: Thu, 30 Mar 2023 09:34:58 +0100 From: Jonathan Cameron To: Yicong Yang CC: , , , , , , , , , , Subject: Re: [PATCH 1/4] hwtracing: hisi_ptt: Make cpumask only present online CPUs Message-ID: <20230330093458.00002c50@Huawei.com> In-Reply-To: <94e7d85a-d580-94c5-ae2c-fe6a77c21487@huawei.com> References: <20230315094316.26772-1-yangyicong@huawei.com> <20230315094316.26772-2-yangyicong@huawei.com> <20230328172409.000021f5@Huawei.com> <94e7d85a-d580-94c5-ae2c-fe6a77c21487@huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml500001.china.huawei.com (7.191.163.213) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 30 Mar 2023 11:53:14 +0800 Yicong Yang wrote: > On 2023/3/29 0:24, Jonathan Cameron wrote: > > On Wed, 15 Mar 2023 17:43:13 +0800 > > Yicong Yang wrote: > > > >> From: Yicong Yang > >> > >> perf will try to start PTT trace on every CPU presented in cpumask sysfs > >> attribute and it will fail to start on offline CPUs(see the comments in > >> perf_event_open()). But the driver is using cpumask_of_node() to export > >> the available cpumask which may include offline CPUs and may fail the > >> perf unintendedly. Fix this by only export the online CPUs of the node. > > > > There isn't clear documentation that I can find for cpumask_of_node() > > and chasing through on arm64 (which is what we care about for this driver) > > it's maintained via numa_add_cpu() numa_remove_cpu() > > Those are called in arch/arm64/kernel/smp.c in locations that are closely coupled > > with set_cpu_online(cpu, XXX); > > https://elixir.bootlin.com/linux/v6.3-rc4/source/arch/arm64/kernel/smp.c#L246 > > https://elixir.bootlin.com/linux/v6.3-rc4/source/arch/arm64/kernel/smp.c#L303 > > > > Now there are races when the two might not be in sync but in this case > > we are just exposing the result to userspace, so chances of a race > > after this sysfs attribute has been read seems much higher to me and > > I don't think we can do anything about that. > > > > Is there another path that I'm missing where online and node masks are out > > of sync? > > > > maybe no. This patch maybe incorrect and I need more investigation, so let's me > drop it from the series. Tested and everything seems fine now. > > I found this problem and referred to commit 064f0e9302af ("mm: only display online cpus of the numa node") > which might be the same problem. But seems unnecessary that cpumask_of_node() > already include online CPUs only. Seems it was fixed up for arm64 in 7f954aa1a ("arm64: smp: remove cpu and numa topology information when hotplugging out CPMU") If we could audit all the other architectures it would be great to document the properties of this cpmuask and possibly simplify the code in the path you highlight above (assuming no race conditions etc) Jonathan > > Thanks. > > > Jonathan > > > > > >> > >> Fixes: ff0de066b463 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device") > >> Signed-off-by: Yicong Yang > > > >> --- > >> drivers/hwtracing/ptt/hisi_ptt.c | 13 +++++++++++-- > >> 1 file changed, 11 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c > >> index 30f1525639b5..0a10c7ec46ad 100644 > >> --- a/drivers/hwtracing/ptt/hisi_ptt.c > >> +++ b/drivers/hwtracing/ptt/hisi_ptt.c > >> @@ -487,9 +487,18 @@ static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, > >> char *buf) > >> { > >> struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev)); > >> - const cpumask_t *cpumask = cpumask_of_node(dev_to_node(&hisi_ptt->pdev->dev)); > >> + cpumask_var_t mask; > >> + ssize_t n; > >> > >> - return cpumap_print_to_pagebuf(true, buf, cpumask); > >> + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) > >> + return 0; > >> + > >> + cpumask_and(mask, cpumask_of_node(dev_to_node(&hisi_ptt->pdev->dev)), > >> + cpu_online_mask); > >> + n = cpumap_print_to_pagebuf(true, buf, mask); > >> + free_cpumask_var(mask); > >> + > >> + return n; > >> } > >> static DEVICE_ATTR_RO(cpumask); > >> > > > > . > >