Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752172AbdHQDbg (ORCPT ); Wed, 16 Aug 2017 23:31:36 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:3579 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751969AbdHQDbe (ORCPT ); Wed, 16 Aug 2017 23:31:34 -0400 Subject: Re: [PATCH v4 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver To: Mark Rutland References: <1500984642-204676-1-git-send-email-zhangshaokun@hisilicon.com> <1500984642-204676-4-git-send-email-zhangshaokun@hisilicon.com> <20170815104143.GD6090@leverpostej> CC: , , , , From: Zhangshaokun Message-ID: Date: Thu, 17 Aug 2017 11:31:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20170815104143.GD6090@leverpostej> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.221.148] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090206.59950E0E.0027,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 407af1e06a12f0cb0246dcff7ae44b06 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6065 Lines: 223 Hi Mark, On 2017/8/15 18:41, Mark Rutland wrote: > On Tue, Jul 25, 2017 at 08:10:39PM +0800, Shaokun Zhang wrote: >> This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each >> L3C has own control, counter and interrupt registers and is an separate >> PMU. For each L3C PMU, it has 8-programable counters and supports 0x60 >> events, event code is 8-bits and every counter is free-running. >> Interrupt is supported to handle counter (48-bits) overflow. > > [...] > >> +/* L3C register definition */ >> +#define L3C_PERF_CTRL 0x0408 >> +#define L3C_INT_MASK 0x0800 >> +#define L3C_INT_STATUS 0x0808 >> +#define L3C_INT_CLEAR 0x080c >> +#define L3C_EVENT_CTRL 0x1c00 >> +#define L3C_EVENT_TYPE0 0x1d00 >> +#define L3C_CNTR0_LOWER 0x1e00 > > Why does this have a _LOWER suffix? > > How big is this regsiter? is it part of a pair of registers? > Each counter is 48-bits, for counter0, the register offset of the lower 32-bits is 0x1e00 and the higher 16-bits is in 0x1e04 (while the upper 16-bits is reserved for 0x1e04, RAZ and WI), other counters are the same. >> + >> +/* L3C has 8-counters and supports 0x60 events */ >> +#define L3C_NR_COUNTERS 0x8 >> +#define L3C_NR_EVENTS 0x60 > > What exactly is meant by "supports 0x60 events"? > > e.g. does tha mean event IDs 0-0x5f are valid? > It is event IDs, my apologies to describe it vaguely. >> +static irqreturn_t hisi_l3c_pmu_isr(int irq, void *dev_id) >> +{ >> + struct hisi_pmu *l3c_pmu = dev_id; >> + struct perf_event *event; >> + unsigned long overflown; >> + u32 status; >> + int idx; >> + >> + /* Read L3C_INT_STATUS register */ >> + status = readl(l3c_pmu->base + L3C_INT_STATUS); >> + if (!status) >> + return IRQ_NONE; >> + overflown = status; > > You don't need the temporary u32 value here; you can assign directly to > an unsigned lnog and do all the manipulation on that. > Ok. > [...] > >> +/* Check if the CPU belongs to the SCCL and CCL of PMU */ >> +static bool hisi_l3c_is_cpu_in_ccl(struct hisi_pmu *l3c_pmu) >> +{ >> + u32 ccl_id, sccl_id; >> + >> + hisi_read_sccl_and_ccl_id(&sccl_id, &ccl_id); >> + >> + if (sccl_id != l3c_pmu->sccl_id) >> + return false; >> + >> + if (ccl_id != l3c_pmu->ccl_id) >> + return false; >> + >> + /* Return true if matched */ >> + return true; >> +} > > The conditionals would be simpler as: > > return (sccl_id == l3c_pmu->sccl_id && > ccl_id == l3c_pmu->ccl_id); > Ok. >> + >> +static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) >> +{ >> + struct hisi_pmu *l3c_pmu; >> + >> + l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); >> + >> + /* Proceed only when CPU belongs to the same SCCL and CCL */ >> + if (!hisi_l3c_is_cpu_in_ccl(l3c_pmu)) >> + return 0; > > Surely you have a mask of CPUs that you can check instead? You'll need > that for the offline path. > Ok, Shall create the cpumask and update it during CPU hotplug callback. > [...] > >> +static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) >> +{ >> + struct hisi_pmu *l3c_pmu; >> + cpumask_t ccl_online_cpus; >> + unsigned int target; >> + >> + l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); >> + >> + /* Proceed only when CPU belongs to the same SCCL and CCL */ >> + if (!hisi_l3c_is_cpu_in_ccl(l3c_pmu)) >> + return 0; > > Again, surely you can check a pre-computed mask? > Ok. >> + >> + /* Proceed if this CPU is used for event counting */ >> + if (!cpumask_test_cpu(cpu, &l3c_pmu->cpus)) >> + return 0; > > You need to set up the CPU state regardless of whether there are active > events currently. Otherwise the cpumask can be stale, pointing at an > offline CPU, leaving the PMU unusable. > Ok. Shall update the cpumask and also hisi_pmu::init cpu to hold the next online CPU >> + >> + /* Give up ownership of CCL */ >> + cpumask_test_and_clear_cpu(cpu, &l3c_pmu->cpus); >> + >> + /* Any other CPU for this CCL which is still online */ >> + cpumask_and(&ccl_online_cpus, cpu_coregroup_mask(cpu), >> + cpu_online_mask); > > What is cpu_coregroup_mask? I do not think you can rely on that > happening to align with the physical CCL mask. > The cpu_coregroup_mask return the CPU cores within the cluster. So we used this. > Instead, please: > > * Keep track of which CPU(s) this PMU can be used from, with a cpumask. > Either initialise that at probe time, or add CPUs to that in the > hotplug callback. > > * Use only that mask to determine which CPU to move the PMU context to. > > * Use an int to hold the current CPU; there's no need to use a mask to > hold what shoule be a single CPU ID. > Shall modify as suggested. > [...] > >> + /* Get the L3C SCCL ID */ >> + if (device_property_read_u32(dev, "hisilicon,scl-id", >> + &l3c_pmu->sccl_id)) { >> + dev_err(dev, "Can not read l3c sccl-id!\n"); >> + return -EINVAL; >> + } >> + >> + /* Get the L3C CPU cluster(CCL) ID */ >> + if (device_property_read_u32(dev, "hisilicon,ccl-id", >> + &l3c_pmu->ccl_id)) { >> + dev_err(dev, "Can not read l3c ccl-id!\n"); >> + return -EINVAL; >> + } > > Previously, you expect that these happen to match particular bits of the > MPIDR, which vary for multi-threaded cores. Please document this. > Ok. >> +static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, >> + struct hisi_pmu *l3c_pmu) >> +{ >> + struct device *dev = &pdev->dev; >> + int ret; >> + >> + ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu); >> + if (ret) >> + return ret; >> + >> + ret = hisi_l3c_pmu_init_irq(l3c_pmu, pdev); >> + if (ret) >> + return ret; >> + >> + l3c_pmu->name = devm_kasprintf(dev, GFP_KERNEL, "hisi_l3c%u_%u", >> + l3c_pmu->l3c_tag_id, l3c_pmu->sccl_id); > > As mentioned on the documentation patch, it would be nicer for the name > to be hierarchical, i.e. placing the SCCL ID first. > Surely. Thanks. Shaokun > Thanks, > Mark. > _______________________________________________ > linuxarm mailing list > linuxarm@huawei.com > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm > > . >