Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1997073imm; Thu, 11 Oct 2018 03:27:41 -0700 (PDT) X-Google-Smtp-Source: ACcGV63IFUphcGIbsH+2LkodJn6SMWeSOjd71eILxEeJkDLTLWsJtawyyp+mWXht99MlK1DruoqZ X-Received: by 2002:a65:62d5:: with SMTP id m21-v6mr888660pgv.243.1539253661545; Thu, 11 Oct 2018 03:27:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539253661; cv=none; d=google.com; s=arc-20160816; b=V1jewqz2+vaDg4ifcwnKzF+txQYqynYt1Qw5TAmuGY7bqHdYjV4R623UlQVu4ozfpe SMJqKyVc78TbkmU9/iLyDEFtq1Nu3PJd87z3+hKxgVMZI/md3VPNWLNFfT+L2kyP1Fa8 3jz9ds739j/irUL9UCZsu9i4OyO5dl6gvujcAs4LNWewRUxfttZob/diyQwPba9KYGIK AAXsLcx1tEwL/k7LIEZjviU5nrwi4yUtzmAOZSpoowrSwCd68ocDv3SVW72L8GFKp2eZ y5zi2g8v/jim2VkqtRbcwuPRX9sl9vhhdPK18E3ZMeIA0WB5Wnr926v3TCdRxpA6wx4z Yn2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Xor5UyMu307tfktJt+TCHHU+6ERBucaLIpH+WzPplNY=; b=rORreNClLgqcY8KDVJ8qITakj5K+7l8DqV8KmfA+OPqM5VxILsV54SG70d/98Q3Hn+ /4jaU/vFyU5DS6p9slI995XY+bglBr8y3fjCjxE0Y/dy9fLsRTFMKiIkxRC+fiAkivEs r0VlMtSiqliAFvk+PIyAj14J4qyvMw3wkDGYt+LFwMbqdmoXQI4W4Bwdrme6apvwRgdp Z31YVvWbqHnW1jelifqtVMRBSDUVBnwwyd/OuPTSIyrX6EZUbgHOPL0sP7gt4gO9SHg2 ptJVyBj5pvnnB9mzQFXspEG/dm8QRzpvGmN3pepWBe7ZE2gvRN+7MbS47tDopFA7dt/b F5JA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b6-v6si2413897plr.267.2018.10.11.03.26.54; Thu, 11 Oct 2018 03:27:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727278AbeJKQjs (ORCPT + 99 others); Thu, 11 Oct 2018 12:39:48 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:34576 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726223AbeJKQjs (ORCPT ); Thu, 11 Oct 2018 12:39:48 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 546FDED1; Thu, 11 Oct 2018 02:13:25 -0700 (PDT) Received: from [10.1.196.93] (en101.cambridge.arm.com [10.1.196.93]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 85C223F5BC; Thu, 11 Oct 2018 02:13:23 -0700 (PDT) Subject: Re: [PATCH v6 2/2] ThunderX2: Add Cavium ThunderX2 SoC UNCORE PMU driver To: Ganapatrao Kulkarni Cc: Ganapatrao Kulkarni , linux-doc@vger.kernel.org, LKML , linux-arm-kernel@lists.infradead.org, Will Deacon , Mark Rutland , jnair@caviumnetworks.com, Robert Richter , Vadim.Lomovtsev@cavium.com, Jan.Glauber@cavium.com References: <20180621063338.20093-1-ganapatrao.kulkarni@cavium.com> <20180621063338.20093-3-ganapatrao.kulkarni@cavium.com> From: Suzuki K Poulose Message-ID: <3e9ab774-71db-d81d-5f0a-15fe942dfdb0@arm.com> Date: Thu, 11 Oct 2018 10:13:21 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ganapatrao, On 11/10/18 07:39, Ganapatrao Kulkarni wrote: >>> + >>> +/* >>> + * We must NOT create groups containing events from multiple hardware PMUs, >>> + * although mixing different software and hardware PMUs is allowed. >>> + */ >>> +static bool thunderx2_uncore_validate_event_group(struct perf_event *event) >>> +{ >>> + struct pmu *pmu = event->pmu; >>> + struct perf_event *leader = event->group_leader; >>> + struct perf_event *sibling; >>> + int counters = 0; >>> + >>> + if (leader->pmu != event->pmu && !is_software_event(leader)) >>> + return false; >>> + >>> + for_each_sibling_event(sibling, event->group_leader) { >>> + if (is_software_event(sibling)) >>> + continue; >>> + if (sibling->pmu != pmu) >>> + return false; >>> + counters++; >>> + } >> >> The above loop will not account for : >> 1) The leader event >> 2) The event that we are about to add. >> >> So you need to allocate counters for both the above to make sure we can >> accommodate the events. > > Is leader also part of sibling list? No. >>> +static void thunderx2_uncore_start(struct perf_event *event, int flags) >>> +{ >>> + struct hw_perf_event *hwc = &event->hw; >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore; >>> + struct thunderx2_pmu_uncore_dev *uncore_dev; >>> + unsigned long irqflags; >>> + >>> + hwc->state = 0; >>> + pmu_uncore = pmu_to_thunderx2_pmu_uncore(event->pmu); >>> + uncore_dev = pmu_uncore->uncore_dev; >>> + >>> + raw_spin_lock_irqsave(&uncore_dev->lock, irqflags); >>> + uncore_dev->select_channel(event); >>> + uncore_dev->start_event(event, flags); >>> + raw_spin_unlock_irqrestore(&uncore_dev->lock, irqflags); >>> + >>> + perf_event_update_userpage(event); >>> + >>> + if (!find_last_bit(pmu_uncore->active_counters, >>> + pmu_uncore->uncore_dev->max_counters)) { >> >> Are we guaranteed that the counter0 is always allocated ? What if the >> event is deleted and there are other events active ? A better check >> would be : >> if (find_last_bit(...) != pmu_uncore->uncore_dev->max_counters) >> > IIUC, find_last_bit will return zero if none of the counters are > active and we want to start timer for first active counter. > Well, if that was the case, how do you know if the bit zero is the only bit set ? AFAICS, lib/find_bit.c has : unsigned long find_last_bit(const unsigned long *addr, unsigned long size) { if (size) { unsigned long val = BITMAP_LAST_WORD_MASK(size); unsigned long idx = (size-1) / BITS_PER_LONG; do { val &= addr[idx]; if (val) return idx * BITS_PER_LONG + __fls(val); val = ~0ul; } while (idx--); } return size; } It returns "size" when it can't find a set bit. >>> + thunderx2_uncore_update(event); >>> + hwc->state |= PERF_HES_UPTODATE; >>> + } >>> + raw_spin_unlock_irqrestore(&uncore_dev->lock, irqflags); >>> +} >>> + >>> +static int thunderx2_uncore_add(struct perf_event *event, int flags) >>> +{ >>> + struct hw_perf_event *hwc = &event->hw; >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore; >>> + struct thunderx2_pmu_uncore_dev *uncore_dev; >>> + >>> + pmu_uncore = pmu_to_thunderx2_pmu_uncore(event->pmu); >>> + uncore_dev = pmu_uncore->uncore_dev; >>> + >>> + /* Allocate a free counter */ >>> + hwc->idx = alloc_counter(pmu_uncore); >>> + if (hwc->idx < 0) >>> + return -EAGAIN; >>> + >>> + pmu_uncore->events[hwc->idx] = event; >>> + /* set counter control and data registers base address */ >>> + uncore_dev->init_cntr_base(event, uncore_dev); >>> + >>> + hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; >>> + if (flags & PERF_EF_START) >>> + thunderx2_uncore_start(event, flags); >>> + >>> + return 0; >>> +} >>> + >>> +static void thunderx2_uncore_del(struct perf_event *event, int flags) >>> +{ >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore = >>> + pmu_to_thunderx2_pmu_uncore(event->pmu); >>> + struct hw_perf_event *hwc = &event->hw; >>> + >>> + thunderx2_uncore_stop(event, PERF_EF_UPDATE); >>> + >>> + /* clear the assigned counter */ >>> + free_counter(pmu_uncore, GET_COUNTERID(event)); >>> + >>> + perf_event_update_userpage(event); >>> + pmu_uncore->events[hwc->idx] = NULL; >>> + hwc->idx = -1; >>> +} >>> + >>> +static void thunderx2_uncore_read(struct perf_event *event) >>> +{ >>> + unsigned long irqflags; >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore = >>> + pmu_to_thunderx2_pmu_uncore(event->pmu); >>> + >>> + raw_spin_lock_irqsave(&pmu_uncore->uncore_dev->lock, irqflags); >>> + thunderx2_uncore_update(event); >>> + raw_spin_unlock_irqrestore(&pmu_uncore->uncore_dev->lock, irqflags); >>> +} >>> + >>> +static enum hrtimer_restart thunderx2_uncore_hrtimer_callback( >>> + struct hrtimer *hrt) >>> +{ >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore; >>> + unsigned long irqflags; >>> + int idx; >>> + bool restart_timer = false; >>> + >>> + pmu_uncore = container_of(hrt, struct thunderx2_pmu_uncore_channel, >>> + hrtimer); >>> + >>> + raw_spin_lock_irqsave(&pmu_uncore->uncore_dev->lock, irqflags); >>> + for_each_set_bit(idx, pmu_uncore->active_counters, >>> + pmu_uncore->uncore_dev->max_counters) { >>> + struct perf_event *event = pmu_uncore->events[idx]; >> >> Do we need to check if the "event" is not NULL ? Couldn't this race with >> an event_del() operation ? > > i dont think so, i have not come across any such race condition during > my testing. Thinking about it further, we may not hit it, as the PMU is "disable"d, before "add/del", which implies that the IRQ won't be triggered. Btw, in general, not hitting a race condition doesn't prove there cannot be a race ;) >>> +static int thunderx2_pmu_uncore_register( >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore) >>> +{ >>> + struct device *dev = pmu_uncore->uncore_dev->dev; >>> + char *name = pmu_uncore->uncore_dev->name; >>> + int channel = pmu_uncore->channel; >>> + >>> + /* Perf event registration */ >>> + pmu_uncore->pmu = (struct pmu) { >>> + .attr_groups = pmu_uncore->uncore_dev->attr_groups, >>> + .task_ctx_nr = perf_invalid_context, >>> + .event_init = thunderx2_uncore_event_init, >>> + .add = thunderx2_uncore_add, >>> + .del = thunderx2_uncore_del, >>> + .start = thunderx2_uncore_start, >>> + .stop = thunderx2_uncore_stop, >>> + .read = thunderx2_uncore_read, >> >> You must fill in the module field of the PMU to prevent this module from >> being unloaded while it is in use. > > thanks, i will add it. >> >>> + }; >>> + >>> + pmu_uncore->pmu.name = devm_kasprintf(dev, GFP_KERNEL, >>> + "%s_%d", name, channel); >>> + >>> + return perf_pmu_register(&pmu_uncore->pmu, pmu_uncore->pmu.name, -1); >>> +} >>> + >>> +static int thunderx2_pmu_uncore_add(struct thunderx2_pmu_uncore_dev *uncore_dev, >>> + int channel) >>> +{ >>> + struct thunderx2_pmu_uncore_channel *pmu_uncore; >>> + int ret, cpu; >>> + >>> + pmu_uncore = devm_kzalloc(uncore_dev->dev, sizeof(*pmu_uncore), >>> + GFP_KERNEL); >>> + if (!pmu_uncore) >>> + return -ENOMEM; >>> + >>> + cpu = cpumask_any_and(cpumask_of_node(uncore_dev->node), >>> + cpu_online_mask); >>> + if (cpu >= nr_cpu_ids) >>> + return -EINVAL; >> >> Do we need to fail this here ? We could always add this back when a CPU >> comes online (e.g, maxcpus=n). We do have cpu-hotplug call backs anyway. > > we want to fail, since we dont have any cpu to serve. > I understand that. We can bring up CPUs later from userspace. e.g, on a system like ThunderX, to speed up the booting, one could add "maxcpus=8" which forces the kernel to only bring up the first 8 CPUs and the rest are brought up by the userspace. So, one of the DMC/L3 doesn't have a supported CPU booted up during the driver probe, you will not bring the PMU up and hence won't be able to use them even after the CPUs are brought up later. You can take a look at the arm_dsu_pmu, where we handle a similar situation via dsu_pmu_cpu_online(). >>> +static struct platform_driver thunderx2_uncore_driver = { >>> + .probe = thunderx2_uncore_probe, >>> + .driver = { >>> + .name = "thunderx2-uncore-pmu", >>> + .acpi_match_table = ACPI_PTR(thunderx2_uncore_acpi_match), >>> + }, >>> +}; >> >> Don't we need to unregister the pmu's when we remove the module ? > > it is built in module at present, no unload is supported. > Anyway i am adding in next version , since i am going for loadable module. Ok Suzuki