Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp101250pxy; Tue, 4 May 2021 19:52:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwo+3R2fQ4IqA2+jBjInWlI3RVEvCgqNI+9MxDqfs0UyUULE2x+s1xOuTSCjyHIrqmKDZt/ X-Received: by 2002:a05:6402:12cf:: with SMTP id k15mr29652672edx.130.1620183121119; Tue, 04 May 2021 19:52:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620183121; cv=none; d=google.com; s=arc-20160816; b=yiR5kB/eJpLGbomte8c8ICo0GxJBuMLKkRcVEXH8Vxg1wVUX42zParu1M9Jcfjrwff ihfp7sYoe5h++DA++3sOt0gNBIQk8Z28cW4ttGXNY88J4YMJxphrvrYtbLU3JGEzGuyC 15wJr/nYqU0W/BbhYGWoOzAi2aJ26B52c7Va5UNxLMj8d5rYoAeP0ECeY+WXHYA9WNgM XeZJ2ZfbX2lzn/07H/foGGRjvIHuEvlpCRSCras9oqZV8jFXze83ovJTne+lKK0g76vf mIFR2Hwq6/FLbAnXGY0pg/K8dKqA/s5SmtjXlfXy7hec4GvoAbJGF4386h6Aj5oa4525 isdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=apmwOMIX42vLhRllCLH8i0RfwYVPmyfjMJJpS6kcMzI=; b=pMv/bZSmUlrKsMlJY9CWHRIk2otFhs+c+2kP6ZIwbftxIKzciw7A62yNoWE3n4yJqP JcGP1cGa3SrWSpSAAsvdXfowSIE+RL5DOblc4N0w818qYw7mRyYyqYklIp/av/m7WGy5 3VBeiV6bsBg173+NwsDyGXCff/SZwZcTuDRYxw3aVpgYH0hXT8yS6FIjF99+hNMdaTJ8 DUgWGkz5tJY52dnvm1Q1angY6/hjfN/q40qKxYTMC1hZtVhiCxxEBxyT8DRkkGkfMMNZ KESnpNT8HT/HfDd/o/j7ewQizBaQXbwNRpA8OOZmWXb78JzFMAJDzLA6DK/EfoiaPTvC AsAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RIk7szTj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dt20si1351362ejb.445.2021.05.04.19.50.39; Tue, 04 May 2021 19:52:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RIk7szTj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230454AbhEECN3 (ORCPT + 99 others); Tue, 4 May 2021 22:13:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:37690 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229799AbhEECN2 (ORCPT ); Tue, 4 May 2021 22:13:28 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6CF5C613CC for ; Wed, 5 May 2021 02:12:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620180752; bh=QcOcUQQZmj+dOibYCaDeGWcNaIV3C2QYxgfO6KET9gk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=RIk7szTjVh8RsIYtvU1beuqct76T7eGNKsL8EwfQ3cUNP+tT9RJN7iTdsADYEi1Gk 38bJhdWGUXtb9pYMWWBii/I2F2HmUAIj1002EhhSCeUtxU682Nj4AX3EkXLAq2tSHO S1gpDfxEX2+8P7yw5PpdUG5t6mgqE3kFSEeJYRJTIYY5sJcyutfDt9bvEPeNw1u0Yf AuT2rI5nRajKzsTzhtZjAFYsGX6/BeVgRDGyv3g3sWESL1Z8eeZX+bIOLY62ViqpLy jtzN6/lysYu7VvRFdAgvLBfvZ9LOVY6LjHtr8HAGtdu/JqEJnyx8ZawztGsGcSG2+1 MY+M4KDY+qt3w== Received: by mail-ed1-f41.google.com with SMTP id c22so74327edn.7 for ; Tue, 04 May 2021 19:12:32 -0700 (PDT) X-Gm-Message-State: AOAM532HE2SBsMKfSJ5xcRplYoSxuhXxbaLGO5J9+2wNXPpT2ccMxkPq 4Qmm+UFa7tvJu0GuZteuzP7g9PEddDGeqh+Hug== X-Received: by 2002:aa7:cd83:: with SMTP id x3mr29087357edv.373.1620180750937; Tue, 04 May 2021 19:12:30 -0700 (PDT) MIME-Version: 1.0 References: <20210311000837.3630499-1-robh@kernel.org> <20210311000837.3630499-7-robh@kernel.org> In-Reply-To: From: Rob Herring Date: Tue, 4 May 2021 21:12:19 -0500 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v6 06/10] libperf: Add support for user space counter access To: Ian Rogers Cc: Will Deacon , Catalin Marinas , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Jiri Olsa , Mark Rutland , Alexander Shishkin , Honnappa Nagarahalli , Zachary.Leaf@arm.com, Raphael Gault , Jonathan Cameron , Namhyung Kim , Itaru Kitayama , Linux ARM , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 4, 2021 at 4:41 PM Ian Rogers wrote: > > On Wed, Mar 10, 2021 at 4:08 PM Rob Herring wrote: > > > > x86 and arm64 can both support direct access of event counters in > > userspace. The access sequence is less than trivial and currently exists > > in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in > > projects such as PAPI and libpfm4. > > > > In order to support usersapce access, an event must be mmapped first > > with perf_evsel__mmap(). Then subsequent calls to perf_evsel__read() > > will use the fast path (assuming the arch supports it). > > > > Signed-off-by: Rob Herring > > --- > > v6: > > - Adapt to mmap changes adding MMAP NULL check > > v5: > > - Make raw count s64 instead of u64 so that counter width shifting > > works > > - Adapt to mmap changes > > v4: > > - Update perf_evsel__mmap size to pages > > v3: > > - Split out perf_evsel__mmap() to separate patch > > --- > > tools/lib/perf/evsel.c | 4 ++ > > tools/lib/perf/include/internal/mmap.h | 3 + > > tools/lib/perf/mmap.c | 88 ++++++++++++++++++++++++++ > > tools/lib/perf/tests/test-evsel.c | 65 +++++++++++++++++++ > > 4 files changed, 160 insertions(+) > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > index 1057e9b15528..4d67343d36c9 100644 > > --- a/tools/lib/perf/evsel.c > > +++ b/tools/lib/perf/evsel.c > > @@ -242,6 +242,10 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > if (FD(evsel, cpu, thread) < 0) > > return -EINVAL; > > > > + if (MMAP(evsel, cpu, thread) && > > + !perf_mmap__read_self(MMAP(evsel, cpu, thread), count)) > > + return 0; > > + > > if (readn(FD(evsel, cpu, thread), count->values, size) <= 0) > > return -errno; > > > > diff --git a/tools/lib/perf/include/internal/mmap.h b/tools/lib/perf/include/internal/mmap.h > > index be7556e0a2b2..5e3422f40ed5 100644 > > --- a/tools/lib/perf/include/internal/mmap.h > > +++ b/tools/lib/perf/include/internal/mmap.h > > @@ -11,6 +11,7 @@ > > #define PERF_SAMPLE_MAX_SIZE (1 << 16) > > > > struct perf_mmap; > > +struct perf_counts_values; > > > > typedef void (*libperf_unmap_cb_t)(struct perf_mmap *map); > > > > @@ -52,4 +53,6 @@ void perf_mmap__put(struct perf_mmap *map); > > > > u64 perf_mmap__read_head(struct perf_mmap *map); > > > > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count); > > + > > #endif /* __LIBPERF_INTERNAL_MMAP_H */ > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > > index 79d5ed6c38cc..915469f00cf4 100644 > > --- a/tools/lib/perf/mmap.c > > +++ b/tools/lib/perf/mmap.c > > @@ -8,9 +8,11 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > +#include > > #include "internal.h" > > > > void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev, > > @@ -273,3 +275,89 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map) > > > > return event; > > } > > + > > +#if defined(__i386__) || defined(__x86_64__) > > +static u64 read_perf_counter(unsigned int counter) > > +{ > > + unsigned int low, high; > > + > > + asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter)); > > + > > + return low | ((u64)high) << 32; > > +} > > + > > +static u64 read_timestamp(void) > > +{ > > + unsigned int low, high; > > + > > + asm volatile("rdtsc" : "=a" (low), "=d" (high)); > > + > > + return low | ((u64)high) << 32; > > +} > > +#else > > +static u64 read_perf_counter(unsigned int counter) { return 0; } > > +static u64 read_timestamp(void) { return 0; } > > +#endif > > + > > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count) > > +{ > > + struct perf_event_mmap_page *pc = map->base; > > + u32 seq, idx, time_mult = 0, time_shift = 0; > > + u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL; > > + > > + if (!pc || !pc->cap_user_rdpmc) > > + return -1; > > + > > + do { > > + seq = READ_ONCE(pc->lock); > > + barrier(); > > + > > + count->ena = READ_ONCE(pc->time_enabled); > > + count->run = READ_ONCE(pc->time_running); > > + > > + if (pc->cap_user_time && count->ena != count->run) { > > + cyc = read_timestamp(); > > + time_mult = READ_ONCE(pc->time_mult); > > + time_shift = READ_ONCE(pc->time_shift); > > + time_offset = READ_ONCE(pc->time_offset); > > + > > + if (pc->cap_user_time_short) { > > + time_cycles = READ_ONCE(pc->time_cycles); > > + time_mask = READ_ONCE(pc->time_mask); > > + } > > Nit, this is now out of sync with the comment code in perf_event.h. IMO, we should just delete that version. One less slightly wrong version... > > + } > > + > > + idx = READ_ONCE(pc->index); > > + cnt = READ_ONCE(pc->offset); > > + if (pc->cap_user_rdpmc && idx) { > > + s64 evcnt = read_perf_counter(idx - 1); > > + u16 width = READ_ONCE(pc->pmc_width); > > + > > + evcnt <<= 64 - width; > > + evcnt >>= 64 - width; > > + cnt += evcnt; > > + } else > > + return -1; > > + > > + barrier(); > > + } while (READ_ONCE(pc->lock) != seq); > > + > > + if (count->ena != count->run) { > > + u64 delta; > > + > > + /* Adjust for cap_usr_time_short, a nop if not */ > > + cyc = time_cycles + ((cyc - time_cycles) & time_mask); > > + > > + delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift); > > + > > + count->ena += delta; > > + if (idx) > > + count->run += delta; > > + > > + cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > > Does this still suffer the divide by zero if multiplexing hasn't run > the counter? If so, we still need to add something like: > https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@mail.gmail.com/ I don't think so because if we don't have a valid counter index, we exit before this if. > > > + } > > + > > + count->val = cnt; > > + > > + return 0; > > +} > > diff --git a/tools/lib/perf/tests/test-evsel.c b/tools/lib/perf/tests/test-evsel.c > > index 0ad82d7a2a51..54fb4809b9ee 100644 > > --- a/tools/lib/perf/tests/test-evsel.c > > +++ b/tools/lib/perf/tests/test-evsel.c > > @@ -120,6 +120,69 @@ static int test_stat_thread_enable(void) > > return 0; > > } > > > > +static int test_stat_user_read(int event) > > +{ > > + struct perf_counts_values counts = { .val = 0 }; > > + struct perf_thread_map *threads; > > + struct perf_evsel *evsel; > > + struct perf_event_mmap_page *pc; > > + struct perf_event_attr attr = { > > + .type = PERF_TYPE_HARDWARE, > > + .config = event, > > + }; > > A nit, previously test-evsel was able to run and pass on a hypervisor. > As now there is a reliance on hardware events the evsel open fails on > a hypervisor. It'd be nice if we could detect running on a hypervisor > and test software events in that case. I suppose we can just exit if open fails on this test. Rob