Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4920141imw; Tue, 19 Jul 2022 16:18:14 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sg+2KuXtbHl82Q+xn5fumdlFPlWgSZeuwc/ZFictqTdmIy6mdRt0HwUQUZ4Zy4wF+V3J0/ X-Received: by 2002:a17:906:1315:b0:72c:5348:a153 with SMTP id w21-20020a170906131500b0072c5348a153mr31182640ejb.446.1658272694482; Tue, 19 Jul 2022 16:18:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658272694; cv=none; d=google.com; s=arc-20160816; b=0cW75c5/MHKn41Ho8D7mVocAX55u1iZpun8395Sg7tTEHGOHoM3wKsufmMGsQAVGLY dSQW4DXtTrzxnXs7KTIkAeoB/L8XqlqwNesMATNJ08zmu8oujvKN7ZngWTxKATOA3Cz/ WbSivKorpb7CtBEXlNbPjWhYsPjZak4JJJYoho3QbldqA8txq9NQUVccWFn8wCXlWK7h y8nlZP1ERAtN8UTmdl5k7Xau950TWGThhPsZNKAYO2DAX3P7wA+d3yoG2m6mBSz/ddhb gttr+qGeKqzTQttPFDYIBf2yFe0hOooge2CnpngPlCKp1MVf3SV1vc6aBv8o9Jof2maS Oafg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7Xp4OEUi6R+aBFLPRkw9TiBXnBGJkUzRpG9JK7ttIaI=; b=JhCLuwwIJ9IV/Dp+dJNOtd7SN6frOfosGvmYcEpxov9i9q0GMGafya7bR3R3lYoTSL MGGqZXrlS1K0rnU3Y49ApRjOL7lf1WxoM/wAOkKy9H+JybZ4MflWxwrCugi1Zwye+UQp /Bb5oJN9qv7BW5K2E0gx7OGib0gaCQwkkhmG6g8mt6lsOZGMgB/MJeBe+ekF0Ny3WMSv OvJY9x82aYLuPCbeTwH66WxaKhO3cWYm94yXmscDT43JibaZixKoaCsoYzdrXstV4q0E VrTMe+GsiJ2g8tNoHhJT5k81Lt+x7R3SmPL2vNyM0ADIPZ4WKtr4oTmDi7+KXHB1kR6b BrKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=r96jTHk5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ji12-20020a170907980c00b0072f17b37231si16791063ejc.743.2022.07.19.16.17.50; Tue, 19 Jul 2022 16:18:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=r96jTHk5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240074AbiGSWmd (ORCPT + 99 others); Tue, 19 Jul 2022 18:42:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235239AbiGSWma (ORCPT ); Tue, 19 Jul 2022 18:42:30 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B796859249 for ; Tue, 19 Jul 2022 15:42:29 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id ay11-20020a05600c1e0b00b003a3013da120so211842wmb.5 for ; Tue, 19 Jul 2022 15:42:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7Xp4OEUi6R+aBFLPRkw9TiBXnBGJkUzRpG9JK7ttIaI=; b=r96jTHk5U7oIXsKC84CqWsfrUB696Ew4eLHpBMO/3N8Ny+QF0YneBpuYIr3/UhculI RD5JBm/ErIqRRjgrv2pGwReXfRm/uJbKP/I411c3q9Eah7nVr/4QldKH0CzqjYMirJrP KzU6X0gG2FbvV6rpe/WgsveopCv7bdNcoBUmsUgj73NJK6w9qHxvJe6GJXV80vOJXR72 aZQ4lBxzsReNWFANO2hkbFWn4b0CKGfAH4ksG2EpbeiI6Oey8p6RatHpsO4nRM+W7ji2 GfCau+1yCdm1mbyU0L8eDgHRQnfaeWthdCxQqnTAHJBxdpmRt+dBCLXSh3Fl8sf5Mej5 26eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7Xp4OEUi6R+aBFLPRkw9TiBXnBGJkUzRpG9JK7ttIaI=; b=WURhaGieX20rb5K8SCL3kGw8O0ELPskoJJCUDGhiXEJv1M4pcte9rkeqD57EFtonzU bJURUC4ees0gJxknSnvYlqSYHWmFLQj5TdvClDWtugbgliDFA3/ZNqTIVM8b7PRklG6D 4IqRO9qS4bdo/xHT/ZoBsM7iXOorIp76v5xBRQX2WiHXfynEcsROcU6nD4lx+/CwFMcP GooX/he2X3JmEPEL6B1UqCEIlV5nyrJE40fieYnqJKtCINDVUyh4S94HzyrQ65NZEeaH fm4vZ3hfUm/cdXFSXI3gloX9dIIz1+JDdSuYFPFYKxyPweAIzO8M607PM7KaSJhPpipe 3ffQ== X-Gm-Message-State: AJIora/ZCx6YMhfd2cHNccIq9Cid6sMrwrrV1X9aRa3BhglktBx7X8z0 AKhlE359v8Ewp3fdXSxmL/ATzpxtcURFqQUcKl0vPw== X-Received: by 2002:a7b:ce13:0:b0:3a3:102c:23d3 with SMTP id m19-20020a7bce13000000b003a3102c23d3mr1116893wmc.67.1658270548139; Tue, 19 Jul 2022 15:42:28 -0700 (PDT) MIME-Version: 1.0 References: <20220609052355.1300162-1-irogers@google.com> <20220609052355.1300162-3-irogers@google.com> In-Reply-To: From: Ian Rogers Date: Tue, 19 Jul 2022 15:42:15 -0700 Message-ID: Subject: Re: [PATCH v2 2/4] perf: Align user space counter reading with code To: Rob Herring Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Kajol Jain , Andi Kleen , Adrian Hunter , Anshuman Khandual , "linux-kernel@vger.kernel.org" , linux-perf-users , Stephane Eranian Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 19, 2022 at 10:45 AM Rob Herring wrote: > > On Wed, Jun 8, 2022 at 11:24 PM Ian Rogers wrote: > > > > Align the user space counter reading documentation with the code in > > perf_mmap__read_self. Previously the documentation was based on the perf > > rdpmc test, but now general purpose code is provided by libperf. > > > > Signed-off-by: Ian Rogers > > --- > > include/uapi/linux/perf_event.h | 32 ++++++++++++++++----------- > > tools/include/uapi/linux/perf_event.h | 32 ++++++++++++++++----------- > > 2 files changed, 38 insertions(+), 26 deletions(-) > > > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > > index d37629dbad72..3b84e0ad0723 100644 > > --- a/include/uapi/linux/perf_event.h > > +++ b/include/uapi/linux/perf_event.h > > @@ -538,9 +538,13 @@ struct perf_event_mmap_page { > > * > > * if (pc->cap_usr_time && enabled != running) { > > * cyc = rdtsc(); > > - * time_offset = pc->time_offset; > > * time_mult = pc->time_mult; > > * time_shift = pc->time_shift; > > + * time_offset = pc->time_offset; > > + * if (pc->cap_user_time_short) { > > + * time_cycles = pc->time_cycles; > > + * time_mask = pc->time_mask; > > + * } > > * } > > * > > * index = pc->index; > > @@ -548,6 +552,9 @@ struct perf_event_mmap_page { > > * if (pc->cap_user_rdpmc && index) { > > * width = pc->pmc_width; > > * pmc = rdpmc(index - 1); > > + * pmc <<= 64 - width; > > + * pmc >>= 64 - width; > > + * count += pmc; > > * } > > * > > * barrier(); > > @@ -590,25 +597,24 @@ struct perf_event_mmap_page { > > * If cap_usr_time the below fields can be used to compute the time > > * delta since time_enabled (in ns) using rdtsc or similar. > > * > > - * u64 quot, rem; > > - * u64 delta; > > - * > > - * quot = (cyc >> time_shift); > > - * rem = cyc & (((u64)1 << time_shift) - 1); > > - * delta = time_offset + quot * time_mult + > > - * ((rem * time_mult) >> time_shift); > > + * cyc = time_cycles + ((cyc - time_cycles) & time_mask); > > + * delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift); > > I still think this chunk should stay as-is because mul_u64_u32_shr > isn't defined here. At least a comment as to what the C version does: > > /* time_offset + (u64)(((unsigned __int128)cyc * time_mult) >> time_shift) */ > > > * > > * Where time_offset,time_mult,time_shift and cyc are read in the > > * seqcount loop described above. This delta can then be added to > > - * enabled and possible running (if index), improving the scaling: > > + * enabled and possible running (if index) to improve the scaling. Due > > + * to event multiplexing, running maybe zero and so care is needed to > > s/maybe/may be/ > > > + * avoid division by zero. > > * > > * enabled += delta; > > - * if (index) > > + * if (idx) > > This should remain 'index'. Thanks, I've tried to incorporate all of this into v3: https://lore.kernel.org/lkml/20220719223946.176299-1-irogers@google.com/ Ian > > * running += delta; > > * > > - * quot = count / running; > > - * rem = count % running; > > - * count = quot * enabled + (rem * enabled) / running; > > + * if (running != 0) { > > + * quot = count / running; > > + * rem = count % running; > > + * count = quot * enabled + (rem * enabled) / running; > > + * } > > */ > > __u16 time_shift; > > __u32 time_mult;