Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1933101ybz; Thu, 30 Apr 2020 08:01:02 -0700 (PDT) X-Google-Smtp-Source: APiQypKSshwz9UXQgE0QxjoT8lmSA1Au7h/H1ZrsMex0e+nDkT5dzcCcIWZG73hZAJ7i87hNvND5 X-Received: by 2002:a5d:6188:: with SMTP id j8mr4248536wru.119.1588258861929; Thu, 30 Apr 2020 08:01:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588258861; cv=none; d=google.com; s=arc-20160816; b=YUuG6LaLFCgDfdNTnAKnmqWKZU3ugK75Uqnys97O36mQoXLpnZpVRq2PejDrAa+rBW n3Citp3ND4COpjIP3plAZjlP02FiRF4My8FF5bK+28+bu5XJQOWrtl12+XpKevpBlQ2L rygiJHv/DFReyV/hVyfjlNiTbWEhq240cXce+q8OKPro8iRT34S/1i/9kR0DEVsIRJbc xVmqPwsDUHtit9XYHfo4gsvT/qKa3xP7WwhIUbECPIXQAQuCLf4jCy3DKxNM821RrCE2 Tcun+ZrRZb3CEKkraQGKKfdUshKDsoxRr1Rn6Whh9YTdUmS4SAUY4vFmSESjUEpVOv85 rIwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=A2pY1CSYgLAoj1yHA+VdIKPiosyO0rS+62N6MPHk/Fc=; b=g43Y74Uh03cpJ9amlYS1tIVcL0bA6rVtvAEpMv65rOdjkWeU5y9JgOvjYI+kNCHBQA qBbD1wv5ndvBQfYUKvzpaWDerwt81hseGe0v93dNGO+0CbqzgIVrjcxVNqNR3D4mEdS+ RPcw19pOyzpmjJBx3GIcXHHHKR/vv+/0dgy1wihY5dtGFjWHjSeAIWbkMaZCGm7EUSu2 PJDsk+VHM3Sik3LDBZ4H0EvjbiwSGS/WRnXgHZceJC1/qTFiNz/SHN8V/grJ0oAf8o/9 KwlarltLTt0Hh64Y7vy6acfGLZJpB/6ETGfObhVF3v1M/tmh4Yb5rI4mwP6rEsdMzypa 1SnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ufCxGIZF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e16si5420301eds.503.2020.04.30.08.00.34; Thu, 30 Apr 2020 08:01:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ufCxGIZF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726835AbgD3O6b (ORCPT + 99 others); Thu, 30 Apr 2020 10:58:31 -0400 Received: from mail.kernel.org ([198.145.29.99]:52822 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726436AbgD3O6a (ORCPT ); Thu, 30 Apr 2020 10:58:30 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id BF7A62076D; Thu, 30 Apr 2020 14:58:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588258710; bh=qlM9az2Mu6qcrDqC7rjQnCbfEc3OOne2AKmDVHC22QM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ufCxGIZFIcDmQiW1DSEmDR/KL8XoZubTDhSgr2RlZfEeFvCfjUgaVoVSBEWLylpCZ o1VuspeFLXTSHKQrsReeaBhxtbh1Zv0punWbMc/vMeb+2qTgbRNeC8E27QcmnDBtI/ aZnWmQGYO6rQvKAZ4o96/6YsS80P71vZYvIrfrwU= Date: Thu, 30 Apr 2020 15:58:24 +0100 From: Will Deacon To: Leo Yan Cc: Mark Rutland , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Catalin Marinas , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Mathieu Poirier , Mike Leach , Al Grant , James Clark , maz@kernel.org, tglx@linutronix.de Subject: Re: [PATCH] arm64: perf_event: Fix time_offset for arch timer Message-ID: <20200430145823.GA25258@willie-the-truck> References: <20200320093545.28227-1-leo.yan@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200320093545.28227-1-leo.yan@linaro.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Leo, [+Maz and tglx in case I'm barking up the wrong tree] On Fri, Mar 20, 2020 at 05:35:45PM +0800, Leo Yan wrote: > Between the system powering on and kernel's sched clock registration, > the arch timer usually has been enabled at the early time and its > counter is incremented during the period of the booting up. Thus the > arch timer's counter is not completely accounted into the sched clock, > and has a delta between the arch timer's counter and sched clock. This > delta value should be stored into userpg->time_offset, which later can > be retrieved by Perf tool in the user space for sample timestamp > calculation. > > Now userpg->time_offset is assigned to the negative sched clock with > '-now', this value cannot reflect the delta between arch timer's counter > and sched clock, so Perf cannot use it to calculate the sample time. > > To fix this issue, this patch calculate the delta between the arch > timer's and sched clock and assign the delta to userpg->time_offset. > The detailed steps are firstly to convert counter to nanoseconds 'ns', > then the offset is calculated as 'now' minus 'ns'. > > |<------------------- 'ns' ---------------------->| > |<-------- 'now' -------->| > |<---- time_offset ---->| > |-----------------------|-------------------------| > ^ ^ ^ > Power on system sched clock registration Perf starts FWIW, I'm /really/ struggling to understand the problem here. If I've grokked it correctly (big 'if'), then you can't just factor in what you call "time_offset" in the diagram above, because there isn't a guarantee that the counter is zero-initialised at the start. > Signed-off-by: Leo Yan > --- > arch/arm64/kernel/perf_event.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > index e40b65645c86..226d25d77072 100644 > --- a/arch/arm64/kernel/perf_event.c > +++ b/arch/arm64/kernel/perf_event.c > @@ -1143,6 +1143,7 @@ void arch_perf_update_userpage(struct perf_event *event, > { > u32 freq; > u32 shift; > + u64 count, ns, quot, rem; > > /* > * Internal timekeeping for enabled/running/stopped times > @@ -1164,5 +1165,21 @@ void arch_perf_update_userpage(struct perf_event *event, > userpg->time_mult >>= 1; > } > userpg->time_shift = (u16)shift; > - userpg->time_offset = -now; > + > + /* > + * Since arch timer is enabled ealier than sched clock registration, > + * compuate the delta (in nanosecond unit) between the arch timer > + * counter and sched clock, assign the delta to time_offset and > + * perf tool can use it for timestamp calculation. > + * > + * The formula for conversion arch timer cycle to ns is: > + * quot = (cyc >> time_shift); > + * rem = cyc & ((1 << time_shift) - 1); > + * ns = quot * time_mult + ((rem * time_mult) >> time_shift); > + */ > + count = arch_timer_read_counter(); > + quot = count >> shift; > + rem = count & ((1 << shift) - 1); > + ns = quot * userpg->time_mult + ((rem * userpg->time_mult) >> shift); > + userpg->time_offset = now - ns; Hmm, reading the counter and calculating the delta feels horribly approximate to me. It would be much better if we could get hold of the initial epoch cycles from the point at which sched_clock was initialised using the counter. This represents the true cycle delta between the counter and what sched_clock uses for 0 ns. Unfortunately, I can't see a straightforward way to grab that information. It looks like x86 pulls this directly from the TSC driver. Will