Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp331297ybv; Thu, 13 Feb 2020 01:09:23 -0800 (PST) X-Google-Smtp-Source: APXvYqwqGqjujFRzYlt9rL9maeG7tRvhk+nph86ZZwD86HZmKcHlmEK6rse5iA1yNq9EWQBHyVor X-Received: by 2002:aca:b4c3:: with SMTP id d186mr2148504oif.131.1581584963205; Thu, 13 Feb 2020 01:09:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581584963; cv=none; d=google.com; s=arc-20160816; b=CT3+ZGmq9j8ltN6mWKAWHM33Tn5Nmu1XaehpDYgldtuMZJ6o7ndqQJIXBJ6QAeZXQB l+MIWOh5H4DHpvx/t/OQ9dpdBgnTtF57nSwy0AzJV/xmsejnnZ+QA7aHhGRx/Cc4i0OD QBdXJ+fXqURjPV4Ceca+dUTSXU/k1AiAwIV5Odnqt6AoeKlgGBb3WJifSBEkn3i5qSmT 9918WwMrual7uxpirVbbvva4PJZRyURf0hIovn4xHYtIbQNkig7nORiGXYie9NpZ44KU 75c7mxm7LcuGY5hS2cGopY7zWPYZH5i0MAsRhxMJsueptCw+WGLPr/xwZLJMbZmTdi2Q 65yQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=y3s6gaPNv6gh+GkzKyY3UtAQAchaJB9LiAGyIZUo9HA=; b=agOBdlrP1KqiqX76nuTn+AZ5rdeiJGY04y+DWJJLQH7vlHImET0w3kb+g2+HMPhaaW LilcasQUS4Uk7Pp4nx8xoJ9p0esZb7qLfMxYzS1P3kv+szMfh6nnP8ckhPTvNifbx+Yi gZUNQPDNKfAhnvpz7hqr4/OEjiQ7JFS3JVym7UJqp+x0WgF+TXnYsXgTHrDb4vXNgV2O 5lcHU3gXqAh0ZYy89Dnvu/JzBqlyfhEnAkrdIHYjUkccmj+anBs06mxqLPx3EeCYV0X7 GjuwRNlx2rlZEjptR48Rt3qQifOIHHB/Kg4Tu7qqBcb0ZXHymnGh2eiXLJ7U4sjsWEjd pvtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=L1oIXlIZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c5si798257otp.108.2020.02.13.01.09.10; Thu, 13 Feb 2020 01:09:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=L1oIXlIZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729653AbgBMJIo (ORCPT + 99 others); Thu, 13 Feb 2020 04:08:44 -0500 Received: from mail-pl1-f195.google.com ([209.85.214.195]:39777 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729515AbgBMJIn (ORCPT ); Thu, 13 Feb 2020 04:08:43 -0500 Received: by mail-pl1-f195.google.com with SMTP id g6so2094758plp.6 for ; Thu, 13 Feb 2020 01:08:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=y3s6gaPNv6gh+GkzKyY3UtAQAchaJB9LiAGyIZUo9HA=; b=L1oIXlIZNksy4ObWQD8J12F1MQ4PdH6+bQ3sKhZweOunw9sTHubLYqqrpljDD0iay9 rO28jJD2pd2ck5nR2h63jFFnCPhIbDQ2+YXmsPdzwg5H2sxdFsKHSS+vmUr/qeKYy8X9 QeGViJCBpNnk+J7GXhdgxiK7rNHytD06p8ZKQL4qcbRxq2UbGrZtx0XqRfk6q/IaLXiQ O3DsSgcOSW7OjJTEjWZS9WhCojf3W4MNiH51gyaoHTU49TI4HOJCZElvQdLUcUQWqQwQ rNwX3macnKNHJy3dlq381tAJEBp+jm/58bN9iY3dLnGvRnMpOI+JicQOfnvL5WKzOkUz yPLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=y3s6gaPNv6gh+GkzKyY3UtAQAchaJB9LiAGyIZUo9HA=; b=URQ/L1HHlzcwmXOMGDqhQjWMMKLZQs1AmNrYs/t4m5lA/xq803DYEfjB1CKU/liz+Z iuxBZbIkYkEC5I68DdPYVFjbas88IBqUPU37dz7PD3SRmiH/cfZxhWWvoPuA+qyWZh2k FyyOZNsjjwEIEZR+RKoIupYtbJY4me6SEwGY/bO4PXWtXCre/ugqKzimWUQGMPjMHZRJ ptqz/+iIgYho1w+Na4zLhqcjuqsW4u8yPbayZhMdt0llRKbiZt249UUHIVhX5VfA0mzI e7sQ9W2Jy7jauk4f81a+R6jFHPg94exd2rCGRCwFFFG3LPUBrKIRrLq9S22QqbC5HD4Z Tzgg== X-Gm-Message-State: APjAAAXqPW8f7lPsgLOjGz5WrbjefdiZWzhrYtw8Mtvt2VIiTrTp4XVl agUYSfIW8Y4dVZk+ITD1PWAFGw== X-Received: by 2002:a17:902:6809:: with SMTP id h9mr27247972plk.32.1581584922048; Thu, 13 Feb 2020 01:08:42 -0800 (PST) Received: from leoy-ThinkPad-X240s ([2400:8902::f03c:91ff:fe3f:ee42]) by smtp.gmail.com with ESMTPSA id 13sm1970773pgo.13.2020.02.13.01.08.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 13 Feb 2020 01:08:41 -0800 (PST) Date: Thu, 13 Feb 2020 17:08:27 +0800 From: Leo Yan To: Mike Leach Cc: Arnaldo Carvalho de Melo , Mathieu Poirier , Suzuki K Poulose , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , linux-arm-kernel , linux-kernel@vger.kernel.org, Robert Walker , Coresight ML Subject: Re: [PATCH v4 5/5] perf cs-etm: Synchronize instruction sample with the thread stack Message-ID: <20200213090827.GA21618@leoy-ThinkPad-X240s> References: <20200203020716.31832-1-leo.yan@linaro.org> <20200203020716.31832-6-leo.yan@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mike, On Thu, Feb 06, 2020 at 03:01:52PM +0000, Mike Leach wrote: > Hi Leo, > > On Mon, 3 Feb 2020 at 02:08, Leo Yan wrote: > > > > The synthesized flow use 'tidq->packet' for instruction samples; on the > > other hand, 'tidp->prev_packet' is used to generate the thread stack and > > the branch samples, this results in the instruction samples using one > > packet ahead than thread stack and branch samples ('tidp->prev_packet' > > vs 'tidq->packet'). > > > > This leads to an instruction's callchain error as shows in below > > example: > > > > main 1579 100 instructions: > > ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) > > ffff000010214850 perf_event_update_userpage+0x48 ([kernel.kallsyms]) > > ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) > > ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) > > ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) > > ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) > > ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) > > > > In the callchain log, for the two continuous lines the up line contains > > one child function info and the followed line contains the caller > > function info, and so forth. So the first two lines are: > > > > perf_event_update_userpage+0x4c => the sampled instruction > > perf_event_update_userpage+0x48 => the parent function's calling > > > > The child function and parent function both are the same function > > perf_event_update_userpage(), but this isn't a recursive function, thus > > the sequence for perf_event_update_userpage() calling itself shouldn't > > never happen. This callchain error is caused by the instruction sample > > using an ahead packet than the thread stack, the thread stack is deferred > > to process the new packet and misses to pop stack if it is just a return > > packet. > > > > To fix this issue, we can simply change to use 'tidq->prev_packet' to > > generate the instruction samples, this allows the thread stack to push > > and pop synchronously with instruction sample. Finally, the callchain > > can be displayed correctly as below: > > > > main 1579 100 instructions: > > ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) > > ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) > > ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) > > ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) > > ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) > > ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) > > > > Signed-off-by: Leo Yan > > --- > > tools/perf/util/cs-etm.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > > index 8f805657658d..410e40ce19f2 100644 > > --- a/tools/perf/util/cs-etm.c > > +++ b/tools/perf/util/cs-etm.c > > @@ -1414,7 +1414,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > > struct cs_etm_packet *tmp; > > int ret; > > u8 trace_chan_id = tidq->trace_chan_id; > > - u64 instrs_executed = tidq->packet->instr_count; > > + u64 instrs_executed = tidq->prev_packet->instr_count; > > > > tidq->period_instructions += instrs_executed; > > > > @@ -1505,7 +1505,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > > * instruction) > > */ > > addr = cs_etm__instr_addr(etmq, trace_chan_id, > > - tidq->packet, offset - 1); > > + tidq->prev_packet, > > + offset - 1); > > ret = cs_etm__synth_instruction_sample( > > etmq, tidq, addr, > > etm->instructions_sample_period); > > @@ -1525,7 +1526,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > > * instruction) > > */ > > addr = cs_etm__instr_addr(etmq, trace_chan_id, > > - tidq->packet, offset - 1); > > + tidq->prev_packet, > > + offset - 1); > > ret = cs_etm__synth_instruction_sample( > > etmq, tidq, addr, > > etm->instructions_sample_period); > > -- > > 2.17.1 > > > I am really not convinced that this is a correct solution. > > Consider a set of trace range packet inputs: > current: 0x3000-0x3050 > prev: 0x2000-0x2100 > prev-1: 0x1020-0x1080 > > Before your modification..... > cs_etm__sample() processes the current packet.... > > On entry, the branch stack will contain:0x1080=>0x2000; > > We add to this from the current packet to get: 0x1080=>0x2000; 0x2100=>0x3000; > > This is then copied by cs_etm__copy_last_branch_rb() > > We find the instruction sample address in the range 0x3000 to 0x3050, > e.g. 0x3010. > cs_etm__synth_instruction_sample() will then generate a sample with values > > sample.ip = 0x3010 > sample.branch_stack = 0x1080=>0x2000; 0x2100=>0x3000; > > to be passed to the perf session / injected as required. > This sample has the correct branch context for the sampled address - > i.e. how the code arrived @0x3010 > > After the modification..... > The branch stack will be the same, but the sample address will be from > the range 0x2000-0x2010, e.g. 0x2008 to give a sample in > cs_etm__synth_instruction_sample() of > sample.ip = 0x2008 > sample.branch_stack = 0x1080=>0x2000; 0x2100=>0x3000; > > This really does not make much sense - the branch stack no longer > relates to the sample.ip. > > Further - cs_etm__synth_instruction_sample() calls cs_etm__copy_insn() > using the _current_ packet and sample.ip. This is a clear mismatch. > > I don't know what is causing the apparent error in the callchain, but > given that the previous features added in this set, work without this > alteration, I feel there must be another solution. Good catch! Thanks a lot for very detailed analysis. I root caused this issue is relevant with the sequence between two functions thread_stack__event() and thread_stack__sample(). In this series, thread_stack__sample() is prior to thread_stack__event(), thus the thread stack event cannot be handled before thread stack generation. If move the function thread_stack__event() up and place it before instruction sample synthesizing; thread_stack__event() can be invoked prior to thread_stack__sample(), then I can see the thread stack can be popped properly and the issue can be fixed. Simply to say, patch 0002 should change the code as below: /* * Record a branch when the last instruction in * PREV_PACKET is a branch. */ if (etm->synth_opts.last_branch && tidq->prev_packet->sample_type == CS_ETM_RANGE && tidq->prev_packet->last_instr_taken_branch) cs_etm__update_last_branch_rb(etmq, tidq); /* * The stack event must be processed prior to synthesizing * instruction sample; this can ensure the instruction samples * to generate correct thread stack. */ if (tidq->prev_packet->last_instr_taken_branch) cs_etm__add_stack_event(etmq, tidq); if (etm->sample_instructions && tidq->period_instructions >= etm->instructions_sample_period) { cs_etm__synth_instruction_sample(); `-> thread_stack__sample(); } Does this make sense for you? Thanks, Leo Yan