Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp440579ybv; Wed, 5 Feb 2020 08:11:37 -0800 (PST) X-Google-Smtp-Source: APXvYqyUOTMsIk14HggAR1jRXi15Qx86pBGT0BRGQ8KRzVyFhCpY8x8jgc0otjyL8W0N56ONLYzW X-Received: by 2002:aca:3109:: with SMTP id x9mr3529452oix.107.1580919097679; Wed, 05 Feb 2020 08:11:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580919097; cv=none; d=google.com; s=arc-20160816; b=Bp5FWym0HfWmjitYc26w4oDmrXdTK3zxilLpjfPnAFqOvoNAuxvF+cU+Hio0U3bEut KvPKjtLUrRehcS5QG3BakJkjYp0FlbvNNJIJi8VCkV7tqoTgke0lTyiFd6S/kltq1U+R WctziMkgOwsy6k7IWLoHUtEQRAwiNRuHeezL1veVtMGmirzwYTXwrmuMHN7gbzcGrxq8 jYM6zpLWVrldPZHW7FY0Zvx7qeWU1xh4jKvjNoZYnz1qIWv4qNAokQtdCDOHGog+nboE AtEqZX6q9r8ZT5rFF1kg/EMvB8SbYX9f3w94/j695+JZxLlpnZ16/709keCUXC4xKgWU EmOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=PvMgtmCypa6dN2YRTNmRQCzXIANAtMCLKhGHOwYDoYY=; b=GxKHXcknhAO/+21uO7Y947v9VDiku7NiGbfySCUdYXyLDT3xRsdvR4ga8qIlhboePz KKzy/xs6tl7f1LzrWA6N3wykcJH7x+oLFKV/1tDXjpz2M/8BuxNrFl3kFsJ9oT87sYoJ Zp39oLbI3gRqHn4Yzay22K0Hwip8dOWN3I3nt3QahwOjl4Bhz1lC4lKPS8bIbm14iSF6 KFtLgGhPgMXAUCWF1duucbKrwki640bHp/P+RudIJF8jFov8pBuBkvY0M/Hz1GaLxbTa CSxWLoyVQAXlxj9D6sE8Z6rIRhQOQNzJWoEN93vxTj8sJxeNU6tdk1oKIP19LpJtSJ8w arGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EdrGDqRy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v18si247598oic.188.2020.02.05.08.11.22; Wed, 05 Feb 2020 08:11:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EdrGDqRy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727341AbgBEQJP (ORCPT + 99 others); Wed, 5 Feb 2020 11:09:15 -0500 Received: from mail-qt1-f195.google.com ([209.85.160.195]:41435 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726534AbgBEQJO (ORCPT ); Wed, 5 Feb 2020 11:09:14 -0500 Received: by mail-qt1-f195.google.com with SMTP id l19so1939033qtq.8 for ; Wed, 05 Feb 2020 08:09:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PvMgtmCypa6dN2YRTNmRQCzXIANAtMCLKhGHOwYDoYY=; b=EdrGDqRyMunjHbYpc5wP887ggphFAl5426HJY9WayFNhFqCaCpknkK3F0SlcTYfGiI cgLKTtp1+9KAeJx00gF+Ef08HCs2IzX156DZAWGB++CbeHDxPfZWUauUZufYbiGLGekV 83q5BfXwUOuuDj1rwng3JpCrZL/abv51l5TexSgSBcblOAD4C0Ru0U/FMHRXUz4ax3s2 kyGjuJUJtFJxWT+T4BqKX+yVfMo3Rj2LrSWFQY7rxZ6CqY/kkvnp5kK51etPgzI9W3gh S5HYUmZjGUNZpPoGEyD6XvAkv7SpZOMK2pZI1JNHJeTRzSndGsjmu0lbdJZlxesgKGhg lhPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PvMgtmCypa6dN2YRTNmRQCzXIANAtMCLKhGHOwYDoYY=; b=ud3joqPJ0kWge0g+6fQC6fb1tK5k8MsYiuWHi0MhFA20vR+NkuqSEQ1O2WuZ431QUI rWgdLEuZX5VpibywurXMosQo1Q3469UPdxnG4C6xVVuVbZjOlGp6Jsw6RBSFuVGNlswr vJHe8tcmMO2/lOpq4Azhb0kI09hSA9J6doezCGWXFzOFzzI+75T/M/WQYHu71Ya+gtC3 PZUTgK73L2J2ANwe4y+nzRROMFJ3lg0trAwBN9nyPtUu4fRkLA4VlN5OyXSlTWZOSxes 0iwhb7bSpslKJk3h6Q+jysTPwYEvzRr4xsXG7H6KAFTYm/lZo+EGwPLoMQrJs1Y4SR+p lcNg== X-Gm-Message-State: APjAAAU8lxtZbqjG0LH0R8TCB5Zy+e2O1ncJfh7V+n73FzVOGoFa9GbS 88rxZg0WfFuldtnAic6puuWPRWmUjuHajLzn7rpsEw== X-Received: by 2002:ac8:718e:: with SMTP id w14mr33945370qto.266.1580918952671; Wed, 05 Feb 2020 08:09:12 -0800 (PST) MIME-Version: 1.0 References: <20200203015203.27882-1-leo.yan@linaro.org> <20200203015203.27882-4-leo.yan@linaro.org> In-Reply-To: <20200203015203.27882-4-leo.yan@linaro.org> From: Mike Leach Date: Wed, 5 Feb 2020 16:09:01 +0000 Message-ID: Subject: Re: [PATCH v3 3/5] perf cs-etm: Correct synthesizing instruction samples To: Leo Yan Cc: Arnaldo Carvalho de Melo , Mathieu Poirier , Suzuki K Poulose , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , linux-arm-kernel , linux-kernel@vger.kernel.org, Robert Walker , Coresight ML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Leo, There are a couple of typos in the comments below, but I also believe that the sample loop could be considerably simplified On Mon, 3 Feb 2020 at 01:52, Leo Yan wrote: > > When 'etm->instructions_sample_period' is less than > 'tidq->period_instructions', the function cs_etm__sample() cannot handle > this case properly with its logic. > > Let's see below flow as an example: > > - If we set itrace option '--itrace=i4', then function cs_etm__sample() > has variables with initialized values: > > tidq->period_instructions = 0 > etm->instructions_sample_period = 4 > > - When the first packet is coming: > > packet->instr_count = 10; the number of instructions executed in this > packet is 10, thus update period_instructions as below: > > tidq->period_instructions = 0 + 10 = 10 > instrs_over = 10 - 4 = 6 > offset = 10 - 6 - 1 = 3 > tidq->period_instructions = instrs_over = 6 > > - When the second packet is coming: > > packet->instr_count = 10; in the second pass, assume 10 instructions > in the trace sample again: > > tidq->period_instructions = 6 + 10 = 16 > instrs_over = 16 - 4 = 12 > offset = 10 - 12 - 1 = -3 -> the negative value > tidq->period_instructions = instrs_over = 12 > > So after handle these two packets, there have below issues: > > The first issue is that cs_etm__instr_addr() returns the address within > the current trace sample of the instruction related to offset, so the > offset is supposed to be always unsigned value. But in fact, function > cs_etm__sample() might calculate a negative offset value (in handling > the second packet, the offset is -3) and pass to cs_etm__instr_addr() > with u64 type with a big positive integer. > > The second issue is it only synthesizes 2 samples for sample period = 4. > In theory, every packet has 10 instructions so the two packets have > total 20 instructions, 20 instructions should generate 5 samples > (4 x 5 = 20). This is because cs_etm__sample() only calls once > cs_etm__synth_instruction_sample() to generate instruction sample per > range packet. > > This patch fixes the logic in function cs_etm__sample(); the basic > idea is to divide into three parts for handling coming packet: > > - The first part is for synthesizing the first instruction sample, it > combines the instructions from the tail of previous packet and the > instructions from the head of the new packet; > - The second part is to simply generate samples with sample period > aligned; > - The third part is the tail of new packet, the rest instructions will > be left for the sequential sample handling. > > Suggested-by: Mike Leach > Signed-off-by: Leo Yan > --- > tools/perf/util/cs-etm.c | 105 ++++++++++++++++++++++++++++++++++----- > 1 file changed, 92 insertions(+), 13 deletions(-) > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > index 3e28462609e7..c5a05f728eac 100644 > --- a/tools/perf/util/cs-etm.c > +++ b/tools/perf/util/cs-etm.c > @@ -1360,23 +1360,102 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, > * TODO: allow period to be defined in cycles and clock time > */ > > - /* Get number of instructions executed after the sample point */ > - u64 instrs_over = tidq->period_instructions - > - etm->instructions_sample_period; > + /* > + * Below diagram demonstrates the instruction samples > + * generation flows: > + * > + * Instrs Instrs Instrs Instrs > + * Sample(n) Sample(n+1) Sample(n+2) Sample(n+3) > + * | | | | > + * V V V V > + * -------------------------------------------------- > + * ^ ^ > + * | | > + * Period Period > + * instructions(Pi) instructions(Pi') > + * > + * | | > + * \---------------- -----------------/ > + * V > + * instrs_executed > + * > + * Period instructions (Pi) contains the the number of > + * instructions executed after the sample point(n). When a new > + * instruction packet is coming and generate for the next sample > + * (n+1), it combines with two parts instructions, one is the > + * tail of the old packet and another is the head of the new > + * coming packet. So 'head' variable is used to cauclate the typo : s/cauclate/calculate > + * instruction numbers in the new packet for sample(n+1). > + * > + * Sample(n+2) and sample(n+3) consume the instructions with > + * sample period, so directly generate samples based on the > + * sampe period. > + * typo: s/sampe/sample > + * After sample(n+3), the rest instructions will be used by > + * later packet; so use 'instrs_over' to track the rest > + * instruction number and it is assigned to > + * 'tidq->period_instructions' for next round calculation. > + */ > + u64 head, offset = 0; > + u64 addr; > > /* > - * Calculate the address of the sampled instruction (-1 as > - * sample is reported as though instruction has just been > - * executed, but PC has not advanced to next instruction) > + * 'instrs_over' is the number of instructions executed after > + * sample points, initialise it to 'instrs_executed' and will > + * decrease it for consumed instructions in every synthesized > + * instruction sample. > */ > - u64 offset = (instrs_executed - instrs_over - 1); > - u64 addr = cs_etm__instr_addr(etmq, trace_chan_id, > - tidq->packet, offset); > + u64 instrs_over = instrs_executed; > > - ret = cs_etm__synth_instruction_sample( > - etmq, tidq, addr, etm->instructions_sample_period); > - if (ret) > - return ret; > + /* > + * 'head' is the instructions number of the head in the new > + * packet, it combines with the tail of previous packet to > + * generate a sample. So 'head' uses the sample period to > + * decrease the instruction number introduced by the previous > + * packet. > + */ > + head = etm->instructions_sample_period - > + (tidq->period_instructions - instrs_executed); > + > + if (head) { > + offset = head; > + > + /* > + * Calculate the address of the sampled instruction (-1 > + * as sample is reported as though instruction has just > + * been executed, but PC has not advanced to next > + * instruction) > + */ > + addr = cs_etm__instr_addr(etmq, trace_chan_id, > + tidq->packet, offset - 1); > + ret = cs_etm__synth_instruction_sample( > + etmq, tidq, addr, > + etm->instructions_sample_period); > + if (ret) > + return ret; > + > + instrs_over -= head; > + } > + > + while (instrs_over >= etm->instructions_sample_period) { > + offset += etm->instructions_sample_period; > + > + /* > + * Calculate the address of the sampled instruction (-1 > + * as sample is reported as though instruction has just > + * been executed, but PC has not advanced to next > + * instruction) > + */ > + addr = cs_etm__instr_addr(etmq, trace_chan_id, > + tidq->packet, offset - 1); > + ret = cs_etm__synth_instruction_sample( > + etmq, tidq, addr, > + etm->instructions_sample_period); > + if (ret) > + return ret; > + > + instrs_over -= etm->instructions_sample_period; > + } > > /* Carry remaining instructions into next sample period */ > tidq->period_instructions = instrs_over; > -- > 2.17.1 > I believe the following change would work and make for easier reading... .... at the start of the function remove instrs_executed and replace .... /* get instructions remainder from previous packet */ u64 instrs_prev = tidq->period_instructions; /* set available instructions to previous packet remainder + the current packet count */ tidq->period_instructions += tidq->packet->instr_count; .... within the if(etm->sample_instructions && ...) statement I would be more explicit what the elements of the diagram are .... /* * Below diagram demonstrates the instruction samples * generation flows: * * Instrs Instrs Instrs Instrs * Sample(n) Sample(n+1) Sample(n+2) Sample(n+3) * | | | | * V V V V * -------------------------------------------------- * ^ ^ * | | * Period Period * instructions(Pi) instructions(Pi') * * | | * \---------------- -----------------/ * V * tidq->packet->instr_count; * * Instrs Sample(n...) are the synthesised samples occuring every etm->instructions_sample_period * instructions - as defined on the perf command line. Sample(n) being the last sample before the * current etm packet, n+1 to n+3 samples generated from the current etm packet. * * tidq->packet->instr_count represents the number of instructions in the current etm packet. * * Period instructions (Pi) contains the the number of instructions executed after the sample point(n) * from the previous etm packet. This will always be less than etm->instructions_sample_period. * .... continue with explanation here .... .... then we can simplify the loop code removing some of the temporary variables .... /* get the initial offset into the current packet instructions (entry conditions ensure that instrs_prev < etm->instructions_sample_period) */ u64 offset = etm->instructions_sample_period - instrs_prev; u64 addr; /* Prepare last branches for instruction sample */ if (etm->synth_opts.last_branch) cs_etm__copy_last_branch_rb(etmq, tidq); while (tidq->period_instructions >= etm->instructions_sample_period) { /* * Calculate the address of the sampled instruction (-1 * as sample is reported as though instruction has just * been executed, but PC has not advanced to next * instruction) */ addr = cs_etm__instr_addr(etmq, trace_chan_id, tidq->packet, offset - 1); ret = cs_etm__synth_instruction_sample( etmq, tidq, addr, etm->instructions_sample_period); if (ret) return ret; offset += etm->instructions_sample_period; tidq->period_instructions -= etm->instructions_sample_period; } ..... I believe the above should work, but cannot claim to have tried it out. What do you think? Regards Mike -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK