Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp1073088ybx; Wed, 30 Oct 2019 09:26:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqxauUxlwmX6izJEUOIRThZKLYO5GrvTsAhDyIoQ6ypIkuTjRozEDsfZAyc2zo+YJ3jcTx9x X-Received: by 2002:a05:6402:154e:: with SMTP id p14mr556191edx.145.1572452774466; Wed, 30 Oct 2019 09:26:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572452774; cv=none; d=google.com; s=arc-20160816; b=MwnqUMcHcSHcW5bDXw2lJUUpKylkFMM2WcnJOLTTHHHA5FoSVnf/O+DyiymgtDpbCU 20An9ylEuEK8cuE66/1WUuJk/cQOIGy/1YB6oXRvIyWBn2tsvzrKFBabphW0uCVJZ8pk 8UyTwGzX2gTCVzbaTikD4Olyqu7ZhNv9f6KkICndg4SafM7vxUs0wlf6RjHvXkDNFBuu tlNu1hddvdRsSmYeKjpTuBEECZYjTLecEStnkRmWSGYIOuSyv9xKDV5oYxv43j1C+nlv 0lgT76cIjAKXsU6KtS1SH2dF9ufi5XWfbKTks0FxIqsuMeSWOo5qRaqJ96aDkqL+0kZY 0hzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=MhjZQUW87IyuTxjRvPIP8irpYJjRIJ0bB6+g7tP3LwI=; b=idEzZAhTHgvQqNMGzBbRuT1AA8t1ye3q56zXLM0h4RjPSBTMvFqjMhee/bb0loMtCp bh4RmgCLsq9jFvLg6DwVsKbLe8bXm2RMP2NE5SnF6frOfZezzuruyDILqpMZ5vBF4xRZ Yo4bZ88reF1SWt2yRHVcLD89fsQJs0EUPrMkNVpIciDMafY51+GCx0+g6kluNIeOTUxW +QnGMHH9xSC/0jVz4E+aR8SBjax4ZVnp09J+MIiY2VsB3NKAgZzTExIplUGGnY/xeglF dOXTRS0CBpbJHnRDI9Ij/n6rdTgABNR0MF7GxQdGs9MtZiHZMa3P9Hthe+iPCojxf87V FjAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=s5cydnpv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q5si1456433ejb.174.2019.10.30.09.25.22; Wed, 30 Oct 2019 09:26:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=s5cydnpv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727391AbfJ3QXk (ORCPT + 99 others); Wed, 30 Oct 2019 12:23:40 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:55132 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726261AbfJ3QXk (ORCPT ); Wed, 30 Oct 2019 12:23:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=MhjZQUW87IyuTxjRvPIP8irpYJjRIJ0bB6+g7tP3LwI=; b=s5cydnpvAKQtVHwFgaFzmGt3i HB9X7PYRjKpNEkIdK648f/d/gOX+ehlhvXewdihcqwWlFeoa7keVtyf3iJE7us85zUsCGz/3UPbgj f8ad3H9TqqXHqq4hh//UvjLODmdOIrOxqKeI6ZVO+iMqJBjhiaZxeRZfllxjJivHzMsCs0t6xETKd MtbkXfHplWHO2R/ZBZY2ty+kDmwDOPnKguvYALR6vd3vbYaU0MyzzjU+t+E7B7NEvou0NebwiT2Xz msGP+R9/gNaSJFboylevRAhTsWd/Dj+m7dpr1N0yFtioS97wI5idjg+0vjt1Zb/s8Sq9PMh+4Za8u UGDVIq0xQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1iPqkq-0000hL-PJ; Wed, 30 Oct 2019 16:23:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id C82AF306098; Wed, 30 Oct 2019 17:22:24 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 09E7F2B444DB8; Wed, 30 Oct 2019 17:23:26 +0100 (CET) Date: Wed, 30 Oct 2019 17:23:25 +0100 From: Peter Zijlstra To: Leo Yan Cc: Mike Leach , Adrian Hunter , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , x86@kernel.org, Mark Rutland , Alexander Shishkin , Mathieu Poirier , Arnaldo Carvalho de Melo , Jiri Olsa , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC 1/6] perf/x86: Add perf text poke event Message-ID: <20191030162325.GT4114@hirez.programming.kicks-ass.net> References: <20191025130000.13032-1-adrian.hunter@intel.com> <20191025130000.13032-2-adrian.hunter@intel.com> <20191030104747.GA21153@leoy-ThinkPad-X240s> <20191030124659.GQ4114@hirez.programming.kicks-ass.net> <20191030141950.GB21153@leoy-ThinkPad-X240s> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191030141950.GB21153@leoy-ThinkPad-X240s> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 30, 2019 at 10:19:50PM +0800, Leo Yan wrote: > On Wed, Oct 30, 2019 at 01:46:59PM +0100, Peter Zijlstra wrote: > > On Wed, Oct 30, 2019 at 06:47:47PM +0800, Leo Yan wrote: > > Anyway, the below argument doesn't care much, it works for NOP/JMP just > > fine. > > We can support NOP/JMP case as the first step, but later should can > extend to support other transitions. Since all instructions (with the possible exception of RET) are unconditional branch instructions: NOP, JMP, CALL. It makes no read difference to the argument below. ( I'm thinking RET might be special in that it reads the return address from the stack and therefore must emit the whole IP into the stream, as we cannot know the stack state ) > > > we need to update dso cache for the > > > 'PERF_TEXT_POKE_UPDATE_PREV' event; if detect the instruction is > > > changed from branch to nop, we need to update dso cache for > > > 'PERF_TEXT_POKE_UPDATE_POST' event. The main idea is to ensure the > > > branch instructions can be safely contained in the dso file and any > > > branch samples can read out correct branch instruction. > > > > > > Could you confirm this is the same with your understanding? Or I miss > > > anything? I personally even think the pair events can be used for > > > different arches (e.g. the solution can be reused on Arm64/x86, etc). > > > > So the problem we have with PT is that it is a bit-stream of > > branch taken/not-taken decisions. In order to decode that we need to > > have an accurate view of the unconditional code flow. > > > > Both NOP/JMP are unconditional and we need to exactly know which of the > > two was encountered. > > If I understand correctly, PT decoder needs to read out instructions > from dso and decide the instruction type (NOP or JMP), and finally > generate the accurate code flow. > > So PT decoder relies on (cached) DSO for decoding. As I know, this > might be different from Arm CS, since Arm CS decoder is merely > generate packets and it doesn't need to rely on DSO for decoding. Given a start point (from a start or sync packet) we scan the instruction stream forward until the first conditional branch instruction. Then we consume the next available branch decision bit to know where to continue. So yes, we need to have a correct text image available for this to work. > > With your scheme, I don't see how we can ever actually know that. When > > we get the PRE event, all we really know is that we're going to change > > a specific instruction into another. And at the POST event we know it > > has been done. But in between these two events, we have no clue which of > > the two instructions is live on which CPU (two CPUs might in fact have a > > different live instruction at the same time). > > > > This means we _cannot_ unambiguously decode a taken/not-taken decision > > stream. > > > > Does CS have this same problem, and how would the PRE/POST events help > > with that? > > My purpose is to use PRE event and POST event to update cached DSO, > thus perf tool can read out 'correct' instructions and fill them into > instruction/branch samples. The thing is, as I argued, the instruction state between PRE and POST is ambiguous. This makes it impossible to decode the branch decision stream. Suppose CPU0 emits the PRE event at T1 and the POST event at T5, but we have CPU1 covering the instruction at T3. How do you decide where CPU1 goes and what the next conditional branch is?