Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp237719pxb; Mon, 2 Nov 2020 20:25:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJxmVL/5ClNT8pvkn1U9sUULRiBSklcGjRowl2XuITtd4yBx1bDnYMdgvfqqWHOhUuS+vyvv X-Received: by 2002:aa7:da44:: with SMTP id w4mr7977950eds.131.1604377525074; Mon, 02 Nov 2020 20:25:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604377525; cv=none; d=google.com; s=arc-20160816; b=gf3w9bJBb3K31pJssCWFhtoUnZaQvEi8uh2iDkVfKD1TQAPBQGl5MVqTmSwyH4iNXd zSlz0VKR2FEANHe+VZgJpkozFM4Gu2vcu5A7NRwM59iRkR4oQ0y/1Y9WwwqNE+nXtGXJ an6L61xEHix4seOcK50/ICESANT+LvxaxvKyixrgwiXPS0ixIZ0JmrKXVvf5kd8KnV5o nqjYbGX3XW1oy4DT9YkuGhHuOGhbwcftng775QhxTlB6zFVzGgkLmfx2Mto8s16RjxxQ sgZ9NlNHc+VItXDovyv4dXc8NLknXquuwyJmuefGa6cwtNykgOueoK3INpz8Q+TSUH1Q f5SQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=kwvQxdCCB9ua8CwhZ8qTufxP4V9xwd/fI1X1QHmzIMA=; b=eZv35ZejDZlil3WGmHntBwywi7TJqSB/Ga90bqaCUw6jC12PjXCOJ7PgJptuUEv0/Y c71RV2bVQ2cKuDgUAfdA8ntzUZPgeXWrq6jbe46wC8ZX+SVyIZYoA3TpVWz+n2Oi+bZm OkxsatJ/tCGXAR8gePlPqAihcukmwG8rZgTdyprp/fI3ehfnREx93OG3JkdwjU5MzHEf xs0sYiYk4FKjXjPvxmC5HdrNvlIT3rspnJtTXF96Ql/uRIVCRX+0xqBmL4PU3ufOsL4I SYk8i/sMlr7ZDdS6f0UR4v+wUhVQkaB0sCTa19KxVftfG0k/1/f17imSxAlEPUvuTMOe dQxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id me9si4116720ejb.199.2020.11.02.20.25.00; Mon, 02 Nov 2020 20:25:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725980AbgKCEWh (ORCPT + 99 others); Mon, 2 Nov 2020 23:22:37 -0500 Received: from mail-wr1-f42.google.com ([209.85.221.42]:36368 "EHLO mail-wr1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725940AbgKCEWh (ORCPT ); Mon, 2 Nov 2020 23:22:37 -0500 Received: by mail-wr1-f42.google.com with SMTP id x7so16973069wrl.3; Mon, 02 Nov 2020 20:22:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kwvQxdCCB9ua8CwhZ8qTufxP4V9xwd/fI1X1QHmzIMA=; b=eaq1VsBPcrIQtilYsLxYfxSzBuN+zyKtzsJNYA/7RUSIoMjCWy4Y9B3y1aONiyWLaX vPB4hnqr8QAeSqQIBINIf4kY1NpbVknZpqNlrggoMiJSc2/IWdRn9vJCQxbvmdBWdI3t 3CaEAxlkWvEWCWBu0u5SYxjXd231qb6AMfWojAd1eYmeqEqyXe0CeyWFW5BSYbTHkDuJ VdRsaz/cuJW2DeU1sVHOVPm0L4JKrFy6CgtvwZvxBdZ4fe+fbTclJ3aqhYig9EAVKmpP YztvUJllkB8uBML/56cLzTcrzfgIOUS9DuBtt6OWYCfXr7UjyC3zwAv4FYVkY7oh1RSD dFzw== X-Gm-Message-State: AOAM530at8X4glJndVpALgI5uLGGUA6OOAiuCUpvQEEzns+21A2f1y8O MbMXOXSsakv1AyhjNaeS6QWpQdIY2y6QYZ7eRs8= X-Received: by 2002:adf:84a5:: with SMTP id 34mr23487326wrg.8.1604377354918; Mon, 02 Nov 2020 20:22:34 -0800 (PST) MIME-Version: 1.0 References: <20201006131703.GR2628@hirez.programming.kicks-ass.net> <20201008070231.GS2628@hirez.programming.kicks-ass.net> <50338de81b34031db8637337f08b89b588476211.camel@klomp.org> <20201030091649.GB3100@wildebeest.org> <20201030101004.GB2628@hirez.programming.kicks-ass.net> <20201102172712.0c9859124835089d80dc2348@kernel.org> In-Reply-To: <20201102172712.0c9859124835089d80dc2348@kernel.org> From: Namhyung Kim Date: Tue, 3 Nov 2020 13:22:23 +0900 Message-ID: Subject: Re: Additional debug info to aid cacheline analysis To: Masami Hiramatsu Cc: Peter Zijlstra , Mark Wielaard , Stephane Eranian , linux-toolchains@vger.kernel.org, Arnaldo Carvalho de Melo , linux-kernel , Ingo Molnar , Jiri Olsa , Ian Rogers , "Phillips, Kim" , Mark Rutland , Andi Kleen Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Masami, On Mon, Nov 2, 2020 at 5:27 PM Masami Hiramatsu wrote: > > Hi, > > On Fri, 30 Oct 2020 11:10:04 +0100 > Peter Zijlstra wrote: > > > On Fri, Oct 30, 2020 at 10:16:49AM +0100, Mark Wielaard wrote: > > > Hi Namhyung, > > > > > > On Fri, Oct 30, 2020 at 02:26:19PM +0900, Namhyung Kim wrote: > > > > On Thu, Oct 8, 2020 at 6:38 PM Mark Wielaard wrote: > > > > > GCC using -fvar-tracking and -fvar-tracking-assignments is pretty good > > > > > at keeping track of where variables are held (in memory or registers) > > > > > when in the program, even through various optimizations. > > > > > > > > > > -fvar-tracking-assignments is the default with -g -O2. > > > > > Except for the upstream linux kernel code. Most distros enable it > > > > > again, but you do want to enable it by hand when building from the > > > > > upstream linux git repo. > > > > > > > > Please correct me if I'm wrong. This seems to track local variables. > > > > But I'm not sure it's enough for this purpose as we want to know > > > > types of any memory references (not directly from a variable). > > > > > > > > Let's say we have a variable like below: > > > > > > > > struct xxx a; > > > > > > > > a.b->c->d++; > > > > > > > > And we have a sample where 'd' is updated, then how can we know > > > > it's from the variable 'a'? Maybe we don't need to know it, but we > > > > should know it accesses the 'd' field in the struct 'c'. > > > > > > > > Probably we can analyze the asm code and figure out it's from 'a' > > > > and accessing 'd' at the moment. I'm curious if there's a way in > > > > the DWARF to help this kind of work. > > > > > > DWARF does have that information, but it stores it in a way that is > > > kind of opposite to how you want to access it. Given a variable and an > > > address, you can easily get the location where that variable is > > > stored. But if you want to map back from a given (memory) location and > > > address to the variable, that is more work. > > > > The principal idea in this thread doesn't care about the address of the > > variables. The idea was to get the data type and member information from > > the instruction. > > > > So in the above example: a.b->c->d++; what we'll end up with is > > something like: > > > > inc 8(%rax) > > > > Where %rax contains c, and the offset of d in c is 8. > > For this simple case, it is possible. > > This offset information is stored in the DWARF as a data-structure type > information. (perf-probe uses it to find how to get the given local var's > fields) > > So if we do this off-line, I think it is possible if it is recorded with > instruction-pointers. For each place, we can do > > - decode instruction and get the access address. > - get var assignment of %rax at that IP. > - get type information of var and find the field from offset. > > However, the problem is that if the DWARF has only assignment of "a", > we need to decode the function body. (and usually this happens) > > func() { > struct xxx a; > ... > a.b->c->d++; > } > > In this case, only "a" is the local variable. So DWARF records assignment of > "a", not "b" nor "c" (since those are not a name of variables, just a name > of fields). GCC may generate something like > > mov 16(%rsp),%rdx // rdx = a.b > mov 8(%rdx),%rax // rax = b->c > inc 8(%rax) // c->d++ Right, it'd be really nice if compiler can add information about the (hidden) assignments in the rdx and rax here. Thanks Namhyung