Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp471955yba; Fri, 26 Apr 2019 03:28:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqx585z7eR6INekBBnAKJWnQUUxzdAw3DdESHOf6XBsIBo1olc75foqQzqCgttvzuAuo/ku6 X-Received: by 2002:a63:f843:: with SMTP id v3mr38881890pgj.69.1556274487279; Fri, 26 Apr 2019 03:28:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556274487; cv=none; d=google.com; s=arc-20160816; b=IWRqn9g4IiH8n7f9i5BzX58IAKly+8MXpjePOws+klN5rI2aswK/fTkg6X+p0rr1/U gaN41ku/naYbXrQJXy3iggv+bkSiGjEqQQEayXRU83SpiWprlmrYgPyXCY6tV5F+d7AG WRxbOV1KlV4wLMmesW+BTy275gmy9N7MsNow7C6rw+Sax3QHsXj8bQaLeUKIGpYJcnXo Uz2zhHP9B6Zo16RGzY86r/w6DQMTy+UEK18+Tmtmmdcth1zr7egr+TdIZxdzm/JA97YB t//l1nnjQox/d0qXojtBS3VKEhRB5vLbyGaRME7L+kMvtc9Sxyf2A+kILlxF8ORiapEs yGlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=XnptRIal3j0mMOBpGlzbCuM40NEyMqiMoXzligrBKwQ=; b=XTDKHsI0qehT8XG0jVuC96T1HCBw9JWGnVgty/h3j4ku6IInDMS5/s26RtFlu+1cad rWknOcJ8WiazbewtfEHUa4foCm2Hab+sFPRnFdf9dH/m2e+LgIjpGQf5Vmgkxs4cad9F 7CfOv5G7lsLbW7gmAU1slhkr11+vD+Dr6547CU18qfEyIYFiqTkI6BdtwCnBHG/AxqYs Lr0i5zd0eT8uPtm/GlH/Ez9yyN2Ynsx4zWPA5wYa0LWM1kMM27GjdvXU6IqJ4ZyWvzsp yJ06YZd70RtKE/liupfbziMTx4YNP+ATEcafcX+/65/1w3iD+leAjBZiP8limY6V1Kqs 4/Tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g12si25263887plp.340.2019.04.26.03.27.51; Fri, 26 Apr 2019 03:28:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726184AbfDZK0p (ORCPT + 99 others); Fri, 26 Apr 2019 06:26:45 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:38180 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725901AbfDZK0p (ORCPT ); Fri, 26 Apr 2019 06:26:45 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7C44FA78; Fri, 26 Apr 2019 03:26:44 -0700 (PDT) Received: from queper01-lin (queper01-lin.cambridge.arm.com [10.1.195.48]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E57F93F7BD; Fri, 26 Apr 2019 03:26:42 -0700 (PDT) Date: Fri, 26 Apr 2019 11:26:38 +0100 From: Quentin Perret To: Qais Yousef Cc: rostedt@goodmis.org, peterz@infradead.org, dietmar.eggemann@arm.com, bristot@redhat.com, juri.lelli@redhat.com, williams@redhat.com, linux-kernel@vger.kernel.org Subject: Re: Tracehooks in scheduler Message-ID: <20190426102635.almrj7bbjqlbt77n@queper01-lin> References: <20190407175235.5c2livciovwgq7mm@e107158-lin.cambridge.arm.com> <20190409082450.mkcobfbmohhxqk6k@e107158-lin.cambridge.arm.com> <20190415144945.tumeop4djyj45v6k@e107158-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190415144945.tumeop4djyj45v6k@e107158-lin.cambridge.arm.com> User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Qais, On Monday 15 Apr 2019 at 15:49:45 (+0100), Qais Yousef wrote: > Hi Steve, Peter > > > On 04/07/19 18:52, Qais Yousef wrote: > > > Hi Steve, Peter > > > > > > I know the topic has sprung up in the past but I couldn't find anything that > > > points into any conclusion. > > > > > > As far as I understand new TRACE_EVENTS() in the scheduler (and probably other > > > subsystems) isn't desirable as it intorduces a sort of ABI that can be painful > > > to maintain. > > > > > > But for us to be able to test various aspect of EAS, we rely on some events > > > that track load_avg, util_avg and some other metrics in the scheduler. > > > Example of such patches that are in android and we maintain out of tree can be > > > found here: > > > > > > https://android.googlesource.com/kernel/common/+/42903694913697da88a4ac627a92bbfdf44f0a2e > > > https://android.googlesource.com/kernel/common/+/6dfaed989ea4ca223f0913dfc11cdafd9664fc1c > > > > > > Dietmar and Quentin pointed me to a discussion you guys had with Daniel Bristot > > > in the last LPC when he had a similar need. So it is something that could > > > benefit other users as well. > > > > > > What is the best way forward to be able to add tracehooks into the scheduler > > > and any other subsystem for that matters? > > > > > > We tried using DECLARE_TRACE() to create a tracepoint which doesn't export > > > anything in /sys/kernel/debug/tracing/events and hoped that we can use eBPF or > > > a kernel module to attach to this tracepoint and access the args to inject our > > > own trace_printks() but this didn't work. The glue logic necessary to attach > > > to this tracepoint in a similar manner to how RAW_TRACEPOINT() in eBPF works > > > isn't there AFAICT. > > > > > > I can post the full example if the above doesn't make sense. I am still > > > familiarizing myself with the different aspects of this code as well. There > > > might be support for what we want but I failed to figure out the magic > > > combination to get it to work. > > > > > > If I got this glue logic done, would this be an acceptable solution? If not, do > > > you have any suggestions on how to progress? > > I have written some patches in hope it'll clarify further what we are trying to > achieve here and what would be the best possible approach about it. > > I have taken two approaches to solve the problem. > > > 1. > > https://github.com/qais-yousef/linux/commit/e7d0aa7ff1328195f314b0730c4cc744dec4261e > > In this approach everything we need is already available and we just > need to create new tracepoints as described in > Documentation/trace/tracepoints.rst and export it with > EXPORT_TRACEPOINT_SYMBOL_GPL(). > > A user then can have an out of tree module to probe this tp and > manipulate it as they like. > > Example of such a module is here, the pelt_se tp is to demo the > approach: > > https://github.com/qais-yousef/tracepoints-helpers/blob/master/module-pelt-se/probe_tp_pelt_se.c > > Googling around I can see that the use of > EXPORT_TRACEPOINT_SYMBOL_GPL() is not desired unless the module is > in-tree which I doubt will be the case here. > > https://lore.kernel.org/lkml/20150422130052.4996e231@gandalf.local.home/ > > 2. > https://github.com/qais-yousef/linux/commit/fb9fea29edb8af327e6b2bf3bc41469a8e66df8b > https://github.com/qais-yousef/linux/commit/edd2498c5bbfca1a26acd151a4e3323e511f3455 > > In this approach I try to allow attaching to a TP using eBPF. Sadly the > current infrastructure is lacking so I hacked the above up to create a > new DECLARE_TRACE_HOOK() macro which will allow using eBPF but without > exporting anything in debugfs that can constitute an ABI. > > The following eBPF program can be used then to attach and access some > info at the TP: > > https://github.com/qais-yousef/tracepoints-helpers/blob/master/bpf/tp_trace_printk_pelt_se > > > Does any of the above approaches make sense? For the EAS-testing use-case you mentioned earlier, it's really for debugging so we don't actually need the eBPF safety. None of this is supposed to run in production I would say. So I tend to prefer option 1 if that works for everybody interested in this thing. And then what would be the story ? We would carry a module out-of-tree in our test suite to extract scheduler data and then post-process it in userspace or something ? Since that would be an out-of-tree module, upstream doesn't commit to anything to userspace, so perhaps that could work. Another thing, should these sched tracepoints be guarded by sched_debug ? Thanks, Quentin