Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp527987ybe; Wed, 4 Sep 2019 03:46:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1ZMgEQNGGj8DhSeLzr+B+XHlqHsrS1xBbqK5URpig5LL/pqiyFQmHxnLET63ZQOdMRy/T X-Received: by 2002:a63:8f55:: with SMTP id r21mr34303287pgn.318.1567593967445; Wed, 04 Sep 2019 03:46:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567593967; cv=none; d=google.com; s=arc-20160816; b=cJ6xG3++qmF/dzEV+Tkzl3xRZapms3pShabj6B8uPAkBwV5LWOj7eDdKWGUojk6reA D/Ac5GwarolzWJ7s5IyZONL8bsC4f7k3iojig8s8S35nKwyaJ/m1kWhoHbePvcMkJlmp v5Li/2eixSvOH5+BWHQQosS2xNMR59mbnOosbuz+RbMCXSPh/gFjbT2EY2wrPCPW7ss0 vvLCumkf3hPED7qGvQXuTLHg60SJjm6BHc+C4mMzCChPzgVX4o4Uo1MUhxl8s5ZTz442 739M+b0JG7oSvDbgqB9XGDUeH+zky/goXP5cHumNV5ZK9tsJQp96zTgn8Mbwg05yQn+l 9KUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=tmV0ZRedjrRgrr0GHKEe9MAJtj/Fyn2a1ikrbz+1JT4=; b=iT2WL5knj+P41DtSLh0Dk6DSITPEbjGayweoqkFAXc8ZvAGzI0XR1LiM2BxGoiXRuN PPkm3wSAj2wiY0DnrJYjuQkZc4NqQ4qyN+gjL2mMjOL3imXETbjg1RY8J/ewXN6C0FDK xM6QUgHG7ZIAFArsb1blm0bg8+zOdfOpyTewOX8quSttWLQ5sgw+fiL+lijFcV9HIHIb wR2wCkos6d3wPLMRTmozIZNQ0RkmwhAF1IJkAbTwHb5v/g6AAyplniNgqPhGIGBFXiVE SuExBDDl/6hOdqGYS3O9WScBTRcgciEAUErN2lVgbPHEMjdywa4ckBWAdgZiN5fm5Y0y B2lQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i188si19938150pfe.96.2019.09.04.03.45.52; Wed, 04 Sep 2019 03:46:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729398AbfIDKnj (ORCPT + 99 others); Wed, 4 Sep 2019 06:43:39 -0400 Received: from foss.arm.com ([217.140.110.172]:51628 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726010AbfIDKni (ORCPT ); Wed, 4 Sep 2019 06:43:38 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B573B337; Wed, 4 Sep 2019 03:43:37 -0700 (PDT) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.194.52]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CE6D33F246; Wed, 4 Sep 2019 03:43:35 -0700 (PDT) Date: Wed, 4 Sep 2019 11:43:33 +0100 From: Qais Yousef To: Joel Fernandes Cc: Valentin Schneider , Radim =?utf-8?B?S3LEjW3DocWZ?= , linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Thomas Gleixner , Borislav Petkov , Dave Hansen , Steven Rostedt , "H. Peter Anvin" , Andy Lutomirski , Jirka =?utf-8?Q?Hladk=C3=BD?= , =?utf-8?B?SmnFmcOtIFZvesOhcg==?= , x86@kernel.org Subject: Re: [PATCH 2/2] sched/debug: add sched_update_nr_running tracepoint Message-ID: <20190904104332.ogsjtbtuadhsglxh@e107158-lin.cambridge.arm.com> References: <20190903154340.860299-1-rkrcmar@redhat.com> <20190903154340.860299-3-rkrcmar@redhat.com> <20190904042310.GA159235@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190904042310.GA159235@google.com> User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/04/19 00:23, Joel Fernandes wrote: > On Tue, Sep 03, 2019 at 05:05:47PM +0100, Valentin Schneider wrote: > > On 03/09/2019 16:43, Radim Krčmář wrote: > > > The paper "The Linux Scheduler: a Decade of Wasted Cores" used several > > > custom data gathering points to better understand what was going on in > > > the scheduler. > > > Red Hat adapted one of them for the tracepoint framework and created a > > > tool to plot a heatmap of nr_running, where the sched_update_nr_running > > > tracepoint is being used for fine grained monitoring of scheduling > > > imbalance. > > > The tool is available from https://github.com/jirvoz/plot-nr-running. > > > > > > The best place for the tracepoints is inside the add/sub_nr_running, > > > which requires some shenanigans to make it work as they are defined > > > inside sched.h. > > > The tracepoints have to be included from sched.h, which means that > > > CREATE_TRACE_POINTS has to be defined for the whole header and this > > > might cause problems if tree-wide headers expose tracepoints in sched.h > > > dependencies, but I'd argue it's the other side's misuse of tracepoints. > > > > > > Moving the import sched.h line lower would require fixes in s390 and ppc > > > headers, because they don't include dependecies properly and expect > > > sched.h to do it, so it is simpler to keep sched.h there and > > > preventively undefine CREATE_TRACE_POINTS right after. > > > > > > Exports of the pelt tracepoints remain because they don't need to be > > > protected by CREATE_TRACE_POINTS and moving them closer would be > > > unsightly. > > > > > > > Pure trace events are frowned upon in scheduler world, try going with > > trace points. Qais did something very similar recently: > > > > https://lore.kernel.org/lkml/20190604111459.2862-1-qais.yousef@arm.com/ > > > > You'll have to implement the associated trace events in a module, which > > lets you define your own event format and doesn't form an ABI :). > > Is that really true? eBPF programs loaded from userspace can access > tracepoints through BPF_RAW_TRACEPOINT_OPEN, which is UAPI: > https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L103 > > I don't have a strong opinion about considering tracepoints as ABI / API or > not, but just want to get the facts straight :) It is actually true. But you need to make the distinction between a tracepoint and a trace event first. What Valentin is talking about here is the *bare* tracepoint without any event associated with them like the one I added to the scheduler recently. These ones are not accessible via eBPF, unless something has changed since I last tried. The current infrastructure needs to be expanded to allow eBPF to attach these bare tracepoints. Something similar to what I have in [1] is needed - but instead of creating a new macro it needs to expand the current macro. [2] might give full context of when I was trying to come up with alternatives to using trace events. [1] https://github.com/qais-yousef/linux/commit/fb9fea29edb8af327e6b2bf3bc41469a8e66df8b [2] https://lore.kernel.org/lkml/20190415144945.tumeop4djyj45v6k@e107158-lin.cambridge.arm.com/ HTH -- Qais Yousef