Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4944441ybv; Wed, 26 Feb 2020 05:47:06 -0800 (PST) X-Google-Smtp-Source: APXvYqwqwYeeqTTzBeizVSGysT4zLNPynU9ZIk6QcPaWfr8WKQam2G2OCVspg6b/QczDTJvgj54Q X-Received: by 2002:a54:4705:: with SMTP id k5mr3022064oik.154.1582724826140; Wed, 26 Feb 2020 05:47:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582724826; cv=none; d=google.com; s=arc-20160816; b=GzFLC/6Z4dgSFE+jwCDZZjMWyc4PJxuqycptnD1eB+VbSEcqYzJ1VDiG/bqLjNfpVi sBCVd+5nrTd4234przbUuoDjGGc+cLoKLanNM2De0TbKxIai9fMn+RaZ38SMFfPrf6Xu Qq/ArX2NAXfP5GttEhIjPHfXMXcQwE2fNUTi2Vw1jd5D/qDkNVvbQEqI3Og2iBkL7rfy doT+0x16Dg5d7lfEENTVQC0UVNFtLPpxoEI0XRqj32A17Mh4S1b8jti3IJS9MIfteg5d KXkcJFpf8yGBepNOW6KyRu/u2hqZhHPSY33XhVoVxD4w/zCeiyeu0Sn16FgDic59xwuk eX0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:mime-version :message-id:date:dkim-signature; bh=qeSh50iw2mfGwCVy8a6hYA44UG8luPXQGKb+YPVPRpk=; b=NfjliO+583Fqi5CGdkZXehBbjvFEtHAnwhhJ1ZhswQloz1slsUX5huLo8l3nGx6We5 yZYMi46/Ei00q21m1uCJTh/5MDIOEux1muljjoQrRoXDKSJ9+Xt6e7xNHbho2J8uzh6k /s4QCS/+aOtZyS74jnjmTnmoj1v43O1W249A6JlN3AWx3HjxS57gdPWzbEuvsHuefTaa /cRXUAqYDK3DI4vblm0F+Be/ultTueYa7uvuWiMPLM8GCee4wLXv4TBxzBeRqcBGdWxc e1k/XgwLKCNKinwtkfNwvCJDOlR3HlGgTRxc3/BAXiTwjsnR071YqzL+Wlqy8aY03CtB ik+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ZAiEwNKr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v23si1231912oto.81.2020.02.26.05.46.53; Wed, 26 Feb 2020 05:47:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ZAiEwNKr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726988AbgBZNqq (ORCPT + 99 others); Wed, 26 Feb 2020 08:46:46 -0500 Received: from mail-pj1-f73.google.com ([209.85.216.73]:35842 "EHLO mail-pj1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726277AbgBZNqq (ORCPT ); Wed, 26 Feb 2020 08:46:46 -0500 Received: by mail-pj1-f73.google.com with SMTP id m61so2043887pjb.1 for ; Wed, 26 Feb 2020 05:46:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=qeSh50iw2mfGwCVy8a6hYA44UG8luPXQGKb+YPVPRpk=; b=ZAiEwNKr7fkZcNBKQPqqQhDt4zbN5d2YaTnEL33HQlp2rROu6W6T6qRVf3H3DexcwC 7hv1Yt1NAwo/4R9HMvtS67Y+aYU76cMDvz4uAbSP26Wcq9lQoET3FxmyQ2ALIGBKaIU6 6JB4mGEeT4ikFHRpDaJddOub471mvuiVqFRtr0wuFZsmOs+QDdKH6hmUe9Ioc9odsqcO Fo3mROTi9PlKNhtlTKNnJv1g2RbHWtwOrkD5KH42ppYodOuBb4GkGphYkvnJY1BuotNL 7Di9MUE3VhC1sLbq7Sgjmh5ISPZOgYToa9AJWe0MFszX+E6l39BhUGvcTEuPJx0f0x/I JTsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=qeSh50iw2mfGwCVy8a6hYA44UG8luPXQGKb+YPVPRpk=; b=dMeYOjBHURWovZQCi7BDHXGH8ZbQy7aXsoY30HIA+CdRVaPeQI0nhWP/zZmJSqU7zA Xz1SfJgUd4an0AiozPpTmQelb3rtuh8FsmPRpgRMbdmRCX12aBzEvUVfAiw16LVTWR3K G/6tVCwbDnx1hU3/iGBxnoECui2UQYFEoWcyb4vakhpuaSt+gPd3zRItlON6S+pGuAdM szloNjIZUSCJzNWC8hn8mSHGCKW4wmd8LnmQhiui4H846mBfQlm7J7vLr0ZdDY51uhCF HLS6g0HnoJfcBE0zsIDKgfGmrnTxFsSf1lPfFSGrYDQbSP4nHpuv21jeS5naEHky8rVu 6OrQ== X-Gm-Message-State: APjAAAVlHqJ36WkU+lkkqoigP3r3NgMXmR6Ytb/YosxGCWNDKsJn/DVE rC8TGjZb8hXEP67bF0Uc/6pyBEZhMlu6CODeX2bInBd1Z74mRx0jUE+qzEDH0Q9g3sw+2hq7c+2 c0iRfDvnCZcuBb+IIn0uy172oAe+xreyR/i3Anlc24vLhUXue/GPUp12C5h4nw7onEDlz1g== X-Received: by 2002:a63:d241:: with SMTP id t1mr3822851pgi.283.1582724804738; Wed, 26 Feb 2020 05:46:44 -0800 (PST) Date: Wed, 26 Feb 2020 05:46:35 -0800 Message-Id: <20200226134637.31670-1-lrizzo@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.25.0.265.gbab2e86ba0-goog Subject: [PATCH v2 0/2] kstats: kernel metric collector From: Luigi Rizzo To: linux-kernel@vger.kernel.org, mhiramat@kernel.org, akpm@linux-foundation.org, gregkh@linuxfoundation.org, naveen.n.rao@linux.ibm.com, ardb@kernel.org, rizzo@iet.unipi.it, pabeni@redhat.com, giuseppe.lettieri@unipi.it, toke@redhat.com, hawk@kernel.org, mingo@redhat.com, acme@kernel.org, rostedt@goodmis.org, peterz@infradead.org Cc: Luigi Rizzo Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patchset introduces a small library to collect per-cpu samples and accumulate distributions to be exported through debugfs. Note that use case and performance are orthogonal to those of the perf/tracing architecture, which is why this is proposed as a standalone component. I have used this to trace the execution times of small sections of code (network stack functions, drivers and xdp processing, ...), whole functions, or notification latencies (e.g. IPI dispatch, etc.). The fine granularity of the aggregation helps identifying outliers, multimodal distributions, caching effects and general performance-related issues. Samples are 64-bit values that are aggregated into per-cpu logarithmic buckets with configurable density (typical sample collectors use one buckets for each power of 2, powers of 2, but that is very coarse and corresponds to 1 significant bit per value; in quickstats one can specify the number of significant bits, up to 5, which is useful for finer grain measurements). There are two ways to trace a block of code: manual annotations has the best performance at runtime and is done as follows: // create metric and entry in debugfs struct kstats *key = kstats_new("foo", frac_bits); ... // instrument the code u64 t0 = ktime_get_ns(); // about 20ns
t0 = ktime_get_ns() - t0; // about 20ns kstats_record(key, t0); // 5ns hot cache, 300ns cold This method has an accuracy of about 20-30ns (inherited from the clock) and an overhead with hot/cold cache as show above. Values can be read from debugfs in an easy to parse format # cat /sys/kernel/debug/kstats/foo ... slot 55 CPU 0 count 589 avg 480 p 0.027613 slot 55 CPU 1 count 18 avg 480 p 0.002572 slot 55 CPU 2 count 25 avg 480 p 0.003325 ... slot 55 CPUS 28 count 814 avg 480 p 0.002474 ... slot 97 CPU 13 count 1150 avg 20130 p 0.447442 slot 97 CPUS 28 count 152585 avg 19809 p 0.651747 ... Writing STOP, START, RESET to the file executes the corresponding action echo RESET > /sys/kernel/debug/kstats/foo The second instrumentation mechanism uses kretrprobes or tracepoints and lets tracing be enabled or removed at runtime from the command line for any globally visible function echo trace some_function bits 3 > /sys/kernel/debug/kstats/_control Data are exported or controlled in the same way as above. Accuracy is worse due to the presence of kretprobe trampolines: 90ns with hot cache, 500ns with cold cache. The overhead on the traced function is 250ns hot, 1500ns cold. Hope you find this useful. Luigi Rizzo (2): kstats: kernel metric collector kstats: kretprobe and tracepoint support include/linux/kstats.h | 62 ++++ lib/Kconfig.debug | 7 + lib/Makefile | 1 + lib/kstats.c | 654 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 724 insertions(+) create mode 100644 include/linux/kstats.h create mode 100644 lib/kstats.c -- 2.25.0.265.gbab2e86ba0-goog