Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4947778ybv; Wed, 26 Feb 2020 05:50:59 -0800 (PST) X-Google-Smtp-Source: APXvYqyTU48thvvrNDR8bH8WZb1N9bSS5DDjFGoajq9PAsEZHJ9IkAy2rec2u+QkzcPZKol8Zsps X-Received: by 2002:aca:4b46:: with SMTP id y67mr2965689oia.122.1582725059307; Wed, 26 Feb 2020 05:50:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582725059; cv=none; d=google.com; s=arc-20160816; b=pXfGNtQzDN+75TEw8YerKkZcFbjUIqkQO4pRws1menEnCJqj4eFAmeGVNn+bDfYive OR487sFNWXC2+86xWP4QxWvS7+PoRs5iOdYbFETdKpFs2Peakp4paxnZtKdTl+HIDylq 5yeludMdQg45bEN7IMJhPH8WfYTdb0+kZHuOPtR6IolckbibMzfrNKjgMkDWGDVKtEX4 8j2fjHeSh1M5d2roxBKGhjmfU8XidgTHwtjeLA7SUvO4mV4ZBWlSxDNyuYuDAmYA35iE DZT4Gu7YR26k++9xK08J9rMskZ6WL0jHlrIgMc0umbgZyiy61/u9I8BGbQMmxSdeQgCg WTzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:mime-version :message-id:date:dkim-signature; bh=PYZaUdOqYqx4M6998+NOPh3hPZwThKUsmp2WTmT8hDA=; b=kIcM0Xz8ntwpCj8LpkM91UkgtJ+846lR09uEzE1jTv0w0Y5/JB3/WRbUfQNxEXRopH G+11onZfcNEKHXIa/suddgrzsLvc4TScfQKiEjxvGP2BGt0yYbZNSbKw/09VNoWfs6NC cP7ryGaSgsdDion1RoFseRH/h+RmfsrP8raVdKmDBHzfnFJ0eTX7/lZM7jWljHLcKAIv T8rV8AdvfBh5zwy6R4olB2TKY1qf6Lt091ZBxThkdW4S5CBRjbdK5qLJ+SkOr8+/BDq8 b0CDcHc+/PzLKz0zXxezeWLDVLKO4Us4DSCrTCG2AkBtSyNZW9PAagKajAoJ3OnEitzI HMfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=GFvTYLsE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5si1263463oti.22.2020.02.26.05.50.46; Wed, 26 Feb 2020 05:50:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=GFvTYLsE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727334AbgBZNuh (ORCPT + 99 others); Wed, 26 Feb 2020 08:50:37 -0500 Received: from mail-pj1-f73.google.com ([209.85.216.73]:40261 "EHLO mail-pj1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726388AbgBZNug (ORCPT ); Wed, 26 Feb 2020 08:50:36 -0500 Received: by mail-pj1-f73.google.com with SMTP id ev1so2035294pjb.5 for ; Wed, 26 Feb 2020 05:50:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=PYZaUdOqYqx4M6998+NOPh3hPZwThKUsmp2WTmT8hDA=; b=GFvTYLsE6VDPPWH5nvS1z761QY8oQVdzcepn+Uk4V6pSJIY8x5BmlLNo/agcXyeXzV YMULG18hCo/qyYwo5OiiDQGEjXv8uqa0gQ/0Ptu7HmLC2hIxy/AWGM8mfU4j/kLQjKSW WYZ3pkWXAqSd4VorOaBWeettOfQEh++HKQRfPiBzjd1UJRc3hM4Rx5HOCq8+zbd5j43O uzLfn+1XDdwq00dXVDyEYBxxJ/DB+NyehSCXLHvjW2pM3N6OpAcIO44/0EmtyNmWRu+f I6Dc1i0xhn/ObfBCIS7x1HNlBhojCeLGqAmF7aR6irvWcK/+u27Y1EzKWmROPuT724AQ DFTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=PYZaUdOqYqx4M6998+NOPh3hPZwThKUsmp2WTmT8hDA=; b=Mzxo4iypMmSgN0sw44IXNeZxDwYWCNgyuJHLmg8fjOfcw2FRq89af0ROLmsWJDQrzm RQqDyRKe1xP6PXomwlhtaQEfunFpputLF7/XC/ZlT4Q1nBBSqzadkIVLIOrPHVtDgG7Y iCFzFAzjMhI3UGJ9Hu+xQ1nNKLrJaQ3CFDmGUY38S+xyfIyKoimy+xy9q5xXfjBTMIk3 rgacqWBUcwqPG1AD9S6t946wO4y3eRhLMqusRKTSFUZD5BPb6T++Qf0+CKsDcpMpYDWC 4UH8DoYuKrMfWR3ESAeodZcfZYPSi4mUHtCCLc+Zdc2axy+OsRwb9S75QfMgAZHS6kCm ZPew== X-Gm-Message-State: APjAAAXJFnKjIDGZavkvkMVO8fH3hlL16ZE17rZHrJNDVvJaEwtX9QGg 0mrcWeBYBmTP79P5rAXqTAXFI82OiFaIBzW7v4enUcZl/dvqxnDVX0yKuKM5EhMqJ3CROQuQ0rV 6NOQ1YECqOhoKYBcW14SQsg/mQ3gZlkZ37t/5QkLtc8O9U32hafHe/xhq+24SWnjyJkIxKg== X-Received: by 2002:a65:6718:: with SMTP id u24mr4088545pgf.289.1582725035541; Wed, 26 Feb 2020 05:50:35 -0800 (PST) Date: Wed, 26 Feb 2020 05:50:25 -0800 Message-Id: <20200226135027.34538-1-lrizzo@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.25.0.265.gbab2e86ba0-goog Subject: [PATCH v3 0/2] kstats: kernel metric collector From: Luigi Rizzo To: linux-kernel@vger.kernel.org, mhiramat@kernel.org, akpm@linux-foundation.org, gregkh@linuxfoundation.org, naveen.n.rao@linux.ibm.com, ardb@kernel.org, rizzo@iet.unipi.it, pabeni@redhat.com, giuseppe.lettieri@unipi.it, toke@redhat.com, hawk@kernel.org, mingo@redhat.com, acme@kernel.org, rostedt@goodmis.org, peterz@infradead.org Cc: Luigi Rizzo Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patchset introduces a small library to collect per-cpu samples and accumulate distributions to be exported through debugfs. This v3 series addresses some initial comments (mostly style fixes in the code) and revises commit logs. Note that use case and performance are orthogonal to those of the perf/tracing architecture, which is why this is proposed as a standalone component. I have used this to trace the execution times of small sections of code (network stack functions, drivers and xdp processing, ...), whole functions, or notification latencies (e.g. IPI dispatch, etc.). The fine granularity of the aggregation helps identifying outliers, multimodal distributions, caching effects and general performance-related issues. Samples are 64-bit values that are aggregated into per-cpu logarithmic buckets with configurable density (typical sample collectors use one buckets for each power of 2, powers of 2, but that is very coarse and corresponds to 1 significant bit per value; in quickstats one can specify the number of significant bits, up to 5, which is useful for finer grain measurements). There are two ways to trace a block of code: manual annotations has the best performance at runtime and is done as follows: // create metric and entry in debugfs struct kstats *key = kstats_new("foo", frac_bits); ... // instrument the code u64 t0 = ktime_get_ns(); // about 20ns
t0 = ktime_get_ns() - t0; // about 20ns kstats_record(key, t0); // 5ns hot cache, 300ns cold This method has an accuracy of about 20-30ns (inherited from the clock) and an overhead with hot/cold cache as show above. Values can be read from debugfs in an easy to parse format # cat /sys/kernel/debug/kstats/foo ... slot 55 CPU 0 count 589 avg 480 p 0.027613 slot 55 CPU 1 count 18 avg 480 p 0.002572 slot 55 CPU 2 count 25 avg 480 p 0.003325 ... slot 55 CPUS 28 count 814 avg 480 p 0.002474 ... slot 97 CPU 13 count 1150 avg 20130 p 0.447442 slot 97 CPUS 28 count 152585 avg 19809 p 0.651747 ... Writing STOP, START, RESET to the file executes the corresponding action echo RESET > /sys/kernel/debug/kstats/foo The second instrumentation mechanism uses kretrprobes or tracepoints and lets tracing be enabled or removed at runtime from the command line for any globally visible function echo trace some_function bits 3 > /sys/kernel/debug/kstats/_control Data are exported or controlled in the same way as above. Accuracy is worse due to the presence of kretprobe trampolines: 90ns with hot cache, 500ns with cold cache. The overhead on the traced function is 250ns hot, 1500ns cold. Hope you find this useful. Luigi Rizzo (2): kstats: kernel metric collector kstats: kretprobe and tracepoint support include/linux/kstats.h | 62 ++++ lib/Kconfig.debug | 7 + lib/Makefile | 1 + lib/kstats.c | 654 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 724 insertions(+) create mode 100644 include/linux/kstats.h create mode 100644 lib/kstats.c -- 2.25.0.265.gbab2e86ba0-goog