Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp4468ybe; Thu, 5 Sep 2019 16:25:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqzdoIqyozsaRIHCQslurlslEEjFwzsbXEukVV/xK3iafUaviOofQW7B3/fF28y/vflH8JDG X-Received: by 2002:aa7:99da:: with SMTP id v26mr7072342pfi.258.1567725940699; Thu, 05 Sep 2019 16:25:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567725940; cv=none; d=google.com; s=arc-20160816; b=L9wNocvtntXCnkPlKFd+QEPYu1LsxDCSewq9N9NRLxGZvt7VCF6mxrwAYpBRecFb8K mDG24levoCJcwhPHZAl9/qVlnHbaaCUv4YMZwdtCYQU2VtkE/3vswIsNYqOjIyilAhQ7 bo7zbtNzNwl0NVOKsCJwCHM1v9SRSM2fxyN09Dy/jBC/0Fng0UIh5MjqZdoW4UkkrQKa IPFDeKOB+suvGJ8OF6tWib2cS4S4F6LpXTSpgF+OIGCunfhRfpTqtGy3MvUtWs1pI4KB EFrveWR9qD7YHJYsvWXegUdho8JL302KmiPY5hADpp4LcbuR2CD2SE4dHS7B7+4JIHCx 5Ifw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=d0h5gzva5JOXpyAq4Mm9NWyHUEqR2Uzdq9JFc8ciedQ=; b=xFat5i+/tK8j/bIzpAK645BLKsMwRESwZofAyVavcCQehIBSBMRxrsakfj2+8/5BSL WGqLtD2h2tGR3VsKkNbuvnEvCx8fzqe8AAfceFFBLv4WqpXGRrV6/wEafUEJz9Nht3fg IRpws/PkQJyBkzQpVIkjQJu4pE9mc48KUhKNUkCK4e4e596ftRTpDDNy19mM3nxbMfo8 HV1wOYDfoxofnFtcjEKtU5Dm+g8O2VKiJNqmFUVJCgXMMpjOMpPgqLlhn7DDBle644JJ 935vrBS0025WcIJAGa68RyQxwF6+Uyp36dxFj3svbXJh87vfzRsWBoTtn+RBkCXSouPh +XBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TEMRditZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z13si3942267pfc.18.2019.09.05.16.25.24; Thu, 05 Sep 2019 16:25:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TEMRditZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390138AbfIEUZ2 (ORCPT + 99 others); Thu, 5 Sep 2019 16:25:28 -0400 Received: from mail-ua1-f51.google.com ([209.85.222.51]:40839 "EHLO mail-ua1-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388907AbfIEUZ1 (ORCPT ); Thu, 5 Sep 2019 16:25:27 -0400 Received: by mail-ua1-f51.google.com with SMTP id i17so1289030ual.7 for ; Thu, 05 Sep 2019 13:25:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d0h5gzva5JOXpyAq4Mm9NWyHUEqR2Uzdq9JFc8ciedQ=; b=TEMRditZ69Dg7u1c8c5lDWyHLaVJVRQ7T+MRwSHiisK1SttgMGs+lQunS04A7b6Q8I M0TpJHCJHUbua99b3v1NboQEz2BtlIdp0zFafkCDWw3Vh8tJ6M1IiC7VBsle5h4hzPlS pUO+R1mwGeo1D95SmrcSR/hveiKYJma6WEkJgrHg4buy6qeTmnXk5RSCZZmgBjZ8vNId yCoedq7D6iyz/eWitV0OvkwJLj3UanZAu9DTlwOwNnhKXH/0zvuV2AVHE7+RCdKs7vGQ BW6YOZPkH8u4ou7cBAPYeyajPsVMbp0NbNwGKCTBTZl5kHqF9u24rXd7TkwPmiH04yo3 DkTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d0h5gzva5JOXpyAq4Mm9NWyHUEqR2Uzdq9JFc8ciedQ=; b=br3CurwdkaHzt2Pir6xJCseRps+SNSKk02P5Ss2lzQh9DMgEGr8IuODGbZHmqOp6x0 74/HUdBMwiHZDx7ekx6gpd/sKcxCss7E37Orsf9gLnGH7NGXS/y5x/e8wfNCr/qmIFNB 8RbEm8O0SF1b58ymQUOsyS92IDp1cBwA4EIqQS7pT1ahOEXJ4il014C7b4Rl/gt9kXW+ c1Vr/elBzBJulFR12l+GmWXEpJ+0Q2VwlcI9SdmO/BmuVW2cwUovQA0NY5GxEMy4V5CP PkmnXN4JWTCNGQuGgfyR5K+6IwoWdrfrH2fAPGtYbqgNxVrwb/M4kmRUHK6ERfUipurm E4cQ== X-Gm-Message-State: APjAAAWrqJc8gBKS6NGhKKTFDNtFjmKtUz2BQnwL2kzEIqyLvoeXdkul zLXL/ANY3S89v7BaL6RUfY/uGTM2+28xwOzeBGSPsA== X-Received: by 2002:ab0:392:: with SMTP id 18mr2585498uau.85.1567715125946; Thu, 05 Sep 2019 13:25:25 -0700 (PDT) MIME-Version: 1.0 References: <20190903200905.198642-1-joel@joelfernandes.org> <20190904084508.GL3838@dhcp22.suse.cz> <20190904153258.GH240514@google.com> <20190904153759.GC3838@dhcp22.suse.cz> <20190904162808.GO240514@google.com> <20190905144310.GA14491@dhcp22.suse.cz> <20190905133507.783c6c61@oasis.local.home> <20190905174705.GA106117@google.com> <20190905175108.GB106117@google.com> <1567713403.16718.25.camel@kernel.org> In-Reply-To: <1567713403.16718.25.camel@kernel.org> From: Daniel Colascione Date: Thu, 5 Sep 2019 13:24:49 -0700 Message-ID: Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold To: Tom Zanussi Cc: Joel Fernandes , Steven Rostedt , Suren Baghdasaryan , Michal Hocko , LKML , Tim Murray , Carmen Jackson , Mayank Gupta , Minchan Kim , Andrew Morton , kernel-team , "Aneesh Kumar K.V" , Dan Williams , Jerome Glisse , linux-mm , Matthew Wilcox , Ralph Campbell , Vlastimil Babka Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 5, 2019 at 12:56 PM Tom Zanussi wrote: > On Thu, 2019-09-05 at 13:51 -0400, Joel Fernandes wrote: > > On Thu, Sep 05, 2019 at 01:47:05PM -0400, Joel Fernandes wrote: > > > On Thu, Sep 05, 2019 at 01:35:07PM -0400, Steven Rostedt wrote: > > > > > > > > > > > > [ Added Tom ] > > > > > > > > On Thu, 5 Sep 2019 09:03:01 -0700 > > > > Suren Baghdasaryan wrote: > > > > > > > > > On Thu, Sep 5, 2019 at 7:43 AM Michal Hocko > > > > > wrote: > > > > > > > > > > > > [Add Steven] > > > > > > > > > > > > On Wed 04-09-19 12:28:08, Joel Fernandes wrote: > > > > > > > On Wed, Sep 4, 2019 at 11:38 AM Michal Hocko > > > > > > .org> wrote: > > > > > > > > > > > > > > > > On Wed 04-09-19 11:32:58, Joel Fernandes wrote: > > > > > > > > > > > > [...] > > > > > > > > > but also for reducing > > > > > > > > > tracing noise. Flooding the traces makes it less useful > > > > > > > > > for long traces and > > > > > > > > > post-processing of traces. IOW, the overhead reduction > > > > > > > > > is a bonus. > > > > > > > > > > > > > > > > This is not really anything special for this tracepoint > > > > > > > > though. > > > > > > > > Basically any tracepoint in a hot path is in the same > > > > > > > > situation and I do > > > > > > > > not see a point why each of them should really invent its > > > > > > > > own way to > > > > > > > > throttle. Maybe there is some way to do that in the > > > > > > > > tracing subsystem > > > > > > > > directly. > > > > > > > > > > > > > > I am not sure if there is a way to do this easily. Add to > > > > > > > that, the fact that > > > > > > > you still have to call into trace events. Why call into it > > > > > > > at all, if you can > > > > > > > filter in advance and have a sane filtering default? > > > > > > > > > > > > > > The bigger improvement with the threshold is the number of > > > > > > > trace records are > > > > > > > almost halved by using a threshold. The number of records > > > > > > > went from 4.6K to > > > > > > > 2.6K. > > > > > > > > > > > > Steven, would it be feasible to add a generic tracepoint > > > > > > throttling? > > > > > > > > > > I might misunderstand this but is the issue here actually > > > > > throttling > > > > > of the sheer number of trace records or tracing large enough > > > > > changes > > > > > to RSS that user might care about? Small changes happen all the > > > > > time > > > > > but we are likely not interested in those. Surely we could > > > > > postprocess > > > > > the traces to extract changes large enough to be interesting > > > > > but why > > > > > capture uninteresting information in the first place? IOW the > > > > > throttling here should be based not on the time between traces > > > > > but on > > > > > the amount of change of the traced signal. Maybe a generic > > > > > facility > > > > > like that would be a good idea? > > > > > > > > You mean like add a trigger (or filter) that only traces if a > > > > field has > > > > changed since the last time the trace was hit? Hmm, I think we > > > > could > > > > possibly do that. Perhaps even now with histogram triggers? > > > > > > > > > Hey Steve, > > > > > > Something like an analog to digitial coversion function where you > > > lose the > > > granularity of the signal depending on how much trace data: > > > https://www.globalspec.com/ImageRepository/LearnMore/20142/9ee38d1a > > > 85d37fa23f86a14d3a9776ff67b0ec0f3b.gif > > > > s/how much trace data/what the resolution is/ > > > > > so like, if you had a counter incrementing with values after the > > > increments > > > as: 1,3,4,8,12,14,30 and say 5 is the threshold at which to emit a > > > trace, > > > then you would get 1,8,12,30. > > > > > > So I guess what is need is a way to reduce the quantiy of trace > > > data this > > > way. For this usecase, the user mostly cares about spikes in the > > > counter > > > changing that accurate values of the different points. > > > > s/that accurate/than accurate/ > > > > I think Tim, Suren, Dan and Michal are all saying the same thing as > > well. > > > > There's not a way to do this using existing triggers (histogram > triggers have an onchange() that fires on any change, but that doesn't > help here), and I wouldn't expect there to be - these sound like very > specific cases that would never have support in the simple trigger > 'language'. I don't see the filtering under discussion as some "very specific" esoteric need. You need this general kind of mechanism any time you want to monitor at low frequency a thing that changes at high frequency. The general pattern isn't specific to RSS or even memory in general. One might imagine, say, wanting to trace large changes in TCP window sizes. Any time something in the kernel has a "level" and that level changes at high frequency and we want to learn about big swings in that level, the mechanism we're talking about becomes useful. I don't think it should be out of bounds for the histogram mechanism, which is *almost* there right now. We already have the ability to accumulate values derived from ftrace events into tables keyed on various fields in these events and things like onmax(). > On the other hand, I have been working on something that should give > you the ability to do something like this, by writing a module that > hooks into arbitrary trace events, accessing their fields, building up > any needed state across events, and then generating synthetic events as > needed: You might as well say we shouldn't have tracepoints at all and that people should just write modules that kprobe what they need. :-) You can reject *any* kernel interface by suggesting that people write a module to do that thing. (You could also probably do something with eBPF.) But there's a lot of value to having an easy-to-use general-purpose mechanism that doesn't make people break out the kernel headers and a C compiler.