Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp320830ybe; Thu, 5 Sep 2019 23:17:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqx7Ot3ZmLajC4/ALvEfsUZU/zA/xZ2O6dJQlJBXKS+fUIYe5j4M3UO7q2olJV+JfE4oSrzr X-Received: by 2002:a17:902:7c91:: with SMTP id y17mr7768941pll.46.1567750660448; Thu, 05 Sep 2019 23:17:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567750660; cv=none; d=google.com; s=arc-20160816; b=vCU0vez3iXFAVvceuH2rMxL91XjFLfVEuxhFCPCmDbXGCOwvH35rB+UPZJ0URiDmKI +qXwu3tHsnZYP/UwgkHFMQc8VNsvzLqy/eOAiGHB1cRqwZAJiU4U/9uXOhuW5KvyBmUg uG4DRvbIKV2RHrTKKT/Ga5N6IqCeJs3oOvJbJ5uUypmB6XqHRdwrNQLe3VxYtptNd9HJ BVPcHmK5Tc9+VsrzZtB2nZKzzz076NOxh7m+M8zp3l7yBaKBlI7R5Lo6qWdYW3SvZUk8 7ZYz3IPi4GeVo54cbWAskvZQ8Z1bxeBZjdf0X+hUUOSMca1yDhgLOhl+uZL3e5NhADX3 9Gpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=qkhUiv7klANS8T2sJRsKKv55Xkdq7BcYlP1fntAaQ/c=; b=PDC3OcUFOBn1NI+D/bTzTm3yr3iODqw9KpF4zbTB3IymZgTUvBOZxTeH39jVsBpYHM IqBO1Fj28Gpw88HuoESg8I7gboYilA28VCP6TY6FSUDO/1EhaR6aL1K7k3VX3RA/J/8l LT7s/ISWjnFXbmjYBHKdh9UpEMnVOHEdT50/m5pdVAWH9/3lDt2zMM7NoPq6HWTDKJZ6 pg/TiXhqEDQmO4DA0Sx58srM3FTWQlUmlpVt4h/0QNjL0j4hVy1Au72oHBsxQ3eEq0V7 ymIN36oOet3X0la4gjPsIkz63YbFXDV0j+Vhss4kt2uvmsLRMIHZP9LMb7YMM7mSjNrx ftZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ElpduPjH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p33si3705338pgb.0.2019.09.05.23.17.24; Thu, 05 Sep 2019 23:17:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ElpduPjH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391573AbfIEUch (ORCPT + 99 others); Thu, 5 Sep 2019 16:32:37 -0400 Received: from mail.kernel.org ([198.145.29.99]:59596 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732067AbfIEUcg (ORCPT ); Thu, 5 Sep 2019 16:32:36 -0400 Received: from tzanussi-mobl (c-98-220-238-81.hsd1.il.comcast.net [98.220.238.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5BC7B2082E; Thu, 5 Sep 2019 20:32:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567715555; bh=jND1byzr7Vi4ZUqwHS0s7Jh352lG35T0RmAaDn9PAB4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ElpduPjHn8/OdAB1Mjp2oVgCX+kPmVj0zJevc7xIypjHVwPa6BoOxZwaBM2uetxrN cNGpF6QiSY+7vWOiyh0vsiFs9H0RMRtcgi/2xse/U2TCbdn3yXKy6GMnVmv5hxU8JW 2dUMQEY3rwmNvk3vDFt9+V0XLH5sqYTMXHW06EmI= Message-ID: <1567715553.16718.29.camel@kernel.org> Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold From: Tom Zanussi To: Daniel Colascione Cc: Joel Fernandes , Steven Rostedt , Suren Baghdasaryan , Michal Hocko , LKML , Tim Murray , Carmen Jackson , Mayank Gupta , Minchan Kim , Andrew Morton , kernel-team , "Aneesh Kumar K.V" , Dan Williams , Jerome Glisse , linux-mm , Matthew Wilcox , Ralph Campbell , Vlastimil Babka Date: Thu, 05 Sep 2019 15:32:33 -0500 In-Reply-To: References: <20190903200905.198642-1-joel@joelfernandes.org> <20190904084508.GL3838@dhcp22.suse.cz> <20190904153258.GH240514@google.com> <20190904153759.GC3838@dhcp22.suse.cz> <20190904162808.GO240514@google.com> <20190905144310.GA14491@dhcp22.suse.cz> <20190905133507.783c6c61@oasis.local.home> <20190905174705.GA106117@google.com> <20190905175108.GB106117@google.com> <1567713403.16718.25.camel@kernel.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.1-1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Thu, 2019-09-05 at 13:24 -0700, Daniel Colascione wrote: > On Thu, Sep 5, 2019 at 12:56 PM Tom Zanussi > wrote: > > On Thu, 2019-09-05 at 13:51 -0400, Joel Fernandes wrote: > > > On Thu, Sep 05, 2019 at 01:47:05PM -0400, Joel Fernandes wrote: > > > > On Thu, Sep 05, 2019 at 01:35:07PM -0400, Steven Rostedt wrote: > > > > > > > > > > > > > > > [ Added Tom ] > > > > > > > > > > On Thu, 5 Sep 2019 09:03:01 -0700 > > > > > Suren Baghdasaryan wrote: > > > > > > > > > > > On Thu, Sep 5, 2019 at 7:43 AM Michal Hocko > > > > > org> > > > > > > wrote: > > > > > > > > > > > > > > [Add Steven] > > > > > > > > > > > > > > On Wed 04-09-19 12:28:08, Joel Fernandes wrote: > > > > > > > > On Wed, Sep 4, 2019 at 11:38 AM Michal Hocko > > > > > > > rnel > > > > > > > > .org> wrote: > > > > > > > > > > > > > > > > > > On Wed 04-09-19 11:32:58, Joel Fernandes wrote: > > > > > > > > > > > > > > [...] > > > > > > > > > > but also for reducing > > > > > > > > > > tracing noise. Flooding the traces makes it less > > > > > > > > > > useful > > > > > > > > > > for long traces and > > > > > > > > > > post-processing of traces. IOW, the overhead > > > > > > > > > > reduction > > > > > > > > > > is a bonus. > > > > > > > > > > > > > > > > > > This is not really anything special for this > > > > > > > > > tracepoint > > > > > > > > > though. > > > > > > > > > Basically any tracepoint in a hot path is in the same > > > > > > > > > situation and I do > > > > > > > > > not see a point why each of them should really invent > > > > > > > > > its > > > > > > > > > own way to > > > > > > > > > throttle. Maybe there is some way to do that in the > > > > > > > > > tracing subsystem > > > > > > > > > directly. > > > > > > > > > > > > > > > > I am not sure if there is a way to do this easily. Add > > > > > > > > to > > > > > > > > that, the fact that > > > > > > > > you still have to call into trace events. Why call into > > > > > > > > it > > > > > > > > at all, if you can > > > > > > > > filter in advance and have a sane filtering default? > > > > > > > > > > > > > > > > The bigger improvement with the threshold is the number > > > > > > > > of > > > > > > > > trace records are > > > > > > > > almost halved by using a threshold. The number of > > > > > > > > records > > > > > > > > went from 4.6K to > > > > > > > > 2.6K. > > > > > > > > > > > > > > Steven, would it be feasible to add a generic tracepoint > > > > > > > throttling? > > > > > > > > > > > > I might misunderstand this but is the issue here actually > > > > > > throttling > > > > > > of the sheer number of trace records or tracing large > > > > > > enough > > > > > > changes > > > > > > to RSS that user might care about? Small changes happen all > > > > > > the > > > > > > time > > > > > > but we are likely not interested in those. Surely we could > > > > > > postprocess > > > > > > the traces to extract changes large enough to be > > > > > > interesting > > > > > > but why > > > > > > capture uninteresting information in the first place? IOW > > > > > > the > > > > > > throttling here should be based not on the time between > > > > > > traces > > > > > > but on > > > > > > the amount of change of the traced signal. Maybe a generic > > > > > > facility > > > > > > like that would be a good idea? > > > > > > > > > > You mean like add a trigger (or filter) that only traces if a > > > > > field has > > > > > changed since the last time the trace was hit? Hmm, I think > > > > > we > > > > > could > > > > > possibly do that. Perhaps even now with histogram triggers? > > > > > > > > > > > > Hey Steve, > > > > > > > > Something like an analog to digitial coversion function where > > > > you > > > > lose the > > > > granularity of the signal depending on how much trace data: > > > > https://www.globalspec.com/ImageRepository/LearnMore/20142/9ee3 > > > > 8d1a > > > > 85d37fa23f86a14d3a9776ff67b0ec0f3b.gif > > > > > > s/how much trace data/what the resolution is/ > > > > > > > so like, if you had a counter incrementing with values after > > > > the > > > > increments > > > > as: 1,3,4,8,12,14,30 and say 5 is the threshold at which to > > > > emit a > > > > trace, > > > > then you would get 1,8,12,30. > > > > > > > > So I guess what is need is a way to reduce the quantiy of trace > > > > data this > > > > way. For this usecase, the user mostly cares about spikes in > > > > the > > > > counter > > > > changing that accurate values of the different points. > > > > > > s/that accurate/than accurate/ > > > > > > I think Tim, Suren, Dan and Michal are all saying the same thing > > > as > > > well. > > > > > > > There's not a way to do this using existing triggers (histogram > > triggers have an onchange() that fires on any change, but that > > doesn't > > help here), and I wouldn't expect there to be - these sound like > > very > > specific cases that would never have support in the simple trigger > > 'language'. > > I don't see the filtering under discussion as some "very specific" > esoteric need. You need this general kind of mechanism any time you > want to monitor at low frequency a thing that changes at high > frequency. The general pattern isn't specific to RSS or even memory > in > general. One might imagine, say, wanting to trace large changes in > TCP > window sizes. Any time something in the kernel has a "level" and that > level changes at high frequency and we want to learn about big swings > in that level, the mechanism we're talking about becomes useful. I > don't think it should be out of bounds for the histogram mechanism, > which is *almost* there right now. We already have the ability to > accumulate values derived from ftrace events into tables keyed on > various fields in these events and things like onmax(). > > > On the other hand, I have been working on something that should > > give > > you the ability to do something like this, by writing a module that > > hooks into arbitrary trace events, accessing their fields, building > > up > > any needed state across events, and then generating synthetic > > events as > > needed: > > You might as well say we shouldn't have tracepoints at all and that > people should just write modules that kprobe what they need. :-) You > can reject *any* kernel interface by suggesting that people write a > module to do that thing. (You could also probably do something with > eBPF.) But there's a lot of value to having an easy-to-use > general-purpose mechanism that doesn't make people break out the > kernel headers and a C compiler. Oh, I didn't mean to reject any interface - I guess I should go read the whole thread then, and find the interface you're talking about. Tom