Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp803321ybe; Wed, 4 Sep 2019 08:00:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqwI6fDptdjY4zq9xm2rMxsVFQvgthSn+HrMvU5paRIzpIqCDPIjqJZMuVuq6MBe21lp6Bkb X-Received: by 2002:a17:902:b183:: with SMTP id s3mr673990plr.338.1567609257234; Wed, 04 Sep 2019 08:00:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567609257; cv=none; d=google.com; s=arc-20160816; b=GWLqlP4gMMJfwMdgKtkyrkdirGnZKkY1sglUSImvbxeHnEbSN+5sDlcIw0Lu1wSNhl N1aSg1wk5bkrsmOn2GyX3yKv6meFSfo+8i7IzTx4hzPGmlLNwnoC29KJaAzulyHEwqBU QCq+g/rgNcF0rWD+2dzo/Zteblu2WJpB3QONbLi3K1nmC/jdhRAbemkPUiQdkfvTDgPE XpbYj5VVd3b8gx/0kyharVndwhdlXAQIPBvoNhXQkTRzFfvJluFJvcWSgvkGiXuYqevt lG6R3O2MDBUq60wkigrfJQjjbi6yW1CJ3YGekatPb7KK7TPMI1yuPylz+cfzcdx7Fc6w JU6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=rE1O7fdRvE37jnARAHsLuKHdyufLxZ8cmQ0gadZAgL8=; b=oOlQSc+SRkD9LdDqpn7D1PMNo2t4Cfm+HoCeMGoICmpcEdJg2DuXQ5ObXAf7zmtR/z UOgxzJ/Q6Ae99NyI4qbFMjRA3j4X3q4CPBoV/UBEHh6GkPzOPXSsoDtDyYe418A9dT8/ AiUaV2++uPezH5Ba1LpX/n/6iM0bsaNmE4WIVrlMHAYVngeiPs6jZ33EA+L3DRbq3+is xJUWlPfEWm/gpoP3lC6ydTXdBfBLdPOYEo+PA3ZkKkAezq9cIILSyg1EjpwX1UtQgcHW AWaHSq0ppS0tBWTwebzS9RjQdKx81pA0gEo1gWTQo8MnRv6W8wAP0LDmEdz/SogO7SGb Qyig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=D9ih9Cw2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f7si17406667pgv.135.2019.09.04.08.00.40; Wed, 04 Sep 2019 08:00:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=D9ih9Cw2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731128AbfIDO7o (ORCPT + 99 others); Wed, 4 Sep 2019 10:59:44 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:41818 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731060AbfIDO7n (ORCPT ); Wed, 4 Sep 2019 10:59:43 -0400 Received: by mail-pg1-f193.google.com with SMTP id x15so11390426pgg.8 for ; Wed, 04 Sep 2019 07:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=rE1O7fdRvE37jnARAHsLuKHdyufLxZ8cmQ0gadZAgL8=; b=D9ih9Cw29Le6OiWtBd6aRyEWwpDTtYVafWcJb7+7LuDHHVOIIXPs+2zV1Ht8Ax++BC ia8+M2H3nQxG94rQvOCQVj16tbKYPYaFGu4HsuhvV78nwd2wFSOzc3c/yQNBbBvBA0Yd qsTZxSaf3QzMI1+NuWyOro0muhD4GtxAju2jM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=rE1O7fdRvE37jnARAHsLuKHdyufLxZ8cmQ0gadZAgL8=; b=Ftur+1+DdQcDPKF9nSKMl7qdr4M5wj7EC95/b9YE4hs+8FgPSqHsCQe49Ey+ylpJhi B6ex2t2ieSOvnCQLk/HYTCkK8DeGVo1VNjBuN26skqH8yjsXdyEmXGZ5/oZWB051zqA5 Nh3ZcKbr2FYAUmIcED1kqzIvK+kUxFGgaFEmblS70F/iADOHwIEGiifnmJOLQRDg79LK iB90WEhdzSAfdgA0ggnuz7Wyyb4oDpk5nNxVW3fcKeOuS1tAeWKiofyZfLS9wqwIo6On eLItM8d13pAOzU/gI9mTQy9N7KnRmomma51ErggiziJ6t6vgTJvcyNBzLgskhsWZuNEW +bUA== X-Gm-Message-State: APjAAAW9F5AS9Kzjajbw4cdPRMxSHS3twc4ZDakhnUXo9yfqW6LRQYxA 9J31F+EyMAEE7QDBgFMKwmRT2Q== X-Received: by 2002:a17:90a:2e15:: with SMTP id q21mr5316128pjd.97.1567609182722; Wed, 04 Sep 2019 07:59:42 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id e6sm643717pfl.146.2019.09.04.07.59.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2019 07:59:42 -0700 (PDT) Date: Wed, 4 Sep 2019 10:59:41 -0400 From: Joel Fernandes To: Daniel Colascione Cc: Suren Baghdasaryan , LKML , Tim Murray , Carmen Jackson , Mayank Gupta , Steven Rostedt , Minchan Kim , Andrew Morton , kernel-team , "Aneesh Kumar K.V" , Dan Williams , Jerome Glisse , linux-mm , Matthew Wilcox , Michal Hocko , Ralph Campbell , Vlastimil Babka Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold Message-ID: <20190904145941.GF240514@google.com> References: <20190903200905.198642-1-joel@joelfernandes.org> <20190904051549.GB256568@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 03, 2019 at 10:42:53PM -0700, Daniel Colascione wrote: > On Tue, Sep 3, 2019 at 10:15 PM Joel Fernandes wrote: > > > > On Tue, Sep 03, 2019 at 09:51:20PM -0700, Daniel Colascione wrote: > > > On Tue, Sep 3, 2019 at 9:45 PM Suren Baghdasaryan wrote: > > > > > > > > On Tue, Sep 3, 2019 at 1:09 PM Joel Fernandes (Google) > > > > wrote: > > > > > > > > > > Useful to track how RSS is changing per TGID to detect spikes in RSS and > > > > > memory hogs. Several Android teams have been using this patch in various > > > > > kernel trees for half a year now. Many reported to me it is really > > > > > useful so I'm posting it upstream. > > > > > > It's also worth being able to turn off the per-task memory counter > > > caching, otherwise you'll have two levels of batching before the > > > counter gets updated, IIUC. > > > > I prefer to keep split RSS accounting turned on if it is available. > > Why? AFAIK, nobody's produced numbers showing that split accounting > has a real benefit. I am not too sure. Have you checked the original patches that added this stuff though? It seems to me the main win would be on big systems that have to pay for atomic updates. > > I think > > discussing split RSS accounting is a bit out of scope of this patch as well. > > It's in-scope, because with split RSS accounting, allocated memory can > stay accumulated in task structs for an indefinite time without being > flushed to the mm. As a result, if you take the stream of virtual > memory management system calls that program makes on one hand, and VM > counter values on the other, the two don't add up. For various kinds > of robustness (trace self-checking, say) it's important that various > sources of data add up. > > If we're adding a configuration knob that controls how often VM > counters get reflected in system trace points, we should also have a > knob to control delayed VM counter operations. The whole point is for > users to be able to specify how precisely they want VM counter changes > reported to analysis tools. We're not adding more configuration knobs. > > Any improvements on that front can be a follow-up. > > > > Curious, has split RSS accounting shown you any issue with this patch? > > Split accounting has been a source of confusion for a while now: it > causes that numbers-don't-add-up problem even when sampling from > procfs instead of reading memory tracepoint data. I think you can just disable split RSS accounting if it does not work well for your configuration. It sounds like the problems you share are common all with existing ways of getting RSS accounting working, and not this particular one, hence I mentioned it is a bit of scope. Also AFAIU, every TASK_RSS_EVENTS_THRESH the page fault code does sync the counters. So it does not indefinitely lurk. The tracepoint's main intended use is to detect spikes which provides ample opportunity to sync the cache. You could reduce TASK_RSS_EVENTS_THRESH in your kernel, or even just disable split RSS accounting if that suits you better. That would solve all the issues you raised, not just any potential ones that you raised here for this tracepoint. thanks, - Joel