Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp844866ybe; Wed, 4 Sep 2019 08:34:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqwZf47kpGSJkG9tMjIR16zd0jPFh7gK7FvLWQYXBHZsrY0b3/UNy2WOlAFv29if0Vv/v4Ue X-Received: by 2002:a63:3009:: with SMTP id w9mr37080961pgw.260.1567611276321; Wed, 04 Sep 2019 08:34:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567611276; cv=none; d=google.com; s=arc-20160816; b=HRicGao1n+YdsVGBpoq3CnU4XrXFjBaHXHZBSz3eZ7lan27pSTGzufrML8h6jOdE0q xNfAEu/I5CJodetf6uNWRck2wYtGGzyC5SUUW1qNGoGcIf7KPHPfvMjO3QAYxlpzqV+J 2patY9aiFGVeabsnGW2kwjwuVZLQUEe+hiJPRQGH94MUGo09ViySNUWe7sB1/FiQ5a5t DtBT1/1nOj2poVt852LJb+MyE8jOyraIhBgAnCLChKYqUo6+Rhy3IOTKi/KfUqpU5QnQ rmH5a9h5hteEAnOt1JOzD0vcSjO1XrVABexHwBWRSNYk7p4qNAPl9HBiKJsz0k85bYP1 3kvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=pAF8NODS0IdeXYqZZg+qCkGRO2c4ppZNQniBzk4p2uM=; b=WOon4ARU5OEeAa0f4PWG4KVJeddEKwctqZnjkCQMwkr5MhjIDKdvolVcCXCiayjGEx fCFJTihrZnyrkPLzJhOgujldMy/0XVvX0vvv/P/hP44xqJ2Mfxbyb0CoLj+IgdhH1N8v F5ncStgE11cuf4LDZe6SbGHdDExcxxQdgISMwDh3pzpMa+tCuYT4/NhWr1gbZh9powBH KSXEfNTsC7kcr0jt51W+dKAI44orrStsYCvA0KoRFwCIm5Gx2QfnncV6rrV78eajP/lh s5mOmDxlJi4Iwn5pIY2ygLgeH1Ewgxq2p056nHtL84zkpH5Xk4SqJhLQ1BR4az2zNQp+ wGow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=o3gX50CZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 10si16933292pgp.78.2019.09.04.08.34.20; Wed, 04 Sep 2019 08:34:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=o3gX50CZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731317AbfIDPdA (ORCPT + 99 others); Wed, 4 Sep 2019 11:33:00 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:46616 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729773AbfIDPdA (ORCPT ); Wed, 4 Sep 2019 11:33:00 -0400 Received: by mail-pf1-f195.google.com with SMTP id q5so5615844pfg.13 for ; Wed, 04 Sep 2019 08:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=pAF8NODS0IdeXYqZZg+qCkGRO2c4ppZNQniBzk4p2uM=; b=o3gX50CZNu1DcMqR6yUhYk+TBr4cQghSccWX9D21vjjS4pqOig7pOsSJ7Th0chKLMA Q5K2BYcdbPSYmNTdTv9lO6frE7SHY4Am/WMnRV9V6NGYEozg4P0/EjSGCX75HKt0cbWL 0Qb7sgxZr7mf5ls4lw9/gJk+/0KNS+kBDwYJg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=pAF8NODS0IdeXYqZZg+qCkGRO2c4ppZNQniBzk4p2uM=; b=CwGF3HEJngpuzMI6hSZ5A+htw4nfCapxFGMAEc7/jNjME+75LbasCaT/AE3lR7gYWv oHEcpe1RlebB6ynU9sIp2W06Z0JKKOGdqBMNj2d5nUn/i0BfKKuBphrGFwv1r1jHyHHh 7DxDmq55Y/J1PeMtCZgBNxs2PES52VFxouUgYHYO2JO3LwgysLpyH5MdzNlzkwUgvALY JCPnZsSgQa+UiDu+1iU7MGWDGs3Dl1es2pwm+4m7d4iZHME1s0I7WwVMwZyAog9ge2pc 42CQSvt3rLASma/6Qx/ypkOj+MVGIk+8h+IQZNWQnB9wuHOvnjDYly4KCPbmw5b7rLbK vFPA== X-Gm-Message-State: APjAAAVEvCbcH6kdszJglQizBbdbqFXai1GnHVGWtBka3daf+vyEkDJQ o7rRmdMCU+DQRtfrKS9Hm68rVw== X-Received: by 2002:a62:e216:: with SMTP id a22mr15980395pfi.249.1567611179460; Wed, 04 Sep 2019 08:32:59 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id o1sm2744305pjp.0.2019.09.04.08.32.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2019 08:32:58 -0700 (PDT) Date: Wed, 4 Sep 2019 11:32:58 -0400 From: Joel Fernandes To: Michal Hocko Cc: linux-kernel@vger.kernel.org, Tim Murray , carmenjackson@google.com, mayankgupta@google.com, dancol@google.com, rostedt@goodmis.org, minchan@kernel.org, akpm@linux-foundation.org, kernel-team@android.com, "Aneesh Kumar K.V" , Dan Williams , Jerome Glisse , linux-mm@kvack.org, Matthew Wilcox , Ralph Campbell , Vlastimil Babka Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold Message-ID: <20190904153258.GH240514@google.com> References: <20190903200905.198642-1-joel@joelfernandes.org> <20190904084508.GL3838@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190904084508.GL3838@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 04, 2019 at 10:45:08AM +0200, Michal Hocko wrote: > On Tue 03-09-19 16:09:05, Joel Fernandes (Google) wrote: > > Useful to track how RSS is changing per TGID to detect spikes in RSS and > > memory hogs. Several Android teams have been using this patch in various > > kernel trees for half a year now. Many reported to me it is really > > useful so I'm posting it upstream. > > > > Initial patch developed by Tim Murray. Changes I made from original patch: > > o Prevent any additional space consumed by mm_struct. > > o Keep overhead low by checking if tracing is enabled. > > o Add some noise reduction and lower overhead by emitting only on > > threshold changes. > > Does this have any pre-requisite? I do not see trace_rss_stat_enabled in > the Linus tree (nor in linux-next). No, this is generated automatically by the tracepoint infrastructure when a tracepoint is added. > Besides that why do we need batching in the first place. Does this have a > measurable overhead? How does it differ from any other tracepoints that we > have in other hotpaths (e.g. page allocator doesn't do any checks). We do need batching not only for overhead reduction, but also for reducing tracing noise. Flooding the traces makes it less useful for long traces and post-processing of traces. IOW, the overhead reduction is a bonus. I have not looked at the page allocator paths, we don't currently use that for the purposes of this rss_stat tracepoint. > Other than that this looks reasonable to me. Thanks! - Joel > > > Co-developed-by: Tim Murray > > Signed-off-by: Tim Murray > > Signed-off-by: Joel Fernandes (Google) > > > > --- > > > > v1->v2: Added more commit message. > > > > Cc: carmenjackson@google.com > > Cc: mayankgupta@google.com > > Cc: dancol@google.com > > Cc: rostedt@goodmis.org > > Cc: minchan@kernel.org > > Cc: akpm@linux-foundation.org > > Cc: kernel-team@android.com > > > > include/linux/mm.h | 14 +++++++++++--- > > include/trace/events/kmem.h | 21 +++++++++++++++++++++ > > mm/memory.c | 20 ++++++++++++++++++++ > > 3 files changed, 52 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 0334ca97c584..823aaf759bdb 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1671,19 +1671,27 @@ static inline unsigned long get_mm_counter(struct mm_struct *mm, int member) > > return (unsigned long)val; > > } > > > > +void mm_trace_rss_stat(int member, long count, long value); > > + > > static inline void add_mm_counter(struct mm_struct *mm, int member, long value) > > { > > - atomic_long_add(value, &mm->rss_stat.count[member]); > > + long count = atomic_long_add_return(value, &mm->rss_stat.count[member]); > > + > > + mm_trace_rss_stat(member, count, value); > > } > > > > static inline void inc_mm_counter(struct mm_struct *mm, int member) > > { > > - atomic_long_inc(&mm->rss_stat.count[member]); > > + long count = atomic_long_inc_return(&mm->rss_stat.count[member]); > > + > > + mm_trace_rss_stat(member, count, 1); > > } > > > > static inline void dec_mm_counter(struct mm_struct *mm, int member) > > { > > - atomic_long_dec(&mm->rss_stat.count[member]); > > + long count = atomic_long_dec_return(&mm->rss_stat.count[member]); > > + > > + mm_trace_rss_stat(member, count, -1); > > } > > > > /* Optimized variant when page is already known not to be PageAnon */ > > diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h > > index eb57e3037deb..8b88e04fafbf 100644 > > --- a/include/trace/events/kmem.h > > +++ b/include/trace/events/kmem.h > > @@ -315,6 +315,27 @@ TRACE_EVENT(mm_page_alloc_extfrag, > > __entry->change_ownership) > > ); > > > > +TRACE_EVENT(rss_stat, > > + > > + TP_PROTO(int member, > > + long count), > > + > > + TP_ARGS(member, count), > > + > > + TP_STRUCT__entry( > > + __field(int, member) > > + __field(long, size) > > + ), > > + > > + TP_fast_assign( > > + __entry->member = member; > > + __entry->size = (count << PAGE_SHIFT); > > + ), > > + > > + TP_printk("member=%d size=%ldB", > > + __entry->member, > > + __entry->size) > > + ); > > #endif /* _TRACE_KMEM_H */ > > > > /* This part must be outside protection */ > > diff --git a/mm/memory.c b/mm/memory.c > > index e2bb51b6242e..9d81322c24a3 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -72,6 +72,8 @@ > > #include > > #include > > > > +#include > > + > > #include > > #include > > #include > > @@ -140,6 +142,24 @@ static int __init init_zero_pfn(void) > > } > > core_initcall(init_zero_pfn); > > > > +/* > > + * This threshold is the boundary in the value space, that the counter has to > > + * advance before we trace it. Should be a power of 2. It is to reduce unwanted > > + * trace overhead. The counter is in units of number of pages. > > + */ > > +#define TRACE_MM_COUNTER_THRESHOLD 128 > > + > > +void mm_trace_rss_stat(int member, long count, long value) > > +{ > > + long thresh_mask = ~(TRACE_MM_COUNTER_THRESHOLD - 1); > > + > > + if (!trace_rss_stat_enabled()) > > + return; > > + > > + /* Threshold roll-over, trace it */ > > + if ((count & thresh_mask) != ((count - value) & thresh_mask)) > > + trace_rss_stat(member, count); > > +} > > > > #if defined(SPLIT_RSS_COUNTING) > > > > -- > > 2.23.0.187.g17f5b7556c-goog > > -- > Michal Hocko > SUSE Labs