Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754915AbdIHHLV (ORCPT ); Fri, 8 Sep 2017 03:11:21 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:34765 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754510AbdIHHLT (ORCPT ); Fri, 8 Sep 2017 03:11:19 -0400 X-Original-SENDERIP: 156.147.1.121 X-Original-MAILFROM: kyeongdon.kim@lge.com X-Original-SENDERIP: 10.174.78.84 X-Original-MAILFROM: kyeongdon.kim@lge.com Subject: Re: Re: [PATCH] mm/vmstats: add counters for the page frag cache To: Konstantin Khlebnikov , akpm@linux-foundation.org, sfr@canb.auug.org.au Cc: ying.huang@intel.com, vbabka@suse.cz, hannes@cmpxchg.org, xieyisheng1@huawei.com, luto@kernel.org, shli@fb.com, mhocko@suse.com, mgorman@techsingularity.net, hillf.zj@alibaba-inc.com, kemi.wang@intel.com, rientjes@google.com, bigeasy@linutronix.de, iamjoonsoo.kim@lge.com, bongkyu.kim@lge.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, netdev References: <1504222631-2635-1-git-send-email-kyeongdon.kim@lge.com> From: Kyeongdon Kim Message-ID: Date: Fri, 8 Sep 2017 16:11:15 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3888 Lines: 106 On 2017-09-04 오후 5:30, Konstantin Khlebnikov wrote: > On 04.09.2017 04:35, Kyeongdon Kim wrote: > > Thanks for your reply, > > But I couldn't find "NR_FRAGMENT_PAGES" in linux-next.git .. is that > vmstat counter? or others? > > > > I mean rather than adding bunch vmstat counters for operations it > might be > worth to add page counter which will show current amount of these pages. > But this seems too low-level for tracking, common counters for all > network > buffers would be more useful but much harder to implement. > Ok, thanks for the comment. > As I can see page owner is able to save stacktrace where allocation > happened, > this makes debugging mostly trivial without any counters. If it adds > too much > overhead - just track random 1% of pages, should be enough for finding > leak. > As I said, we already used page owner tools to resolve the leak issue. But that's extremely difficult to figure it out, too much callstack and too much allocation orders(0 or more). We couldn't easily get a clue event if we track 1% of pages.. In fact, I was writing another email to send a new patch with debug config. We added a hash function to pick out address with callstack by using debugfs, It could be showing the only page_frag_cache leak with owner. for exmaple code : +++ /mm/page_alloc.c @@ -4371,7 +4371,9 @@ void *page_frag_alloc(struct page_frag_cache *nc,         nc->pagecnt_bias--;         nc->offset = offset; +#ifdef CONFIG_PGFRAG_DEBUG +       page_frag_debug_alloc(nc->va + offset); +#endif         return nc->va + offset;  }  EXPORT_SYMBOL(page_frag_alloc); @@ -4382,7 +4384,9 @@ EXPORT_SYMBOL(page_frag_alloc);  void page_frag_free(void *addr)  {         struct page *page = virt_to_head_page(addr); +#ifdef CONFIG_PGFRAG_DEBUG +       page_frag_debug_free(addr); +#endif         if (unlikely(put_page_testzero(page))) Those counters that I added may be too much for the linux server or something. However, I think the other systems may need to simple debugging method. (like Android OS) So if you can accept the patch with debug feature, I will send it including counters. but still thinking those counters don't need. I won't. Anyway, I'm grateful for your feedback, means a lot to me. Thanks, Kyeongdon Kim > > As you know, page_frag_alloc() directly calls > __alloc_pages_nodemask() function, > > so that makes too difficult to see memory usage in real time even > though we have "/meminfo or /slabinfo.." information. > > If there was a way already to figure out the memory leakage from > page_frag_cache in mainline, I agree your opinion > > but I think we don't have it now. > > > > If those counters too much in my patch, > > I can say two values (pgfrag_alloc and pgfrag_free) are enough to > guess what will happen > > and would remove pgfrag_alloc_calls and pgfrag_free_calls. > > > > Thanks, > > Kyeongdon Kim > > > > >> IMHO that's too much counters. > >> Per-node NR_FRAGMENT_PAGES should be enough for guessing what's > going on. > >> Perf probes provides enough features for furhter debugging. > >> > >> On 01.09.2017 02:37, Kyeongdon Kim wrote: > >> > There was a memory leak problem when we did stressful test > >> > on Android device. > >> > The root cause of this was from page_frag_cache alloc > >> > and it was very hard to find out. > >> > > >> > We add to count the page frag allocation and free with function > call. > >> > The gap between pgfrag_alloc and pgfrag_free is good to to calculate > >> > for the amount of page. > >> > The gap between pgfrag_alloc_calls and pgfrag_free_calls is for > >> > sub-indicator. > >> > They can see trends of memory usage during the test. > >> > Without it, it's difficult to check page frag usage so I believe we > >> > should add it. > >> > > >> > Signed-off-by: Kyeongdon Kim > >> > ---