Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753699AbbHLOyK (ORCPT ); Wed, 12 Aug 2015 10:54:10 -0400 Received: from mailout3.samsung.com ([203.254.224.33]:54133 "EHLO mailout3.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753129AbbHLOyG (ORCPT ); Wed, 12 Aug 2015 10:54:06 -0400 X-AuditID: cbfee68e-f79c56d000006efb-78-55cb5df82633 From: PINTU KUMAR To: "'Michal Hocko'" Cc: "'Andrew Morton'" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, minchan@kernel.org, dave@stgolabs.net, koct9i@gmail.com, mgorman@suse.de, vbabka@suse.cz, js1304@gmail.com, hannes@cmpxchg.org, alexander.h.duyck@redhat.com, sasha.levin@oracle.com, cl@linux.com, fengguang.wu@intel.com, cpgs@samsung.com, pintu_agarwal@yahoo.com, pintu.k@outlook.com, vishnu.ps@samsung.com, rohit.kr@samsung.com, iqbal.ams@samsung.com References: <1438931334-25894-1-git-send-email-pintu.k@samsung.com> <20150807074422.GE26566@dhcp22.suse.cz> <0f2101d0d10f$594e4240$0beac6c0$@samsung.com> <20150807153547.04cf3a12ae095fcdd19da670@linux-foundation.org> <012e01d0d351$5dc752a0$1955f7e0$@samsung.com> <20150811105512.GD18998@dhcp22.suse.cz> In-reply-to: <20150811105512.GD18998@dhcp22.suse.cz> Subject: RE: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath Date: Wed, 12 Aug 2015 20:22:10 +0530 Message-id: <077101d0d50e$a37310f0$ea5932d0$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-index: AQFaNvMwsNOv9/J1Si/Msok5+J2pJQL1BaZPAjLrr4YCBnMzkQEdirPRAchKQEuepOrRcA== Content-language: en-us X-Brightmail-Tracker: H4sIAAAAAAAAA02Se0hTYRjG/c45O7uQcJqaX1IqQlpS3lL7zIjUsO+P7mVGULb04CS3xual DMwLkZtiOTNzU5Ak8zKQZqmZhW3aTUWlq5aSYC5TGxlN7bLaOv3hfz/e9/me5+HjFZDihzwv QZo8g1XKJel+tIgyuIfnblo83pcQ0vwqEFW3GGj0y15JoTe2WYCmTRtQ84gBIKulhUTNxj1I /0NLoKoHNTzUqP7AQy86q2k0bvjDQ+VfpgCasdWTqP67lY8Gi17wkG3uHR9ZCi9SqK7nHYn0 +aUAXS0dAzs8sHnWSuJ7ujE+ruuaJrCxSU1j47yWj59e/0nhyZeVBP76cZTCrTOfSGx9+IrG pXeaAK7QX8D9tT183Np3Hn8zemOdzkzsZ46JtqWw6WlZrDJ4+0mRdGFyAij6488+Gxwi80Ad 0gCBADLhUN+UrQFCB66CQ+MttAaIBGKmAcC8J5UUtwiH9VMjhJPFzE0ANQ2AE80BeL/9Buk0 opkA+Ljb1alxZ/xhXucw38kkM0DCyqoY7m0HARvtMicLHZ5ai5Z0shuD4ctfnD/FrIP9dy3/ 5q5MFOxV36Y5XgkXy8cpzjMQtrYV8Dj2ga2GOZLr6Qs7BmYA1+EoHC4pojmNJ9R+mOA7O0NG L4TtpiWSC2OgrdxEcR+xFhq7//usho8a3lJXANQti9Yti9Yti9Yti6gFVBPwYBXJCtWpVGVY kEoiU2XKU4OSz8iMwHFWffap0g4w0h1tAowA+K1wDSl7niDmSbJU52QmEOFoVEZ6eSSfcVyi PCMpdHNkGIoIj9gctiUq0s/T9aTX0iExkyrJYE+zrIJVJikz01mVCRACoVceOLAa5uzbVSNa o848nZUsSrOTGxcsv9tiR8dutTXaUks+9RH0TpfAgEJZYkpO7ue2oK3E3uJ4n/LDIDFYFDfb VRiSfakgOsbsZjUHh0hjze1E/ny09Y98t7ubTahomKHWvzb1SiM+Ts5Lhk4Anfc138tx2RWD xS7v/WuOHHzD96NUUkloIKlUSf4CkijsCVEDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA2WSe2xLURzHnd7b225Rueqxk2ZobpiQlJbVTsWEkDmJ19hqHqGu7eoW7e30 duYVhki0Yawjs45ksRimMlqPbR5hQ0YnzIgwm4nZzNTiNQ0yWjdk4fz1yS+/7+f8zslPTij3 Uip5Fu/g7DxrYahoMtD3itaEVgaM2g/vY9GRSi+FfvQVkehJ7zuAumrHodNPvQD1dFYS6LRv Pir55pag4qtHpeiUs02KmmqOUKjV+1OKCt93ANTdW06g8i89MnR/T5MU9QabZahz124Sld1s JlDJjnyADua3gBnDcN27HgJXe1pkuOxKlwT7KpwU9n10y3D94e8kbn9UJMEfXj8jsb/7DYF7 rj2mcP75CoAPlWzHDaU3Zdgf2II/+UZij6dOkkwvzwPTMjk2g7OrOT7dlpHFmxOZuSmmWSb9 FK1OozOgBEbNs1YukZk9L1mTlGUJfwCj3sBacsKlZFYQmInT/zcYU5M06E9wqWZ+6uK/mUna f85qL8j82v4SZDckbbxz/wGRB8qQC0TJIR0PyzueSkQeDh+0VlIRVtLHAXSdBC4QHeYggJcv HSNcQC6n6LHw9nVFpGcoHQfzahplESboewQsKp4pZqsk8FSfNcJRYb+7001EeAiN4aMf4l0k PQY2XOj8XVfQBnjLeY4SeTAMFbaSonM89F/cKRV5FPR7g4Q4pxpW3esG4gxLYePePZTYEwPd bS9lB4DS00/l6afy9FN5+kVKAVkBIJedni2sMVt1PJc7QWCtQg5vnpBus/rA703sUFWBGzVz agEtB8xAhbbgrlEpZTcIm6y1AMoJZqhiW2rAqFRksJs2c3abyZ5j4YRaoA8/toBQDUu3hfea d5h0k+MT9JP1hngUP8XAxChGnPUblbSZdXDrOC6bs//JSeRRqjwgW2B1VlfXh/Y3pbYnuGJj k9ZP3Zqysllt/mobYuHhpeuFMY78RcterWnoKsAW8Hz26Obe0DW8fu2g4od8XY6qeEdg+66U 4Ysz9J+P6oILFate7HtrctRv1c5b0TaLrFOSzgH+3ItnljQGN9/mDDNDA0+kZXpTViWmNebG EZK0IoYUMlndeMIusL8AA31eEZ8DAAA= DLP-Filter: Pass X-MTR: 20000000000000000@CPGS X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9066 Lines: 221 Hi, > -----Original Message----- > From: Michal Hocko [mailto:mhocko@kernel.org] > Sent: Tuesday, August 11, 2015 4:25 PM > To: PINTU KUMAR > Cc: 'Andrew Morton'; linux-kernel@vger.kernel.org; linux-mm@kvack.org; > minchan@kernel.org; dave@stgolabs.net; koct9i@gmail.com; > mgorman@suse.de; vbabka@suse.cz; js1304@gmail.com; > hannes@cmpxchg.org; alexander.h.duyck@redhat.com; > sasha.levin@oracle.com; cl@linux.com; fengguang.wu@intel.com; > cpgs@samsung.com; pintu_agarwal@yahoo.com; pintu.k@outlook.com; > vishnu.ps@samsung.com; rohit.kr@samsung.com > Subject: Re: [PATCH 1/1] mm: vmstat: introducing vm counter for slowpath > > On Mon 10-08-15 15:15:06, PINTU KUMAR wrote: > [...] > > > > Regarding trace points, I am not sure if we can attach counter to it. > > > > Also trace may have more over-head and requires additional configs > > > > to be enabled to debug. > > > > Mostly these configs will not be enabled by default (at least in > > > > embedded, low memory device). > > > > I found the vmstat interface more easy and useful. > > > > > > This does seem like a pretty basic and sensible thing to expose in > > > vmstat. It probably makes more sense than some of the other things we have > in there. > > I still fail to see what exactly this number says. The allocator slowpath (aka > __alloc_pages_slowpath) is more an organizational split up of the code than > anything that would tell us about how costly the allocation is - e.g. zone_reclaim > might happen before we enter the slowpath. > > > Thanks Andrew. > > Yes, as par my analysis, I feel that this is one of the useful and > > important interface. > > I added it in one of our internal product and found it to be very useful. > > Specially during shrink_memory and compact_nodes analysis I found it > > really useful. > > It helps me to prove that if higher-order pages are present, it can > > reduce the slowpath drastically. > > I am not sure I understand but this is kind of obvious, no? > Yes, but it's hard to prove to management that the slowpath count is reduced. As we have seen, most of the time this kind of performance issues are hard to reproduce. > > Also during my ELC presentation people asked me how to monitor the > > slowpath counts. > > Isn't the allocation latency a much well defined metric? What does the slowpath > without compaction/reclaim tell to user? > The current metrics in slowpath is the story half told. > > > Yes, it could be a tracepoint but practically speaking, a tracepoint > > > makes it developer-only. You can ask a bug reporter or a customer > > > "what is /proc/vmstat:slowpath_entered" doing, but it's harder to > > > ask them to set up tracing. > > > > > Yes, at times tracing are painful to analyze. > > Also, in commercial user binaries, most of tracing support are > > disabled (with no root privileges). > > However, /proc/vmstat works with normal user binaries. > > When memory issues are reported, we just get log dumps and few > > interfaces like this. > > Most of the time these memory issues are hard to reproduce because it > > may happen after long usage. > > Yes, I do understand that vmstat is much more convenient. No question about > that. But the counter should be generally usable. > > When I see COMPACTSTALL increasing I know that the direct compaction had to > be invoked and that tells me that the system is getting fragmented and > COMPACTFAIL/COMPACTSUCCESS will tell me how successful the compaction is. > > Similarly when I see ALLOCSTALL I know that kswapd doesn't catch up and > scan/reclaim will tell me how effective it is. Snapshoting ALLOCSTALL/time > helped me to narrow down memory pressure peaks to further investigate other > counters in a more detail. > > What will entered-slowpath without triggering neither compaction nor direct > reclaim tell me? > The slowpath count will actually give the actual number, irrespective of compact/reclaim/kswapd. There are other things that happens in slowpath, for which we don't have counters. Thus having one counter _slowpath_ is enough for all situations. Even, when KSWAP/COMPACTION is disabled, or not used. > [...] > > > > Two things: > > > > > > - we appear to have forgotten to document /proc/vmstat > > > > > Yes, I could not find any document on vmstat under kernel/Documentation. > > I think it's a nice think to have. > > May be, I can start this initiative to create one :) > > That would be more than appreciated. > Ok, I will start the basic vmstat.txt in Documentation and release first version. Thanks. > > If respective owner can update, it will be great. > > > > > - How does one actually use slowpath_entered? Obviously we'd like to > > > know "what proportion of allocations entered the slowpath", so we > > > calculate > > > > > > slowpath_entered/X > > > > > > how do we obtain "X"? Is it by adding up all the pgalloc_*? > > It's not because pgalloc_ count number of pages while slowpath_entered counts > allocations requests. > > > > If > > > so, perhaps we should really have slowpath_entered_dma, > > > slowpath_entered_dma32, ...? > > > > I think the slowpath for other zones may not be required. > > We just need to know how many times we entered slowpath and possibly > > do something to reduce it. > > But, I think, pgalloc_* count may also include success for fastpath. > > > > How I use slowpath for analysis is: > > VMSTAT BEFORE AFTER %DIFF > > ---------- ---------- ---------- ------------ > > nr_free_pages 6726 12494 46.17% > > pgalloc_normal 985836 1549333 36.37% > > pageoutrun 2699 529 80.40% > > allocstall 298 98 67.11% > > slowpath_entered 16659 739 95.56% > > compact_stall 244 21 91.39% > > compact_fail 178 11 93.82% > > compact_success 52 7 86.54% > > > > The above values are from 512MB system with only NORMAL zone. > > Before, the slowpath count was 16659. > > After (memory shrinker + compaction), the slowpath reduced by 95%, for > > the same scenario. > > This is just an example. > > But what additional information does it give to us? We can see that the direct > reclaim has been reduced as well as the compaction which was even more > effective so the overall memory pressure was lighter and memory less > fragmented. I assume that your test has requested the same amount of high > order allocations and pgalloc_normal much higher in the second case suggests > they were more effective but we can see that clearly even without > slowpath_entered. > The think to note here is that, slowpath count is 16659 (which is 100% actual, and no confusion). However, if you see the other counter for slowpath (pageoutrun:2699, allocstall:298, compact_stall:244), And add all of them (2699+298+244)=3241, it is much lesser than the actual slowpath count. So, these counter doesn't really tells what actually happened in the slowpath. There are other factors that effects slowpath (like, alloc without watermarks). Moreover, with _retry_ and _rebalance_ mechanism, the allocstall/compact_stall counter will keep increasing. But, slowpath count will remain same. Also, in some system, the KSWAP can be disabled, so pageoutrun will be always 0. Similarly, COMPACTION can be disabled, so compact_stall will not be present. In this scenario, we are left with only allocstall. Also, as I said earlier, this allocstall can also be incremented from other place, such as shrink_all_memory. Consider, another situation like below: VMSTAT ------------------------------------- nr_free_pages 59982 pgalloc_normal 364163 pgalloc_high 2046 pageoutrun 1 allocstall 0 compact_stall 0 compact_fail 0 compact_success 0 ------------------------------------ >From the above, is it possible to tell how many times it entered into slowpath? Now, I will add slowpath here, and check again. I don't have that data right now. Thus, the point is, just one counter is enough to quickly analyze the behavior in slowpath. More suggestions are welcome! > So I would argue that we do not need slowpath_entered. We already have it, > even specialized depending on which _slow_ path has been executed. > What we are missing is a number of all requests to have a reasonable base. > Whether adding such a counter in the hot path is justified is a question. I haven't > really needed it so far and I am looking into vmstat and meminfo to debug > memory reclaim related issues quite often. > > > If we are interested to know even allocation success/fail ratio in > > slowpath, then I think we need more counters. > > Such as; direct_reclaim_success/fail, kswapd_success/fail (just like > > compaction success/fail). > > OR, we can have pgalloc_success_fastpath counter. > > This all sounds like exposing more and more details about internal > implementation. This all fits into tracepoints world IMO. > > -- > Michal Hocko > SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/