Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752296AbdGFPvc (ORCPT ); Thu, 6 Jul 2017 11:51:32 -0400 Received: from outbound-smtp04.blacknight.com ([81.17.249.35]:41670 "EHLO outbound-smtp04.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751729AbdGFPva (ORCPT ); Thu, 6 Jul 2017 11:51:30 -0400 Date: Thu, 6 Jul 2017 16:51:23 +0100 From: Mel Gorman To: Debabrata Banerjee Cc: Roman Gushchin , linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Rik van Riel , kernel-team@fb.com, "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] mm: make allocation counters per-order Message-ID: <20170706155123.cyyjpvraifu5ptmr@techsingularity.net> References: <1499346271-15653-1-git-send-email-guro@fb.com> <20170706131941.omod4zl4cyuscmjo@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2349 Lines: 42 On Thu, Jul 06, 2017 at 10:54:24AM -0400, Debabrata Banerjee wrote: > On Thu, Jul 6, 2017 at 9:19 AM, Mel Gorman wrote: > > > The alloc counter updates are themselves a surprisingly heavy cost to > > the allocation path and this makes it worse for a debugging case that is > > relatively rare. I'm extremely reluctant for such a patch to be added > > given that the tracepoints can be used to assemble such a monitor even > > if it means running a userspace daemon to keep track of it. Would such a > > solution be suitable? Failing that if this is a severe issue, would it be > > possible to at least make this a compile-time or static tracepoint option? > > That way, only people that really need it have to take the penalty. > > > > -- > > Mel Gorman > > We (Akamai) have been struggling with memory fragmentation issues for > years, and especially the inability to track positive or negative > changes to fragmentation between allocator changes and kernels without > simply looking for how many allocations are failing. We've had someone > toying with trying to report the same data via scanning all pages at > report time versus keeping running stats, although we don't have > working code yet. If it did work it would avoid the runtime overhead. > I don't believe tracepoints are a workable solution for us, since we > would have to be collecting the data from boot, as well as continually > processing the data in userspace at high cost. Ultimately the > locations and other properties (merge-ability) of the allocations in > the buddy groups are also important, which would be interesting to add > on-top of Roman's patch. These counters do not actually help you solve that particular problem. Knowing how many allocations happened since the system booted doesn't tell you much about how many failed or why they failed. You don't even know what frequency they occured at unless you monitor it constantly so you're back to square one whether this information is available from proc or not. There even is a tracepoint that can be used to track information related to events that degrade fragmentation (trace_mm_page_alloc_extfrag) although the primary thing it tells you is that "the probability that an allocation will fail due to fragmentation in the future is potentially higher". -- Mel Gorman SUSE Labs