Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752144AbdGFOzV (ORCPT ); Thu, 6 Jul 2017 10:55:21 -0400 Received: from mail-ua0-f193.google.com ([209.85.217.193]:35659 "EHLO mail-ua0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751742AbdGFOy0 (ORCPT ); Thu, 6 Jul 2017 10:54:26 -0400 MIME-Version: 1.0 In-Reply-To: <20170706131941.omod4zl4cyuscmjo@techsingularity.net> References: <1499346271-15653-1-git-send-email-guro@fb.com> <20170706131941.omod4zl4cyuscmjo@techsingularity.net> From: Debabrata Banerjee Date: Thu, 6 Jul 2017 10:54:24 -0400 Message-ID: Subject: Re: [PATCH] mm: make allocation counters per-order To: Mel Gorman Cc: Roman Gushchin , linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Rik van Riel , kernel-team@fb.com, "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1541 Lines: 27 On Thu, Jul 6, 2017 at 9:19 AM, Mel Gorman wrote: > The alloc counter updates are themselves a surprisingly heavy cost to > the allocation path and this makes it worse for a debugging case that is > relatively rare. I'm extremely reluctant for such a patch to be added > given that the tracepoints can be used to assemble such a monitor even > if it means running a userspace daemon to keep track of it. Would such a > solution be suitable? Failing that if this is a severe issue, would it be > possible to at least make this a compile-time or static tracepoint option? > That way, only people that really need it have to take the penalty. > > -- > Mel Gorman We (Akamai) have been struggling with memory fragmentation issues for years, and especially the inability to track positive or negative changes to fragmentation between allocator changes and kernels without simply looking for how many allocations are failing. We've had someone toying with trying to report the same data via scanning all pages at report time versus keeping running stats, although we don't have working code yet. If it did work it would avoid the runtime overhead. I don't believe tracepoints are a workable solution for us, since we would have to be collecting the data from boot, as well as continually processing the data in userspace at high cost. Ultimately the locations and other properties (merge-ability) of the allocations in the buddy groups are also important, which would be interesting to add on-top of Roman's patch.