Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752253AbdGFSAE (ORCPT ); Thu, 6 Jul 2017 14:00:04 -0400 Received: from mail-vk0-f44.google.com ([209.85.213.44]:34516 "EHLO mail-vk0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751854AbdGFSAC (ORCPT ); Thu, 6 Jul 2017 14:00:02 -0400 MIME-Version: 1.0 In-Reply-To: <20170706171658.mohgkjcefql4wekz@techsingularity.net> References: <1499346271-15653-1-git-send-email-guro@fb.com> <20170706131941.omod4zl4cyuscmjo@techsingularity.net> <20170706144634.GB14840@castle> <20170706154704.owxsnyizel6bcgku@techsingularity.net> <20170706164304.GA23662@castle> <20170706171658.mohgkjcefql4wekz@techsingularity.net> From: Debabrata Banerjee Date: Thu, 6 Jul 2017 14:00:00 -0400 Message-ID: Subject: Re: [PATCH] mm: make allocation counters per-order To: Mel Gorman Cc: Roman Gushchin , linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Rik van Riel , kernel-team@fb.com, "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 857 Lines: 16 On Thu, Jul 6, 2017 at 1:16 PM, Mel Gorman wrote: > > I'm still struggling to see how counters help when an agent that monitors > for high CPU usage could be activated > I suspect Roman has the same problem set as us, the CPU usage is either always high, high and service critical likely when something interesting is happening. We'd like to collect data on 200k machines, and study the results statistically and with respect to time based on kernel versions, build configs, hardware types, process types, load patterns, etc, etc. Even finding good candidate machines and at the right time of day to manually debug with ftrace is problematic. Granted we could be utilizing existing counters like compact_fail better. Ultimately the data either leads to dealing with certain bad actors, different vm tunings, or patches to mm.