Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751503AbdIOR2d (ORCPT ); Fri, 15 Sep 2017 13:28:33 -0400 Received: from rcdn-iport-1.cisco.com ([173.37.86.72]:58668 "EHLO rcdn-iport-1.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751197AbdIOR2b (ORCPT ); Fri, 15 Sep 2017 13:28:31 -0400 X-IronPort-AV: E=Sophos;i="5.42,398,1500940800"; d="scan'208";a="299515367" Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 To: Michal Hocko From: Taras Kondratiuk In-Reply-To: <20170915143619.2ifgex2jxck2xt5u@dhcp22.suse.cz> Cc: linux-mm@kvack.org, xe-linux-external@cisco.com, Ruslan Ruslichenko , linux-kernel@vger.kernel.org References: <150543458765.3781.10192373650821598320@takondra-t460s> <20170915143619.2ifgex2jxck2xt5u@dhcp22.suse.cz> Message-ID: <150549651001.4512.15084374619358055097@takondra-t460s> User-Agent: alot/0.5.1 Subject: Re: Detecting page cache trashing state Date: Fri, 15 Sep 2017 10:28:30 -0700 X-Auto-Response-Suppress: DR, OOF, AutoReply Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v8FHSfCd014037 Content-Length: 2336 Lines: 45 Quoting Michal Hocko (2017-09-15 07:36:19) > On Thu 14-09-17 17:16:27, Taras Kondratiuk wrote: > > Hi > > > > In our devices under low memory conditions we often get into a trashing > > state when system spends most of the time re-reading pages of .text > > sections from a file system (squashfs in our case). Working set doesn't > > fit into available page cache, so it is expected. The issue is that > > OOM killer doesn't get triggered because there is still memory for > > reclaiming. System may stuck in this state for a quite some time and > > usually dies because of watchdogs. > > > > We are trying to detect such trashing state early to take some > > preventive actions. It should be a pretty common issue, but for now we > > haven't find any existing VM/IO statistics that can reliably detect such > > state. > > > > Most of metrics provide absolute values: number/rate of page faults, > > rate of IO operations, number of stolen pages, etc. For a specific > > device configuration we can determine threshold values for those > > parameters that will detect trashing state, but it is not feasible for > > hundreds of device configurations. > > > > We are looking for some relative metric like "percent of CPU time spent > > handling major page faults". With such relative metric we could use a > > common threshold across all devices. For now we have added such metric > > to /proc/stat in our kernel, but we would like to find some mechanism > > available in upstream kernel. > > > > Has somebody faced similar issue? How are you solving it? > > Yes this is a pain point for a _long_ time. And we still do not have a > good answer upstream. Johannes has been playing in this area [1]. > The main problem is that our OOM detection logic is based on the ability > to reclaim memory to allocate new memory. And that is pretty much true > for the pagecache when you are trashing. So we do not know that > basically whole time is spent refaulting the memory back and forth. > We do have some refault stats for the page cache but that is not > integrated to the oom detection logic because this is really a > non-trivial problem to solve without triggering early oom killer > invocations. > > [1] http://lkml.kernel.org/r/20170727153010.23347-1-hannes@cmpxchg.org Thanks Michal. memdelay looks promising. We will check it.