Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751646AbdIOAZv (ORCPT + 1 other); Thu, 14 Sep 2017 20:25:51 -0400 Received: from rcdn-iport-7.cisco.com ([173.37.86.78]:42023 "EHLO rcdn-iport-7.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751556AbdIOAZu (ORCPT ); Thu, 14 Sep 2017 20:25:50 -0400 X-Greylist: delayed 562 seconds by postgrey-1.27 at vger.kernel.org; Thu, 14 Sep 2017 20:25:50 EDT X-IronPort-AV: E=Sophos;i="5.42,395,1500940800"; d="scan'208";a="293694485" Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT From: Taras Kondratiuk Cc: xe-linux-external@cisco.com, "Ruslan Ruslichenko" , linux-kernel@vger.kernel.org User-Agent: alot/0.5.1 To: linux-mm@kvack.org Message-ID: <150543458765.3781.10192373650821598320@takondra-t460s> Subject: Detecting page cache trashing state Date: Thu, 14 Sep 2017 17:16:27 -0700 X-Auto-Response-Suppress: DR, OOF, AutoReply Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: Hi In our devices under low memory conditions we often get into a trashing state when system spends most of the time re-reading pages of .text sections from a file system (squashfs in our case). Working set doesn't fit into available page cache, so it is expected. The issue is that OOM killer doesn't get triggered because there is still memory for reclaiming. System may stuck in this state for a quite some time and usually dies because of watchdogs. We are trying to detect such trashing state early to take some preventive actions. It should be a pretty common issue, but for now we haven't find any existing VM/IO statistics that can reliably detect such state. Most of metrics provide absolute values: number/rate of page faults, rate of IO operations, number of stolen pages, etc. For a specific device configuration we can determine threshold values for those parameters that will detect trashing state, but it is not feasible for hundreds of device configurations. We are looking for some relative metric like "percent of CPU time spent handling major page faults". With such relative metric we could use a common threshold across all devices. For now we have added such metric to /proc/stat in our kernel, but we would like to find some mechanism available in upstream kernel. Has somebody faced similar issue? How are you solving it?