Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935326AbdDFOtr (ORCPT ); Thu, 6 Apr 2017 10:49:47 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:43976 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752846AbdDFOti (ORCPT ); Thu, 6 Apr 2017 10:49:38 -0400 Date: Thu, 6 Apr 2017 10:49:22 -0400 From: Johannes Weiner To: Rik van Riel Cc: Andrew Morton , Mel Gorman , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition Message-ID: <20170406144922.GA32364@cmpxchg.org> References: <20170404220052.27593-1-hannes@cmpxchg.org> <1491430264.16856.43.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1491430264.16856.43.camel@redhat.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1847 Lines: 45 On Wed, Apr 05, 2017 at 06:11:04PM -0400, Rik van Riel wrote: > On Tue, 2017-04-04 at 18:00 -0400, Johannes Weiner wrote: > > > + > > + /* > > + ?* When refaults are being observed, it means a new > > workingset > > + ?* is being established. Disable active list protection to > > get > > + ?* rid of the stale workingset quickly. > > + ?*/ > > This looks a little aggressive. What is this > expected to do when you have multiple workloads > sharing the same LRU, and one of the workloads > is doing refaults, while the other workload is > continuing to use the same working set as before? It is aggressive, but it seems to be a trade-off between three things: maximizing workingset protection during stable periods; minimizing repeat refaults during workingset transitions; both of those when the LRU is shared. The data point we would need to balance optimally between these cases is whether the active list is hot or stale, but we only have that once we disable active list protection and challenge those pages. The more conservative we go about this, the more IO cost to establish the incoming workingset pages. I actually did experiment with this. Instead of disabling active list protection entirely, I reverted to the more conservative 50/50 ratio during refaults. The 50/50 split addressed the regression, but the aggressive behavior fared measurably better across three different services I tested this on (one of them *is* multi-workingset, but the jobs are cgrouped so they don't *really* share LRUs). That win was intriguing, but it would be bad if it came out of the budget of truly shared LRUs (for which I have no quantification). Since this is a regression fix, it would be fair to be conservative and use the 50/50 split for transitions here; keep the more adaptive behavior for a future optimization. What do you think?