Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755507AbdDFQva (ORCPT ); Thu, 6 Apr 2017 12:51:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46990 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753461AbdDFQvV (ORCPT ); Thu, 6 Apr 2017 12:51:21 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 56DA413D13 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=riel@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 56DA413D13 Message-ID: <1491497476.8850.156.camel@redhat.com> Subject: Re: [PATCH] mm: vmscan: fix IO/refault regression in cache workingset transition From: Rik van Riel To: Johannes Weiner Cc: Andrew Morton , Mel Gorman , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Date: Thu, 06 Apr 2017 12:51:16 -0400 In-Reply-To: <20170406144922.GA32364@cmpxchg.org> References: <20170404220052.27593-1-hannes@cmpxchg.org> <1491430264.16856.43.camel@redhat.com> <20170406144922.GA32364@cmpxchg.org> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 06 Apr 2017 16:51:20 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1522 Lines: 44 On Thu, 2017-04-06 at 10:49 -0400, Johannes Weiner wrote: > On Wed, Apr 05, 2017 at 06:11:04PM -0400, Rik van Riel wrote: > > On Tue, 2017-04-04 at 18:00 -0400, Johannes Weiner wrote: > > > > > + > > > + /* > > > +  * When refaults are being observed, it means a new > > > workingset > > > +  * is being established. Disable active list protection > > > to > > > get > > > +  * rid of the stale workingset quickly. > > > +  */ > > > > This looks a little aggressive. What is this > > expected to do when you have multiple workloads > > sharing the same LRU, and one of the workloads > > is doing refaults, while the other workload is > > continuing to use the same working set as before? > > That win was intriguing, but it would be bad if it came out of the > budget of truly shared LRUs (for which I have no quantification). > > Since this is a regression fix, it would be fair to be conservative > and use the 50/50 split for transitions here; keep the more adaptive > behavior for a future optimization. > > What do you think? Lets try your patch, and see what happens. After all, it only affects the file cache, and does not lead to anonymous pages being swapped out and causing major pain. A fast workload transition seems like it could be in everybody's best interest. If this approach leads to trouble, we can always try to soften it later. One potential way of softening would be to look at the number of refaults, vs the number of working set re-confirmations, and determine a target based on that.