Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp1911988pxb; Sun, 10 Jan 2021 16:15:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJzjbWL9D88y3sjTI6PJzlXirF7sfSCZIsajO8riIGs9omF4kcMp0Avsv7M7FmO1Q9Bo83/3 X-Received: by 2002:a17:906:2ccb:: with SMTP id r11mr9276538ejr.39.1610324108850; Sun, 10 Jan 2021 16:15:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610324108; cv=none; d=google.com; s=arc-20160816; b=RQ7la14PIQ0ZuCft7JiyK3+3bNjl1qCyK0Efkq6vFknKOZrvs5+YP8ACSeTX0zz0Yt ObFkthreXSmzHnZI79Kh2epaA0yzyFXU6Kzpk+3eaTMWfdFP3nyF2x20CTACdg64pVvu X9U5iZvJbUo98+MIUmqAzQkQvDHFUEUNn17I4BOfctab72tTlrumXVqFIoIFeIb0NB75 ayXWBW4SdNMg9dDIpiA0Dm4uCUm6PcsdT3ZIoTgyn5Y1kw5WSI1YrvoQvcPDZ2yWSAOV k29wdigWvA4FejpMrO5IA9fAcn53Y6NMCcJkEWum74FoEvSdRH4613nxYUVq68kKoDEM XoWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=3yFnhC7dEdU4oERyEo7mx3Up3OyrU6guGsY7drWVuZ0=; b=CQz4hPeIp6+nc03a5WlFdOfikOdjhV8AhtNDxcKczti4V53fIeBxxJMcTQsy8yN7m/ cGn7enfjhPTFQ7izJE2yBlb47OqELwkt0XmtRLhnrlrJ0mv15EYN6QHe+v8qyLirtuTz a+OABo1O4+QztJzTUfFp2d1nzPsMY7DLPbpW9XBwWWQZhggPL0PceWJsme1DDkN2HPJz RnejeG2ZEROho+TgrIpSfSSt6VlEhmJVZmdVWNfu9V4zXzhNT8jnBJCK2V7iR5UD5HCd e0OGtP3rh+K81JLCx8jHbPwL8QjSa0q49WYxJ7hE2Nkh6dIUPlu7HVkovv/t/5rUzchR umww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j22si6162106edh.496.2021.01.10.16.14.37; Sun, 10 Jan 2021 16:15:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726709AbhAKAMi (ORCPT + 99 others); Sun, 10 Jan 2021 19:12:38 -0500 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:52814 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726049AbhAKAMi (ORCPT ); Sun, 10 Jan 2021 19:12:38 -0500 Received: from dread.disaster.area (pa49-179-167-107.pa.nsw.optusnet.com.au [49.179.167.107]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id B4D54827F97; Mon, 11 Jan 2021 11:11:52 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1kykoI-005ATR-MN; Mon, 11 Jan 2021 11:11:50 +1100 Date: Mon, 11 Jan 2021 11:11:50 +1100 From: Dave Chinner To: Andrew Morton Cc: Sudarshan Rajagopalan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vladimir Davydov Subject: Re: [PATCH] mm: vmscan: support complete shrinker reclaim Message-ID: <20210111001150.GB164110@dread.disaster.area> References: <2d1f1dbb7e018ad02a9e7af36a8c86397a1598a7.1609892546.git.sudaraja@codeaurora.org> <20210106155602.6ce48dfe88ca7b94986b329b@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210106155602.6ce48dfe88ca7b94986b329b@linux-foundation.org> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=F8MpiZpN c=1 sm=1 tr=0 cx=a_idp_d a=+wqVUQIkAh0lLYI+QRsciw==:117 a=+wqVUQIkAh0lLYI+QRsciw==:17 a=kj9zAlcOel0A:10 a=EmqxpYm9HcoA:10 a=LpQP-O61AAAA:8 a=VwQbUJbxAAAA:8 a=pGLkceISAAAA:8 a=7-415B0cAAAA:8 a=QjApW-H8ccU26VCOgC8A:9 a=CjuIK1q_8ugA:10 a=pioyyrs4ZptJ924tMmac:22 a=AjGcO6oz07-iQ99wixmX:22 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 06, 2021 at 03:56:02PM -0800, Andrew Morton wrote: > (cc's added) > > On Tue, 5 Jan 2021 16:43:38 -0800 Sudarshan Rajagopalan wrote: > > > Ensure that shrinkers are given the option to completely drop > > their caches even when their caches are smaller than the batch size. > > This change helps improve memory headroom by ensuring that under > > significant memory pressure shrinkers can drop all of their caches. > > This change only attempts to more aggressively call the shrinkers > > during background memory reclaim, inorder to avoid hurting the > > performance of direct memory reclaim. > > Why isn't the residual scan count accrual (nr_deferred) not triggering the total_scan > freeable condition that is supposed to allow shrinkers to completely empty under ongoing memory pressure events? > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -424,6 +424,10 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > > long batch_size = shrinker->batch ? shrinker->batch > > : SHRINK_BATCH; > > long scanned = 0, next_deferred; > > + long min_cache_size = batch_size; > > + > > + if (current_is_kswapd()) > > + min_cache_size = 0; > > > > if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) > > nid = 0; > > @@ -503,7 +507,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > > * scanning at high prio and therefore should try to reclaim as much as > > * possible. > > */ > > - while (total_scan >= batch_size || > > + while (total_scan > min_cache_size || > > total_scan >= freeable) { > > unsigned long ret; > > unsigned long nr_to_scan = min(batch_size, total_scan); > > I don't really see the need to exclude direct reclaim from this fix. > > And if we're leaving unscanned objects behind in this situation, the > current code simply isn't working as intended, and 0b1fb40a3b1 ("mm: > vmscan: shrink all slab objects if tight on memory") either failed to > achieve its objective or was later broken? This looks to me like just another symptom of the fact that nr_deferred needs to be tracked per-memcg. i.e. the deferred work because total_scan < batch_size is not being aggregated against that specific memcg and hence the accrual of deferred work over multiple calls is not occurring correctly. Therefore we never meet the conditions (total_scan > freeable) where the memcg shrinker can drain the last few freeable entries in the cache. i.e. see this patchset which makes the deferral of work be accounted per-memcg: https://lore.kernel.org/lkml/20210105225817.1036378-1-shy828301@gmail.com/ and that should also allow accrual of the work skipped on each memcg be accounted across multiple calls to the shrinkers for the same memcg. Hence as memory pressure within the memcg goes up, the repeated calls to direct reclaim within that memcg will result in all of the freeable items in each cache eventually being freed... Cheers, Dave. -- Dave Chinner david@fromorbit.com