Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2741983imm; Tue, 4 Sep 2018 09:16:38 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZ00XSfEiZWXXdrqZUYCCMMF25RQNS6H9j0TO/hHysxU28K7NGYE3/UYW8VEwTqJHzPhq29 X-Received: by 2002:a17:902:2f43:: with SMTP id s61-v6mr33441144plb.176.1536077798259; Tue, 04 Sep 2018 09:16:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536077798; cv=none; d=google.com; s=arc-20160816; b=gvxe5hTzUOXPCGEBbvR5wO8s55Y9r0ZMl/gzR8+EkyJD7SK7GXbr+xQAtyyj8GuQVo 7U26sXfDn7l/tykSguVD4mE4gdRXVnhHlDMjdD7sCCFJjYphjZ/FNuDwUFZBW15UlEIO GIP+nfN5A5eRQpmlHVvm4++CPx38tgUPXYt6WpHLTmM1PseEGCgslsoeyvpW7oxUvSGo mIGwSo19M2+4K9fwDi/E6d/M3wjEM0ZNHOAQU5Y6VUCmYxK37oRGl12l6B+QogsiR47h csEi5/qw1G7XENsogLqySG2BIeUY2hm3NP9mkRKH5J8Ps50950AKuMKz0uF0tZm+q2gv +DEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=dr5OZmREkPZ5f99lzcn2Ir5jt+8Xq0OpeZ2lbXJ2tYI=; b=Dm6xDocxs7mFYVmJUga7/q1SQZSEvOxzWLiOFAIQ5cO4cp8KrMg5nT8WuWFQxpMOxF 2gzBMlDUsaQa02yXhZgvVB7K4H58DfuTzpaXvnmGNPOV1bZ3O4IvTb+J4l61AJPR8IHE I9U25oQLhxkuYDzb/WPxNwgHOVG9j/jZ95pKJp4CpZTH3k1hpNE6Z50DKmvZSjLsHoof JnW+qfh1z2cWdvzM9chZB5KXCP2QiHMd6TluYLjwq0jcH7Dmau6H/HQWgVrAaEz1rJ+1 e42HZxoGPfzM1QkSWr7uTzFsivahKSulIv+ThNm7rQ2X5AH4lO6YbXRSGrbJyezx5wgI syiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2-v6si21185232plo.206.2018.09.04.09.16.22; Tue, 04 Sep 2018 09:16:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727734AbeIDUkX (ORCPT + 99 others); Tue, 4 Sep 2018 16:40:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:56350 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726347AbeIDUkW (ORCPT ); Tue, 4 Sep 2018 16:40:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 633C2AD6D; Tue, 4 Sep 2018 16:14:33 +0000 (UTC) Date: Tue, 4 Sep 2018 18:14:31 +0200 From: Michal Hocko To: Roman Gushchin Cc: Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, Josef Bacik , Johannes Weiner , Andrew Morton Subject: Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects Message-ID: <20180904161431.GP14951@dhcp22.suse.cz> References: <20180831203450.2536-1-guro@fb.com> <3b05579f964cca1d44551913f1a9ee79d96f198e.camel@surriel.com> <20180831213138.GA9159@tower.DHCP.thefacebook.com> <20180903182956.GE15074@dhcp22.suse.cz> <20180903202803.GA6227@castle.DHCP.thefacebook.com> <20180904070005.GG14951@dhcp22.suse.cz> <20180904153445.GA22328@tower.DHCP.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180904153445.GA22328@tower.DHCP.thefacebook.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 04-09-18 08:34:49, Roman Gushchin wrote: > On Tue, Sep 04, 2018 at 09:00:05AM +0200, Michal Hocko wrote: > > On Mon 03-09-18 13:28:06, Roman Gushchin wrote: > > > On Mon, Sep 03, 2018 at 08:29:56PM +0200, Michal Hocko wrote: > > > > On Fri 31-08-18 14:31:41, Roman Gushchin wrote: > > > > > On Fri, Aug 31, 2018 at 05:15:39PM -0400, Rik van Riel wrote: > > > > > > On Fri, 2018-08-31 at 13:34 -0700, Roman Gushchin wrote: > > > > > > > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > > index fa2c150ab7b9..c910cf6bf606 100644 > > > > > > > --- a/mm/vmscan.c > > > > > > > +++ b/mm/vmscan.c > > > > > > > @@ -476,6 +476,10 @@ static unsigned long do_shrink_slab(struct > > > > > > > shrink_control *shrinkctl, > > > > > > > delta = freeable >> priority; > > > > > > > delta *= 4; > > > > > > > do_div(delta, shrinker->seeks); > > > > > > > + > > > > > > > + if (delta == 0 && freeable > 0) > > > > > > > + delta = min(freeable, batch_size); > > > > > > > + > > > > > > > total_scan += delta; > > > > > > > if (total_scan < 0) { > > > > > > > pr_err("shrink_slab: %pF negative objects to delete > > > > > > > nr=%ld\n", > > > > > > > > > > > > I agree that we need to shrink slabs with fewer than > > > > > > 4096 objects, but do we want to put more pressure on > > > > > > a slab the moment it drops below 4096 than we applied > > > > > > when it had just over 4096 objects on it? > > > > > > > > > > > > With this patch, a slab with 5000 objects on it will > > > > > > get 1 item scanned, while a slab with 4000 objects on > > > > > > it will see shrinker->batch or SHRINK_BATCH objects > > > > > > scanned every time. > > > > > > > > > > > > I don't know if this would cause any issues, just > > > > > > something to ponder. > > > > > > > > > > Hm, fair enough. So, basically we can always do > > > > > > > > > > delta = max(delta, min(freeable, batch_size)); > > > > > > > > > > Does it look better? > > > > > > > > Why don't you use the same heuristic we use for the normal LRU raclaim? > > > > > > Because we do reparent kmem lru lists on offlining. > > > Take a look at memcg_offline_kmem(). > > > > Then I must be missing something. Why are we growing the number of dead > > cgroups then? > > We do reparent LRU lists, but not objects. Objects (or, more precisely, pages) > are still holding a reference to the memcg. OK, this is what I missed. I thought that the reparenting includes all the pages as well. Is there any strong reason that we cannot do that? Performance/Locking/etc.? Or maybe do not reparent at all and rely on the same reclaim heuristic we do for normal pages? I am not opposing your patch but I am trying to figure out whether that is the best approach. -- Michal Hocko SUSE Labs