Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp2522600ybd; Thu, 27 Jun 2019 14:00:09 -0700 (PDT) X-Google-Smtp-Source: APXvYqyA0J3JfdcocAedtAvcQsIHKeC/2H+mS8xuVgut9Sza7cJWWCShdYl6uvvk3BSOkHunih3a X-Received: by 2002:a63:7f07:: with SMTP id a7mr5590624pgd.26.1561669208749; Thu, 27 Jun 2019 14:00:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561669208; cv=none; d=google.com; s=arc-20160816; b=rdR0bYMQuBddnKhpvYlezIFnAAGZ+4Pps1Zc3BGbn7R/OCJzUkADeK7f/7mU0RfbyC fpWSW/Si37wb/888iYPSt+4qHSv1BSX8s1RXoVytSZbGPcls9B9DysYOmnc7D6de/V4E 4/BSzSz5Lyd9eFq401Q0MhZzf1G4RPIhyY9dmjomH+bU+QtKFKiVPpUy/4MIDN60bAJI 5dkTTOAFyg4X4tZRmhhOGMJjzvQjyciY4bk6aX2P/InWmZQu91VYy+qEOacDiuVfu3HN BNxtppKXGtMgh6Y0NUgsp6vYLF9q+BateMaypiHKyACdonhuwLJnaY+P2IZ/hgSLdL1o 3HfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=IRxeaXD7MNYi+4DQpHJ7keFZoWtD592Ry/e2Cz42UtE=; b=MhsK5IxTYdM++zuYxNOv0HlrxlusuyoNFsCXfXlkhOSYQx2nZt3H/7r2qvBE6wrUEq 6dLot9aqUG3LgmpHnGA+JZ5NZzlNyk2ms01TGfx1FOyeW0ytHm1U5oU3183TmzfAK2jK cDx0MZf89bplku0mW2j78KRuaZPk1taAdAF6QCNcMR7dXk32MNqZneLMn+h3JvBorgW2 I3fWoUcfJdNvDtJYakmbobTfOP1tBpu7IOSz9YPIi5BWJ1h9y0/j2PuOZlDgX+P+QOUg 1JiORloIwDohVvLmD8Oc/rZ8L20Db/6iFxYMAL1m+WQQVUzJi8faSC106gudssNIQGVp 6W7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m16si3332035pls.307.2019.06.27.13.59.52; Thu, 27 Jun 2019 14:00:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726583AbfF0U6R (ORCPT + 99 others); Thu, 27 Jun 2019 16:58:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50648 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726426AbfF0U6R (ORCPT ); Thu, 27 Jun 2019 16:58:17 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C10C930C0DCA; Thu, 27 Jun 2019 20:57:57 +0000 (UTC) Received: from llong.remote.csb (dhcp-17-85.bos.redhat.com [10.18.17.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3DD4B10013D9; Thu, 27 Jun 2019 20:57:51 +0000 (UTC) Subject: Re: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs To: Roman Gushchin Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Alexander Viro , Jonathan Corbet , Luis Chamberlain , Kees Cook , Johannes Weiner , Michal Hocko , Vladimir Davydov , "linux-mm@kvack.org" , "linux-doc@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "cgroups@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Shakeel Butt , Andrea Arcangeli References: <20190624174219.25513-1-longman@redhat.com> <20190624174219.25513-3-longman@redhat.com> <20190626201900.GC24698@tower.DHCP.thefacebook.com> From: Waiman Long Organization: Red Hat Message-ID: <063752b2-4f1a-d198-36e7-3e642d4fcf19@redhat.com> Date: Thu, 27 Jun 2019 16:57:50 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: <20190626201900.GC24698@tower.DHCP.thefacebook.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Thu, 27 Jun 2019 20:58:17 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/26/19 4:19 PM, Roman Gushchin wrote: >> >> +#ifdef CONFIG_MEMCG_KMEM >> +static void kmem_cache_shrink_memcg(struct mem_cgroup *memcg, >> + void __maybe_unused *arg) >> +{ >> + struct kmem_cache *s; >> + >> + if (memcg == root_mem_cgroup) >> + return; >> + mutex_lock(&slab_mutex); >> + list_for_each_entry(s, &memcg->kmem_caches, >> + memcg_params.kmem_caches_node) { >> + kmem_cache_shrink(s); >> + } >> + mutex_unlock(&slab_mutex); >> + cond_resched(); >> +} > A couple of questions: > 1) how about skipping already offlined kmem_caches? They are already shrunk, > so you probably won't get much out of them. Or isn't it true? I have been thinking about that. This patch is based on the linux tree and so don't have an easy to find out if the kmem caches have been shrinked. Rebasing this on top of linux-next, I can use the SLAB_DEACTIVATED flag as a marker for skipping the shrink. With all the latest patches, I am still seeing 121 out of a total of 726 memcg kmem caches (1/6) that are deactivated caches after system bootup one of the test systems. My system is still using cgroup v1 and so the number may be different in a v2 setup. The next step is probably to figure out why those deactivated caches are still there. > 2) what's your long-term vision here? do you think that we need to shrink > kmem_caches periodically, depending on memory pressure? how a user > will use this new sysctl? Shrinking the kmem caches under extreme memory pressure can be one way to free up extra pages, but the effect will probably be temporary. > What's the problem you're trying to solve in general? At least for the slub allocator, shrinking the caches allow the number of active objects reported in slabinfo to be more accurate. In addition, this allow to know the real slab memory consumption. I have been working on a BZ about continuous memory leaks with a container based workloads. The ability to shrink caches allow us to get a more accurate memory consumption picture. Another alternative is to turn on slub_debug which will then disables all the per-cpu slabs. Anyway, I think this can be useful to others that is why I posted the patch. Cheers, Longman