Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp31000yba; Wed, 17 Apr 2019 18:57:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqwotPIEjlsO9/j4yF0Ft8LBcSYl+tdyA//UaTJtOebYB/dL6LRcitIxTM0YgyeuOQUj9rDv X-Received: by 2002:a17:902:2d01:: with SMTP id o1mr93403238plb.155.1555552624961; Wed, 17 Apr 2019 18:57:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555552624; cv=none; d=google.com; s=arc-20160816; b=ZKtR3UXm5AtnUEHHYlXgtawvOnbBY5sfXsic4fIbTZoHv2cWz5pkiWUmLgLFNlf3va lyQ942T9T3FVpsv7cQNEBU+34hBvFfzvlklOigjrLq1nDSXo8V+M5d+y13SJhj1P7E1G i6nJpZQGwrD3VEx6CKKSbj5TZxS4J2AvsEiOFgTL1S+Ia1wGiK1w/u0TUlnfh2Faf0lP 8S8CnPi65oIyNBvqb1hq5hnCyGUQwppM98n5PN7KMEfVul/Vn0IVjjAKGc2Ycaaa82SG EA3RQ0T+CLMkBAXYS0CAudr7S+VMnFB/QjTfC1VbaasJs4RKVCgD86V658QpoMIQrOdp kLQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=KCQ+0ux3f2lhz2JQz+sODUn7gTy6oTF+cvFWZuiFK1k=; b=UoXyrXnPYpqZGQ7eOrYxPkm0w7i70Z56qM0lt34lxd8qIe9fgfNyPY88544DJOYkdK 9P4Ouddm7SONMuYUfxx4M5okqCYcTv8MWW9iTK5/IJ6l0dT79hj1eAGy04iuxYVSAYUb wTtqdN6zRvpxtnH3+NuaO/mh0iHTqL26uBXfZ3qL3igYaj4IOUP4hlV/LVgxnK8VKwpM c4a8KLbatVoAT60LZwYcmlvRpuq8mXFqfABbv9GwHdQdrc1pgS2T6XN67rGdc6sK+WcK ctx0+b2dHkGdKteRBL6pNznY1FVt0SPIv7bLv45nZBjM+nxb92doWf4nL8Ot8jKmw056 +E8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=lWEA3zXd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g24si822594pfd.212.2019.04.17.18.56.49; Wed, 17 Apr 2019 18:57:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=lWEA3zXd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387859AbfDRBzZ (ORCPT + 99 others); Wed, 17 Apr 2019 21:55:25 -0400 Received: from mail-yb1-f194.google.com ([209.85.219.194]:38452 "EHLO mail-yb1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730037AbfDRBzY (ORCPT ); Wed, 17 Apr 2019 21:55:24 -0400 Received: by mail-yb1-f194.google.com with SMTP id w206so296783ybg.5 for ; Wed, 17 Apr 2019 18:55:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KCQ+0ux3f2lhz2JQz+sODUn7gTy6oTF+cvFWZuiFK1k=; b=lWEA3zXd1uUNwv1sfs/UBAcFEGV7gzD1mipY5IZjzjsWevuBoSGre2ffvv0LXhiX4/ il0RgAxHngPy2TypBkVVpVhebvUuicEOsLe69KHZgTg2Zena0Fg5SHPJGeGcJUlix0KX weKb4jER8526L50m+2MUd9bNeIw/1seMQ5k/0Z9qBKWVfB5W7YQF7LwAeFzykqOcPsyl 2EEp3IbR6TfHeeiJW5IAuOiQa+KVTjR9+cqCX6OraglnF6lFsaoXFgMyDt6OtNR3uKYk 0WCmXgoVwQMwrGnETEOjRHfIB7v11L7+Pesh8yrqNVI1oU/z4JfX2WzzMZLRH801nTP+ 3YwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KCQ+0ux3f2lhz2JQz+sODUn7gTy6oTF+cvFWZuiFK1k=; b=IkyEUUFBSgz1JVUpyUq3OevxYwijOpr5koEdK/4mHeElNKOTzhMsNso761YElsvWRC CWc4jMSIareYHS4H2OlfhqWXO1LZeez15veaRbDoKJuPauB/cWyZd4i46hLeaZ6M1Z6w dlGy6FQ6yIyksDsnJ32mBBSPmRUUOT84E5YzMFU7DLuHgOeR8HdoVa9ta6HO73eEuAmT rohuWm37SH9dTy9qS1zfOBz8HKjZqw9ZZnge8pM0a4q40GNuD9OzqfS3NBNF6giPWhhG byK+Rh3SMA8gvygCQtYvt80kMLrI2XxNyXqUtPsxEXGefTx0YsSawmdi0PsgpPnecGP8 eG7Q== X-Gm-Message-State: APjAAAWeJA7jHfb86xlL6SykAIGmw/LKINBqygSBmOIbjz5cte8O3ig/ 20IWVr8h+11KPRYt4h2VLXDw0VYUi/+5Rm/OPQ7t0A== X-Received: by 2002:a25:2bc4:: with SMTP id r187mr52894861ybr.150.1555552523533; Wed, 17 Apr 2019 18:55:23 -0700 (PDT) MIME-Version: 1.0 References: <20190417215434.25897-1-guro@fb.com> <20190417215434.25897-5-guro@fb.com> <20190418003850.GA13977@tower.DHCP.thefacebook.com> In-Reply-To: <20190418003850.GA13977@tower.DHCP.thefacebook.com> From: Shakeel Butt Date: Wed, 17 Apr 2019 18:55:12 -0700 Message-ID: Subject: Re: [PATCH 4/5] mm: rework non-root kmem_cache lifecycle management To: Roman Gushchin Cc: Roman Gushchin , Andrew Morton , Linux MM , LKML , Kernel Team , Johannes Weiner , Michal Hocko , Rik van Riel , "david@fromorbit.com" , Christoph Lameter , Pekka Enberg , Vladimir Davydov , Cgroups Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 17, 2019 at 5:39 PM Roman Gushchin wrote: > > On Wed, Apr 17, 2019 at 04:41:01PM -0700, Shakeel Butt wrote: > > On Wed, Apr 17, 2019 at 2:55 PM Roman Gushchin wrote: > > > > > > This commit makes several important changes in the lifecycle > > > of a non-root kmem_cache, which also affect the lifecycle > > > of a memory cgroup. > > > > > > Currently each charged slab page has a page->mem_cgroup pointer > > > to the memory cgroup and holds a reference to it. > > > Kmem_caches are held by the cgroup. On offlining empty kmem_caches > > > are freed, all other are freed on cgroup release. > > > > No, they are not freed (i.e. destroyed) on offlining, only > > deactivated. All memcg kmem_caches are freed/destroyed on memcg's > > css_free. > > You're right, my bad. I was thinking about the corresponding sysfs entry > when was writing it. We try to free it from the deactivation path too. > > > > > > > > > So the current scheme can be illustrated as: > > > page->mem_cgroup->kmem_cache. > > > > > > To implement the slab memory reparenting we need to invert the scheme > > > into: page->kmem_cache->mem_cgroup. > > > > > > Let's make every page to hold a reference to the kmem_cache (we > > > already have a stable pointer), and make kmem_caches to hold a single > > > reference to the memory cgroup. > > > > What about memcg_kmem_get_cache()? That function assumes that by > > taking reference on memcg, it's kmem_caches will stay. I think you > > need to get reference on the kmem_cache in memcg_kmem_get_cache() > > within the rcu lock where you get the memcg through css_tryget_online. > > Yeah, a very good question. > > I believe it's safe because css_tryget_online() guarantees that > the cgroup is online and won't go offline before css_free() in > slab_post_alloc_hook(). I do initialize kmem_cache's refcount to 1 > and drop it on offlining, so it protects the online kmem_cache. > Let's suppose a thread doing a remote charging calls memcg_kmem_get_cache() and gets an empty kmem_cache of the remote memcg having refcnt equal to 1. That thread got a reference on the remote memcg but no reference on the kmem_cache. Let's suppose that thread got stuck in the reclaim and scheduled away. In the meantime that remote memcg got offlined and decremented the refcnt of all of its kmem_caches. The empty kmem_cache which the thread stuck in reclaim have pointer to can get deleted and may be using an already destroyed kmem_cache after coming back from reclaim. I think the above situation is possible unless the thread gets the reference on the kmem_cache in memcg_kmem_get_cache(). Shakeel