Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp934103imm; Sat, 26 May 2018 15:44:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIcQ7w6NCTzaOrphs2/tSu0Wg6PtVXyebkg55s9RqKK3YysD2QpsxyiKSB/wDi348g9WHe4 X-Received: by 2002:a65:6047:: with SMTP id b7-v6mr2531371pgv.241.1527374646504; Sat, 26 May 2018 15:44:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527374646; cv=none; d=google.com; s=arc-20160816; b=Pef6TDKBV0yvJr3mSuyS6XncksH0E8Na8O3ivSQNsggC/L+EYyjZIDojEJDuQEbAkm pnNNs5RiqSlXw2YH0iq57HAYRcL/I1bjHv0MzsYzKIhH0CscrY9w37WhTZ5DNSQEiQYb zQX7BO7CBlWsiQX81zj0nuPE6hHHnX/eLOeLcbrmdDv5S931vhHnAaNUUveaZ4hAQu2q q/+SAcQOt0NvTGiR0Cqin/b4KHlGcgKQbbMkJM42Rc1UGHqL6Zho8SgCggZOdJ5tMZVG FqqXDXqFcv8jpvgFq+QQunxBe/NRCkrCizt1BdePwL/3p5RyzNf31C1h7VsQhjtRtcR3 C91g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=M1sn3tTe/hzAPCQmGTmPe8jG0FvJ5Q1o4X82msO1Jek=; b=T+ORjRSu+mg/CpLFeU4SU5bPVow8Zou4UctTksarScOmYCs6T2DYmh6MC1djftKXkB UCTyKGlD/leap8oksVwsq2x1AbB/w5mNnNYpYMB9ZLib0dj03zL/SGrj395MBfjy/H1I TykVWFhcYx+umaYlHLEuL+u0Wf0plocerV7z5N6LQQhFPPfSOWMJmn/2wWQtldYd03CM mAfLjLS5osmybbw1I6E9IRYZjkKC9WPiwNCvHCXBeTJxl9/7Zoj03IAgyNOvBCZ5Ym9k zT/7/vWN9PLVW1CiuzvKqU+vXNx9rcOePnDDEbVTylaVGGtibX3+GHaFFBGc5lywlIsl uBCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SGyjlwbr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f16-v6si26998086pfe.291.2018.05.26.15.43.50; Sat, 26 May 2018 15:44:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SGyjlwbr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1032478AbeEZWni (ORCPT + 99 others); Sat, 26 May 2018 18:43:38 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:34579 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1032422AbeEZWnh (ORCPT ); Sat, 26 May 2018 18:43:37 -0400 Received: by mail-wm0-f68.google.com with SMTP id q4-v6so18592948wmq.1 for ; Sat, 26 May 2018 15:43:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=M1sn3tTe/hzAPCQmGTmPe8jG0FvJ5Q1o4X82msO1Jek=; b=SGyjlwbr5aHIOmlVtnAYrHvBn8Hg2MscqUy8RAvyp9CLFyv7cNDISRoJda0t1ajC9f uG15R+owHkPJoSanoXcHFegenppWi7hAzN+o7MaWDLnJrXb3XUfacJwl+mwNb3U88/n8 JugGhXU6vekvmHIU1rKmSmKUOBptak5MOEunpksEUqlM2569QpqsjhhL+OcX2YqicC7z YreXUs40OKaKDfifRZmeNEKR4tpXWnnvrBGgUUqvxs+EZU34UUWWl491JEyXrn9AFgVW gx02y5IsOavKWr98ZzimHS3koUUWfKmu3p6IJuozzi8Sp0YVj2koDtZWw7QWbRsYWwif SsjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=M1sn3tTe/hzAPCQmGTmPe8jG0FvJ5Q1o4X82msO1Jek=; b=EwDEVR0wRSxAKcf4BPbAkKXdq+XZ/8KTIU3GgsLfL9DWhYObbAc5D4F25cbI/3wz2L 97AqZA3FFNN+qhk2BbWn/HoGrFyabBmisbLhVzTFlikN2YJBvoypG62zcsfaR/X6xFAS EkOFghC1kMeQllxTQynVsLdVxEn9OW7gyyHzz42jIYIwAWgz+dFcmwfFCvSabG7FO3vC 4PCqxhcCr41gkNHNQ7Ha2S6qIAeAqQqgsoCXxSo9oLLW8Nhi4yMGX/SpBfq2n03/rKmM zYsBBXVayAr2uRjKrtJuQpVK0z7EuPRu6NpK7OB63C3LQ0xNgUtJAOiXOlGwW6G6fhm+ KDXw== X-Gm-Message-State: ALKqPwcJlotqakftB7svUYd0q7ABYf9JW7BKERcALX/CI61oZT4u3N3k YPMj0VK7cCFVK+b4GNIglwKQgG9Q52J/i8C5OOj4qg== X-Received: by 2002:a1c:6744:: with SMTP id b65-v6mr1811901wmc.9.1527374615944; Sat, 26 May 2018 15:43:35 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:1286:0:0:0:0:0 with HTTP; Sat, 26 May 2018 15:43:34 -0700 (PDT) In-Reply-To: <20180526185837.k5ztrillokpi65qj@esperanza> References: <20180522201336.196994-1-shakeelb@google.com> <20180526185837.k5ztrillokpi65qj@esperanza> From: Shakeel Butt Date: Sat, 26 May 2018 15:43:34 -0700 Message-ID: Subject: Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate To: Vladimir Davydov Cc: Michal Hocko , Andrew Morton , Greg Thelen , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Johannes Weiner , Tejun Heo , Linux MM , Cgroups , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 26, 2018 at 11:58 AM, Vladimir Davydov wrote: > On Tue, May 22, 2018 at 01:13:36PM -0700, Shakeel Butt wrote: >> The memcg kmem cache creation and deactivation (SLUB only) is >> asynchronous. If a root kmem cache is destroyed whose memcg cache is in >> the process of creation or deactivation, the kernel may crash. >> >> Example of one such crash: >> general protection fault: 0000 [#1] SMP PTI >> CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp >> ... >> Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn >> RIP: 0010:has_cpu_slab >> ... >> Call Trace: >> ? on_each_cpu_cond >> __kmem_cache_shrink >> kmemcg_cache_deact_after_rcu >> kmemcg_deactivate_workfn >> process_one_work >> worker_thread >> kthread >> ret_from_fork+0x35/0x40 >> >> This issue is due to the lack of real reference counting for the root >> kmem_caches. Currently kmem_cache does have a field named refcount which >> has been used for multiple purposes i.e. shared count, reference count >> and noshare flag. Due to its conflated nature, it can not be used for >> reference counting by other subsystems. >> >> This patch decoupled the reference counting from shared count and >> noshare flag. The new field 'shared_count' represents the shared count >> and noshare flag while 'refcount' is converted into a real reference >> counter. >> >> The reference counting is only implemented for root kmem_caches for >> simplicity. The reference of a root kmem_cache is elevated on sharing or >> while its memcg kmem_cache creation or deactivation request is in the >> fly and thus it is made sure that the root kmem_cache is not destroyed >> in the middle. As the reference of kmem_cache is elevated on sharing, >> the 'shared_count' does not need any locking protection as at worst it >> can be out-dated for a small window which is tolerable. > > I wonder if we could fix this problem without introducing reference > counting for kmem caches (which seems a bit of an overkill to me TBO), > e.g. by flushing memcg_kmem_cache_wq before root cache destruction? Thanks I will look into workqueue flushing.