Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2106849iof; Tue, 7 Jun 2022 19:59:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzKv2fmfy7IG4maskSWUwoStqFffIh4C2jvzuS7mO7Q7ud2Hp272YkS7CCwc8sXXNgNooDz X-Received: by 2002:a17:90a:408f:b0:1e3:23a:2370 with SMTP id l15-20020a17090a408f00b001e3023a2370mr34924103pjg.84.1654657143747; Tue, 07 Jun 2022 19:59:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654657143; cv=none; d=google.com; s=arc-20160816; b=J321zYHJ/3bzb9/NWCKkRMSEgHNXBd/pm6bloF0kqfkYITazAfxxi783n6oPcN0Ohl 96rlJnU1YR8OHAqHWFl15lC6vqhZchyLsle3qiyf78oGzxSISCNQVsylG7XmDjZamuhi UwQOWsQqVUz/h68Jxr0r4syr55LGbxBglImZTw7pER+WNUsnOHRYIgrDDmF3OtxaOIoC YBkGFBnFmdS77hHRNsZP/ldxIbiHm1nksvLVPpS8B+CKztawuPmCgqyUkjIsADLLNF8r 7856qMWteoMAR0pbpJN/IQiqzP91PQgIHenuB91auhJDD+rU3zSmLFMPy5ZQudNkqjKp wJWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=ClvLn1v0KW5jcy2W1CElCvPJ8/3zjMIv2rSPG5d3wcg=; b=r4o1a9JdkZUIDxLIdpqxUs2T8KUPkZIU0FH/loYWMTNdVIR6jhJvADneRF3jvVJXpa /AppaLcgoDnj1Qn1RBMC2WcEJtuOAwszrodny4RBs8F4XWzFUVpMnAHOvAjO4+uqPyiN TP9PAY2IvcgFM/brijrtgNau6CiI0jFd6tKrln1d90Lk/dmIaH2/r4fXyHOcg+FsPyR6 84a1Cyj79TTt4Dj9iFRcuzfIHFIDNxJmH2XK0syStcYw5ABBP6TtUMOZHrXNEScoOf1D 3FZIgVbYmGY6l/EQzmXriQQxgcWqeLILEvY/HXYVWop+GBCJEVjnB6XzFNJvrB08XWv2 FcuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tc1I+sdA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id z2-20020a056a00240200b0050ad2c9d507si28791392pfh.170.2022.06.07.19.59.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 19:59:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tc1I+sdA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3D8DD29B16C; Tue, 7 Jun 2022 18:53:29 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378688AbiFGVzH (ORCPT + 99 others); Tue, 7 Jun 2022 17:55:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379233AbiFGVCN (ORCPT ); Tue, 7 Jun 2022 17:02:13 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F50634663 for ; Tue, 7 Jun 2022 11:47:02 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id u12-20020a17090a1d4c00b001df78c7c209so21675286pju.1 for ; Tue, 07 Jun 2022 11:47:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=message-id:date:mime-version:user-agent:content-language:to:cc :references:from:subject:in-reply-to:content-transfer-encoding; bh=ClvLn1v0KW5jcy2W1CElCvPJ8/3zjMIv2rSPG5d3wcg=; b=tc1I+sdA0FwDJ/gsRcalhTdWIHb3oeBmUEKkK8g4OZEMjXtAlRTZvaNhHDSk/3+9Zf ZCnTm71Ikwc9fj78Z6tJ+r+s3NKyZPGXQUM5GKc63uB2wcSoTM+yHLH1Ir/Fm51e9ot0 iPOjiMnofGcQT1I4z3zOljtH0zbFxSybwkGC8Uh8OTfQAgLL0VB/fu8yQm7VfTWTj1iK ViabI4/ZPV3krTjB9RtlTrwxbRjkqxfKSGqc7GY05gqlqamDH4sYLLUCxRq+Tg2RE6e2 AQwMtPhyN/4yV7ZESuVPupvYQ3xLbWu98R4sqW9Vc3joiZka92axUvzIDfM/7JpyJlSZ AErQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:subject:in-reply-to :content-transfer-encoding; bh=ClvLn1v0KW5jcy2W1CElCvPJ8/3zjMIv2rSPG5d3wcg=; b=VAuEQBXlz+iPxXloxC6vckhhAqn+895dhlMV+LpRax6tFXWb2Ib/qh+BszUZWCYfUs dt9CP/CQ145JhSL31DRG9Jj9ukR7ZkdPKE4iV/E/4UZ5fu/uV4Bc6SzGVXdifjNcbSov gaRFr4HKA22k1+4Wa2A+DKTJZHgT4NI4Paiy6ujrB9ycwjV1/zl3MrYuL0gtZ/dN2yG9 e3vDyAz326IAjzzdTVXvCmYtiJ5Y6VH76N0gGqFkIDyJ1JPHHCJSHMBM7tvJZ/D9hCuB hkAx0jByWNi2YAGuNcbR9jQIv0EbFRRxkxE4M2uEm6YNtYs3OSLveXnVXreNXqBUMi8S upjA== X-Gm-Message-State: AOAM531phgQu+e7UnEJdCUPaOtpSw1Sq8blTpSgV7ZVQZiwbUcNJimBo UGNUSWRdqhSUrsrDPkIs0+Nh5g== X-Received: by 2002:a17:902:ec92:b0:166:3502:ecb1 with SMTP id x18-20020a170902ec9200b001663502ecb1mr30357875plg.62.1654627621789; Tue, 07 Jun 2022 11:47:01 -0700 (PDT) Received: from [192.168.254.36] ([50.39.160.154]) by smtp.gmail.com with ESMTPSA id a2-20020a170902710200b0016141e6c5acsm13036791pll.296.2022.06.07.11.47.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 07 Jun 2022 11:47:01 -0700 (PDT) Message-ID: Date: Tue, 7 Jun 2022 11:47:00 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: =?UTF-8?Q?Michal_Koutn=c3=bd?= Cc: Tejun Heo , Zefan Li , Johannes Weiner , Christian Brauner , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, stable@vger.kernel.org, linux-kernel@vger.kernel.org, syzbot+e42ae441c3b10acf9e9d@syzkaller.appspotmail.com References: <20220603173455.441537-1-tadeusz.struk@linaro.org> <20220603181321.443716-1-tadeusz.struk@linaro.org> <20220606123910.GF6928@blackbody.suse.cz> From: Tadeusz Struk Subject: Re: [PATCH v2] cgroup: serialize css kill and release paths In-Reply-To: <20220606123910.GF6928@blackbody.suse.cz> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/6/22 05:39, Michal Koutný wrote: > On Fri, Jun 03, 2022 at 11:13:21AM -0700, Tadeusz Struk wrote: >> In such scenario the css_killed_work_fn will be en-queued via >> cgroup_apply_control_disable(cgrp)->kill_css(css), and bail out to >> cgroup_kn_unlock(). Then cgroup_kn_unlock() will call: >> cgroup_put(cgrp)->css_put(&cgrp->self), which will try to enqueue >> css_release_work_fn for the same css instance, causing a list_add >> corruption bug, as can be seen in the syzkaller report [1]. > This hypothesis doesn't add up to me (I am sorry). > > The kill_css(css) would be a css associated with a subsys (css.ss != > NULL) whereas css_put(&cgrp->self) is a different css just for the > cgroup (css.ss == NULL). Yes, you are right. I couldn't figure it out where the extra css_put() is called from, and the only place that fitted into my theory was from the cgroup_kn_unlock() in cgroup_apply_control_disable(). After some more debugging I can see that, as you said, the cgrp->self is a different css. The offending _put() is actually called by the percpu_ref_kill_and_confirm(), as it not only calls the passed confirm_kill percpu_ref_func_t, but also it puts the refcnt iself. Because the cgroup_apply_control_disable() will loop for_each_live_descendant, and call css_kill() on all css'es, and css_killed_work_fn() will also loop and call css_put() on all parents, the css_release() will be called on the first parent prematurely, causing the BUG(). What I think should be done to balance put/get is to call css_get() for all the parents in kill_css(): diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index c1e1a5c34e77..3ca61325bc4e 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -5527,6 +5527,8 @@ static void css_killed_ref_fn(struct percpu_ref *ref) */ static void kill_css(struct cgroup_subsys_state *css) { + struct cgroup_subsys_state *_css = css; + lockdep_assert_held(&cgroup_mutex); if (css->flags & CSS_DYING) @@ -5541,10 +5543,13 @@ static void kill_css(struct cgroup_subsys_state *css) css_clear_dir(css); /* - * Killing would put the base ref, but we need to keep it alive - * until after ->css_offline(). + * Killing would put the base ref, but we need to keep it alive, + * and all its parents, until after ->css_offline(). */ - css_get(css); + do { + css_get(_css); + _css = _css->parent; + } while (_css && atomic_read(&_css->online_cnt)); /* * cgroup core guarantees that, by the time ->css_offline() is This will be then "reverted" in css_killed_work_fn() Please let me know if it makes sense to you. I'm still testing it, but syzbot is very slow today. -- Thanks, Tadeusz