Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp6863707ioo; Thu, 2 Jun 2022 15:40:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxS9rWP7/mZzTX/YO7UJCd2vJMjkg6z+fQJmWI4VCVdWL65TGKvammsUqMH1/mO9YS8aY1y X-Received: by 2002:a17:90b:1c11:b0:1e7:8bd2:697d with SMTP id oc17-20020a17090b1c1100b001e78bd2697dmr1715217pjb.90.1654209622322; Thu, 02 Jun 2022 15:40:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654209622; cv=none; d=google.com; s=arc-20160816; b=uvX2jyxr/p5b7IPhS4cnu4dOv/GDnGx1h+I/jrgatZKklyJUZHtyHHlEr7513Tyi4E PMvrsiHfYvupqL7yAambGoyg0OHTlegY6P3v9iOlZMrGOehcpsBAvFyfAGGmlz7btruk UFtEzVzBEOCuPcMtWdTIZXmYGtc0s4RUQ4cHYBYw6G/lh2jokkcP7/7q29i6RlThGqC4 M3ryAAvcxe8Gjig+vRdXkU0DcHHrOu9B7U0M0gqnSo9B5a4tNLMJLp2ys0ttL0BahFTM 3DO+WZGh8vddRDFFR0hCxNa1Mi1W3BsrmJrt5lQbdvboUpSfj3nefV32p/+lp/maXSyu o4ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=yU2c05v5u7FyIeRoQARLGBGxinXGPSbObbAhOHtlivk=; b=ReqztDVUp0PZ/K1wp7OuQDPFSIrZA8Je7czSbfxY5f9xi/AgpAra+2UrEQpPgDaxKU Q9hmq1I4sdpsMpxDlsXA9qrrH+KTBPxKR8ImepEvoAl/PoS/rbGCiX8pnZrfArvYlTQL rTaavOXMECyRnKJul4ywYf/Ca7VUU6v48QQaAcY9eD/VinovpKGQLG1L7sMeeJEs3L7j FnKQABpfQHx3EJiIBgXVqWzaTQfq29kU9ZQuX+94iyUEHxq2nOau5mdypv7K7FjtvDYf Oct8VVQN8WO1UwZQnqsEHZ+2Y/QaxIa/5YDGsswxe8cKRBgqBmYn4K/HmVa7mEHYROs9 N8PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=iS7l6vTG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r12-20020a63ec4c000000b003c208b61727si7832269pgj.822.2022.06.02.15.40.08; Thu, 02 Jun 2022 15:40:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=iS7l6vTG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234403AbiFBLrZ (ORCPT + 99 others); Thu, 2 Jun 2022 07:47:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233488AbiFBLrP (ORCPT ); Thu, 2 Jun 2022 07:47:15 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E552250682; Thu, 2 Jun 2022 04:47:08 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C787B21B6F; Thu, 2 Jun 2022 11:47:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1654170426; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yU2c05v5u7FyIeRoQARLGBGxinXGPSbObbAhOHtlivk=; b=iS7l6vTGZLN97NTh5jQeXGGRyx3mXhgVrXT1uZlWRBfP5ckC33GgQkOvK2jzKixSmNQ/eQ TW9e7UxBek/WCPm7Rrd+FAokOBEQg2hgxLXU30+gB/fe5Ka5Lr5WlARJfeaJax9PlVF2Yj 1Z+PbHJR0Yh1xFV8MIz5ychDM22WHTU= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9AA36134F3; Thu, 2 Jun 2022 11:47:06 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id SIKzJDqjmGIgCQAAMHmgww (envelope-from ); Thu, 02 Jun 2022 11:47:06 +0000 Date: Thu, 2 Jun 2022 13:47:05 +0200 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: Tadeusz Struk Cc: Tejun Heo , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Zefan Li , Johannes Weiner , Bui Quang Minh Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path Message-ID: <20220602114705.GB21320@blackbody.suse.cz> References: <0babd7df-bdef-9edc-3682-1144bc0c2d2b@linaro.org> <1fb4d8d7-ccc0-b020-715e-38c2dfd94c23@linaro.org> <416dc60a-f0e5-7d05-1613-3cd0ca415768@linaro.org> <0fd1c3fd-fa86-dbed-f3f0-74c91b1efa11@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0fd1c3fd-fa86-dbed-f3f0-74c91b1efa11@linaro.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 01, 2022 at 05:40:51PM -0700, Tadeusz Struk wrote: > css_killed_ref_fn() will be called regardless of the value of refcnt (via percpu_ref_kill_and_confirm()) > and it will only enqueue the css_killed_work_fn() to be called later. > Then css_put()->css_release() will be called before the css_killed_work_fn() will even > get a chance to run, and it will also *only* enqueue css_release_work_fn() to be called later. > The problem happens on the second enqueue. So there need to be something in place that > will make sure that css_killed_work_fn() is done before css_release() can enqueue > the second job. IIUC, here you describe the same scenario I broke down at [1]. > Does it sound right? I added a parameter A there (that is sum of base and percpu references before kill_css()). I thought it fails because A == 1 (i.e. killing the base reference), however, that seems an unlikely situation (because cgroup code uses a "fuse" reference to pin css for offline_css()). So the remaining option (at least I find it more likely now) is that A == 0 (A < 0 would trigger the warning in percpu_ref_switch_to_atomic_rcu()), aka the ref imbalance. I hope we can get to the bottom of this with detailed enough tracing of gets/puts. Splitting the work struct is condradictive to the existing approach with the "fuse" reference. (BTW you also wrote On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk wrote: > The fact the css_release() is called (via cgroup_kn_unlock()) just after > kill_css() causes the css->destroy_work to be enqueued twice on the same WQ > (cgroup_destroy_wq), just with different function. This results in the > BUG: corrupted list in insert_work issue. Where do you see a critical css_release called from cgroup_kn_unlock()? I always observed the css_release() being called via percpu_ref_call_confirm_rcu() (in the original and subsequent syzbot logs.)) Thanks, Michal [1] https://lore.kernel.org/r/Yo7KfEOz92kS2z5Y@blackbook/