Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp740906ioo; Thu, 26 May 2022 13:45:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4+3uPNnCeiP0YRl8kad+fqjIHWGbylm+Epf/q2hiHmLzoEvR8brA5ODcn+Fjeh2IdlaTu X-Received: by 2002:a17:907:8689:b0:6fe:e525:ea9c with SMTP id qa9-20020a170907868900b006fee525ea9cmr19656183ejc.720.1653597941601; Thu, 26 May 2022 13:45:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653597941; cv=none; d=google.com; s=arc-20160816; b=0R36pHPbYJa8XNK6A4ebeJEBAWc/zLvz78DbUgMWC/mfuY8fydOAmEPl/D0kySX9oE wkbwCDeOPyiUk0EmQKhKIb3Bgt/p90gIEvo+Fjjd7n17Kn1TVnwExbEBZO82E/eQfRfH nkNPZLWXwbEtzKCb8tjX10M3QhfTk2KqHn4uS6Bb5Y5TxyOfXz0kBONrAoIVoMCEkl4a m1izh2CvQMS6d/fT/zSkCD0sKVF9l3GqRm+XZIGQHAe7r0HgmE68XHs/rh15/14XPPKR iQ/DibnSAK5txpsySBvK3UNmRbfEqOWX19L7uiuLZ9hM8qF+rHsJyKmYlJpPLa8cSkCj KZTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=OFhmgixX5LOgkqqC8+umKFYLrjndJW5kicXnZWzhHcM=; b=Pol75ywf1ZgXl1Gd11yj8ubNdNlM2eHKQbts4mWoWH4Fp4WUAF1eqCDgt9AUwquZaE x7hUxo/SwCWn7U37jo0v//KKie7EX9+P6KMsTTPj/vRWi9bmvGOSsAvWmum1PkXSDFFm vV88Dx5L0h1blIhSNkMEsXaEe0Ustl9BCRXpcQaRXMoiAN86imSYuwHnrorxYst4E7L8 bnbxaPXg4DsByvj+VWKVbFDiaygEbvMElu3i0xu2FOV7AVhOdnog1+8Bx6UFFAOXBLvm 4kgsM4ivBDk7U+9/oKbctlChyyHNBEh+fhOsIxYB8SCYGYQFLeDr04G2WZABtw0DOe/b YbXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=IFtKrgBp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l6-20020a170906644600b006fe95bb93b3si154385ejn.861.2022.05.26.13.45.15; Thu, 26 May 2022 13:45:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=IFtKrgBp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244377AbiEZJ4k (ORCPT + 99 others); Thu, 26 May 2022 05:56:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231512AbiEZJ4i (ORCPT ); Thu, 26 May 2022 05:56:38 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82F7BC9676; Thu, 26 May 2022 02:56:37 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C63901F91C; Thu, 26 May 2022 09:56:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1653558995; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OFhmgixX5LOgkqqC8+umKFYLrjndJW5kicXnZWzhHcM=; b=IFtKrgBpUSxaIqVCvIEJxuP0+8BajYar9PUK+it7szWAOhxuqgh3gsI50vxQ5v2+PAk+v5 HmhTw77kSpoGw0/3EJ8WyMbDUAVjADlDKKvwZcwDnjw6FA5moc+3dGsd+gFfKPR7xHm1RT FKNfMRfPE3vE2jk8xlz9H4AnRMv+oTs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 968381351F; Thu, 26 May 2022 09:56:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id gi+FI9NOj2IBHwAAMHmgww (envelope-from ); Thu, 26 May 2022 09:56:35 +0000 Date: Thu, 26 May 2022 11:56:34 +0200 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Tejun Heo , Zefan Li , Johannes Weiner , Bui Quang Minh , Tadeusz Struk Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path Message-ID: References: <20220525151517.8430-1-mkoutny@suse.com> <20220525151517.8430-3-mkoutny@suse.com> <20220525161455.GA16134@blackbody.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20220525161455.GA16134@blackbody.suse.cz> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 25, 2022 at 06:14:55PM +0200, Michal Koutn? wrote: > But the above is not correct. I've looked at the stack trace [1] and the > offending percpu_ref_put_many is called from an RCU callback > percpu_ref_switch_to_atomic_rcu(), so I can't actually see why it drops > to zero there... The link [1] should have been [1]. After some more thought, the following is possible sequencing of involved functions. // ref=A: initial state kill_css() css_get // ref+=F == A+F: fuse percpu_ref_kill_and_confirm __percpu_ref_switch_to_atomic percpu_ref_get // ref += 1 == A+F+1: atomic mode, self-protection percpu_ref_put // ref -= 1 == A+F: kill the base reference [via rcu] percpu_ref_switch_to_atomic_rcu percpu_ref_call_confirm_rcu css_killed_ref_fn == refcnt.confirm_switch queue_work(css->destroy_work) (1) [via css->destroy_work] css_killed_work_fn == wq.func offline_css() // needs fuse css_put // ref -= F == A: de-fuse percpu_ref_put // ref -= 1 == A-1: remove self-protection css_release // A <= 1 -> 2nd queue_work explodes! queue_work(css->destroy_work) (2) [via css->destroy_work] css_release_work_fn == wq.func Another CPU would have to dispatch and run the css_killed_work_fn callback in parallel to percpu_ref_switch_to_atomic_rcu. It's a more correct explanation, however, its likelihood does seem very low. Perhaps some debug prints of percpu_ref_data.data in percpu_ref_call_confirm_rcu could shed more light onto this [2]. HTH, Michal [1] https://syzkaller.appspot.com/text?tag=CrashReport&x=162b5781f00000 [2] I tried notifying syzbot about [3] moments ago. [3] https://github.com/Werkov/linux/tree/cgroup-ml/css-lifecycle-syzbot