Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp647439pxb; Tue, 5 Apr 2022 17:12:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvAAg4EKUp0fFL7YUmpvTpp50V5DE7WgU2oC4wZHmPsWTCjsC3Z8aeGwso8Gbe5E5jxvTe X-Received: by 2002:a17:902:f649:b0:156:1609:79e9 with SMTP id m9-20020a170902f64900b00156160979e9mr6088091plg.69.1649203954142; Tue, 05 Apr 2022 17:12:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649203954; cv=none; d=google.com; s=arc-20160816; b=dm5LwZVcrd3/npy/2FmjUfDELQHRlNlUXfyMWaJGxBjJnA1Q2tAM9QbL/glbtxmCIx RMeXX/WJYh+F66omXBPMNBn5SbyIpyIyaGEEQZwfSMuCU6rmOwaCKVijsEYE0gMa785A adWSyKYfW93QW+Eyp6jai/6tdY36t1Ko2I2X7OqonbavPnFkdJSj1JTeHp4SGwzwJvjt rUaYMMwFuTrwhfrpQNbEv7oUa82un2C0n2T4txpSBXB3y0w1qZbq1Z2BXcItYSjZ5YHj 0y9ZtOttxmDP8digN2woX7RWzfycQI2FK+G7WFMemZwYNglJU7aTMqnb4HOGF9DV8C+d EO8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=mVhpyElouVHiShzfKiMDpwTUsMQzwG3WoSbwcyg24Hk=; b=PBqGM52lBAvt2+fY81LUyY5AauvWElLufyNYq08w2aCwBy3TZl4lxMDj2dwGbePnOs cUZmr+GI488+cbzokCIG8Eh+UBgtxyNDw9Pw2BnuN6NOMHQmLFrKTQEamx701JoVl8Ln Cm7OLpuPo2lknvdiXaHih+RwVN92moS2w6uSADI/WoZyi/Sz8Y1NO55BCuyJy1TIKpwi f98mMeL2PXYUKhw7IlypYPiYRGiqddT4UIR1XIC0aXOvGh26broijogxvp5r0e0FFFFM Silv0uYn8cw/yWYipCXRgCc2OmSQ2vurlwdytclAmvjELGGxzYgsu1LxLY28vDki4FBq OZWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=o2qv6btz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id bh4-20020a170902a98400b00153b2d16614si13090131plb.540.2022.04.05.17.12.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Apr 2022 17:12:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=o2qv6btz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4EF1A15A236; Tue, 5 Apr 2022 16:56:57 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1384202AbiDENcH (ORCPT + 99 others); Tue, 5 Apr 2022 09:32:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345737AbiDEJW7 (ORCPT ); Tue, 5 Apr 2022 05:22:59 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E63C6EB3C; Tue, 5 Apr 2022 02:12:01 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 494E8210F4; Tue, 5 Apr 2022 09:12:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1649149920; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mVhpyElouVHiShzfKiMDpwTUsMQzwG3WoSbwcyg24Hk=; b=o2qv6btz8dqtGvXVklb2rUqFn4/SpHgk1qtD2cB+/Na3YKuaNPlK0xJA+XEQs3GqIfrsWJ i5L+/jIZaEkmbNq7j3ojQvfVjcdznGe3XGW3hPoaj5Cvjj6e/6zVr9wJq6IhUPdXrtTXij 3F5YLiidr96TCatmJROASTwlYMUqf9U= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 029C0132B7; Tue, 5 Apr 2022 09:11:59 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id xLx0O98HTGJgKwAAMHmgww (envelope-from ); Tue, 05 Apr 2022 09:11:59 +0000 Date: Tue, 5 Apr 2022 11:11:58 +0200 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: Tejun Heo Cc: Bui Quang Minh , cgroups@vger.kernel.org, kernel test robot , Zefan Li , Johannes Weiner , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: Re: [PATCH v2] cgroup: Kill the parent controller when its last child is killed Message-ID: <20220405091158.GA13806@blackbody.suse.cz> References: <20220404142535.145975-1-minhquangbui99@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 04, 2022 at 07:37:24AM -1000, Tejun Heo wrote: > And the suggested behavior doesn't make much sense to me. It doesn't > actually solve the underlying problem but instead always make css > destructions recursive which can lead to surprises for normal use cases. I also don't like the nested special-case use percpu_ref_kill(). I looked at this and my supposed solution turned out to be a revert of commit 3c606d35fe97 ("cgroup: prevent mount hang due to memory controller lifetime"). So at the unmount time it's necessary to distinguish children that are in the process of removal from children than are online or pinned indefinitely. What about: --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2205,11 +2205,14 @@ static void cgroup_kill_sb(struct super_block *sb) struct cgroup_root *root = cgroup_root_from_kf(kf_root); /* - * If @root doesn't have any children, start killing it. + * If @root doesn't have any children held by residual state (e.g. + * memory controller), start killing it, flush workqueue to filter out + * transiently offlined children. * This prevents new mounts by disabling percpu_ref_tryget_live(). * * And don't kill the default root. */ + flush_workqueue(cgroup_destroy_wq); if (list_empty(&root->cgrp.self.children) && root != &cgrp_dfl_root && !percpu_ref_is_dying(&root->cgrp.self.refcnt)) { cgroup_bpf_offline(&root->cgrp); (I suspect there's technically still possible a race between concurrent unmount and the last rmdir but the flush on kill_sb path should be affordable and it prevents unnecessarily conserved cgroup roots.) Michal