Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1892213rwl; Thu, 6 Apr 2023 02:52:32 -0700 (PDT) X-Google-Smtp-Source: AKy350ZZj4uhjv0zo3H2KhErr8mOH0NVS6HdgWAWv0CSEsLuzMlupjl1y+Pw+qU2r6kVqSZ00XsZ X-Received: by 2002:a17:906:8a69:b0:90b:53f6:fd8a with SMTP id hy9-20020a1709068a6900b0090b53f6fd8amr5118329ejc.10.1680774752378; Thu, 06 Apr 2023 02:52:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680774752; cv=none; d=google.com; s=arc-20160816; b=BHXDUqHetay0zLFuiAR6A9CkgIbr6x4IteJMyou2YBaLF6wYQp6+jj5HHk2ngSoXRK JHD6EGf+iCWFaKN8wFTeNoCfdTbglgrSJs/Ug7n9YlOUifump9CYUFjf2nOcEXVNT0wH KI9eo2l/eXwDn1xHo0s0pxgKn2Xf5mjDZ0W72wPyB5D95LpeytpV3QH7H/urZmwu7BsC DOJ1J7xg/EAMMfVK0TDj2MZkjowdDgrpfXXopH7vpH2pxeIkFYgr5H5eN9EoGwx0CMXK iZ3w9af06j3jX0WP9tH9DXV9mp09qZCO9KcbOm2mj2ZyplsVnbADp1LxTBjtfp6DBgxe ZloQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tfLpz22cFBlOE8lN+su4aioL42PracOQdRfY6z5UnvM=; b=P51n1oAD3mLNgYIplz5S+cGi5YSkVzi36ewCly3aCbOmPzpevJ9FjSlBloEXpf8Ihk R3nj6kxtJ5gTWnHvbz2OdImCnTeYjcUi3rGQ2pak7AMrMdrk2RHd++IDhdpKJIgmRaXm s80MLTdKUjIZ+p7FRHN7HPtSzRvXTxplB8EcEFbYE1cblqX1lZBxYkyOaqHUgjOPuR2z Td947WjdJAmA0f4F+613Ni61b13ixqAB+o1S17WptT3JBTSlZIuHz+e2xDpaGOMFaaqe BIClbYyV70HgpiLnp52aGglB0nKTAtyhLz1Ppno5ev8cfJxw/A6XeE/6qTYSDcD8TA30 qiBg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Cu2ZM7g4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u18-20020a170906b11200b008ceed901431si879317ejy.360.2023.04.06.02.52.06; Thu, 06 Apr 2023 02:52:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Cu2ZM7g4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236566AbjDFJs1 (ORCPT + 99 others); Thu, 6 Apr 2023 05:48:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236998AbjDFJsC (ORCPT ); Thu, 6 Apr 2023 05:48:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DEDBA5FE; Thu, 6 Apr 2023 02:45:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 90721644B7; Thu, 6 Apr 2023 09:44:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC833C433EF; Thu, 6 Apr 2023 09:44:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680774266; bh=mH5FA82ZmBZCk3OyomF15pnwwOXhYUZvWscsCw1LcYU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Cu2ZM7g4q/pBTzeZyybOsqrI1YACcXxFg+9Ssl85A9DaoaGBcMU3dF85fmPzAli5n 7t/biEsqtZRFCn9qONpvQzPKoKF02/R/tH8i9Ho4Yo3ePolMpdN67xGvk9Ab4m/krQ psOvWpDwYDAQE3hgtEPegn3dq9Axd5XV5jFFP7nju/KQW2coFx9XVxgMG4zZOSZWhn GuKGE5XWZX4H7bDetPxbNtQ+pk8iRKcBcodIVnDm5ouDyM+2sIrGPfKCbiC1ryVI1l t05wme3n786TnyTs3UzlVsEldRzBFdSkjZ5h4D4D/eoyeeW6K3IZX70Ei4sKbmXirJ r5tHKmvmN6Jew== Date: Thu, 6 Apr 2023 11:44:20 +0200 From: Christian Brauner To: Waiman Long Cc: Tejun Heo , Zefan Li , Johannes Weiner , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Juri Lelli , Dietmar Eggemann , gscrivan@redhat.com Subject: Re: [PATCH 3/3] cgroup/cpuset: Allow only one active attach operation per cpuset Message-ID: <20230406-haselnuss-baumhaus-83dc05f869df@brauner> References: <20230331145045.2251683-1-longman@redhat.com> <20230331145045.2251683-4-longman@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230331145045.2251683-4-longman@redhat.com> X-Spam-Status: No, score=-5.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 31, 2023 at 10:50:45AM -0400, Waiman Long wrote: > The current cpuset code uses the global cpuset_attach_old_cs variable > to store the old cpuset value between consecutive cpuset_can_attach() > and cpuset_attach() calls. Since a caller of cpuset_can_attach() may > not need to hold the global cgroup_threadgroup_rwsem, parallel cpuset > attach operations are possible. > > When there are concurrent cpuset attach operations in progress, > cpuset_attach() may fetch the wrong value from cpuset_attach_old_cs > causing incorrect result. To avoid this problem while still allowing > certain level of parallelism, drop cpuset_attach_old_cs and use a > per-cpuset attach_old_cs value. Also restrict to at most one active > attach operation per cpuset to avoid corrupting the value of the > per-cpuset attach_old_cs value. > > Signed-off-by: Waiman Long > --- > kernel/cgroup/cpuset.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 2367de611c42..3f925c261513 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -198,6 +198,8 @@ struct cpuset { > > /* Handle for cpuset.cpus.partition */ > struct cgroup_file partition_file; > + > + struct cpuset *attach_old_cs; > }; > > /* > @@ -2456,22 +2458,27 @@ static int fmeter_getrate(struct fmeter *fmp) > return val; > } > > -static struct cpuset *cpuset_attach_old_cs; > - > /* Called by cgroups to determine if a cpuset is usable; cpuset_rwsem held */ > static int cpuset_can_attach(struct cgroup_taskset *tset) > { > struct cgroup_subsys_state *css; > - struct cpuset *cs; > + struct cpuset *cs, *oldcs; > struct task_struct *task; > int ret; > > /* used later by cpuset_attach() */ > - cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css)); > + oldcs = task_cs(cgroup_taskset_first(tset, &css)); > cs = css_cs(css); > > percpu_down_write(&cpuset_rwsem); > > + /* > + * Only one cpuset attach operation is allowed for each cpuset. > + */ > + ret = -EBUSY; > + if (cs->attach_in_progress) > + goto out_unlock; That'll mean CLONE_INTO_CGROUP becomes even more interestig because it isn't subject to this restriction in contrast to fork()+migrate, right?