Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4889152ybl; Tue, 4 Feb 2020 03:55:09 -0800 (PST) X-Google-Smtp-Source: APXvYqxxRG184PVNQcqzYBsT8nNGtPjEbumOqmejfl8oKnoSPQlcwfLcZolUMeGslZ1HMsGao4fe X-Received: by 2002:aca:4789:: with SMTP id u131mr3027219oia.43.1580817309229; Tue, 04 Feb 2020 03:55:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580817309; cv=none; d=google.com; s=arc-20160816; b=0Js0yGd6X1tHH4o1/Fhl8X+DEnU1zYQU8QwLdrTkncBmd+0J9a0vUM5gVyJHUMDt63 dfHiHgYGFThw5jdPScBP6qi7dujcYJbQ1VzDQvP1xvW68p3C8eOI9xN7ZETSf+7FQKCJ gPw4216VndBQrgrfOVIj5dyDGKe7zT2FHTLSOxL7vpJCLg2oVO936m2tQ3icg3AImqiV IJ4TaA6KxK3DlpGt3re+7yCki0ZQcCGkOtlvMXwwAAeP9q5bFsH+KmRNYQCp3sJcLiAo SzU/fGfARf+k7F8xRVQryXCyfXob2sr4H6G8dybtjoGK2tI+zLYtAeZDGwZoo/G1M0HW dY9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=iUu18RYN80bzMMLTjBJu6yIsP2v0nDSnUCJt6N21n6A=; b=QiPm1HSXPNdoYF/IKDUuvng9BOOt8Askx2LXe2mYg5VCV2CJA2Hyxfb3jCguMOc0UR RLudwK/zYwTflai0lwqIaTY3iAoumeM5/hkzqIMSxaaMQLL9rAZKW7sAXVy06aaMjiMo QxAAVzE4BLG5paOKD4rTu4UTdujlFp7a/JfJ9DT8aI0Ko5ZKyexH6AO0xQnYY7nyVLb2 UMepDKr3KtNwx4JXRR9QaTwPTsuVDMaoRb5u8FYqCs9bH3+QrI1P3DdonP/M0uW1b5WL eyze9fBe94FhZPhX7BvtIa8ztjIpV1E2Ss87FjxLFlEHNXtkzgIiZ9Py2gay/qgmupbW lbvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=KVQzzxEw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f23si7378837oti.283.2020.02.04.03.54.56; Tue, 04 Feb 2020 03:55:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=KVQzzxEw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727148AbgBDLyC (ORCPT + 99 others); Tue, 4 Feb 2020 06:54:02 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:51718 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727040AbgBDLyC (ORCPT ); Tue, 4 Feb 2020 06:54:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=iUu18RYN80bzMMLTjBJu6yIsP2v0nDSnUCJt6N21n6A=; b=KVQzzxEwjH+T/QHWi2sBA9hSk6 3YEa05Z7jzH6PZOqAtx4/80TcdwU6pRdSk3UvqNI1dESExgcgU2hjTDEsqQmuyiYsTDnja/tuV3st 3RmlufXrmB7hhAKpM0FEgnpJygQRQYRvW4Q4ykMQulQ8ZYF48NxRwKxawmmhDojuEdD4JdRyIcj4N NTx9eVuWxbmDL2N5LFEycSeQT//85aGCNXPB+CK6skMQ9QS3TW3c0S9Rr/w1hATzWkrqaTKAt1dGm Qfa+g7v+keGDGVuq6+CWBBrBPkPQ6g6NLiIKKTPEAHnA3ueQON4a/Cc4uEPOJzhbBBzXoZ2h35PAX qEaDf1Rg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1iywm9-00060d-Sw; Tue, 04 Feb 2020 11:53:54 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id D983F30257C; Tue, 4 Feb 2020 12:52:05 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 483C720247145; Tue, 4 Feb 2020 12:53:51 +0100 (CET) Date: Tue, 4 Feb 2020 12:53:51 +0100 From: Peter Zijlstra To: Christian Brauner Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Tejun Heo , Oleg Nesterov , Ingo Molnar , Johannes Weiner , Li Zefan , cgroups@vger.kernel.org Subject: Re: [PATCH v5 5/6] clone3: allow spawning processes into cgroups Message-ID: <20200204115351.GD14879@hirez.programming.kicks-ass.net> References: <20200121154844.411-1-christian.brauner@ubuntu.com> <20200121154844.411-6-christian.brauner@ubuntu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200121154844.411-6-christian.brauner@ubuntu.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 21, 2020 at 04:48:43PM +0100, Christian Brauner wrote: > This adds support for creating a process in a different cgroup than its > parent. Callers can limit and account processes and threads right from > the moment they are spawned: > - A service manager can directly spawn new services into dedicated > cgroups. > - A process can be directly created in a frozen cgroup and will be > frozen as well. > - The initial accounting jitter experienced by process supervisors and > daemons is eliminated with this. > - Threaded applications or even thread implementations can choose to > create a specific cgroup layout where each thread is spawned > directly into a dedicated cgroup. > > This feature is limited to the unified hierarchy. Callers need to pass > an directory file descriptor for the target cgroup. The caller can > choose to pass an O_PATH file descriptor. All usual migration > restrictions apply, i.e. there can be no processes in inner nodes. In > general, creating a process directly in a target cgroup adheres to all > migration restrictions. AFAICT, he *big* win here is avoiding the write side of the cgroup_threadgroup_rwsem. Or am I mis-reading the patch? That global lock is what makes moving tasks/threads around super expensive, avoiding that by use of this clone() variant wins the day.