Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5061796ybl; Tue, 4 Feb 2020 07:03:20 -0800 (PST) X-Google-Smtp-Source: APXvYqxoxUUttSWnSMI1saQUu+tUHarfIRiSy56S1pcX1ylhLLcMe4glMqoZeV6SrX+uuh+jgWOt X-Received: by 2002:aca:b808:: with SMTP id i8mr3632001oif.66.1580828600460; Tue, 04 Feb 2020 07:03:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580828600; cv=none; d=google.com; s=arc-20160816; b=sQpIBSIkULSg+lqj/qDZj4p7R46VG5u1UJ8v8K4J6wbNQiKxfS9vW2QPZeYfoKmZrS LY2GoR6aO/0uCoD76Dnv7+aarZeM3hrEcTNnUzWVMziothYM7fBO1GsEZoxgymSIzxu6 FZhKzrtY6sRLHp08Kw6dgRcW8FrZjS1rTfs0McYS73tcaSlTyk1oE21RB1w8Ymc+aGEB r79zVeJzGT5rlwbsxUPu4ndfFtO3pcj4dE1vVi8Gqmvv6kLx58Vd7uA1d1CulJPoXyYt Si0ikgldAlsoL6Iq7qco5kcPk7vUHo01W8Ieqn4bowqcFxHmMQ/B8fi0VgnfPEENTcJA JRdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=SRGIoGambxiYP7tvb1Yv4X0S7NrSAui7zMdQryjlKmo=; b=YMhpTssyxvRHGgOdpjusxsXsyHfPwJp2O2pIOk2XqBF8q6eygBLaKdbNmLGG5y9BYt w0IbwXgy9Lt30kg3vUx2jMza5YtveONHkaWNmUm+KU3QJholzbb9Rewf5X+WOjFucNAB +KN8s0hSuuVj7kkGG6KOyQDMagd+DFC81TfpZZScE37qOMs3J4ckFeNw7mcFhDWfq8yL g2k7w6T+TDysZnRmCa1yasXBKgMiCiLLHqLG+B8wLUXtEhqdJG6PBf8FOIl7g2IyEk4e iJiz1eQY4Y6p3lFEX9qfqCKBD5HSnKAuF8QuQqWztgpt96ln083ZE6qyAoh/g+T/OE9P 9aNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n7si11180910otk.277.2020.02.04.07.03.07; Tue, 04 Feb 2020 07:03:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727366AbgBDPBu (ORCPT + 99 others); Tue, 4 Feb 2020 10:01:50 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:54206 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727258AbgBDPBt (ORCPT ); Tue, 4 Feb 2020 10:01:49 -0500 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iyzhx-0003UX-Je; Tue, 04 Feb 2020 15:01:45 +0000 Date: Tue, 4 Feb 2020 16:01:44 +0100 From: Christian Brauner To: Peter Zijlstra Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Tejun Heo , Oleg Nesterov , Ingo Molnar , Johannes Weiner , Li Zefan , cgroups@vger.kernel.org Subject: Re: [PATCH v5 5/6] clone3: allow spawning processes into cgroups Message-ID: <20200204150144.fojbdmuyr7bnvgnj@wittgenstein> References: <20200121154844.411-1-christian.brauner@ubuntu.com> <20200121154844.411-6-christian.brauner@ubuntu.com> <20200204115351.GD14879@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200204115351.GD14879@hirez.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 04, 2020 at 12:53:51PM +0100, Peter Zijlstra wrote: > On Tue, Jan 21, 2020 at 04:48:43PM +0100, Christian Brauner wrote: > > This adds support for creating a process in a different cgroup than its > > parent. Callers can limit and account processes and threads right from > > the moment they are spawned: > > - A service manager can directly spawn new services into dedicated > > cgroups. > > - A process can be directly created in a frozen cgroup and will be > > frozen as well. > > - The initial accounting jitter experienced by process supervisors and > > daemons is eliminated with this. > > - Threaded applications or even thread implementations can choose to > > create a specific cgroup layout where each thread is spawned > > directly into a dedicated cgroup. > > > > This feature is limited to the unified hierarchy. Callers need to pass > > an directory file descriptor for the target cgroup. The caller can > > choose to pass an O_PATH file descriptor. All usual migration > > restrictions apply, i.e. there can be no processes in inner nodes. In > > general, creating a process directly in a target cgroup adheres to all > > migration restrictions. > > AFAICT, he *big* win here is avoiding the write side of the > cgroup_threadgroup_rwsem. Or am I mis-reading the patch? No, you're absolutely right. I just didn't bother putting implementation specifics in the cover letter and I probably should have. So thanks for pointing that out! > > That global lock is what makes moving tasks/threads around super > expensive, avoiding that by use of this clone() variant wins the day. :) Christian