Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp599737ybl; Wed, 14 Aug 2019 03:06:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqyyy4xTMS1D7S1NBUFScv0sWmDTiD5lwC45jjbFGl+ylhTcVQYKecVL/sQNSUpv4uDnnYfF X-Received: by 2002:a17:90a:1b0a:: with SMTP id q10mr1567976pjq.91.1565777166270; Wed, 14 Aug 2019 03:06:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565777166; cv=none; d=google.com; s=arc-20160816; b=R1UvyyFLYm5BJBkSKUzSdrkTkJK9d8fS8UDnd2/TWwE6RUCmDb3WeJ5szQaCPy3ceK xZG5Wr1SXSvPKRL2QRDk15lrJVUVpadOFihYxjgDEQnfxAVlK1x4XLfNcHPRhM0+hOsI 1tuHXoAPTuwErJArTNxl3ly313hjB8YxK/9Pzd9+3tli54QJ+/vNvKeD3HhCkdLFA9hv xUluXPKCCJ2LC9N8YtgIFmXKPfOldklBuoIqkxmQ7YneIrANkKUTDZenDZG+SqQWOC4F DFQTLNaGw/jnfKD2zMGNyQaB118IPz8owUoYfbEmq/v4yGNANlqP7VvQqpqaUwgDrPBO +smQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:message-id:subject:cc:to:from:date; bh=s0PvAZwrKC4xVT1yQx/3xIz5yefLzBXRisrMB+vRu44=; b=mNu47Zqnx/VoFo9SHRRhB8SAwe0OfOLcGlBm9Tr1iHV8Iiss98yCDC8S5wxICM5/2a tQlxWKhjqh199Hw42mLZYTVuZpVWAOI+OGVnOft5uf6hQUtxR7jO53y9LLBnAtzaL1Az /vhjpllHs/PdQJo4r/oV2DK4YdE7O/ZZkpPC8yhQhfYLmodMXNBbY2vLJC4rLdcs3Dcg 5gnP7wg8td1XtMtEUWdE8MdcGSyix6sWoITW6HIELkk60WPcZqDN08RsWxR/BvmJ7UfR qOUOg4DYXla/Ipb5hwGlVjP9HGapElQ9nMAi0jC662LPloTF7S2vpx/skMUHKAXQ5RBi 2FLA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i8si30186126pfr.97.2019.08.14.03.05.49; Wed, 14 Aug 2019 03:06:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727164AbfHNKEV (ORCPT + 99 others); Wed, 14 Aug 2019 06:04:21 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:60594 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725280AbfHNKEV (ORCPT ); Wed, 14 Aug 2019 06:04:21 -0400 Received: from [213.220.153.21] (helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1hxq8R-0000Aj-TB; Wed, 14 Aug 2019 10:04:04 +0000 Date: Wed, 14 Aug 2019 12:04:03 +0200 From: Christian Brauner To: Pavel Emelianov , Oleg Nesterov Cc: Adrian Reber , Eric Biederman , Jann Horn , Dmitry Safonov <0x7f454c46@gmail.com>, "linux-kernel@vger.kernel.org" , "Andrei Vagin (C)" , Mike Rapoport , Radostin Stoyanov Subject: Re: [PATCH v6 1/2] fork: extend clone3() to support setting a PID Message-ID: <20190814100402.jf5p2wsqngeuazaj@wittgenstein> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <6a65c4c4-6860-b042-a0bf-b3f8e9b277af@virtuozzo.com> <20190813143023.GC6971@redhat.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 14, 2019 at 09:50:03AM +0000, Pavel Emelianov wrote: > On 8/12/19 11:09 PM, Adrian Reber wrote: > > The main motivation to add set_tid to clone3() is CRIU. > > > > To restore a process with the same PID/TID CRIU currently uses > > /proc/sys/kernel/ns_last_pid. It writes the desired (PID - 1) to > > ns_last_pid and then (quickly) does a clone(). This works most of the > > time, but it is racy. It is also slow as it requires multiple syscalls. > > > > Extending clone3() to support set_tid makes it possible restore a > > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and > > race free (as long as the desired PID/TID is available). > > > > This clone3() extension places the same restrictions (CAP_SYS_ADMIN) > > on clone3() with set_tid as they are currently in place for ns_last_pid. > > > > Signed-off-by: Adrian Reber > > Acked-by: Pavel Emelyanov On Tue, Aug 13, 2019 at 04:30:24PM +0200, Oleg Nesterov wrote: > On 08/12, Adrian Reber wrote: > > > > The main motivation to add set_tid to clone3() is CRIU. > > > > To restore a process with the same PID/TID CRIU currently uses > > /proc/sys/kernel/ns_last_pid. It writes the desired (PID - 1) to > > ns_last_pid and then (quickly) does a clone(). This works most of the > > time, but it is racy. It is also slow as it requires multiple syscalls. > > > > Extending clone3() to support set_tid makes it possible restore a > > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and > > race free (as long as the desired PID/TID is available). > > > > This clone3() extension places the same restrictions (CAP_SYS_ADMIN) > > on clone3() with set_tid as they are currently in place for ns_last_pid. > > > > Signed-off-by: Adrian Reber > > Reviewed-by: Oleg Nesterov > Added-to: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=pidfd Merged-into: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=for-next Thanks! Christian