Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp869018ybi; Fri, 2 Aug 2019 05:42:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqxtbX7dm04LQabVxp/LHi9YkyKzmQqYKvuNET8+g9n7S1D4jr2RLRk2fg1gLOGseGJYoLZf X-Received: by 2002:a63:c006:: with SMTP id h6mr90167830pgg.290.1564749768467; Fri, 02 Aug 2019 05:42:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564749768; cv=none; d=google.com; s=arc-20160816; b=J5qut0TPndOXEknuaEtpGsPUl5DjXbJ09B7rkYSi2MgJlAu8K4O/baBAVRf+pZS3I3 5g37GuaoznAObLD5MBOd8oib+C57XgY6fWZhYuAghmM48+LeEQ6Ta/aUmENef/RNgHDh diuhneXuboxtab7N8WacXZeNHXS9i9TaMddaVyKYCJSpeUDm19sozuG0TbB0jMbwE0qS rutPzfgece5xm4IYXNOTRtVr4OZibnDaHowYZ5Mk/SCzG1Wqv+HGodfVHpOCq232xKvT A+TdjXiXu6JxmYz73uErzRTXJA83XVMXwJ0WO8Z23Ez8s99lKGbex9siYWUndLR76ahM gB+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:organization:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=2y4aLG61dr56RAuDYvEqgIQ7aI5z3w2PqV3GKeD1kmE=; b=IpkvJaSFSV0bYzaJFS35/MLrK20MG+rmF6ew6Jay0G4v/pQFMP/JKJL7iu+eFfV6XY ynhsdhdT/yszTWuXvxyR8qKqdUwM1EdnC3+QlpbeRmLpqksGzrV38VLXeYLTZxOAH0Rj pjayuWsdNhwiG18v1B4jhPLT4pXBlGWMrzl9askc8vbEIwXVeowvS1lBWKV1hao8awR1 dMQHV0s6C40v+1HeSjuU1cZ4j06/goTgA5Cnfh7rhmRr9QNdVnjR5ro2v73BpPlNDvJZ Ailx335Aa1razgUy/tv2CiYq0Xg70bWRsvr1aB9Jl/eo2Pn1xBOQ1N848F8tS3YZWu+w 4IWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f33si6556337pjg.81.2019.08.02.05.42.33; Fri, 02 Aug 2019 05:42:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389568AbfHBHZT (ORCPT + 99 others); Fri, 2 Aug 2019 03:25:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39244 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388008AbfHBHZT (ORCPT ); Fri, 2 Aug 2019 03:25:19 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AD8B7B5C27; Fri, 2 Aug 2019 07:25:18 +0000 (UTC) Received: from dcbz.redhat.com (ovpn-116-74.ams2.redhat.com [10.36.116.74]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 738985D9CD; Fri, 2 Aug 2019 07:25:13 +0000 (UTC) Date: Fri, 2 Aug 2019 09:25:11 +0200 From: Adrian Reber To: Oleg Nesterov Cc: Christian Brauner , Eric Biederman , Pavel Emelianov , Jann Horn , Dmitry Safonov <0x7f454c46@gmail.com>, linux-kernel@vger.kernel.org, Andrei Vagin , Mike Rapoport , Radostin Stoyanov Subject: Re: [PATCH v2 1/2] fork: extend clone3() to support CLONE_SET_TID Message-ID: <20190802072511.GD18263@dcbz.redhat.com> References: <20190731161223.2928-1-areber@redhat.com> <20190731174135.GA30225@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190731174135.GA30225@redhat.com> X-Operating-System: Linux (5.1.19-300.fc30.x86_64) X-Load-Average: 1.75 1.91 1.91 X-Unexpected: The Spanish Inquisition X-GnuPG-Key: gpg --recv-keys D3C4906A Organization: Red Hat User-Agent: Mutt/1.12.0 (2019-05-25) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 02 Aug 2019 07:25:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 31, 2019 at 07:41:36PM +0200, Oleg Nesterov wrote: > On 07/31, Adrian Reber wrote: > > > > Extending clone3() to support CLONE_SET_TID makes it possible restore a > > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and > > race free (as long as the desired PID/TID is available). > > I personally like this... but please see the question below. > > > +struct pid *alloc_pid(struct pid_namespace *ns, int set_tid) > > { > > struct pid *pid; > > enum pid_type type; > > @@ -186,12 +186,28 @@ struct pid *alloc_pid(struct pid_namespace *ns) > > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) > > pid_min = RESERVED_PIDS; > > > > - /* > > - * Store a null pointer so find_pid_ns does not find > > - * a partially initialized PID (see below). > > - */ > > - nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, > > - pid_max, GFP_ATOMIC); > > + if (set_tid) { > > + /* > > + * Also fail if a PID != 1 is requested > > + * and no PID 1 exists. > > + */ > > + if ((set_tid >= pid_max) || ((set_tid != 1) && > > + (idr_get_cursor(&tmp->idr) <= 1))) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Ah, I forgot to mention... this should work but only because > RESERVED_PIDS > 0. How about idr_is_empty() ? > > > But the main question is how it can really help if ns->level > 0, unlikely > CRIU will ever need to clone the process with the same pid_nr == set_tid > in the ns->parent chain. Not sure I understand what you mean. For CRIU only the PID in the PID namespace is relevant. > So may be kernel_clone_args->set_tid should be pid_t __user *set_tid_array? > Or I missed something ? Not sure why and how an array would be needed. Could you give me some more details why you think this is needed. Adrian