Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6250654ybi; Wed, 31 Jul 2019 10:55:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqzxdSCwvboPB76fNJ1jD/CyEHSHC5L1XRgTC8uh+6oBAakE2RsbD38opEA2qSEa6VhS2Tps X-Received: by 2002:a17:902:8f81:: with SMTP id z1mr119170288plo.290.1564595755786; Wed, 31 Jul 2019 10:55:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564595755; cv=none; d=google.com; s=arc-20160816; b=SasqHSiEo4CkzkVLDqPDDQw6vHjxaMsiWb5bNK6au2tlhgjQdjyQosYK8KkDYVYIA/ BB1RwbTNO9hYzbgZKqPXNkvbEBtjo7DkEgWwWGcSIfBm/6Dm+qvvaUjYgecQfWYXL7PZ B+nYtCMm1cSD5ngrqRCmfUO0Xrz652S1o5NXZhOC/6168ozE9Ef2/hYvFDX3TWESx611 SMcoeyUNB+3zGLFd6iKxStYUVlUpChKFVspKCg0T8brvjLV6qPNvyAYibj27tyku3EDB u6pggvKJe5M3qIl7/pfZxSFweithUSdP93BF3GnWAv6r7sLKlBpia6xWYQZwGug7zBJb nuNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=imPnDxj1R4HmNap2TBaYKBDVtImYkup8Ui4lVXSSnGA=; b=k8tySpihKuzMc1Hk1wiaCN+hX0ahsdyBPG+vXtDvq6VUUPxeotGlsxd1HhEzh8jLd/ KRjOnYovKebUJMQyLqCZZvT4C9FqJQpF24dIoNkwyQ5uBcLfI1CLxvqVq6gjATNsqvOh F9OwpXZm5ym0YWPgtEDOYiaOX5reW2W/2RgaRClpYIKXJEfVyT/YgafGIB1Y5bbckZ4Z WFT8AkpMAIebAomkLHTvPpWBL1xieXLZsG0HwMvzO5k4QpjqP01U68qK4+s3FlC06C00 YAFOAZXSYD4I+EVPmAjG+1o3xx45k2zH+dGrDGWuT5qpWTAPdwTqfiY5vm20vAyFn5CH 2cUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p20si34643376pgk.158.2019.07.31.10.55.40; Wed, 31 Jul 2019 10:55:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728174AbfGaRlk (ORCPT + 99 others); Wed, 31 Jul 2019 13:41:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41882 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727850AbfGaRlk (ORCPT ); Wed, 31 Jul 2019 13:41:40 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C4B1EC049E32; Wed, 31 Jul 2019 17:41:39 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (ovpn-204-92.brq.redhat.com [10.40.204.92]) by smtp.corp.redhat.com (Postfix) with SMTP id 4EB3660922; Wed, 31 Jul 2019 17:41:37 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Wed, 31 Jul 2019 19:41:39 +0200 (CEST) Date: Wed, 31 Jul 2019 19:41:36 +0200 From: Oleg Nesterov To: Adrian Reber Cc: Christian Brauner , Eric Biederman , Pavel Emelianov , Jann Horn , Dmitry Safonov <0x7f454c46@gmail.com>, linux-kernel@vger.kernel.org, Andrei Vagin , Mike Rapoport , Radostin Stoyanov Subject: Re: [PATCH v2 1/2] fork: extend clone3() to support CLONE_SET_TID Message-ID: <20190731174135.GA30225@redhat.com> References: <20190731161223.2928-1-areber@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190731161223.2928-1-areber@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 31 Jul 2019 17:41:39 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/31, Adrian Reber wrote: > > Extending clone3() to support CLONE_SET_TID makes it possible restore a > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and > race free (as long as the desired PID/TID is available). I personally like this... but please see the question below. > +struct pid *alloc_pid(struct pid_namespace *ns, int set_tid) > { > struct pid *pid; > enum pid_type type; > @@ -186,12 +186,28 @@ struct pid *alloc_pid(struct pid_namespace *ns) > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) > pid_min = RESERVED_PIDS; > > - /* > - * Store a null pointer so find_pid_ns does not find > - * a partially initialized PID (see below). > - */ > - nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, > - pid_max, GFP_ATOMIC); > + if (set_tid) { > + /* > + * Also fail if a PID != 1 is requested > + * and no PID 1 exists. > + */ > + if ((set_tid >= pid_max) || ((set_tid != 1) && > + (idr_get_cursor(&tmp->idr) <= 1))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Ah, I forgot to mention... this should work but only because RESERVED_PIDS > 0. How about idr_is_empty() ? But the main question is how it can really help if ns->level > 0, unlikely CRIU will ever need to clone the process with the same pid_nr == set_tid in the ns->parent chain. So may be kernel_clone_args->set_tid should be pid_t __user *set_tid_array? Or I missed something ? Oleg.