Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp6380576ybx; Mon, 11 Nov 2019 08:16:12 -0800 (PST) X-Google-Smtp-Source: APXvYqxqIIHHat+LWrYVX4vcqmzOK9tYMr/BtObuHOT79q11BvbRTBvoDv1z1J61JOrZZBOx5dYn X-Received: by 2002:a17:906:4c93:: with SMTP id q19mr17498036eju.91.1573488972216; Mon, 11 Nov 2019 08:16:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573488972; cv=none; d=google.com; s=arc-20160816; b=X2LoowyuSJufNaIyyH0rPEFGhURmfZ0eCpir3EdDeqNJFRmbZ1++GOKp53BWkpuJVb fKOQogEFdZ2eL0TJPRymDef4o8w06biKsvANyZNcQH1D5eySAeyQMJY2Yv12igEm9ovx Q8V25THEsAySslAQLdGBc7uDYTJw6cB7qTnUzUbxb1C11ew+jXJm9j0qRrf9ywFZzqP6 ORXvMhrfujqX0L3agIR8JLIg/ZUhz7V245/6mKQy4kq6wsh1PXziIWqsNfBSp+rRHIMU wWrtHL3Y6IbB1ijlzJiwWfmgn99qK/QcAADWNlHZazxSFN89cwn0TySM0gqlitVcBsmN lWjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=9FpCKjQ2qrSGFuAOAWaryEFrYtfuBB5lXhdLnxjwMeI=; b=YffPqMfWt2hG4L0RFjYAfagX6YZHndnC07SkbUGppD5r0itNcZuo29Fk8FZmgRHxLQ Q9AmjJT8Rfsj7/20aPut843Qlv/Q8L0hO/30JPK3V876J8JQtYul22cwQTEyudeaEaRQ zf2Zo3j89R79Jg3ru0uRW1e+l1SqwrklTPf6uZQsw8WjonvzJLUhGmV+ZNUATpn/HLzT A6D7HkaHNT7FMmDPr9nWz5D+pivebD3e0orGcjzGHXXTbPjoeZhudNH6MdANTgTQ8tGc QrTC1Qwxv8Rv3RMD9z0F5QMfmyP6bN0ElwSBIFUsZpBt+rITszEUDVbLIwdsaIruYqeU 0EAw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f3si9961066edx.449.2019.11.11.08.15.48; Mon, 11 Nov 2019 08:16:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726910AbfKKQPI (ORCPT + 99 others); Mon, 11 Nov 2019 11:15:08 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:52245 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726878AbfKKQPH (ORCPT ); Mon, 11 Nov 2019 11:15:07 -0500 Received: from [213.220.153.21] (helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1iUCLE-0006lT-FV; Mon, 11 Nov 2019 16:15:00 +0000 Date: Mon, 11 Nov 2019 17:14:59 +0100 From: Christian Brauner To: Adrian Reber Cc: Oleg Nesterov , Eric Biederman , Pavel Emelyanov , Jann Horn , Dmitry Safonov <0x7f454c46@gmail.com>, linux-kernel@vger.kernel.org, Andrei Vagin , Mike Rapoport , Radostin Stoyanov Subject: Re: [PATCH v7 1/2] fork: extend clone3() to support setting a PID Message-ID: <20191111161458.fjodxyx566dar6ob@wittgenstein> References: <20191111131704.656169-1-areber@redhat.com> <20191111152514.GA11389@redhat.com> <20191111154028.GF514519@dcbz.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191111154028.GF514519@dcbz.redhat.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 11, 2019 at 04:40:28PM +0100, Adrian Reber wrote: > On Mon, Nov 11, 2019 at 04:25:15PM +0100, Oleg Nesterov wrote: > > On 11/11, Adrian Reber wrote: > > > > > > v7: > > > - changed set_tid to be an array to set the PID of a process > > > in multiple nested PID namespaces at the same time as discussed > > > at LPC 2019 (container MC) > > > > cough... iirc you convinced me this is not needed when we discussed > > the previous version ;) Nevermind, probably my memory fools me. > > You are right. You suggested the same thing and we didn't listen ;) > > > So far I only have some cosmetic nits, > > Thanks for the quick review. I will try to apply your suggestions. > > > > @@ -175,6 +187,18 @@ struct pid *alloc_pid(struct pid_namespace *ns) > > > > > > for (i = ns->level; i >= 0; i--) { > > > int pid_min = 1; > > > + int t_pos = 0; > > ^^^^^ > > > > I won't insist, but I'd suggest to cache set_tid[t_pos] instead to make > > the code a bit more simple. > > > > > @@ -186,12 +210,24 @@ struct pid *alloc_pid(struct pid_namespace *ns) > > > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) > > > pid_min = RESERVED_PIDS; > > > > You can probably move this code into the "else" branch below. > > > > IOW, something like > > > > > > for (i = ns->level; i >= 0; i--) { > > int xxx = 0; > > > > if (set_tid_size) { > > int pos = ns->level - i; > > > > xxx = set_tid[pos]; > > if (xxx < 1 || xxx >= pid_max) > > return ERR_PTR(-EINVAL); > > /* Also fail if a PID != 1 is requested and no PID 1 exists */ > > if (xxx != 1 && !tmp->child_reaper) > > return ERR_PTR(-EINVAL); > > if (!ns_capable(tmp->user_ns, CAP_SYS_ADMIN)) > > return ERR_PTR(-EPERM); > > set_tid_size--; > > } > > > > idr_preload(GFP_KERNEL); > > spin_lock_irq(&pidmap_lock); > > > > if (xxx) { > > nr = idr_alloc(&tmp->idr, NULL, xxx, xxx + 1, > > GFP_ATOMIC); > > /* > > * If ENOSPC is returned it means that the PID is > > * alreay in use. Return EEXIST in that case. > > */ > > if (nr == -ENOSPC) > > nr = -EEXIST; > > } else { > > int pid_min = 1; > > /* > > * init really needs pid 1, but after reaching the > > * maximum wrap back to RESERVED_PIDS > > */ > > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) > > pid_min = RESERVED_PIDS; > > /* > > * Store a null pointer so find_pid_ns does not find > > * a partially initialized PID (see below). > > */ > > nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, > > pid_max, GFP_ATOMIC); > > } > > > > ... > > > > This way only the "if (set_tid_size)" block has to play with set_tid_size/set_tid. > > > > note also that this way we can easily allow set_tid[some_level] == 0, we can > > simply do > > > > - if (xxx < 1 || xxx >= pid_max) > > + if (xxx < 0 || xxx >= pid_max) > > > > although I don't think this is really useful. > > Yes. I explicitly didn't allow 0 as a PID as I didn't thought it would > be useful (or maybe even valid). How do you express: I don't care about a specific pid in pidns level , just give me a random one? For example, set_tid[0] = 1234 set_tid[1] = 5678 set_tid[2] = random_pid() set_tid[3] = 9 Wouldn't that be potentially useful? Christian