Received: by 10.213.65.68 with SMTP id h4csp658240imn; Tue, 13 Mar 2018 16:58:36 -0700 (PDT) X-Google-Smtp-Source: AG47ELsSf+xWvMM6MpCMx5iE0lgHS+6b5GiV2NTaZwAmdhZRLOGj3lMedYnE30V95CRs4nhvxA+F X-Received: by 2002:a17:902:2b84:: with SMTP id l4-v6mr2208791plb.338.1520985516812; Tue, 13 Mar 2018 16:58:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520985516; cv=none; d=google.com; s=arc-20160816; b=wUYj28HeP/rX0kqOjQvmWy+aw3VGZXM4QNkkLLuI7qnk2zTRcL5TJ41H00gudfPmPO iq/3T4lQ947eVxJdCfQ17b+i69r1bHNVmJSNSj/bRVxDacts/8ipbT7EiC9uXnkyk5lQ 9R4JD682OAPT0BXjC+Y7gsFvpCo8Srq5xrJobgFEpZECtE/pk3LVk/gYkYGkHBUV42Fd OQ0p3wWFENIsSGK57k/SGrRMxUEk9xqYabuwQmFmpQfZvONgiyBv7RpozDb7juo1VBVe b+1ifasUYgilxupbY7Dz0I55+Y52BsDEaM37/w+dL3jpYsUCMIjcw5uP/q7xZzI/NK5s 4t7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=OGuQ7SJFfVK8NsH1T9B7S9b+Z7jL1dzxpR6w3e8Ix2c=; b=jzuz/hSaAS+cZf70RKuJl7LSlLI+4GTI6APeB6gj1gOUrH82k91TkFZ6n5w7pNgd0O MBmaLowq1acvkNvPdG7oXIxBcw9my34UPgOa74peUSPBIOyHqpeB0V+DF5CuBnKZ+3hj 7lZF2KaZnTCxxJ/HU7GG4gQsA7wWmqySe+AuBIwZS5ImBC4jJH8vHiy04IMrp6CHg4kT 53j92t+xekXxR/jGGOretsx0iRMwKLKuqFL/vBYzy+7aM7OoJvY0E0FuskE7JQR3CjQQ nJd7njiHwGKM47cd6AGd+AdOUY4Sio4t5A3y9OVZOax4POFADwUeRzlT9bAMoF5C8IKc UyrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=uhB59fDf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t71si880012pgc.793.2018.03.13.16.58.21; Tue, 13 Mar 2018 16:58:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=uhB59fDf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932864AbeCMX5Y (ORCPT + 99 others); Tue, 13 Mar 2018 19:57:24 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:56520 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752885AbeCMX5V (ORCPT ); Tue, 13 Mar 2018 19:57:21 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w2DNpkvu023377; Tue, 13 Mar 2018 23:57:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=OGuQ7SJFfVK8NsH1T9B7S9b+Z7jL1dzxpR6w3e8Ix2c=; b=uhB59fDfRtE4HZjid5RpL7dbNZSpe5lnX5LNfRMDzEy5yEx7w8204cLahmG4wseshHVV 4cq6HezBSwUU9HSTnpiqpqpAcG0vE8hD0wIYD5dM9PRhASO2Uta+LhfFKyYnnocE/BUt ZbiUUU54Ce1oPuxQdNzzhomHDzaCYuH2p2s/9LZ+eY3GbhLkzFRLJ5CyfFTOsp2+xDpQ 5VVtGiQAwENVa4RItquhMPT7ty2G2rsIHnEJKtCou1Wt5YhDRFr8eScqfHaOhowKQqAh /AJ5quU3VLeHGLqnopFJm46dl4x90ttkwt1Oi1d1QBhBMF0TCmbQSsZFPev2QpcCZb+v SQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2gprh4r1bs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Mar 2018 23:57:11 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w2DNv83t030291 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Mar 2018 23:57:09 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w2DNv7QJ020589; Tue, 13 Mar 2018 23:57:07 GMT Received: from [10.132.92.135] (/10.132.92.135) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 13 Mar 2018 16:57:07 -0700 Subject: Re: [RESEND RFC] translate_pid API To: Jann Horn Cc: kernel list , Linux API , Konstantin Khlebnikov , Nagarajan.Muthukrishnan@oracle.com, Prakash Sangappa , Andy Lutomirski , Andrew Morton , Oleg Nesterov , Serge Hallyn , "Eric W. Biederman" , Eugene Syromiatnikov , xemul@parallels.com References: <1520875093-18174-1-git-send-email-nagarathnam.muthusamy@oracle.com> <69f13674-7f84-5dc7-0bd7-e5e65e9cb3b0@oracle.com> <1a8cac8b-22cc-e194-4244-b20428c8a9c2@oracle.com> From: Nagarathnam Muthusamy Message-ID: <65625f8c-d701-7407-3999-6ca30a59c236@oracle.com> Date: Tue, 13 Mar 2018 16:52:40 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8831 signatures=668690 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803130263 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/13/2018 04:10 PM, Jann Horn wrote: > On Tue, Mar 13, 2018 at 3:45 PM, Nagarathnam Muthusamy > wrote: >> On 03/13/2018 03:00 PM, Jann Horn wrote: >>> On Tue, Mar 13, 2018 at 2:44 PM, Nagarathnam Muthusamy >>> wrote: >>>> On 03/13/2018 02:28 PM, Jann Horn wrote: >>>>> On Tue, Mar 13, 2018 at 2:20 PM, Nagarathnam Muthusamy >>>>> wrote: >>>>>> On 03/13/2018 01:47 PM, Jann Horn wrote: >>>>>>> On Mon, Mar 12, 2018 at 10:18 AM, >>>>>>> wrote: > [...] >>>>>>>> + */ >>>>>>>> +SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source, >>>>>>>> + u64, target) >>>>>>>> +{ >>>>>>>> + struct pid_namespace *source_ns = NULL, *target_ns = NULL; >>>>>>>> + struct pid *struct_pid; >>>>>>>> + struct pid_namespace *ph; >>>>>>>> + struct hlist_bl_head *shead = NULL; >>>>>>>> + struct hlist_bl_head *thead = NULL; >>>>>>>> + struct hlist_bl_node *dup_node; >>>>>>>> + pid_t result; >>>>>>>> + >>>>>>>> + if (!source) { >>>>>>>> + source_ns = &init_pid_ns; >>>>>>>> + } else { >>>>>>>> + shead = pid_ns_hash_head(pid_ns_hash, source); >>>>>>>> + hlist_bl_lock(shead); >>>>>>>> + hlist_bl_for_each_entry(ph, dup_node, shead, node) { >>>>>>>> + if (source == ph->ns.ns_id) { >>>>>>>> + source_ns = ph; >>>>>>>> + break; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + if (!source_ns) { >>>>>>>> + hlist_bl_unlock(shead); >>>>>>>> + return -EINVAL; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + if (!ptrace_may_access(source_ns->child_reaper, >>>>>>>> + PTRACE_MODE_READ_FSCREDS)) { >>>>>>> AFAICS this proposal breaks the visibility restrictions that >>>>>>> namespaces normally create. If there are two namespaces-based >>>>>>> containers that use the same UID range, I don't think they should be >>>>>>> able to learn information about each other, such as which PIDs are in >>>>>>> use in the other container; but as far as I can tell, your proposal >>>>>>> makes it possible to do that (unless an LSM or so is interfering). I >>>>>>> would prefer it if this API required visibility of the targeted PID >>>>>>> namespaces in the caller's PID namespace. >>>>>> >>>>>> I am trying to simulate the same access restrictions allowed >>>>>> on a process's /proc//ns/pid file. If the translator has >>>>>> access to /proc//ns/pid file of both source and destination >>>>>> namespaces, shouldn't it be allowed to translate the pid between >>>>>> them? >>>>> But the translator doesn't actually need to have access to those >>>>> procfs files, right? >>>> I thought it should have access to those procfs files to satisfy the >>>> visibility constraint that targeted PID namespaces should be visible >>>> in caller's PID namespace and ptrace_may_access checks that >>>> constraint. >>> If there are two containers that use the same UID range, >>> ptrace_may_access() checks from a process in one container on a >>> process in another container can pass. Normally, you just can't even >>> reach the ptrace_may_access() checks because you can't reference >>> processes in another container in any way. >> >> If there is no way to reference the process in another container, >> there is no way to get to the /proc//ns/pidns_id file which >> exports the ID of that container right? So, a translator has to >> first guess the container ID then try translate. Even after translation, >> unless the translator has proper privileges, I believe it cant do >> anything with just the pid right? > Well, yes to both. You'd have to guess the ID of the container, and > you wouldn't be able to do much with it, apart from finding valid PIDs > and their mapping between namespaces. > >>> By the way, a related concern: The use of global identifiers will >>> probably also negatively affect Checkpoint/Restore In Userspace? >> Will look into this. Can you point me to the specifics of the >> usecase which could be negatively affected? > AFAICS you won't be able to reliably recreate namespace IDs when a > process is checkpointed and resumed, meaning that checkpoint/resume > won't work on processes that use these namespace IDs. I agree. When the process is resumed, the namespace IDs might be obsolete. Thanks, Nagarathnam.