Received: by 10.213.65.68 with SMTP id h4csp642426imn; Tue, 13 Mar 2018 16:12:21 -0700 (PDT) X-Google-Smtp-Source: AG47ELsvGsOTnLteUrO/SpPtDRxBLKZnyLXAamNE6bkGG6FifAUgcBJ+6Wvp/UlLjBFL3QNmpeq0 X-Received: by 10.99.101.193 with SMTP id z184mr1879511pgb.429.1520982741365; Tue, 13 Mar 2018 16:12:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520982741; cv=none; d=google.com; s=arc-20160816; b=kjUXnzFkQO3bWkfrHiWK5RZ/NXPPfTk/v0iNiUnerpOCg5Va7pjmHhbRwT2YqH+fsm +bLUlxoHZZKR07Rs6YR7hdH+EywYxC0tyOeG4u9SV7FCpKqzr4Z0TRqe23zpcstUDoQt F+OfKla6XNFaf91i1L4aQb/t6MPPgKaIs3leIGWdp80z66UaGbe2TV46GQKYw0ktyigF dm3/yTAET6l9u+CGLb4O/PqqKUpUkNFG77QPuib3gaTTBGnhhEvZUThMjeZJdPVKVMxD sk/DAdf4eymEt2hPPaWEqTKtZSXbBFIDxIfGtCJQ7wKziq5nNE3I+O+q62sxl5uUpfPN Uiug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=Vlwdz0cunvmB08QclwjXTgFUo/lcCBdOWhMinAKQzAA=; b=e1ASYJkqp4/Jq0Rk/9TC4gqSxrTNRVZe/vX73IcneTg/hiJabt8h8qX7N+RFRA5BCt qp6dj08HmsFzpIEKqmnoKJ2oO668eths10nN2yKCe7lVq5SRF4LUFV9pLQbVyxUu0v1A 3zSy57CAIVgM66+jxDZ5LqJivqUmcZyfOs3uJC40crvQ2h0WcdVJDbgFSdYQaxWkgMDA hkm9jlOZ4Q2wjqLnOnTCd1MMkgex0TwlvcmUPA9ei7u5bWQu086QuEGpQP4bWlMzDpTH eHT+6FyX9cN0zu4gG+3zIOl9AccQdXWiLgZDDIzD0aM6onW4CzeR2uV8h6Gus8dNJ5Yj /DWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fM/6nIJW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a13si833856pgd.36.2018.03.13.16.12.06; Tue, 13 Mar 2018 16:12:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fM/6nIJW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932771AbeCMXK5 (ORCPT + 99 others); Tue, 13 Mar 2018 19:10:57 -0400 Received: from mail-oi0-f48.google.com ([209.85.218.48]:37129 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932481AbeCMXKz (ORCPT ); Tue, 13 Mar 2018 19:10:55 -0400 Received: by mail-oi0-f48.google.com with SMTP id f186so1178571oig.4 for ; Tue, 13 Mar 2018 16:10:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Vlwdz0cunvmB08QclwjXTgFUo/lcCBdOWhMinAKQzAA=; b=fM/6nIJWdQ/s7vxUanzyFmAap6+TOP/F5UAwMvPiRym0352G8KMaEiyeKzrvpn4QYe MezKQDAY/jNIOH7dIaJX3kvKAmGeeUpZAJXokD+1D7fEa4RU2detIk4yaHXw9eox6FqZ eYRBqsHOy23iJ4cGDWQaiQM3e3V6Kz1Uw4NGEgHzJ0AjK6NN+QDb9JBfkz5wBoM2Tr4L G4F6nFvTqqE9NtUUacRLjIGunhLvXWwkjU6selQrBeVaf3XF+Ks8tZtpwaKFl5jWREiF I6y3BBoewXWo3qygnV26Zo+yPHHrZLjAN6zeaqOXz/xKpH4sjojFueNwKGiiqIWL7Tv8 ncqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Vlwdz0cunvmB08QclwjXTgFUo/lcCBdOWhMinAKQzAA=; b=r5W5fapVUIwZGj1CPc7oiUyLkyj/Q5Rk6cWgpRD9GgB76R2XFlSVykfGX9PLggOXWc tjssO8f0G3ZcF2WcqEZtK6LGPQYUMlvkmYxv18F/Aqo/Ch/YgacnzSWpNq0BL8nNk5m8 WjHZnlmKRox4aeCIIoH85eHOgxGevFG4Gf2tanPsqrsYywgFBhp+G5nDN33QFP1zfmny 35TWJCBct5+en67S0H/2hihIZmoC+kUxbsGwD8ahAD5gLgTs4EEvN/aZVEJS7v5Ku9oC BOuvVworWcOnxLRyBcIwPaOKy3cunPrXZc6aqom6HUDy7SGeGdaa1YvDNVxJ5HGMDD9/ APTw== X-Gm-Message-State: AElRT7HPpner3/xzO2j9G6uYud/9Ifu5DuRQPd+sYQYtL97TGCa89LbR ZwxMpiwAABQDZNcZJ31OKuhNIBM+s1nKn/Mwn62jbA== X-Received: by 10.202.48.211 with SMTP id w202mr1465766oiw.29.1520982654183; Tue, 13 Mar 2018 16:10:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.74.39.84 with HTTP; Tue, 13 Mar 2018 16:10:33 -0700 (PDT) In-Reply-To: References: <1520875093-18174-1-git-send-email-nagarathnam.muthusamy@oracle.com> <69f13674-7f84-5dc7-0bd7-e5e65e9cb3b0@oracle.com> <1a8cac8b-22cc-e194-4244-b20428c8a9c2@oracle.com> From: Jann Horn Date: Tue, 13 Mar 2018 16:10:33 -0700 Message-ID: Subject: Re: [RESEND RFC] translate_pid API To: Nagarathnam Muthusamy Cc: kernel list , Linux API , Konstantin Khlebnikov , Nagarajan.Muthukrishnan@oracle.com, Prakash Sangappa , Andy Lutomirski , Andrew Morton , Oleg Nesterov , Serge Hallyn , "Eric W. Biederman" , Eugene Syromiatnikov , xemul@parallels.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 13, 2018 at 3:45 PM, Nagarathnam Muthusamy wrote: > On 03/13/2018 03:00 PM, Jann Horn wrote: >> On Tue, Mar 13, 2018 at 2:44 PM, Nagarathnam Muthusamy >> wrote: >>> On 03/13/2018 02:28 PM, Jann Horn wrote: >>>> On Tue, Mar 13, 2018 at 2:20 PM, Nagarathnam Muthusamy >>>> wrote: >>>>> On 03/13/2018 01:47 PM, Jann Horn wrote: >>>>>> On Mon, Mar 12, 2018 at 10:18 AM, >>>>>> wrote: [...] >>>>>>> + */ >>>>>>> +SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source, >>>>>>> + u64, target) >>>>>>> +{ >>>>>>> + struct pid_namespace *source_ns = NULL, *target_ns = NULL; >>>>>>> + struct pid *struct_pid; >>>>>>> + struct pid_namespace *ph; >>>>>>> + struct hlist_bl_head *shead = NULL; >>>>>>> + struct hlist_bl_head *thead = NULL; >>>>>>> + struct hlist_bl_node *dup_node; >>>>>>> + pid_t result; >>>>>>> + >>>>>>> + if (!source) { >>>>>>> + source_ns = &init_pid_ns; >>>>>>> + } else { >>>>>>> + shead = pid_ns_hash_head(pid_ns_hash, source); >>>>>>> + hlist_bl_lock(shead); >>>>>>> + hlist_bl_for_each_entry(ph, dup_node, shead, node) { >>>>>>> + if (source == ph->ns.ns_id) { >>>>>>> + source_ns = ph; >>>>>>> + break; >>>>>>> + } >>>>>>> + } >>>>>>> + if (!source_ns) { >>>>>>> + hlist_bl_unlock(shead); >>>>>>> + return -EINVAL; >>>>>>> + } >>>>>>> + } >>>>>>> + if (!ptrace_may_access(source_ns->child_reaper, >>>>>>> + PTRACE_MODE_READ_FSCREDS)) { >>>>>> >>>>>> AFAICS this proposal breaks the visibility restrictions that >>>>>> namespaces normally create. If there are two namespaces-based >>>>>> containers that use the same UID range, I don't think they should be >>>>>> able to learn information about each other, such as which PIDs are in >>>>>> use in the other container; but as far as I can tell, your proposal >>>>>> makes it possible to do that (unless an LSM or so is interfering). I >>>>>> would prefer it if this API required visibility of the targeted PID >>>>>> namespaces in the caller's PID namespace. >>>>> >>>>> >>>>> I am trying to simulate the same access restrictions allowed >>>>> on a process's /proc//ns/pid file. If the translator has >>>>> access to /proc//ns/pid file of both source and destination >>>>> namespaces, shouldn't it be allowed to translate the pid between >>>>> them? >>>> >>>> But the translator doesn't actually need to have access to those >>>> procfs files, right? >>> >>> I thought it should have access to those procfs files to satisfy the >>> visibility constraint that targeted PID namespaces should be visible >>> in caller's PID namespace and ptrace_may_access checks that >>> constraint. >> >> If there are two containers that use the same UID range, >> ptrace_may_access() checks from a process in one container on a >> process in another container can pass. Normally, you just can't even >> reach the ptrace_may_access() checks because you can't reference >> processes in another container in any way. > > > If there is no way to reference the process in another container, > there is no way to get to the /proc//ns/pidns_id file which > exports the ID of that container right? So, a translator has to > first guess the container ID then try translate. Even after translation, > unless the translator has proper privileges, I believe it cant do > anything with just the pid right? Well, yes to both. You'd have to guess the ID of the container, and you wouldn't be able to do much with it, apart from finding valid PIDs and their mapping between namespaces. >> By the way, a related concern: The use of global identifiers will >> probably also negatively affect Checkpoint/Restore In Userspace? > > Will look into this. Can you point me to the specifics of the > usecase which could be negatively affected? AFAICS you won't be able to reliably recreate namespace IDs when a process is checkpointed and resumed, meaning that checkpoint/resume won't work on processes that use these namespace IDs.