Received: by 10.223.164.221 with SMTP id h29csp987804wrb; Mon, 16 Oct 2017 16:15:46 -0700 (PDT) X-Google-Smtp-Source: AOwi7QC6WTty+QgJCp7Em0lIEg51IYZ7NfNHAEfgb7a6daWZxO6Af5zcTG6BjoUu6K4oo54OI28X X-Received: by 10.159.216.131 with SMTP id s3mr10083410plp.252.1508195745930; Mon, 16 Oct 2017 16:15:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1508195745; cv=none; d=google.com; s=arc-20160816; b=G1gTPSoeL18d8j+ziHoT7iwXfAcFxevaDhPuKZGt6FvUGDzV+fUDOkjLWHmCeR53Tn NinkqG3C0jZ/3tQWjXPfHY955glYQ+hCMd8HQ7XAQeXExBE0MBdvjdXCS66gTY2DJxWl wFnd8+npmYDUYHlRPHkD2Z5CY+eiieWMFesaQb3nAa2qeN2cOCW7OoZ5g2ucvb29Yhy9 SDE3nJzMSmTcGGRQSA7y3J9g10kEYzjOZUgcMdMrW3weP8Q2YaCAUlvsjsQlIlYNHORB bEN+D/ng8qqJnl1OXXHFOeMrJla0FdGrT0RH8qyjfc1Ig/xM6jNFW5Mi7zRJTtIlIZ5w +JOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:to:references :subject:reply-to:arc-authentication-results; bh=+zrbV4z2UgXMzzkOxp0FTt5kS0eL4J3PZ8TVYiZl9VY=; b=nn4ajXUZl0is52uSh+mw77UH6lI1mCbj9AsKVnM0dChm3UUVy9vNwJIyPIA4vceLCw e2FGM5BXtN1pZpV2QBurbxtdM1PQ3163gzRVsAFAyjH6BMzsLA+IDRcqV1JYZKggoIam jQIsm9y6dmQpN+E8rTuQoWE/LWPghfxBO/1J9GmweSYuXuk/eAJpWvazBVA4Lmg0tN4l ZuohN5rLLPYWFRk48PEm3TRgmPxemLq7AgjhURPPnK9kTGhC2eEfvzC6SfB+Dw5qEupR xL4R/8J+w4D1u+1A35tNpRmGk4uJ62t7Pe37CQwkcFJBzZeygTIC28ZtZK7l4sazwfgh 2PMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s126si4679385pgc.618.2017.10.16.16.15.32; Mon, 16 Oct 2017 16:15:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932645AbdJPWzv (ORCPT + 99 others); Mon, 16 Oct 2017 18:55:51 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:47530 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756405AbdJPWzu (ORCPT ); Mon, 16 Oct 2017 18:55:50 -0400 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v9GMticl022329 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Oct 2017 22:55:44 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v9GMthF6015128 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 16 Oct 2017 22:55:44 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v9GMtgTi006200; Mon, 16 Oct 2017 22:55:43 GMT Received: from [10.132.93.61] (/10.132.93.61) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 16 Oct 2017 15:55:42 -0700 Reply-To: prakash.sangappa@oracle.com Subject: Re: [PATCH v4] pidns: introduce syscall translate_pid References: <150788678482.924140.11785205105514746135.stgit@buzz> <20171013160514.GA27812@redhat.com> <3bdb5341-9ae6-265a-ce5b-45c2cfc76fad@yandex-team.ru> <20171016143628.b2ef80a9ef16d4345889b4d9@linux-foundation.org> To: Nagarathnam Muthusamy , Andrew Morton , Konstantin Khlebnikov Cc: Oleg Nesterov , linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Serge Hallyn , "Eric W. Biederman" , Eugene Syromiatnikov From: "prakash.sangappa" Message-ID: Date: Mon, 16 Oct 2017 15:54:24 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: > > > On 10/16/2017 02:36 PM, Andrew Morton wrote: >> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov >> wrote: >> >>>>>> pid_t translate_pid(pid_t pid, int source, int target); >>>>>> >>>>>> This syscall converts pid from source pid-ns into pid in target >>>>>> pid-ns. >>>>>> If pid is unreachable from target pid-ns it returns zero. >>>>>> >>>>>> Pid-namespaces are referred file descriptors opened to proc files >>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative >>>>>> argument >>>>>> refers to current pid namespace, same as file /proc/self/ns/pid. >>>>>> >>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but backward >>>>>> translation requires scanning all tasks. Also pids could be >>>>>> translated >>>>>> by sending them through unix socket between namespaces, this >>>>>> method is >>>>>> slow and insecure because other side is exposed inside pid >>>>>> namespace. >>> Andrew asked why we might need this. >>> >>> Such conversion is required for interaction between processes across >>> pid-namespaces. >>> For example to identify process in container by pid file looking >>> from outside. >>> >>> Two years ago I've solved this in project of mine with monstrous >>> code which >>> forks couple times just to convert pid, lucky for me performance >>> wasn't important. >> That's a single user who needed this a single time, and found a >> userspace-based solution anyway. This is not exactly compelling! >> >> Is there a stronger case to be made? How does this change benefit our >> users? Sell it to us! > Oracle database is planning to use pid namespace for sandboxing > database instances and they need an API similar to translate_pid to > effectively translate process IDs from other pid namespaces. Prakash > (cced in mail) can provide more details on this usecase. As Nagarathnam indicated, Oracle Database will be using pid namespaces and needs a direct method of converting pids of processes in the pid namespace hierarchy. In this use case multiple nested PID namespaces will be used. The currently available mechanism are not very efficient for this use case. For ex. as Konstantin described, using /proc//status would require the application to scan all the pid's status files to determine the pid of given process in a child namespace. Use of SCM_CREDENTIALS's socket message is another way, which would require every process starting inside a pid namespace to send this message and the receiving process in the target namespace would have to save the converted pid and reference it. This mechanism becomes cumbersome especially if the application has to deal with multiple nested pid namespaces. Also, the Database needs to be able to convert a thread's global pid(gettid()). Passing the thread's pid(gettid()) in SCM_CREDENTIALS message requires CAP_SYS_ADMIN, which is an issue. So having a direct method, like the API that Konstantin is proposing, will work best for the Database since pid of a process in any of the nested pid namespaces can be converted as and when required. I think with the proposed API, the application should be able to convert pid of a process or tid(gettid()) of a thread as well. -Prakash > > Thanks, > Nagarathnam. > From 1581454562303068418@xxx Mon Oct 16 22:23:18 +0000 2017 X-GM-THRID: 1581133950441644275 X-Gmail-Labels: Inbox,Category Forums