Received: by 10.213.65.68 with SMTP id h4csp716836imn; Fri, 23 Mar 2018 14:21:14 -0700 (PDT) X-Google-Smtp-Source: AG47ELu3vAP0rTobjYza0lflzcCceeiECAi5wuvJu//YIdGEaQvtfYg/V5Zgtwtuozq9mgrjxkUF X-Received: by 2002:a17:902:bf03:: with SMTP id bi3-v6mr30795493plb.343.1521840074367; Fri, 23 Mar 2018 14:21:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521840074; cv=none; d=google.com; s=arc-20160816; b=AWSk7BukDZ+83M9nh/459k6KowJtEIdIdhdlFv/CUIXD/6C5HFqPzKx3nyhp4Y6G6L +/Gv3WQKWdahYp+yiSPwck2dhqk4d+2ZPhFrE1yvfEnpcKdWiVomQjICEqAVN1PNjpQx FctHq9NuSl7wbvHeFIoXbhQJ6dMkP4Js4ysCQSVeHIt5PBD3s0eRf6TDmLuNcGQc/s44 WrpfhEWadPTTEKdF2Oub12mwz1+5EsNzsxecLb6TVDBBwe8w6JEYmL+K+ZBnwoZwAHZr CksVoC2WxVy8t9VREAtauP/cSQ17h5cKYAdIqBlBtTyRPG0XWFVO1W/ZQEpeNQJGZ+sI MmYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature:arc-authentication-results; bh=OaudASXlM7BDu+8/fm/lcJ/c/bnbOGV2qtGPClGWIEo=; b=zau4+0+LLVJ0Jieygv1A3r8MouyQp+xqFkCgEq9Mf9hcG07Wk2g3lNT+0tQR/ikJAP wR8+0tXyLXbUjka7WpquCW2wbJMoNYJfTAZiZLAS151xmx74NWO/ubTMRZzPFHObuL9U oHPf+cJokJ59ATBaOvb+4Lk1BcwrdfdBVxBvA/lnRFKfqVzpZmVh2MTqg6xQeCktrG3b GpzpujIsXh0nM+ORZAdkP7J3t5FX2CxXJtEV45Gj6YFFytIHhFPnkTbxB8SJb6KQ3q7P AUN9nF/1nPBhQQmDgXJQHEz+NBFeEGJs7yAn2YMPJmp31uWEgBYlloZnWldNBWSyje5P MkKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=oIkH4NBN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w3si6559539pgo.645.2018.03.23.14.20.59; Fri, 23 Mar 2018 14:21:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=oIkH4NBN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752007AbeCWVSB (ORCPT + 99 others); Fri, 23 Mar 2018 17:18:01 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:42972 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751471AbeCWVR7 (ORCPT ); Fri, 23 Mar 2018 17:17:59 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w2NLClfm123820; Fri, 23 Mar 2018 21:17:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=OaudASXlM7BDu+8/fm/lcJ/c/bnbOGV2qtGPClGWIEo=; b=oIkH4NBNP/EBSJ56DibpWsBtKakIzIHDeWTDXijiXhf4eo/p1/Y4cp2xBiNpV1TmRU5c XjTHIYiC6BNy/TXXUcOZlZRHO3oLQpKTJh+fH6BAg3WLsAqR8eFC/hBQiJMsuHAkLOW5 1QuPKPxs61qL8qTH/tNCSSM5+hEs8rv31I91eU6Wm6I+PtHuVjWGDrwGhLI435eWeMH1 9Gm+G3q4qV4GGKeorXkn04gs0+XH1YthQrIP5sQdwBIZxltEfoYwILARL/t3rH+275a2 Vwz5o1Guwj+C9WKeO9GYHwN3QPWMBYNlu7/zPCBofyplJhAVIRGsJ1YcckeaRV8wwJd+ HA== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2gw91t05q2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 23 Mar 2018 21:17:46 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w2NLHibJ020531 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 23 Mar 2018 21:17:44 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w2NLHg0X024566; Fri, 23 Mar 2018 21:17:42 GMT Received: from [10.209.243.63] (/10.209.243.63) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 23 Mar 2018 14:17:42 -0700 Subject: Re: [REVIEW][PATCH 09/11] ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces. To: "Eric W. Biederman" , Linux Containers Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, khlebnikov@yandex-team.ru, prakash.sangappa@oracle.com, luto@kernel.org, akpm@linux-foundation.org, oleg@redhat.com, serge.hallyn@ubuntu.com, esyr@redhat.com, jannh@google.com, linux-security-module@vger.kernel.org, Pavel Emelyanov References: <87vadmobdw.fsf_-_@xmission.com> <20180323191614.32489-9-ebiederm@xmission.com> From: NAGARATHNAM MUTHUSAMY Organization: Oracle Corporation Message-ID: <7df62190-2407-bfd4-d144-7304a8ea8ae3@oracle.com> Date: Fri, 23 Mar 2018 14:17:35 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20180323191614.32489-9-ebiederm@xmission.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8841 signatures=668695 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803230239 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/23/2018 12:16 PM, Eric W. Biederman wrote: > Today shm_cpid and shm_lpid are remembered in the pid namespace of the > creator and the processes that last touched a sysvipc shared memory > segment. If you have processes in multiple pid namespaces that > is just wrong, and I don't know how this has been over-looked for > so long. > > As only creation and shared memory attach and shared memory detach > update the pids I do not expect there to be a repeat of the issues > when struct pid was attached to each af_unix skb, which in some > notable cases cut the performance in half. The problem was threads of > the same process updating same struct pid from different cpus causing > the cache line to be highly contended and bounce between cpus. > > As creation, attach, and detach are expected to be rare operations for > sysvipc shared memory segments I do not expect that kind of cache line > ping pong to cause probems. In addition because the pid is at a fixed > location in the structure instead of being dynamic on a skb, the > reference count of the pid does not need to be updated on each > operation if the pid is the same. This ability to simply skip the pid > reference count changes if the pid is unchanging further reduces the > likelihood of the a cache line holding a pid reference count > ping-ponging between cpus. > > Fixes: b488893a390e ("pid namespaces: changes to show virtual ids to user") > Signed-off-by: "Eric W. Biederman" Thanks! Reviewed-by: Nagarathnam Muthusamy > --- > ipc/shm.c | 25 +++++++++++++++---------- > 1 file changed, 15 insertions(+), 10 deletions(-) > > diff --git a/ipc/shm.c b/ipc/shm.c > index 0565669ebe5c..932b7e411c6c 100644 > --- a/ipc/shm.c > +++ b/ipc/shm.c > @@ -57,8 +57,8 @@ struct shmid_kernel /* private to the kernel */ > time64_t shm_atim; > time64_t shm_dtim; > time64_t shm_ctim; > - pid_t shm_cprid; > - pid_t shm_lprid; > + struct pid *shm_cprid; > + struct pid *shm_lprid; > struct user_struct *mlock_user; > > /* The task created the shm object. NULL if the task is dead. */ > @@ -226,7 +226,7 @@ static int __shm_open(struct vm_area_struct *vma) > return PTR_ERR(shp); > > shp->shm_atim = ktime_get_real_seconds(); > - shp->shm_lprid = task_tgid_vnr(current); > + ipc_update_pid(&shp->shm_lprid, task_tgid(current)); > shp->shm_nattch++; > shm_unlock(shp); > return 0; > @@ -267,6 +267,8 @@ static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp) > user_shm_unlock(i_size_read(file_inode(shm_file)), > shp->mlock_user); > fput(shm_file); > + ipc_update_pid(&shp->shm_cprid, NULL); > + ipc_update_pid(&shp->shm_lprid, NULL); > ipc_rcu_putref(&shp->shm_perm, shm_rcu_free); > } > > @@ -311,7 +313,7 @@ static void shm_close(struct vm_area_struct *vma) > if (WARN_ON_ONCE(IS_ERR(shp))) > goto done; /* no-op */ > > - shp->shm_lprid = task_tgid_vnr(current); > + ipc_update_pid(&shp->shm_lprid, task_tgid(current)); > shp->shm_dtim = ktime_get_real_seconds(); > shp->shm_nattch--; > if (shm_may_destroy(ns, shp)) > @@ -614,8 +616,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params) > if (IS_ERR(file)) > goto no_file; > > - shp->shm_cprid = task_tgid_vnr(current); > - shp->shm_lprid = 0; > + shp->shm_cprid = get_pid(task_tgid(current)); > + shp->shm_lprid = NULL; > shp->shm_atim = shp->shm_dtim = 0; > shp->shm_ctim = ktime_get_real_seconds(); > shp->shm_segsz = size; > @@ -648,6 +650,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params) > user_shm_unlock(size, shp->mlock_user); > fput(file); > no_file: > + ipc_update_pid(&shp->shm_cprid, NULL); > + ipc_update_pid(&shp->shm_lprid, NULL); > call_rcu(&shp->shm_perm.rcu, shm_rcu_free); > return error; > } > @@ -970,8 +974,8 @@ static int shmctl_stat(struct ipc_namespace *ns, int shmid, > tbuf->shm_atime = shp->shm_atim; > tbuf->shm_dtime = shp->shm_dtim; > tbuf->shm_ctime = shp->shm_ctim; > - tbuf->shm_cpid = shp->shm_cprid; > - tbuf->shm_lpid = shp->shm_lprid; > + tbuf->shm_cpid = pid_vnr(shp->shm_cprid); > + tbuf->shm_lpid = pid_vnr(shp->shm_lprid); > tbuf->shm_nattch = shp->shm_nattch; > > ipc_unlock_object(&shp->shm_perm); > @@ -1605,6 +1609,7 @@ SYSCALL_DEFINE1(shmdt, char __user *, shmaddr) > #ifdef CONFIG_PROC_FS > static int sysvipc_shm_proc_show(struct seq_file *s, void *it) > { > + struct pid_namespace *pid_ns = ipc_seq_pid_ns(s); > struct user_namespace *user_ns = seq_user_ns(s); > struct kern_ipc_perm *ipcp = it; > struct shmid_kernel *shp; > @@ -1627,8 +1632,8 @@ static int sysvipc_shm_proc_show(struct seq_file *s, void *it) > shp->shm_perm.id, > shp->shm_perm.mode, > shp->shm_segsz, > - shp->shm_cprid, > - shp->shm_lprid, > + pid_nr_ns(shp->shm_cprid, pid_ns), > + pid_nr_ns(shp->shm_lprid, pid_ns), > shp->shm_nattch, > from_kuid_munged(user_ns, shp->shm_perm.uid), > from_kgid_munged(user_ns, shp->shm_perm.gid),