Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp452279pxu; Tue, 1 Dec 2020 15:53:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJxk9MqjmaiyJJwmDwiH//OfMwhDBANp9ufw4f4P6quHBM4WpVXa8a5XwJlpcWh6wc3SpeLu X-Received: by 2002:a50:8e0e:: with SMTP id 14mr36686edw.171.1606866823046; Tue, 01 Dec 2020 15:53:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606866823; cv=none; d=google.com; s=arc-20160816; b=msQkc778LrnoLAjYFqLVmO2otjfUQug787q4hRowuGPZwzYCle4DmFRzRFnr+OLDvr AykHMxQ1JUGPQn+vYB+G7aASVBiPEHhwu/0o1blr8aGWhs3fXiTKSuKlQSMcr+F1W87I cwwhz3UR6iSL8mPoSdXhsb6APApdEWlTl8ebRpw6t+/JT2knrQDlXR+mfMptaror3/sq FQ+/H3gXPOHXYPcawAoQLfSVAVvBUfauASqNtw9W5WvlELO6gIj582g2GSN/Krx/GZsf VqqKt/LQAOq7ltdSREo6vhIbT2MdRvLCh4sO5oAPxJ1r+0DK+MGnccUnG7ys0J321uge urhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=zYc/7KhhD/QqsLlcBZlfwXaD6nCsAYdTZXO0MGw6Xek=; b=QjLuHoywCY5Z/ceVyRD8Uy3C2zU5BJqhwX6dWZhfcymx/zGgd/1lPLS3h8SL/1tcvp Gcm8IMy88S8WR2qL3CrP7fNLOttFYZXI6FwzjTvuLnjQRuNKsQP3A0ADQ94DuwxZ8JNR wwpWwc8bcgAJhEi32zlk06Jd3LFg9uxFj/Bo5Flb1Tjl7GqRbRx7NPQ2+1+iZrYb+qlV AJFoe7zXHN3iN2t47b301tr9rUK2EYgSRoePX7iWSImOMP4X1tB1DHxEzdphY+eH2X1s on7SJUoo5dS0pK5nPFBJ/B4UZbYVNAh+0Sas/hHvHl4UuGeo01iufI8KWTsFZmqzI+TP GdFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=JYej3CBP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v6si923376edj.204.2020.12.01.15.53.20; Tue, 01 Dec 2020 15:53:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=JYej3CBP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727525AbgLAXu0 (ORCPT + 99 others); Tue, 1 Dec 2020 18:50:26 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:51762 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726530AbgLAXuZ (ORCPT ); Tue, 1 Dec 2020 18:50:25 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B1NkfoS159095; Tue, 1 Dec 2020 23:49:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=corp-2020-01-29; bh=zYc/7KhhD/QqsLlcBZlfwXaD6nCsAYdTZXO0MGw6Xek=; b=JYej3CBPea6auedNJn7kva61gHkZj8qJwRzQcP3szCl3lBbOMB6QgPLPYqU/kJ55S6kt xfwG1+uU4+OSRD9KPqp6U8XkHLXqpkXr26SPVURu+Xc6IjAhxuGfhBm0jO1aIWbxt8HW gxefyKPsglbM7oA0A+5m10Wfkf1xS6MmZN88kvvMbJHZvIuxXWYV8wUyNwdV4sz/ne5R nry+DmQX+PxBGHCKv0/YRfs7XPigzV00zx6Pfw8tn/bFIzhJVra5yw44Nykowf9d+SXH XXlLloah7HsrCdN2Yx4/MWtChAxr5svnMKhSU5AIq7kjj2Lj73u2asacPdPtnafLbveQ ig== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 353egkncgp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 01 Dec 2020 23:49:19 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B1NkPx8072471; Tue, 1 Dec 2020 23:49:18 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 35404nhr7m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 01 Dec 2020 23:49:18 +0000 Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 0B1Nn9XC006999; Tue, 1 Dec 2020 23:49:10 GMT Received: from localhost (/10.159.227.169) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 01 Dec 2020 15:49:09 -0800 From: Stephen Brennan To: "Eric W. Biederman" Cc: Alexey Dobriyan , James Morris , "Serge E. Hallyn" , linux-security-module@vger.kernel.org, Paul Moore , Stephen Smalley , Eric Paris , selinux@vger.kernel.org, Casey Schaufler , Alexander Viro , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Matthew Wilcox Subject: Re: [PATCH] proc: Allow pid_revalidate() during LOOKUP_RCU In-Reply-To: <87zh2yh8ti.fsf@x220.int.ebiederm.org> References: <20201130200619.84819-1-stephen.s.brennan@oracle.com> <87zh2yh8ti.fsf@x220.int.ebiederm.org> Date: Tue, 01 Dec 2020 15:49:07 -0800 Message-ID: <87zh2xjde4.fsf@stepbren-lnx.us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9822 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 bulkscore=0 malwarescore=0 mlxscore=0 mlxlogscore=800 phishscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012010140 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9822 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 bulkscore=0 suspectscore=1 phishscore=0 mlxlogscore=810 lowpriorityscore=0 malwarescore=0 priorityscore=1501 spamscore=0 impostorscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012010140 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ebiederm@xmission.com (Eric W. Biederman) writes: > Stephen Brennan writes: > >> The pid_revalidate() function requires dropping from RCU into REF lookup >> mode. When many threads are resolving paths within /proc in parallel, >> this can result in heavy spinlock contention as each thread tries to >> grab a reference to the /proc dentry (and drop it shortly thereafter). >> >> Allow the pid_revalidate() function to execute under LOOKUP_RCU. When >> updates must be made to the inode due to the owning task performing >> setuid(), drop out of RCU and into REF mode. > > So rather than get_task_rcu_user. I think what we want is a function > that verifies task->rcu_users > 0. > > Which frankly is just "pid_task(proc_pid(inode), PIDTYPE_PID)". > > Which is something that we can do unconditionally in pid_revalidate. > > Skipping the update of the inode is probably the only thing that needs > to be skipped. > > It looks like the code can safely rely on the the security_task_to_inode > in proc_pid_make_inode and remove the security_task_to_inode in > pid_update_inode. > This makes sense, I'll get rid of the get_task_rcu_user() stuff in a v2. > >> Signed-off-by: Stephen Brennan >> --- >> >> I'd like to use this patch as an RFC on this approach for reducing spinlock >> contention during many parallel path lookups in the /proc filesystem. The >> contention can be triggered by, for example, running ~100 parallel instances of >> "TZ=/etc/localtime ps -fe >/dev/null" on a 100CPU machine. The %sys utilization >> in such a case reaches around 90%, and profiles show two code paths with high >> utilization: > > Do you have a real world work-load that is behaves something like this > micro benchmark? I am just curious how severe the problem you are > trying to solve is. > We have seen this issue occur internally with monitoring scripts (perhaps a bit misconfigured, I'll admit). However I don't have an exact sample workload that I can give you. Thanks, Stephen