Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp800443ybx; Fri, 1 Nov 2019 11:25:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqwlU/8pV1UkJeOhBN0QOPOmbzzYnfL867zzor+g/cwSHIaXEfSHvajyMf1LIwEC3CvPuTcT X-Received: by 2002:a17:906:945:: with SMTP id j5mr3951867ejd.211.1572632740684; Fri, 01 Nov 2019 11:25:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572632740; cv=none; d=google.com; s=arc-20160816; b=ZQ/FDv/yE29xyBbbmvvRVxZL97OCPaoE8Os9shrejfcosIKUOV4/bUy6b7sw42Xq2V m+SPCeJMNGOciOO2MOG8SfHNFZoqf7kgOsSgTJp5TLkpZmJ31S68aP2LWykc+0UTd4pJ XT98xtWJsFKeFOvAwY89/EyY+amFyqYKtitcWuFJ1Z4R6LGUFlJqj9Dpdarqbrp4OS0b SDEVyDKGU39sdAkEOkt6rUG+/EUUCSgci1ZDwmoxBrJElXoHPLGu3pFyEgSXjTBURBdD jaKNGqOOORFZKLECZwtEWab5kBqBELDD+yNIx44d5nko2FzQY3ap+0/CIBTkEnzCuPQi ddQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=KkJBV1OKZEQpd6p0Z4UlW9v/6UWLZWUIedN3euPfzqM=; b=lhSe8RQRTC/nvqbvvPFlmQN7LbgVeBzGy4k2oi+9Dwinv2fwDm8UE7ZJVYNR+ygp1y 5tFcug6zo3rFpsGf0CJGwhpcyvRUUTaSCJ2ZtTRiuCceDGzkf6LCxC8pizBHoIT1n6pk Li3CebT9l/CaHBFUIcT5tpWW/IJCpF1Cvx5tV50Rc/xaXG9FdA8odBJ8PaPMEV6TlmPL RY50DHeUFSxcIdJ0j2DnGJ/c1sXM9XapOhozr1+f8lwHHtp6D1ZO0moAGzRQJ4+dZaRT asGzXVN+6DzPtdgXKwhmQLQ5iUceYKtA4PCiT7Nd1RFrqySdxRI7iq4JozjFCEYboIkN sawA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=qVlDmf1e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a55si983192edc.13.2019.11.01.11.25.16; Fri, 01 Nov 2019 11:25:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=qVlDmf1e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727270AbfKASVq (ORCPT + 99 others); Fri, 1 Nov 2019 14:21:46 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38342 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726498AbfKASVq (ORCPT ); Fri, 1 Nov 2019 14:21:46 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xA1I9BXQ120184; Fri, 1 Nov 2019 18:21:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2019-08-05; bh=KkJBV1OKZEQpd6p0Z4UlW9v/6UWLZWUIedN3euPfzqM=; b=qVlDmf1e1y4rtEhitO4YdT5duiNCu4M70o7FEu5JUb8kVCkrPpQHyRVkPGIiE+DXx4c5 mE9ItAZypqkiY6yyK77fmBaWBgjn2G5HTcGsWSv6v/ariR2MUsfmTFYZyW2UTWx7/9O2 XKx2BeGdASQbbdYqEoBnjxVMyzYnMtJ+i2hmPSYN90+e4+J9jiZ/HfRT+VqycyrolwBm AGWLD5CIh33m0ft5etI8hn6IsxpZz2GTM9TeaxCk+9YliinrmFvPeHN2aChFhJ2+uujx o1jGt+QLdWJMwGS1LeuMxQVEgqqKHzVX3vqfDSxzV+Mpu6Y9CweJx9Oj5l5NPtNItrpf nw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 2vxwhg3b8p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 01 Nov 2019 18:21:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xA1I9Q8s013876; Fri, 1 Nov 2019 18:21:19 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3030.oracle.com with ESMTP id 2w0qdwrtf5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 01 Nov 2019 18:21:19 +0000 Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id xA1ILFrK004326; Fri, 1 Nov 2019 18:21:16 GMT Received: from pp-ThinkCentre-M82.us.oracle.com (/10.132.95.199) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 01 Nov 2019 11:21:15 -0700 From: Prakash Sangappa To: linux-kernel@vger.kernel.org Cc: ebiederm@xmission.com, tglx@linutronix.de, peterz@infradead.org, serge@hallyn.com, prakash.sangappa@oracle.com Subject: [RFC PATCH 0/1] CAP_SYS_NICE inside user namespace Date: Fri, 1 Nov 2019 11:18:27 -0700 Message-Id: <1572632308-7071-1-git-send-email-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.7.4 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9428 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1911010168 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9428 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1911010168 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Some of the capabilities(7) which affect system wide resources, are ineffective inside user namespaces. This restriction applies even to root user( uid 0) from init namespace mapped into the user namespace. One such capability is CAP_SYS_NICE which is required to change process priority. As a result of which the root user cannot perform operations like increase a process priority using -ve nice value or set RT priority on processes inside the user namespace. A workaround to deal with this restriction is to use the help of a process / daemon running outside the user namespace to change process priority, which is a an inconvenience. We could allow these restricted capabilities to take effect only for the root user from init namespace mapped inside a user namespace and limit the effect with use of cgroups. It would seem reasonable to deal with each of these restricted capabilities on a case by case basis and address them. This patch is concerning CAP_SYS_NICE capability. The proposal here is to selectively allow CAP_SYS_NICE to take effect inside user namespace only for a root user mapped from init name space. Which user id gets to map the root user(uid 0) from init namespace inside its user namespaces is authorized thru /etc/subuid & /etc/subgid entries. Only system admin / root user on the system can add these entries. Therefore any ordinary user cannot simply map the root user(uid 0) into user namespaces created. Necessary cgroup bandwidth control can be used to limit cpu usage for such user namespaces. The capabilities(7) manpage lists all the operations / system calls that are subject to CAP_SYS_NICE capability check. This patch currently allows CAP_SYS_NICE to take effect inside a user namespace only for system calls affecting process priority. For completeness sake should memory operations(migrate_pages(2), move_pages(2), mbind(2)) mentioned in the manpage, also be permitted? There are no cgroup controls to limit the effect of these memory operations. Looking for feedback on this approach. Prakash Sangappa (1): Selectively allow CAP_SYS_NICE capability inside user namespaces kernel/sched/core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) -- 2.7.4