Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp1761047pxb; Sun, 10 Jan 2021 09:45:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJwvpmuWtcjQJMG6AtgD6VcydEo1fZvdkqhWc4yEOTa9cZIA4cS0O1QKR+TWfnqtM3PZxnU/ X-Received: by 2002:a17:906:3c11:: with SMTP id h17mr8242995ejg.20.1610300748360; Sun, 10 Jan 2021 09:45:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610300748; cv=none; d=google.com; s=arc-20160816; b=je2Zt7mav3dd0Mn93V2HwI8X7jve/wIhf6ZBWz1UTlSs+Gw/2qz1c/bvS6m2rQj3Lo 8oIt7XHAkgOjGOduPZRmnZjpyZLdT0yWXH9aAI2CN9AljzbvPPU2vSphNOd+ZR2BPlXX ngtzZLsMXj9oUtzgShIKJz8zFfdNzzLgExhoASrcLnBSgEKMgNHkhZIfYwep3WdEFpdv AtOhHpRQAz24HCkvK5vfoBsVbba53etkgFyApMOzWe3ljQMYBZkGLBXoEizyqltgEmUV ZbRwTuQJhUDkCSdBZpTx9qtow4XZLasZ6wkywHDgsohzqPadid50zGopMP/6XogZQNAI AmIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=rYWLj+CNUfJ5OtjksBXq/0oQZ7JXZ/+q+jJ7EjuKnDQ=; b=my6EQJ3gWd/zho6KksPmN1KDrFHyFoGnNlEGyF8a6kr7dd16tUTAhzLMqEKoEc8o1+ OXKZvT8J11JTV+cDmRCDlUWBspcw8Maabkv1QtFt9t/656ni/Cd31umMHhQvesD1xwVj 7hBq0wnCgHju4MZC6laR37GFas0vmPLltWmcf3uxkN1Tboj5IcKVDBflszat4OP24ozD xsiflBt2ERecQ3sqGJHLhahb1LBrDQiMhXW11dGb9Hnz83cJ588FuZ4U2fNwSK7BI1Tk dIfg+2djCP385C0OvjCD5C9VOVKNH4kxnK7MZDwBoqKFbKZP0JW7sosycIKhRF3kSLHF /Smw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g11si5591154ejc.500.2021.01.10.09.45.25; Sun, 10 Jan 2021 09:45:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726638AbhAJRn2 (ORCPT + 99 others); Sun, 10 Jan 2021 12:43:28 -0500 Received: from raptor.unsafe.ru ([5.9.43.93]:38120 "EHLO raptor.unsafe.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726395AbhAJRn1 (ORCPT ); Sun, 10 Jan 2021 12:43:27 -0500 Received: from comp-core-i7-2640m-0182e6.redhat.com (ip-89-103-122-167.net.upcbroadband.cz [89.103.122.167]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by raptor.unsafe.ru (Postfix) with ESMTPSA id CB9FB20887; Sun, 10 Jan 2021 17:34:35 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers , Kernel Hardening Cc: Alexey Gladkov , "Eric W . Biederman" , Kees Cook , Christian Brauner , Linus Torvalds Subject: [RFC PATCH v2 0/8] Count rlimits in each user namespace Date: Sun, 10 Jan 2021 18:33:39 +0100 Message-Id: X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.1 (raptor.unsafe.ru [5.9.43.93]); Sun, 10 Jan 2021 17:34:51 +0000 (UTC) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Preface ------- These patches are for binding the rlimit counters to a user in user namespace. This patch set can be applied on top of: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.11-rc2 Problem ------- Some rlimits are set per user: RLIMIT_NPROC, RLIMIT_MEMLOCK, RLIMIT_SIGPENDING, RLIMIT_MSGQUEUE. When several containers are created from one user then the processes inside the containers influence each other. Eric W. Biederman mentioned this issue [1][2][3]. For example, there are two containers (A and B) created by one user. The container A sets RLIMIT_NPROC=1 and starts one process. Everything is fine, but when container B tries to do the same it will fail because the number of processes is counted globally for each user and user has one process already. On the other hand, we cannot simply calculate the rlimits for each container separately. This will lead to the fact that the user creating a new user namespace can create a fork bomb. Introduced changes ------------------ To address the problem, we bind rlimit counters to each user namespace. The result is a tree of rlimit counters with the biggest value at the root (aka init_user_ns). The rlimit counter increment/decrement occurs in the current and all parent user namespaces. ToDo ---- * No documentation. * No tests. [1] https://lore.kernel.org/containers/87imd2incs.fsf@x220.int.ebiederm.org/ [2] https://lists.linuxfoundation.org/pipermail/containers/2020-August/042096.html [3] https://lists.linuxfoundation.org/pipermail/containers/2020-October/042524.html Changelog --------- v2: * RLIMIT_MEMLOCK, RLIMIT_SIGPENDING and RLIMIT_MSGQUEUE are migrated to ucounts. * Added ucounts for pair uid and user namespace into cred. * Added the ability to increase ucount by more than 1. v1: * After discussion with Eric W. Biederman, I increased the size of ucounts to atomic_long_t. * Added ucount_max to avoid the fork bomb. -- Alexey Gladkov (8): Use atomic type for ucounts reference counting Add a reference to ucounts for each user Increase size of ucounts to atomic_long_t Move RLIMIT_NPROC counter to ucounts Move RLIMIT_MSGQUEUE counter to ucounts Move RLIMIT_SIGPENDING counter to ucounts Move RLIMIT_MEMLOCK counter to ucounts Move RLIMIT_NPROC check to the place where we increment the counter fs/exec.c | 2 +- fs/hugetlbfs/inode.c | 17 +++--- fs/io-wq.c | 22 ++++---- fs/io-wq.h | 2 +- fs/io_uring.c | 2 +- fs/proc/array.c | 2 +- include/linux/cred.h | 3 ++ include/linux/hugetlb.h | 3 +- include/linux/mm.h | 4 +- include/linux/sched/user.h | 6 --- include/linux/shmem_fs.h | 2 +- include/linux/signal_types.h | 4 +- include/linux/user_namespace.h | 31 +++++++++-- ipc/mqueue.c | 29 +++++----- ipc/shm.c | 31 ++++++----- kernel/cred.c | 43 +++++++++++---- kernel/exit.c | 2 +- kernel/fork.c | 12 +++-- kernel/signal.c | 53 ++++++++---------- kernel/sys.c | 13 ----- kernel/ucount.c | 99 +++++++++++++++++++++++++++++----- kernel/user.c | 2 - kernel/user_namespace.c | 7 ++- mm/memfd.c | 4 +- mm/mlock.c | 35 +++++------- mm/mmap.c | 3 +- mm/shmem.c | 8 +-- 27 files changed, 268 insertions(+), 173 deletions(-) -- 2.29.2