Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4568279pxf; Tue, 23 Mar 2021 14:01:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxRCROZboVSWZFxVzDETpuWolh8+eklAeTj8DvAzjhFjDYgTcwC8gad7HT2NlQzuPsksX6M X-Received: by 2002:a17:906:37db:: with SMTP id o27mr142438ejc.60.1616533307397; Tue, 23 Mar 2021 14:01:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616533307; cv=none; d=google.com; s=arc-20160816; b=uGiOfQaZrwj03lxS6UcCbpe4g/G0mt1J2i3jjNXIl/L9wpsLdhpY8S3YQ2jL6Llxx5 u+dwwyA9qKKu1m4k9zHqjFD2aCUSlY+F5RdNrj1kN3t77gTNvvAKDVudPWKSWj6khTs9 tw/FX8QZPXv9JCfgoBjXuW2qGelbERX5phh5jU80DbosaPcUc4QvM5zbGHkvkC4vr+La lvFl5qpVXY7hRQ4kLzktYTqHQZgKl40gnFCJtJxn3qsaCypgwEJU2Gn5WkwzbZw3bdw4 4WK6r1pPKs9CcrYFBdsREslzqhk9gn3wRJ9PuLJ9gwG1pda4Y1OKmnMDKW8DgnoW5rGF qgjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ckMDE6l+vmHFCjyRcftG7bVn0rwBAaWk1/FtyB2KM24=; b=jGswsu0a0tnf/omuUzJ8ac3ok+V7viI9fegN/LVakCUD0WwYNK2UXar7I+jEXpMm1l x9a0G9dE5MmxO/M9Cjg4gRUnUAPap+7GZ5GvQ0lBV2q7zudqnSCXNHy5xSSxp+mfGa+H IJDb6sN7AX+amiZcE6phsB8KSNQ6Go2hu6yS1f0DYzB6kvBrPwB9q+0gbm4TnVevQUyB HBYVqDfK8JADcUC7reKWL11z55GTkpgTD/22Bu8xOFof3iHLgdXxskI2Ab/MK/At+MaW hzHj7nq2KTzzfCMyvVQtXjggEPnV/0pBwLMiNu9EL4LCllM1dTNIU9Wij7Q11kc6MdcB Exwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b10si132573edd.9.2021.03.23.14.01.19; Tue, 23 Mar 2021 14:01:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233488AbhCWVAg (ORCPT + 99 others); Tue, 23 Mar 2021 17:00:36 -0400 Received: from raptor.unsafe.ru ([5.9.43.93]:50172 "EHLO raptor.unsafe.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233473AbhCWVAN (ORCPT ); Tue, 23 Mar 2021 17:00:13 -0400 Received: from comp-core-i7-2640m-0182e6.redhat.com (ip-94-113-225-162.net.upcbroadband.cz [94.113.225.162]) by raptor.unsafe.ru (Postfix) with ESMTPSA id 1A8E540C95; Tue, 23 Mar 2021 21:00:11 +0000 (UTC) From: Alexey Gladkov To: LKML , Kernel Hardening , Linux Containers , linux-mm@kvack.org Cc: Alexey Gladkov , Andrew Morton , Christian Brauner , "Eric W . Biederman" , Jann Horn , Jens Axboe , Kees Cook , Linus Torvalds , Oleg Nesterov Subject: [PATCH v9 6/8] Reimplement RLIMIT_SIGPENDING on top of ucounts Date: Tue, 23 Mar 2021 21:59:15 +0100 Message-Id: <9c8b24db40a030347652ee58f44028ca1081acfa.1616533074.git.gladkov.alexey@gmail.com> X-Mailer: git-send-email 2.29.3 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.4 (raptor.unsafe.ru [5.9.43.93]); Tue, 23 Mar 2021 21:00:11 +0000 (UTC) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Signed-off-by: Alexey Gladkov --- fs/proc/array.c | 2 +- include/linux/sched/user.h | 1 - include/linux/signal_types.h | 4 ++- include/linux/user_namespace.h | 1 + kernel/fork.c | 1 + kernel/signal.c | 57 ++++++++++++++++------------------ kernel/ucount.c | 1 + kernel/user.c | 1 - kernel/user_namespace.c | 1 + 9 files changed, 34 insertions(+), 35 deletions(-) diff --git a/fs/proc/array.c b/fs/proc/array.c index bb87e4d89cd8..74b0ea4b7e38 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -284,7 +284,7 @@ static inline void task_sig(struct seq_file *m, struct task_struct *p) collect_sigign_sigcatch(p, &ignored, &caught); num_threads = get_nr_threads(p); rcu_read_lock(); /* FIXME: is this correct? */ - qsize = atomic_read(&__task_cred(p)->user->sigpending); + qsize = get_ucounts_value(task_ucounts(p), UCOUNT_RLIMIT_SIGPENDING); rcu_read_unlock(); qlim = task_rlimit(p, RLIMIT_SIGPENDING); unlock_task_sighand(p, &flags); diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h index 8a34446681aa..8ba9cec4fb99 100644 --- a/include/linux/sched/user.h +++ b/include/linux/sched/user.h @@ -12,7 +12,6 @@ */ struct user_struct { refcount_t __count; /* reference count */ - atomic_t sigpending; /* How many pending signals does this user have? */ #ifdef CONFIG_FANOTIFY atomic_t fanotify_listeners; #endif diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h index 68e06c75c5b2..34cb28b8f16c 100644 --- a/include/linux/signal_types.h +++ b/include/linux/signal_types.h @@ -13,6 +13,8 @@ typedef struct kernel_siginfo { __SIGINFO; } kernel_siginfo_t; +struct ucounts; + /* * Real Time signals may be queued. */ @@ -21,7 +23,7 @@ struct sigqueue { struct list_head list; int flags; kernel_siginfo_t info; - struct user_struct *user; + struct ucounts *ucounts; }; /* flags values. */ diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index d0fea0306394..6e8736c7aa29 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -52,6 +52,7 @@ enum ucount_type { #endif UCOUNT_RLIMIT_NPROC, UCOUNT_RLIMIT_MSGQUEUE, + UCOUNT_RLIMIT_SIGPENDING, UCOUNT_COUNTS, }; diff --git a/kernel/fork.c b/kernel/fork.c index 85c6094f5a48..741f896c156e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -824,6 +824,7 @@ void __init fork_init(void) init_user_ns.ucount_max[UCOUNT_RLIMIT_NPROC] = task_rlimit(&init_task, RLIMIT_NPROC); init_user_ns.ucount_max[UCOUNT_RLIMIT_MSGQUEUE] = task_rlimit(&init_task, RLIMIT_MSGQUEUE); + init_user_ns.ucount_max[UCOUNT_RLIMIT_SIGPENDING] = task_rlimit(&init_task, RLIMIT_SIGPENDING); #ifdef CONFIG_VMAP_STACK cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache", diff --git a/kernel/signal.c b/kernel/signal.c index f2a1b898da29..1b537d9de447 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -413,49 +413,44 @@ void task_join_group_stop(struct task_struct *task) static struct sigqueue * __sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimit) { - struct sigqueue *q = NULL; - struct user_struct *user; - int sigpending; + struct sigqueue *q = kmem_cache_alloc(sigqueue_cachep, flags); - /* - * Protect access to @t credentials. This can go away when all - * callers hold rcu read lock. - * - * NOTE! A pending signal will hold on to the user refcount, - * and we get/put the refcount only when the sigpending count - * changes from/to zero. - */ - rcu_read_lock(); - user = __task_cred(t)->user; - sigpending = atomic_inc_return(&user->sigpending); - if (sigpending == 1) - get_uid(user); - rcu_read_unlock(); + if (likely(q != NULL)) { + bool overlimit; - if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) { - q = kmem_cache_alloc(sigqueue_cachep, flags); - } else { - print_dropped_signal(sig); - } - - if (unlikely(q == NULL)) { - if (atomic_dec_and_test(&user->sigpending)) - free_uid(user); - } else { INIT_LIST_HEAD(&q->list); q->flags = 0; - q->user = user; + + /* + * Protect access to @t credentials. This can go away when all + * callers hold rcu read lock. + */ + rcu_read_lock(); + q->ucounts = get_ucounts(task_ucounts(t)); + if (q->ucounts) { + overlimit = inc_rlimit_ucounts_and_test(q->ucounts, UCOUNT_RLIMIT_SIGPENDING, + 1, task_rlimit(t, RLIMIT_SIGPENDING)); + + if (override_rlimit || likely(!overlimit)) { + rcu_read_unlock(); + return q; + } + } + rcu_read_unlock(); } - return q; + print_dropped_signal(sig); + return NULL; } static void __sigqueue_free(struct sigqueue *q) { if (q->flags & SIGQUEUE_PREALLOC) return; - if (atomic_dec_and_test(&q->user->sigpending)) - free_uid(q->user); + if (q->ucounts) { + dec_rlimit_ucounts(q->ucounts, UCOUNT_RLIMIT_SIGPENDING, 1); + put_ucounts(q->ucounts); + } kmem_cache_free(sigqueue_cachep, q); } diff --git a/kernel/ucount.c b/kernel/ucount.c index 03ed158e810f..29976c8942f3 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -81,6 +81,7 @@ static struct ctl_table user_table[] = { UCOUNT_ENTRY("max_inotify_instances"), UCOUNT_ENTRY("max_inotify_watches"), #endif + { }, { }, { }, { } diff --git a/kernel/user.c b/kernel/user.c index 7f5ff498207a..6737327f83be 100644 --- a/kernel/user.c +++ b/kernel/user.c @@ -98,7 +98,6 @@ static DEFINE_SPINLOCK(uidhash_lock); /* root_user.__count is 1, for init task cred */ struct user_struct root_user = { .__count = REFCOUNT_INIT(1), - .sigpending = ATOMIC_INIT(0), .locked_shm = 0, .uid = GLOBAL_ROOT_UID, .ratelimit = RATELIMIT_STATE_INIT(root_user.ratelimit, 0, 0), diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index cc90d5203acf..df1bed32dd48 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -123,6 +123,7 @@ int create_user_ns(struct cred *new) } ns->ucount_max[UCOUNT_RLIMIT_NPROC] = rlimit(RLIMIT_NPROC); ns->ucount_max[UCOUNT_RLIMIT_MSGQUEUE] = rlimit(RLIMIT_MSGQUEUE); + ns->ucount_max[UCOUNT_RLIMIT_SIGPENDING] = rlimit(RLIMIT_SIGPENDING); ns->ucounts = ucounts; /* Inherit USERNS_SETGROUPS_ALLOWED from our parent */ -- 2.29.3