Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp30515pxb; Mon, 7 Feb 2022 05:54:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJwDymREZ/U9QJG44aozggqdzgUtmVcxPeGd3Ozhxv69mo6L7NVrsoHKqTbqQJFM7QMQHbqk X-Received: by 2002:a17:90a:f491:: with SMTP id bx17mr7341244pjb.3.1644242055653; Mon, 07 Feb 2022 05:54:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644242055; cv=none; d=google.com; s=arc-20160816; b=y0aTgBckFR1eJ3CwXZmVYXW+ZdfzJ/w79OIEANIu7bw+BS6/jA62M0lz8ZivGPBVQ4 rqp9lYSLL8VllzeLGFDlOOgdR961xqQl4T8Ugvi+qOG/m295PslFqCVM5dH3bO/yQBEz ZfU3v2xBw+ixeKPijyQccfjmvsd7PtQ3HU2dwL09gkglCEP6uNJm6wIaw7TUzfjYZcQ/ ZQLrVNYujuBRKLf9P3gUr6O6Fxd6S5HFf5/5yRk37wQKRye55TSVyirBrZdkv3tlQ4G5 1k3c/av7QLRicMTogkdO6Y8kzytmIavnoEPBli3pBtPPgJo0zHhO486SoPQDrv+5OFqr tQdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=TnOodoC+dk7hN8J0G80ViKMJ6kkGjwsgHMNI6oJy58Y=; b=a8Jyt7k2jtuz8bprifkMKISw0MUvk7cYtHIIvBHnRS/nOsjFGlPPEA6NDNNhHNfcue OyepFP+2KveT57yjuJXBDLeVohkWFeAMrzTE8V3YqhsALZzvGAeqp6ZmM/cyL/Mv0TjF t+Y3QN9hmBPRK7JAYvDRSUgt1blMndwfrHEi4LQwl26C18eqf5TG++a1Iml+SPrbkJFf 8W8sF8xZnK0Dh/JEqk4WQjN4QJ98lglwHD5Xiz6drfUomJ3UtvpDiJpRAYBX3wTLVP9Y Tmp6syxOnU7Elashfto0XQHSWg9XTGv5Lj4fEyTz2MSm+8JTFUzf+Sp2hu2Qirw9EjEg uveQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=iZn1BOxY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a190si9739314pge.534.2022.02.07.05.54.02; Mon, 07 Feb 2022 05:54:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=iZn1BOxY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377407AbiBDSMC (ORCPT + 99 others); Fri, 4 Feb 2022 13:12:02 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:55678 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237048AbiBDSMB (ORCPT ); Fri, 4 Feb 2022 13:12:01 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3410C1F37D; Fri, 4 Feb 2022 18:12:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1643998320; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=TnOodoC+dk7hN8J0G80ViKMJ6kkGjwsgHMNI6oJy58Y=; b=iZn1BOxYj0KI3tDt9FfvwkM9pRfaxNPLqiJejWltTFqlrZNZRGVxTk9msawFU1gYGmY5xc wyLHK3aqvpeRGlSy9Iu179CSMcL2uyuJAJKGtKKFuvk+1yWFBuOh5Y1fzPRGBJq7/0RcXG d6Q1OgQhLvw/dc/nkmz8WYGTF4ZuOz0= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1EE9E13AD8; Fri, 4 Feb 2022 18:12:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id KnPnBnBs/WHwbwAAMHmgww (envelope-from ); Fri, 04 Feb 2022 18:12:00 +0000 From: =?UTF-8?q?Michal=20Koutn=C3=BD?= To: Eric Biederman , Alexey Gladkov Cc: Kees Cook , linux-kernel@vger.kernel.org Subject: [PATCH] ucounts: Do not allow RLIMIT_NPROC+1 tasks Date: Fri, 4 Feb 2022 19:11:44 +0100 Message-Id: <20220204181144.24462-1-mkoutny@suse.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It was reported that v5.14 behaves differently when enforcing RLIMIT_NPROC limit, namely, it allows one more task than previously. This is consequence of the commit 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") that missed the sharpness of equality in the forking path. In order to accommodate other existing checks of the RLIMIT_NPROC, the fix comprises of extending the result domain of ucount vs limit comparison. Forks or setting uid of a saturated user are denied. (Other RLIMIT_ per-user limits have correct comparison sharpness.) Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Reported-by: TBD Signed-off-by: Michal Koutný --- fs/exec.c | 2 +- include/linux/user_namespace.h | 2 +- kernel/fork.c | 2 +- kernel/sys.c | 2 +- kernel/ucount.c | 11 +++++++---- 5 files changed, 11 insertions(+), 8 deletions(-) This change breaks tools/testing/selftests/rlimits/rlimits-per-userns.c between v5.14..v5.15-rc1~172^2. The commit 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") is an inadvertent "fix". diff --git a/fs/exec.c b/fs/exec.c index 79f2c9483302..fc598c2652b2 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1881,7 +1881,7 @@ static int do_execveat_common(int fd, struct filename *filename, * whether NPROC limit is still exceeded. */ if ((current->flags & PF_NPROC_EXCEEDED) && - is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { + ucounts_limit_cmp(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) > 0) { retval = -EAGAIN; goto out_ret; } diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index 33a4240e6a6f..9ccc336196f7 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -129,7 +129,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum ucount_type type, long v); bool dec_rlimit_ucounts(struct ucounts *ucounts, enum ucount_type type, long v); long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum ucount_type type); void dec_rlimit_put_ucounts(struct ucounts *ucounts, enum ucount_type type); -bool is_ucounts_overlimit(struct ucounts *ucounts, enum ucount_type type, unsigned long max); +long ucounts_limit_cmp(struct ucounts *ucounts, enum ucount_type type, unsigned long max); static inline void set_rlimit_ucount_max(struct user_namespace *ns, enum ucount_type type, unsigned long max) diff --git a/kernel/fork.c b/kernel/fork.c index d75a528f7b21..7cb21a70737d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2022,7 +2022,7 @@ static __latent_entropy struct task_struct *copy_process( DEBUG_LOCKS_WARN_ON(!p->softirqs_enabled); #endif retval = -EAGAIN; - if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { + if (ucounts_limit_cmp(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0) { if (p->real_cred->user != INIT_USER && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) goto bad_fork_free; diff --git a/kernel/sys.c b/kernel/sys.c index ecc4cf019242..8ea20912103a 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -479,7 +479,7 @@ static int set_user(struct cred *new) * for programs doing set*uid()+execve() by harmlessly deferring the * failure to the execve() stage. */ - if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && + if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 && new_user != INIT_USER && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) current->flags |= PF_NPROC_EXCEEDED; diff --git a/kernel/ucount.c b/kernel/ucount.c index 65b597431c86..53ccd96387dd 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -343,18 +343,21 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum ucount_type type) return 0; } -bool is_ucounts_overlimit(struct ucounts *ucounts, enum ucount_type type, unsigned long rlimit) +long ucounts_limit_cmp(struct ucounts *ucounts, enum ucount_type type, unsigned long rlimit) { struct ucounts *iter; long max = rlimit; + long excess = LONG_MIN; if (rlimit > LONG_MAX) max = LONG_MAX; for (iter = ucounts; iter; iter = iter->ns->ucounts) { - if (get_ucounts_value(iter, type) > max) - return true; + /* we already WARN_ON negative ucounts, the subtraction result fits */ + excess = max_t(long, excess, get_ucounts_value(iter, type) - max); + if (excess > 0) + return excess; max = READ_ONCE(iter->ns->ucount_max[type]); } - return false; + return excess; } static __init int user_namespace_sysctl_init(void) -- 2.34.1