Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp613319rwb; Thu, 6 Oct 2022 01:53:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4gz+uVIKh/0/E+j7KtU8d9ERUnQhSjuBruwnfqpLa4iSHR6iyx7m3j6ZG6ctfSa2qk05IB X-Received: by 2002:a63:4b02:0:b0:43c:1909:d350 with SMTP id y2-20020a634b02000000b0043c1909d350mr3526716pga.188.1665046392221; Thu, 06 Oct 2022 01:53:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665046392; cv=none; d=google.com; s=arc-20160816; b=yFf8rqejoA/M0PDv4qwel8pGy+a+JYMo1PTWjHzjYUBbt37dqqxCr2SOsgz3bllQ1D +KNjmSttdN3ZdwQ+HvBc3YV/Z1WkDzys3LGQodlwGyuzscBirruX4ze7XNxzZkg+dULb kgaTFZDSKy5uvRHhkLnP8fV8na6ShTKUcMOZa17pRjfpLHmB//QXgF/ZV1kQ653OUx/6 jQuQImnR+AUQh+QjKfckpmiv1EzU5B6r6FVZLlqLhM6t+vTb9eDvnpaANZWuURct5A0x xv/RGjKRQ8h7OsKpgz6zZTlY8jItHh/U/izgMcqv/yLbVERKvsVPj8rr5gLp1+psp926 tcHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6chMDy7sdAp7fbcB3rx+WX3mNPHiNX0s/cFRhM3ieOg=; b=U2/rwlzTKQPj5BRYwDVvEfA9pw1uc/lkW735rc/jaK1RTDJqDbjzEmZ1yvbbB2Sax/ U3Y35VDBX8dwzftwyXgfwiQ+6hL/DvqljkBhLcpfUlTABaooKPwQycpBI+XfW8T2ihAJ 1rkYGlhoQzPnF/KBj93kuKuVUhCqVRGRNo8G7/T9oqj8956+EZAWwJR+CvIY5v1wwBT7 xa7Oxie7JX3KUMQwk7r7PRGOa58QRY4vXZYkbZu73SPJhIYFN2owbEqX/sfojAoHCNsH hGB02PMPmdVw6FTLvEvIQWzpr5oio1vSuTCzikJyHtZoh/KuVFVb9W1z9etwwyiQmaIL gSfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=LEDgMUcW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k10-20020a63560a000000b0043a20d3388esi18367883pgb.321.2022.10.06.01.52.59; Thu, 06 Oct 2022 01:53:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=LEDgMUcW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230264AbiJFI2b (ORCPT + 99 others); Thu, 6 Oct 2022 04:28:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229766AbiJFI1v (ORCPT ); Thu, 6 Oct 2022 04:27:51 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3BC61183A for ; Thu, 6 Oct 2022 01:27:42 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id 70so1131542pjo.4 for ; Thu, 06 Oct 2022 01:27:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=6chMDy7sdAp7fbcB3rx+WX3mNPHiNX0s/cFRhM3ieOg=; b=LEDgMUcWBRzmi9Mo0+9qOenVfeL77iuZJWbz/l6KBHO9OnXx6oo4g63CPC4mRrRTek VdFOJa1Yb5jMjiimaPf2n+eeRbuQO4DDWl5V+/M07tajLIckBZFpcXq+ubCsS3ZX/aht M8noy9tEF8k2Cvub9V50VkmCeEeuaJzybH8cY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=6chMDy7sdAp7fbcB3rx+WX3mNPHiNX0s/cFRhM3ieOg=; b=nlVmrOoLnwgyp4vIJGbWeI1EZ0hV6jWQ+2+6uNMa3Av680gcuwYYwOSo6dcSMub8HA ZmsznNwOq3x2YwCpzj0CVODq7kauyXAlJv5eKsd2vU7E7qEAdKxHxt2didkLwwLob4LJ w4qzV1sf6gyYJFsq1ycyrzAMIZquFKYCHRPuUsMusg7lCRNzRXQmzAzsNles0YrtlytH 4FTpTh/6kgy7kzjOUXZrcF/zPT50tFCGVe40HIxxt3Axx664G0johXqVb/NsSxzu04b8 dtJ8k3hU6dwwNHiu7quFtoJ34UudFra3z2cUQnzQAlgj/IzXoM2+/ZmduvCgntAaYbHM +qfw== X-Gm-Message-State: ACrzQf0ZVk3lxoiBlDQ6wNg7bZsGpukijysqt20SIF5pgfOFQGhxUuRp 2KqnvpwibjW+LqW1/95ynbKbQg== X-Received: by 2002:a17:90b:4b03:b0:20a:c33f:ca47 with SMTP id lx3-20020a17090b4b0300b0020ac33fca47mr4129164pjb.10.1665044861286; Thu, 06 Oct 2022 01:27:41 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id k6-20020a170902c40600b0017f8094a52asm2330200plk.29.2022.10.06.01.27.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 01:27:39 -0700 (PDT) From: Kees Cook To: Eric Biederman Cc: Kees Cook , Jorge Merlino , Alexander Viro , "Christian Brauner (Microsoft)" , Thomas Gleixner , Andy Lutomirski , Sebastian Andrzej Siewior , Andrew Morton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, John Johansen , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris , Richard Haines , Casey Schaufler , Xin Long , "David S. Miller" , Todd Kjos , Ondrej Mosnacek , Prashanth Prahlad , Micah Morton , Fenghua Yu , Andrei Vagin , linux-kernel@vger.kernel.org, apparmor@lists.ubuntu.com, linux-security-module@vger.kernel.org, selinux@vger.kernel.org, linux-hardening@vger.kernel.org Subject: [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec Date: Thu, 6 Oct 2022 01:27:34 -0700 Message-Id: <20221006082735.1321612-2-keescook@chromium.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221006082735.1321612-1-keescook@chromium.org> References: <20221006082735.1321612-1-keescook@chromium.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8132; h=from:subject; bh=pn442Q+K4NVJzK4Ogn1OJE0Kf/fr+fLN7d/sv7w7wIQ=; b=owEBbQKS/ZANAwAKAYly9N/cbcAmAcsmYgBjPpF3G5PQJoJ0+WU4egcCbptwwSoJJwtYzGLz8/uw x6+wiOSJAjMEAAEKAB0WIQSlw/aPIp3WD3I+bhOJcvTf3G3AJgUCYz6RdwAKCRCJcvTf3G3AJjwwD/ 9AqmiFkR2fsSZYPEIflwSZT7OI5y0/NnvllzZijYCc4qJTyuATyZSV+rAw52JGZI+UqhwIWo7Rw3tL 0W6e8cdMFRts0lRnFAPBRbdCsvl/YhcO97MwHkXJiO8t7lYOJ49tSGrhJYFwb3zq1CihY8b5VjSGVK ynVnjgw0Aoj4C7QEy470dR/keFQc7ubnCJGalsUwSmj3oT/WleAcuy4/N4YretzlLd/s0OYeU7oGzc CP/TmfFOPNaB4UkhfILY+ZcowRsqvPz4GX1CF0syEVrwtJcmQd5H8N60VDdQfc4Ge2p07/ysSRog3W VSb84rDk8sqk3Vx3lj3cK2Z7Z60xZZ7cG9+PsJjD6EasD3k/3MQAOzFZLqlhMr72dSbuYf3V1M70xJ 8dliwMLT2aH74NOjxeLjMWsO+JJYjnmOabVIvFRzSTU3zlNwm0vu7JJX6W/MAfHjtVsDCDyDde4c3A Kqn9wIBt2UjMVZCorgxKTKEj8LHwC5OdxCNdtbIAnB4YqGdsNmQbWpwX45m4i6Yl3XLbFVbSzNT5xC 3939YiCArJArhAqIkw8Fk0jRBqH1ogvt1e6kNTuu94/O5w/TRidPhivZndR2gjgX4PTr5duo/9VGSq kciNr8P/ei//btCcC+4kU3tOLQ15QtK11gc5XyK8JReuFgDI99wTCnGabozQ== X-Developer-Key: i=keescook@chromium.org; a=openpgp; fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The check_unsafe_exec() counting of n_fs would not add up under a heavily threaded process trying to perform a suid exec, causing the suid portion to fail. This counting error appears to be unneeded, but to catch any possible conditions, explicitly unshare fs_struct on exec, if it ends up needing to happen. This means fs->in_exec must be removed (since it would never get cleared in the old copy), and is specifically no longer needed. See also commit 498052bba55e ("New locking/refcounting for fs_struct"). This additionally allows the "in_exec" member to be dropped, showing an 8 bytes savings per fs_struct on 64-bit. Before: struct fs_struct { int users; /* 0 4 */ spinlock_t lock; /* 4 4 */ seqcount_spinlock_t seq; /* 8 4 */ int umask; /* 12 4 */ int in_exec; /* 16 4 */ /* XXX 4 bytes hole, try to pack */ struct path root; /* 24 16 */ struct path pwd; /* 40 16 */ /* size: 56, cachelines: 1, members: 7 */ /* sum members: 52, holes: 1, sum holes: 4 */ /* last cacheline: 56 bytes */ }; After: struct fs_struct { int users; /* 0 4 */ spinlock_t lock; /* 4 4 */ seqcount_spinlock_t seq; /* 8 4 */ int umask; /* 12 4 */ struct path root; /* 16 16 */ struct path pwd; /* 32 16 */ /* size: 48, cachelines: 1, members: 6 */ /* last cacheline: 48 bytes */ }; Reported-by: Jorge Merlino Link: https://lore.kernel.org/lkml/20220910211215.140270-1-jorge.merlino@canonical.com/ Cc: Eric Biederman Cc: Alexander Viro Cc: "Christian Brauner (Microsoft)" Cc: Thomas Gleixner Cc: Andy Lutomirski Cc: Sebastian Andrzej Siewior Cc: Andrew Morton Cc: linux-mm@kvack.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kees Cook --- fs/exec.c | 9 +++--- fs/fs_struct.c | 1 - include/linux/fdtable.h | 1 + include/linux/fs_struct.h | 1 - kernel/fork.c | 62 ++++++++++++++++++++++++++------------- 5 files changed, 48 insertions(+), 26 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 902bce45b116..7d5f63f03c58 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1272,6 +1272,11 @@ int begin_new_exec(struct linux_binprm * bprm) if (retval) goto out; + /* Ensure the fs_struct is not shared. */ + retval = unshare_fs(); + if (retval) + goto out; + /* * Must be called _before_ exec_mmap() as bprm->mm is * not visible until then. This also enables the update @@ -1583,8 +1588,6 @@ static void check_unsafe_exec(struct linux_binprm *bprm) if (p->fs->users > n_fs) bprm->unsafe |= LSM_UNSAFE_SHARE; - else - p->fs->in_exec = 1; spin_unlock(&p->fs->lock); } @@ -1843,7 +1846,6 @@ static int bprm_execve(struct linux_binprm *bprm, goto out; /* execve succeeded */ - current->fs->in_exec = 0; current->in_execve = 0; rseq_execve(current); acct_update_integrals(current); @@ -1861,7 +1863,6 @@ static int bprm_execve(struct linux_binprm *bprm, force_fatal_sig(SIGSEGV); out_unmark: - current->fs->in_exec = 0; current->in_execve = 0; return retval; diff --git a/fs/fs_struct.c b/fs/fs_struct.c index 04b3f5b9c629..c27c67023d01 100644 --- a/fs/fs_struct.c +++ b/fs/fs_struct.c @@ -115,7 +115,6 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old) /* We don't need to lock fs - think why ;-) */ if (fs) { fs->users = 1; - fs->in_exec = 0; spin_lock_init(&fs->lock); seqcount_spinlock_init(&fs->seq, &fs->lock); fs->umask = old->umask; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index e066816f3519..fbfb89ee3bda 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -117,6 +117,7 @@ struct task_struct; void put_files_struct(struct files_struct *fs); int unshare_files(void); +int unshare_fs(void); struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy; void do_close_on_exec(struct files_struct *); int iterate_fd(struct files_struct *, unsigned, diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h index 783b48dedb72..08b82a90ceae 100644 --- a/include/linux/fs_struct.h +++ b/include/linux/fs_struct.h @@ -11,7 +11,6 @@ struct fs_struct { spinlock_t lock; seqcount_spinlock_t seq; int umask; - int in_exec; struct path root, pwd; } __randomize_layout; diff --git a/kernel/fork.c b/kernel/fork.c index b4a799d9c50f..53b7248f7a4b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1589,10 +1589,6 @@ static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) if (clone_flags & CLONE_FS) { /* tsk->fs is already what we want */ spin_lock(&fs->lock); - if (fs->in_exec) { - spin_unlock(&fs->lock); - return -EAGAIN; - } fs->users++; spin_unlock(&fs->lock); return 0; @@ -3070,10 +3066,9 @@ static int check_unshare_flags(unsigned long unshare_flags) return 0; } -/* - * Unshare the filesystem structure if it is being shared - */ -static int unshare_fs(unsigned long unshare_flags, struct fs_struct **new_fsp) +/* Allocate an fs_struct if it is currently shared and CLONE_FS requested. */ +static int unshare_fs_alloc(unsigned long unshare_flags, + struct fs_struct **new_fsp) { struct fs_struct *fs = current->fs; @@ -3091,6 +3086,41 @@ static int unshare_fs(unsigned long unshare_flags, struct fs_struct **new_fsp) return 0; } +/* Swap out fs_struct, returning old fs if it needs freeing. */ +static void unshare_fs_finalize(struct fs_struct **new_fsp) +{ + struct task_struct *task = current; + struct fs_struct *fs = task->fs; + + spin_lock(&fs->lock); + task->fs = *new_fsp; + if (--fs->users) + *new_fsp = NULL; + else + *new_fsp = fs; + spin_unlock(&fs->lock); +} + +/* + * Unshare the filesystem structure if it is being shared + */ +int unshare_fs(void) +{ + struct fs_struct *new_fs = NULL; + int error; + + error = unshare_fs_alloc(CLONE_FS, &new_fs); + if (error || !new_fs) + return error; + + unshare_fs_finalize(&new_fs); + + if (new_fs) + free_fs_struct(new_fs); + + return 0; +} + /* * Unshare file descriptor table if it is being shared */ @@ -3120,7 +3150,7 @@ int unshare_fd(unsigned long unshare_flags, unsigned int max_fds, */ int ksys_unshare(unsigned long unshare_flags) { - struct fs_struct *fs, *new_fs = NULL; + struct fs_struct *new_fs = NULL; struct files_struct *new_fd = NULL; struct cred *new_cred = NULL; struct nsproxy *new_nsproxy = NULL; @@ -3159,7 +3189,7 @@ int ksys_unshare(unsigned long unshare_flags) */ if (unshare_flags & (CLONE_NEWIPC|CLONE_SYSVSEM)) do_sysvsem = 1; - err = unshare_fs(unshare_flags, &new_fs); + err = unshare_fs_alloc(unshare_flags, &new_fs); if (err) goto bad_unshare_out; err = unshare_fd(unshare_flags, NR_OPEN_MAX, &new_fd); @@ -3197,16 +3227,8 @@ int ksys_unshare(unsigned long unshare_flags) task_lock(current); - if (new_fs) { - fs = current->fs; - spin_lock(&fs->lock); - current->fs = new_fs; - if (--fs->users) - new_fs = NULL; - else - new_fs = fs; - spin_unlock(&fs->lock); - } + if (new_fs) + unshare_fs_finalize(&new_fs); if (new_fd) swap(current->files, new_fd); -- 2.34.1