Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp3545231rwn; Sat, 10 Sep 2022 14:17:28 -0700 (PDT) X-Google-Smtp-Source: AA6agR6HaH9gM0PuHF1SS+bjeF9fLNMmp2fqLpO0eEHhEHoLaHDRfiWKtk5cTuVzUj4/4GO5ywmR X-Received: by 2002:a17:90b:1bc2:b0:200:a97b:4ae5 with SMTP id oa2-20020a17090b1bc200b00200a97b4ae5mr16840369pjb.147.1662844647964; Sat, 10 Sep 2022 14:17:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662844647; cv=none; d=google.com; s=arc-20160816; b=yIXorBKO8cXk6Y1iat/lOWVEWstBhr7/oLrHNUb8GLAWkRMTqlINRujrQ+2EUPvjNW OB5/IDjqNN7or3UU1vb3bt0TqE1BaWo8LR9I5fdd7CSdZbS3S/OTPqhXxds2dUghRp4g G+phjdmnw5aCwzYgD/2xZ5SnhopY7aTVgF8vfyT5SWXT1rcCQpbRgdukzt8IHdiqkFRQ VZ4k2Uo6ZMUwQxpCoFYrSSl82xvhQWYu0WNhhSkgg/COHrgctyAR7mNrPXs6TYBI+C1v qIjUmyEKYzXSx7fR48YqNUBfXgRfmkp/skx3EZAQkkvkLJQ85JYr0zTfP6gkobI5+sj6 hRJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=YIXqxtSbGe6vWCX7g0eeSDhgSl7mx2fz9QQkdP6SCjU=; b=jMPjYtFlXflvwtp5MZG0+mgdlkHZcjxnWIEr/hUiYvuhPD3o49JaRoXWaa2oOb+vCz 36WKUKLPsq6RgBf25i5cKjrbAjFQfNZCHPpM6V/hDBiME3hYJb4P5PF4qMC5eWPHSX7D lmKn7x2fnc8yd3DxvkUwjP7YnYsgs2vSjgv6x6alk43C07xjfCKFL+h1a5Bl8XYC2xxE dAvvRqMJn8sLjscF0GuNztuzqRwdITlnEueJhlBvXcH+SGeaAJhKNKws4E6+c7cMXlES kV11zTsRVMXkzZH7RdHpeh9NsFoADvcVV4otg7ZcUSxmylo1emJd8AOSQzDl7gjPG1k1 gcuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b=OkyKYvAp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l20-20020a637c54000000b00434d8692854si4460844pgn.541.2022.09.10.14.17.16; Sat, 10 Sep 2022 14:17:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b=OkyKYvAp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229697AbiIJVNo (ORCPT + 99 others); Sat, 10 Sep 2022 17:13:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229629AbiIJVNn (ORCPT ); Sat, 10 Sep 2022 17:13:43 -0400 Received: from smtp-relay-internal-1.canonical.com (smtp-relay-internal-1.canonical.com [185.125.188.123]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EDD64B4AA for ; Sat, 10 Sep 2022 14:13:41 -0700 (PDT) Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 99822402D4 for ; Sat, 10 Sep 2022 21:13:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1662844417; bh=YIXqxtSbGe6vWCX7g0eeSDhgSl7mx2fz9QQkdP6SCjU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=OkyKYvApXY8hT4qnfzeHHlyqCeSGBUEbF3qpXIuQ1qHxBzhrUnzF9N4Nf2FtWFfCN LZ8LV7F8IP19idSEy99YBXzq45XVWnE6fYzZ8lyPATDHiJJ2SFr66jjr2aeOxsmqEo cXZbBo/edTqaPSUQkCyk174Ue9duQJZgqiCAD2QNh8i+ubcI/7ITN/TDwjxZR7VGb9 ha03RNXI+jGy9gBhGLj7x4ukryOamOdazzrehgECajzWZCKR1UbUiUXbi3EXpPLeYc r85AGysjJxJSMsolEy7jSidSafYDJCahxZGTRnH37zb9kORyoAzCpozCZkaRZf7yJi p+fd9I+bka5+Q== Received: by mail-ot1-f72.google.com with SMTP id s24-20020a0568301e1800b0063b341613f2so2642110otr.19 for ; Sat, 10 Sep 2022 14:13:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=YIXqxtSbGe6vWCX7g0eeSDhgSl7mx2fz9QQkdP6SCjU=; b=SiWWe9ew4UsVc3fcwlPQ+7PyVspQdMO6U3x9D6SxehLVUTZ/H2ReDuqbaNsaBgmwUx CHO4dG3ckTuaMGXNj3p4mF9a3+fCKS/6oHJdwwrIqMZ5LA4wb+hK2541vWw2L+SHf6U2 z2UG6EISo5Uev+RMvtSBgiRfIhgthOy0g2RjDDoiobYIEl83spId8TGTg3wfIevbQfim lgKFkkakjiSNcYv2516oIuLIXEVw/cWraO58EEn6DSP4FnRmUuJCb+QWfXGwMiFjvYHj Swd58TmVfA5DdwlW5tl+sa74tIRSOjW2bH5TKcax575NPvdlWAjspA1nN4pquLPE6m9G UukA== X-Gm-Message-State: ACgBeo2YCX/0RLTXRYcvyajfuraqxpVz14uwSf8+8EciTP/MaPyDa7qW o/k98+Ri+hyf2oPWRF6KdZ14i0Hnb8v6D9SA6T25ops9ieoDy94Mk4j6W95pYDtGENg01TX3YRN Jh0EiqqoH/gs7hhpVPoCLn/YuPh4yErbtj3iMv/AsBw== X-Received: by 2002:a05:6830:1f2e:b0:655:ca83:ddf2 with SMTP id e14-20020a0568301f2e00b00655ca83ddf2mr1151079oth.235.1662844414871; Sat, 10 Sep 2022 14:13:34 -0700 (PDT) X-Received: by 2002:a05:6830:1f2e:b0:655:ca83:ddf2 with SMTP id e14-20020a0568301f2e00b00655ca83ddf2mr1151054oth.235.1662844414569; Sat, 10 Sep 2022 14:13:34 -0700 (PDT) Received: from localhost.localdomain ([2001:67c:1562:8007::aac:4084]) by smtp.gmail.com with ESMTPSA id z14-20020a056870e30e00b0012769122387sm2892567oad.54.2022.09.10.14.13.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Sep 2022 14:13:34 -0700 (PDT) From: Jorge Merlino To: Alexander Viro , Eric Biederman , Kees Cook , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Cc: Jorge Merlino , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] Fix race condition when exec'ing setuid files Date: Sat, 10 Sep 2022 18:12:14 -0300 Message-Id: <20220910211215.140270-1-jorge.merlino@canonical.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch fixes a race condition in check_unsafe_exec when a heavily threaded program tries to exec a setuid file. check_unsafe_exec counts the number of threads sharing the same fs_struct and compares it to the total number of users of the fs_struct by looking at its users counter. If there are more users than process threads using it the setuid exec fails with LSM_UNSAFE_SHARE. The problem is that, during the kernel_clone code execution, the fs_struct users counter is incremented before the new thread is added to the thread_group list. So there is a race when the counter has been incremented but the thread is not visible to the while_each_tread loop in check_unsafe_exec. This patch sort of fixes this by setting a process flag to the parent process during the time this race is possible. Thus, if a process is forking, it counts an extra user fo the fs_struct as the counter might be incremented before the thread is visible. But this is not great as this could generate the opposite problem as there may be an external process sharing the fs_struct that is masked by some thread that is being counted twice. I submit this patch just as an idea but mainly I want to introduce this issue and see if someone comes up with a better solution. This is a simple code to reproduce this issue: $ cat Makefile ALL=a b all: $(ALL) a: LDFLAGS=-pthread b: b.c $(CC) b.c -o b sudo chown root:root b sudo chmod u+s b test: for I in $$(seq 1000); do echo $I; ./a ; done clean: rm -vf $(ALL) $ cat a.c void *nothing(void *p) { return NULL; } void *target(void *p) { for (;;) { pthread_t t; if (pthread_create(&t, NULL, nothing, NULL) == 0) pthread_join(t, NULL); } return NULL; } int main(void) { struct timespec tv; int i; for (i = 0; i < 10; i++) { pthread_t t; pthread_create(&t, NULL, target, NULL); } tv.tv_sec = 0; tv.tv_nsec = 100000; nanosleep(&tv, NULL); if (execl("./b", "./b", NULL) < 0) perror("execl"); return 0; } $ cat b.c int main(void) { const uid_t euid = geteuid(); if (euid != 0) { printf("Failed, got euid %d (expecting 0)\n", euid); return 1; } return 0; } $ make make cc -pthread a.c -o a cc b.c -o b sudo chown root:root b sudo chmod u+s b $ make test Without this fix, one will see 'Failed, got euid 1000 (expecting 0)' messages --- fs/exec.c | 2 ++ include/linux/sched.h | 1 + kernel/fork.c | 3 +++ 3 files changed, 6 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index 9a5ca7b82bfc..a6f949a899d5 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1581,6 +1581,8 @@ static void check_unsafe_exec(struct linux_binprm *bprm) while_each_thread(p, t) { if (t->fs == p->fs) n_fs++; + if (t->flags & PF_IN_FORK) + n_fs++; } rcu_read_unlock(); diff --git a/include/linux/sched.h b/include/linux/sched.h index e7b2f8a5c711..f307165a434a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1722,6 +1722,7 @@ extern struct pid *cad_pid; * I am cleaning dirty pages from some other bdi. */ #define PF_KTHREAD 0x00200000 /* I am a kernel thread */ #define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */ +#define PF_IN_FORK 0x02000000 /* Process is forking, prevents race condition on fs_struct users value */ #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */ #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */ #define PF_MEMALLOC_PIN 0x10000000 /* Allocation context constrained to zones which allow long term pinning. */ diff --git a/kernel/fork.c b/kernel/fork.c index 8a9e92068b15..54e1e1fbe0bd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2245,6 +2245,7 @@ static __latent_entropy struct task_struct *copy_process( retval = copy_files(clone_flags, p); if (retval) goto bad_fork_cleanup_semundo; + current->flags |= PF_IN_FORK; retval = copy_fs(clone_flags, p); if (retval) goto bad_fork_cleanup_files; @@ -2474,6 +2475,7 @@ static __latent_entropy struct task_struct *copy_process( attach_pid(p, PIDTYPE_PID); nr_threads++; } + current->flags &= ~PF_IN_FORK; total_forks++; hlist_del_init(&delayed.node); spin_unlock(¤t->sighand->siglock); @@ -2556,6 +2558,7 @@ static __latent_entropy struct task_struct *copy_process( spin_lock_irq(¤t->sighand->siglock); hlist_del_init(&delayed.node); spin_unlock_irq(¤t->sighand->siglock); + current->flags &= ~PF_IN_FORK; return ERR_PTR(retval); } -- 2.34.1