Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp683056pxb; Wed, 3 Feb 2021 15:17:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJxWIvDDEKGLdZRLmI1JXHE+WoMX0zAT47b5h0LhK/BTd9rzW63gliPYq2/+r/lj9JfEKLVR X-Received: by 2002:a17:906:51d0:: with SMTP id v16mr5508497ejk.510.1612394254676; Wed, 03 Feb 2021 15:17:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612394254; cv=none; d=google.com; s=arc-20160816; b=BdNJOBBv7+039kB5k2n++yKYm+ftfA5MFNh9c0naY5+gqG93yNb2ggsVjw9t9YTT8E asATWUv5KYO7UhneTtjb7RkKvvMhDb07PW1etYGdn3J8GiTuI4tMYj58a0MNDlDCLyXs a3Dk/+6U63BFau9hYIWlT1Jjz2utV81zzxJF0gf8hyrXQM4f61PXbHSQLZx+qjExyJ36 dQzSMgzIGKrVhzWZeK0CserGcKEWL+5DInaRa3j2fSqQw6tX8y1Mj1j4ofPpa1WZatIJ ZDqrHdirzjqN01pTSCxRsydt6Rg8K1JciLbmTxBK7ugGlpziWGaHfleuFbrk2kWNfHtO Kjjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=DCeEHRObsI4IPQt/WHDH85f4+IIuLGhpFd2k4r92IvI=; b=XM5G3PC2jBv8Lmij27yUDVUk7zn2NDgE7XtvwfPn+voUp79aPM881Osb3JCqy7Sws7 VkPF+scc0hRe4tS09fT72zLvVkjbi9vF44yt89TQeKeWGwcADb6RBR1fC5X0bp+xqo1H f3tiPlcQov/VasX9sUkuE0PgnyF19HsRRC6t29AZlFUrO5bAuU+KsFtxmGHoXMhgES2e EYKFqRjDiOZMQ0yqpUWH6czVPr+0nDhGBqoFyTxia6T2Q9UsB+ZpFHKhyZzaXSsBu9aI c5YboEspcDDlJfxqAiHo4Fr52r46bAXwinZph8gUT/NNvOQD1QA1UMveK8+LYcdRjjdk Z4cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h24si2676626ejt.638.2021.02.03.15.17.10; Wed, 03 Feb 2021 15:17:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233029AbhBCXOJ (ORCPT + 99 others); Wed, 3 Feb 2021 18:14:09 -0500 Received: from mga06.intel.com ([134.134.136.31]:30210 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233410AbhBCW7S (ORCPT ); Wed, 3 Feb 2021 17:59:18 -0500 IronPort-SDR: c+3yzZfp/Bdp8n6o9R/vK06iwkKraJI9kdFpHJ/shT5SegDZa0Uh6Zj9tR8ee5M8QM/dqoEiyt fp/Sg/4v7GDg== X-IronPort-AV: E=McAfee;i="6000,8403,9884"; a="242642377" X-IronPort-AV: E=Sophos;i="5.79,399,1602572400"; d="scan'208";a="242642377" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2021 14:56:34 -0800 IronPort-SDR: xQPdUz0I4HCP247LzRakHq5Y+iv05DT2V1YHAj596HtDb4J9m6x6FOwpx5OK3BiNhvo9JWUENB 7c9VdeVeneVQ== X-IronPort-AV: E=Sophos;i="5.79,399,1602572400"; d="scan'208";a="507921168" Received: from yyu32-desk.sc.intel.com ([143.183.136.146]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2021 14:56:33 -0800 From: Yu-cheng Yu To: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , Weijiang Yang , Pengfei Xu Cc: Yu-cheng Yu Subject: [PATCH v19 23/25] x86/cet/shstk: Handle thread shadow stack Date: Wed, 3 Feb 2021 14:55:45 -0800 Message-Id: <20210203225547.32221-24-yu-cheng.yu@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20210203225547.32221-1-yu-cheng.yu@intel.com> References: <20210203225547.32221-1-yu-cheng.yu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The kernel allocates (and frees on thread exit) a new shadow stack for a pthread child. It is possible for the kernel to complete the clone syscall and set the child's shadow stack pointer to NULL and let the child thread allocate a shadow stack for itself. There are two issues in this approach: It is not compatible with existing code that does inline syscall and it cannot handle signals before the child can successfully allocate a shadow stack. A 64-bit shadow stack has a size of min(RLIMIT_STACK, 4 GB). A compat-mode thread shadow stack has a size of 1/4 min(RLIMIT_STACK, 4 GB). This allows more threads to run in a 32-bit address space. Signed-off-by: Yu-cheng Yu --- arch/x86/include/asm/cet.h | 3 ++ arch/x86/include/asm/mmu_context.h | 3 ++ arch/x86/kernel/cet.c | 44 ++++++++++++++++++++++++++++++ arch/x86/kernel/process.c | 8 ++++++ 4 files changed, 58 insertions(+) diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h index 73435856ce54..ec4b5e62d0ce 100644 --- a/arch/x86/include/asm/cet.h +++ b/arch/x86/include/asm/cet.h @@ -18,12 +18,15 @@ struct cet_status { #ifdef CONFIG_X86_CET int cet_setup_shstk(void); +int cet_setup_thread_shstk(struct task_struct *p, unsigned long clone_flags); void cet_disable_shstk(void); void cet_free_shstk(struct task_struct *p); int cet_verify_rstor_token(bool ia32, unsigned long ssp, unsigned long *new_ssp); void cet_restore_signal(struct sc_ext *sc); int cet_setup_signal(bool ia32, unsigned long rstor, struct sc_ext *sc); #else +static inline int cet_setup_thread_shstk(struct task_struct *p, + unsigned long clone_flags) { return 0; } static inline void cet_disable_shstk(void) {} static inline void cet_free_shstk(struct task_struct *p) {} static inline void cet_restore_signal(struct sc_ext *sc) { return; } diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 27516046117a..e90bd2ee8498 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -11,6 +11,7 @@ #include #include +#include #include extern atomic64_t last_mm_ctx_id; @@ -146,6 +147,8 @@ do { \ #else #define deactivate_mm(tsk, mm) \ do { \ + if (!tsk->vfork_done) \ + cet_free_shstk(tsk); \ load_gs_index(0); \ loadsegment(fs, 0); \ } while (0) diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c index c3da4f59bd17..feb466dc2ea8 100644 --- a/arch/x86/kernel/cet.c +++ b/arch/x86/kernel/cet.c @@ -172,6 +172,50 @@ int cet_setup_shstk(void) return 0; } +int cet_setup_thread_shstk(struct task_struct *tsk, unsigned long clone_flags) +{ + unsigned long addr, size; + struct cet_user_state *state; + struct cet_status *cet = &tsk->thread.cet; + + if (!cet->shstk_size) + return 0; + + if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM) + return 0; + + state = get_xsave_addr(&tsk->thread.fpu.state.xsave, + XFEATURE_CET_USER); + + if (!state) + return -EINVAL; + + /* Cap shadow stack size to 4 GB */ + size = min(rlimit(RLIMIT_STACK), 1UL << 32); + + /* + * Compat-mode pthreads share a limited address space. + * If each function call takes an average of four slots + * stack space, allocate 1/4 of stack size for shadow stack. + */ + if (in_compat_syscall()) + size /= 4; + size = round_up(size, PAGE_SIZE); + addr = alloc_shstk(size, 0); + + if (IS_ERR_VALUE(addr)) { + cet->shstk_base = 0; + cet->shstk_size = 0; + return PTR_ERR((void *)addr); + } + + fpu__prepare_write(&tsk->thread.fpu); + state->user_ssp = (u64)(addr + size); + cet->shstk_base = addr; + cet->shstk_size = size; + return 0; +} + void cet_disable_shstk(void) { struct cet_status *cet = ¤t->thread.cet; diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 145a7ac0c19a..3af6b36e1a5c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -43,6 +43,7 @@ #include #include #include +#include #include "process.h" @@ -109,6 +110,7 @@ void exit_thread(struct task_struct *tsk) free_vm86(t); + cet_free_shstk(tsk); fpu__drop(fpu); } @@ -181,6 +183,12 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg, if (clone_flags & CLONE_SETTLS) ret = set_new_tls(p, tls); +#ifdef CONFIG_X86_64 + /* Allocate a new shadow stack for pthread */ + if (!ret) + ret = cet_setup_thread_shstk(p, clone_flags); +#endif + if (!ret && unlikely(test_tsk_thread_flag(current, TIF_IO_BITMAP))) io_bitmap_share(p); -- 2.21.0