Received: by 2002:ab2:6a05:0:b0:1f8:1780:a4ed with SMTP id w5csp1608572lqo; Sun, 12 May 2024 09:51:57 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVrjv5ZJT/MgeVVU57k9VNbLthYZmS7MDGPMuhjYhA6j5pPYvi+USreEuZy8MROwX6FimA1Cat5IEjBYauOl37WDfZG3eGkBCsBAuAaxQ== X-Google-Smtp-Source: AGHT+IEa8dwUCNnaS8YMcKsgCDUOpWTpsukTl3gGer3C6S+ukUXar+3VvVqBDPDpSLIsl5RW1BMW X-Received: by 2002:a17:906:2dd0:b0:a59:cfe5:947d with SMTP id a640c23a62f3a-a5a2d5f1569mr537591066b.40.1715532717102; Sun, 12 May 2024 09:51:57 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715532717; cv=pass; d=google.com; s=arc-20160816; b=X5wP6wY/QuxOiBpa/AtNwL4kP9QmjHdbKGMFQ002tsCKizatXuEjQcsfJ5+1obnF/x fIq5x7WppdTIg6S2MCz5ArFThQMLCaSyHezgaqEM/dPh+xA68zwe7gda4zhezvPMsoyi 6H7bIjbl2lnvnP37HHM6mZ7+a0lBmlBz4VqpaAOQ6rV4xuAjXyo9z3eFfQWF/hYuNZlH n5DhebkRlTUHbUMfV3augGj4EWcTow9pIKwMiOebqXL30bYYWd02+tK4M72Zuo30O9Q9 gptKgBQ1T8MJ1tk0yHlfuGOuX5DwVbV1a7tHocxOSveKXhNJAradx8arCHzxuCq8GP4n dRMw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=jCrpLSCvieLd5QpfbqaZ+60TaWVqCyMqDTb3Yf0SIfE=; fh=tAELyelKdLPpJHiCXz/Zmt/B/1ekPWbwK4A9WYCiavo=; b=cVlx7p5XBucJPGuKE3l4ryQJVxOpkrilaYnNR43AqY3vtZWCrA+BukSwUIrK9b0j0m y0So53EN4SNELSj+BW9/yCOXLFLFX9+y1Qk4WSR8idMrHjDU5//hAuqsKG+vxaRe1V1Q FUes7xsycew+6ZdXJ+YirJUwED7LM6Ni5i8Zuxwe6kHvKR9B0MOEC4lBmU6ENpewFqmw j7GPYVkw3jdCPjwzgj5pkg1CNZQV0MrEk/wYfZEOPq7/9VwdyNLOJDNvTTjf8daWgetB uxazyUiYhx6xwbk5YDuO4e2bVFZs0AqMbP8Wonx2Qw8teEKsH4Oi7TUmxlb+lWq+RNW5 QJ1Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=ghiti.fr); spf=pass (google.com: domain of linux-kernel+bounces-176941-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-176941-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id a640c23a62f3a-a5a1797bf9bsi425686566b.250.2024.05.12.09.51.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 May 2024 09:51:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-176941-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=ghiti.fr); spf=pass (google.com: domain of linux-kernel+bounces-176941-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-176941-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 9914C1F211F3 for ; Sun, 12 May 2024 16:51:56 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 54BFA4B5CD; Sun, 12 May 2024 16:51:45 +0000 (UTC) Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49535DDD7; Sun, 12 May 2024 16:51:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.70.183.193 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715532704; cv=none; b=FtSkCyiOFpBH8fWe+f9+c0Lr7y4Z9AP014sFNZPssWKn9zf3BBGX5jLG2m0+OyTuyO/enVMq0KpnaDk6edwUBI6ksk28xIQr39qpfmL7VJLkK+owmgOT+wfOmY9V57pxoMHcfhL6Dh31Nzs4+BUC3KU4XYn+Sc2HT9qGrUKXI4w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715532704; c=relaxed/simple; bh=u+mTLviAtKqkjVwjm67/CPoAoh0BegY5v8YMcgwh08k=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=AkUUoXl8KP4lxQb2r1xY3A/G4egOa41N6ahy3ZMXszxIHMMP6mIHMgfO3VtyivmAKViovCf82tKoi+wXswBZyWQtc3u+VAkmf3wbZNYwFtiaA+YzjztMDrVk0LHRjep0TEdQMTYtgXNMTk5ca9TPnGQ7GfuGhbb0sUev4+J2NpQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ghiti.fr; spf=pass smtp.mailfrom=ghiti.fr; arc=none smtp.client-ip=217.70.183.193 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ghiti.fr Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ghiti.fr Received: by mail.gandi.net (Postfix) with ESMTPSA id 28326240003; Sun, 12 May 2024 16:50:22 +0000 (UTC) Message-ID: Date: Sun, 12 May 2024 18:50:18 +0200 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 14/29] riscv/mm: Implement map_shadow_stack() syscall Content-Language: en-US To: Deepak Gupta , paul.walmsley@sifive.com, rick.p.edgecombe@intel.com, broonie@kernel.org, Szabolcs.Nagy@arm.com, kito.cheng@sifive.com, keescook@chromium.org, ajones@ventanamicro.com, conor.dooley@microchip.com, cleger@rivosinc.com, atishp@atishpatra.org, bjorn@rivosinc.com, alexghiti@rivosinc.com, samuel.holland@sifive.com, conor@kernel.org Cc: linux-doc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, corbet@lwn.net, palmer@dabbelt.com, aou@eecs.berkeley.edu, robh+dt@kernel.org, krzysztof.kozlowski+dt@linaro.org, oleg@redhat.com, akpm@linux-foundation.org, arnd@arndb.de, ebiederm@xmission.com, Liam.Howlett@oracle.com, vbabka@suse.cz, lstoakes@gmail.com, shuah@kernel.org, brauner@kernel.org, andy.chiu@sifive.com, jerry.shih@sifive.com, hankuan.chen@sifive.com, greentime.hu@sifive.com, evan@rivosinc.com, xiao.w.wang@intel.com, charlie@rivosinc.com, apatel@ventanamicro.com, mchitale@ventanamicro.com, dbarboza@ventanamicro.com, sameo@rivosinc.com, shikemeng@huaweicloud.com, willy@infradead.org, vincent.chen@sifive.com, guoren@kernel.org, samitolvanen@google.com, songshuaishuai@tinylab.org, gerg@kernel.org, heiko@sntech.de, bhe@redhat.com, jeeheng.sia@starfivetech.com, cyy@cyyself.name, maskray@google.com, ancientmodern4@gmail.com, mathis.salmen@matsal.de, cuiyunhui@bytedance.com, bgray@linux.ibm.com, mpe@ellerman.id.au, baruch@tkos.co.il, alx@kernel.org, david@redhat.com, catalin.marinas@arm.com, revest@chromium.org, josh@joshtriplett.org, shr@devkernel.io, deller@gmx.de, omosnace@redhat.com, ojeda@kernel.org, jhubbard@nvidia.com References: <20240403234054.2020347-1-debug@rivosinc.com> <20240403234054.2020347-15-debug@rivosinc.com> From: Alexandre Ghiti In-Reply-To: <20240403234054.2020347-15-debug@rivosinc.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-GND-Sasl: alex@ghiti.fr On 04/04/2024 01:35, Deepak Gupta wrote: > As discussed extensively in the changelog for the addition of this > syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the > existing mmap() and madvise() syscalls do not map entirely well onto the > security requirements for shadow stack memory since they lead to windows > where memory is allocated but not yet protected or stacks which are not > properly and safely initialised. Instead a new syscall map_shadow_stack() > has been defined which allocates and initialises a shadow stack page. > > This patch implements this syscall for riscv. riscv doesn't require token > to be setup by kernel because user mode can do that by itself. However to > provide compatibility and portability with other architectues, user mode > can specify token set flag. > > Signed-off-by: Deepak Gupta > --- > arch/riscv/kernel/Makefile | 2 + > arch/riscv/kernel/usercfi.c | 149 ++++++++++++++++++++++++++++++++ > include/uapi/asm-generic/mman.h | 1 + > 3 files changed, 152 insertions(+) > create mode 100644 arch/riscv/kernel/usercfi.c > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 604d6bf7e476..3bec82f4e94c 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -107,3 +107,5 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ > > obj-$(CONFIG_64BIT) += pi/ > obj-$(CONFIG_ACPI) += acpi.o > + > +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o > diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c > new file mode 100644 > index 000000000000..c4ed0d4e33d6 > --- /dev/null > +++ b/arch/riscv/kernel/usercfi.c > @@ -0,0 +1,149 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2024 Rivos, Inc. > + * Deepak Gupta > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define SHSTK_ENTRY_SIZE sizeof(void *) > + > +/* > + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen > + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to > + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow > + * stack. > + */ > +static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) > +{ > + /* > + * Since shadow stack is supported only in 64bit configuration, > + * ssamoswap.d is used below. > * * CONFIG_RISCV_USER_CFI is dependent > + * on 64BIT and compile of this file is dependent on CONFIG_RISCV_USER_CFI > + * In case ssamoswap faults, return -1. To me, this part of the comment is not needed. > + * Never expect -1 on shadow stack. Expect return addresses and zero In that case, should we BUG() instead? > + */ > + unsigned long swap = -1; > + > + __enable_user_access(); > + asm goto( > + ".option push\n" > + ".option arch, +zicfiss\n" > + "1: ssamoswap.d %[swap], %[val], %[addr]\n" > + _ASM_EXTABLE(1b, %l[fault]) > + RISCV_ACQUIRE_BARRIER > + ".option pop\n" > + : [swap] "=r" (swap), [addr] "+A" (*addr) > + : [val] "r" (val) > + : "memory" > + : fault > + ); > + __disable_user_access(); > + return swap; > +fault: > + __disable_user_access(); > + return -1; > +} > + > +/* > + * Create a restore token on the shadow stack. A token is always XLEN wide > + * and aligned to XLEN. > + */ > +static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) > +{ > + unsigned long addr; > + > + /* Token must be aligned */ > + if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) > + return -EINVAL; > + > + /* On RISC-V we're constructing token to be function of address itself */ > + addr = ssp - SHSTK_ENTRY_SIZE; > + > + if (amo_user_shstk((unsigned long __user *)addr, (unsigned long) ssp) == -1) > + return -EFAULT; > + > + if (token_addr) > + *token_addr = addr; > + > + return 0; > +} > + > +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, > + unsigned long token_offset, > + bool set_tok) > +{ > + int flags = MAP_ANONYMOUS | MAP_PRIVATE; > + struct mm_struct *mm = current->mm; > + unsigned long populate, tok_loc = 0; > + > + if (addr) > + flags |= MAP_FIXED_NOREPLACE; > + > + mmap_write_lock(mm); > + addr = do_mmap(NULL, addr, size, PROT_READ, flags, Hmmm why do you map the shadow stack as PROT_READ here? > + VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL); > + mmap_write_unlock(mm); > + > + if (!set_tok || IS_ERR_VALUE(addr)) > + goto out; > + > + if (create_rstor_token(addr + token_offset, &tok_loc)) { > + vm_munmap(addr, size); > + return -EINVAL; > + } > + > + addr = tok_loc; > + > +out: > + return addr; > +} > + > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) > +{ > + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; > + unsigned long aligned_size = 0; > + > + if (!cpu_supports_shadow_stack()) > + return -EOPNOTSUPP; > + > + /* Anything other than set token should result in invalid param */ > + if (flags & ~SHADOW_STACK_SET_TOKEN) > + return -EINVAL; > + > + /* > + * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is available > + * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction > + * itself. This provides static property on register programming and writes to CSR can't > + * be unintentional from programmer's perspective. As long as programmer has guarded areas > + * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since > + * CSR_SSP is writeable by user mode, it itself can setup a shadow stack token subsequent > + * to allocation. Although in order to provide portablity with other architecture (because > + * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token > + * flag in flags and if provided in flags, setup a token at the base. > + */ > + > + /* If there isn't space for a token */ > + if (set_tok && size < SHSTK_ENTRY_SIZE) > + return -ENOSPC; > + > + if (addr && (addr % PAGE_SIZE)) I would use: if (addr && (addr & (PAGE_SIZE - 1)) > + return -EINVAL; > + > + aligned_size = PAGE_ALIGN(size); > + if (aligned_size < size) > + return -EOVERFLOW; > + > + return allocate_shadow_stack(addr, aligned_size, size, set_tok); > +} > diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h > index 57e8195d0b53..0c0ac6214de6 100644 > --- a/include/uapi/asm-generic/mman.h > +++ b/include/uapi/asm-generic/mman.h > @@ -19,4 +19,5 @@ > #define MCL_FUTURE 2 /* lock all future mappings */ > #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ > > +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ > #endif /* __ASM_GENERIC_MMAN_H */ Don't we need to advertise this new syscall to the man pages?