Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2634762rwb; Wed, 30 Nov 2022 08:58:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf4TlwfwKMirc760TCXJDTE5VMg7eoddvuYb67ChBjOxbyIdMmQShQNhQu8lQL0TyA3bbQeY X-Received: by 2002:a05:6a00:1413:b0:56b:8e99:a5e9 with SMTP id l19-20020a056a00141300b0056b8e99a5e9mr42697790pfu.24.1669827480423; Wed, 30 Nov 2022 08:58:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669827480; cv=none; d=google.com; s=arc-20160816; b=AgeCFW8/iFBU8gOliqWe/mQyXUKuTs1omZTSRolRu4iavFtLKk5UvA6+hnAeKZwMmJ eBKgpMwCxeznUbmxrch4m79gucnbelX8tCCMeGvvyXidxus6DN+iON+mr2oaKgLOjIO3 aPhDHE2HuIigE+c1l8zYKbcGz4zkgKfo9TJZDRWdZP9ujS/Xny3b/HrzTe9G47e2t/wL P73RWfRaxXYtAqatz12wV5pZqP3K1AzRsv0rCJraeoQjz5KQgRC6cwrxNkcTg3tD3vEe u+MOvWikLTsQ56XYukYhlL5K47iZ/M8+5Oetq2bNiRmtW3LR7YkZ9h642lsZPp02njQk qSCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=UpuNagX9yGWmz76JAkOLaM+VS0TPgWzkWZvprmcKQbY=; b=LvC1AhXBq0PYLbtkif9bQ1Fze9/z4bpLWLgylqeIpYBF1fBAbV1dvtL3V/mXedINMt 6E6bH8YmujLYTGW7QtMLBZt62ruFP5B1dXD5OXGbcXP+ZtMAWCeb5m0X4qY+KvYrmOpB VT0umUPT6gZ6BcZ0xmAeukqpVqqoEploR6MpnOvZLgEgQ5bYsd3KzI4aWZdnQuJVPWXg uQCnRXThvKAvjgkuJkSMDepT11l6/MUmX7WIMcvDIo2JpYWMf6nJsxxJqy6n2MyP+VlM t0kT/FDURy/9y5uKfkawIXUy+WMQa8yQ2rQwKh3fCsKW71CiEbE+hD/bpOQZbfApEAAH TDHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=gby+j1c7; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x70-20020a638649000000b0046f6afe556bsi1838316pgd.651.2022.11.30.08.57.42; Wed, 30 Nov 2022 08:58:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=gby+j1c7; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230098AbiK3Qkn (ORCPT + 99 others); Wed, 30 Nov 2022 11:40:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229610AbiK3Qkl (ORCPT ); Wed, 30 Nov 2022 11:40:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18E506DFEA; Wed, 30 Nov 2022 08:40:37 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C6D50B81BB5; Wed, 30 Nov 2022 16:40:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1A557C433D6; Wed, 30 Nov 2022 16:40:33 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="gby+j1c7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1669826430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=UpuNagX9yGWmz76JAkOLaM+VS0TPgWzkWZvprmcKQbY=; b=gby+j1c7NXWwpVuXnHiMmJj+lPbOBkli/O9dEo2H5OspsHqBGo9JSgGSCUis3tMXl0Cpr0 xW5RlhjXuKo0k48ojwY3B9Q84+zlzXZJE1stp5RrYVYsy+GAU2K8S0O43J8Hy6EZstPYXJ Za2Lb5aA0cjThbxvATAe40Q8emKtqik= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 74a5b925 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 30 Nov 2022 16:40:30 +0000 (UTC) Date: Wed, 30 Nov 2022 17:38:13 +0100 From: "Jason A. Donenfeld" To: Florian Weimer Cc: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de, linux-crypto@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , Carlos O'Donell , Arnd Bergmann , Christian Brauner Subject: Re: [PATCH v10 1/4] random: add vgetrandom_alloc() syscall Message-ID: References: <20221129210639.42233-1-Jason@zx2c4.com> <20221129210639.42233-2-Jason@zx2c4.com> <877czc7m0g.fsf@oldenburg.str.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, Nov 30, 2022 at 04:39:55PM +0100, Jason A. Donenfeld wrote: > 2) Convert vgetrandom_alloc() into a clone3-style syscall, as Christian > suggested earlier, which might allow for a bit more overloading > capability. That would be a struct that looks like: > > struct vgetrandom_alloc_args { > __aligned_u64 flags; > __aligned_u64 states; > __aligned_u64 num; > __aligned_u64 size_of_each; > } > > - If flags is VGRA_ALLOCATE, states and size_of_each must be zero on > input, while num is the hint, as is the case now. On output, states, > size_of_each, and num are filled in. > > - If flags is VGRA_DEALLOCATE, states, size_of_each, and num must be as > they were originally, and then it deallocates. > > I suppose (2) would alleviate your concerns entirely, without future > uncertainty over what it'd be like to add special cases to munmap(). And > it'd add a bit more future proofing to the syscall, depending on what we > do. > > So maybe I'm warming up to that approach a bit. So I just did a little quick implementation to see what it'd feel like, and actually, it's quite simple, and might address a lot of concerns all at once. What do you think of the below? Documentation and such still needs work obviously, but the bones should be there. diff --git a/drivers/char/random.c b/drivers/char/random.c index 4341c6a91207..dae6095b937d 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -189,44 +189,53 @@ int __cold execute_with_initialized_rng(struct notifier_block *nb) /** * sys_vgetrandom_alloc - Allocate opaque states for use with vDSO getrandom(). * - * @num: On input, a pointer to a suggested hint of how many states to - * allocate, and on output the number of states actually allocated. - * - * @size_per_each: The size of each state allocated, so that the caller can - * split up the returned allocation into individual states. - * - * @flags: Currently always zero. + * @uargs: A vgetrandom_alloc_args which may be updated on return. + * allocate, and on output the number of states actually allocated. + * @usize: The size of @uargs, which determines the version of the struct used. * * The getrandom() vDSO function in userspace requires an opaque state, which * this function allocates by mapping a certain number of special pages into * the calling process. It takes a hint as to the number of opaque states * desired, and provides the caller with the number of opaque states actually * allocated, the size of each one in bytes, and the address of the first - * state. + * state. Alternatively, if the VGRA_DEALLOCATE flag is specified, the provided + * states parameter is unmapped. * - * Returns the address of the first state in the allocation on success, or a - * negative error value on failure. + * Returns 0 on success and an error value otherwise. */ -SYSCALL_DEFINE3(vgetrandom_alloc, unsigned int __user *, num, - unsigned int __user *, size_per_each, unsigned int, flags) +SYSCALL_DEFINE2(vgetrandom_alloc, struct vgetrandom_alloc_args __user *, uargs, size_t, usize) { const size_t state_size = sizeof(struct vgetrandom_state); + const size_t max_states = (SIZE_MAX & PAGE_MASK) / state_size; + struct vgetrandom_alloc_args args; size_t alloc_size, num_states; unsigned long pages_addr; - unsigned int num_hint; int ret; - if (flags) + if (usize > PAGE_SIZE) + return -E2BIG; + if (usize < VGETRANDOM_ALLOC_ARGS_SIZE_VER0) return -EINVAL; + ret = copy_struct_from_user(&args, sizeof(args), uargs, usize); + if (ret) + return ret; - if (get_user(num_hint, num)) - return -EFAULT; + /* Currently only VGRA_DEALLOCATE is defined. */ + if (args.flags & ~VGRA_DEALLOCATE) + return -EINVAL; - num_states = clamp_t(size_t, num_hint, 1, (SIZE_MAX & PAGE_MASK) / state_size); - alloc_size = PAGE_ALIGN(num_states * state_size); + if (args.flags & VGRA_DEALLOCATE) { + if (args.size_per_each != state_size || args.num > max_states || !args.states) + return -EINVAL; + return vm_munmap(args.states, args.num * state_size); + } - if (put_user(alloc_size / state_size, num) || put_user(state_size, size_per_each)) - return -EFAULT; + /* These don't make sense as input values if allocating, so reject them. */ + if (args.size_per_each || args.states) + return -EINVAL; + + num_states = clamp_t(size_t, args.num, 1, max_states); + alloc_size = PAGE_ALIGN(num_states * state_size); pages_addr = vm_mmap(NULL, 0, alloc_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, 0); @@ -237,7 +246,14 @@ SYSCALL_DEFINE3(vgetrandom_alloc, unsigned int __user *, num, if (ret < 0) goto err_unmap; - return pages_addr; + args.num = num_states; + args.size_per_each = state_size; + args.states = pages_addr; + + ret = -EFAULT; + if (copy_to_user(uargs, &args, sizeof(args))) + goto err_unmap; + return 0; err_unmap: vm_munmap(pages_addr, alloc_size); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 7741dc94f10c..de4338e26db0 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -72,6 +72,7 @@ struct open_how; struct mount_attr; struct landlock_ruleset_attr; enum landlock_rule_type; +struct vgetrandom_alloc_args; #include #include @@ -1006,9 +1007,8 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int flags, void __user *uargs); asmlinkage long sys_getrandom(char __user *buf, size_t count, unsigned int flags); -asmlinkage long sys_vgetrandom_alloc(unsigned int __user *num, - unsigned int __user *size_per_each, - unsigned int flags); +asmlinkage long sys_vgetrandom_alloc(struct vgetrandom_alloc_args __user *uargs, + size_t size); asmlinkage long sys_memfd_create(const char __user *uname_ptr, unsigned int flags); asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size); asmlinkage long sys_execveat(int dfd, const char __user *filename, diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h index e744c23582eb..49911ea2c343 100644 --- a/include/uapi/linux/random.h +++ b/include/uapi/linux/random.h @@ -55,4 +55,30 @@ struct rand_pool_info { #define GRND_RANDOM 0x0002 #define GRND_INSECURE 0x0004 +/* + * Flags for vgetrandom_alloc(2) + * + * VGRA_DEALLOCATE Deallocate supplied states. + */ +#define VGRA_DEALLOCATE 0x0001ULL + +/** + * struct vgetrandom_alloc_args - Arguments for the vgetrandom_alloc(2) syscall. + * + * @flags: Zero or more VGRA_* flags. + * @states: Zero on input if allocating, and filled in on successful + * return. An existing allocation, if deallocating. + * @num: A hint as to the desired number of states, if allocating. The + * number of existing states in @states, if deallocating + * @size_per_each: The size of each state in @states. + */ +struct vgetrandom_alloc_args { + __aligned_u64 flags; + __aligned_u64 states; + __aligned_u64 num; + __aligned_u64 size_per_each; +}; + +#define VGETRANDOM_ALLOC_ARGS_SIZE_VER0 32 /* sizeof first published struct */ + #endif /* _UAPI_LINUX_RANDOM_H */