Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp620735ybe; Thu, 5 Sep 2019 03:19:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqxAX0sm/up4vdDAb/RMIc7LG+cs8V9CWeNxLX41nNzampZLbfJjylCzKGft5mvGb+gAGcGC X-Received: by 2002:a17:90a:fc8:: with SMTP id 66mr3058885pjz.134.1567678774531; Thu, 05 Sep 2019 03:19:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567678774; cv=none; d=google.com; s=arc-20160816; b=Q3jRMiZixqmM4sOYyPUHiyaEbR/xb/kAUSaccl128CqpiutJ+moRuibKa1zuqv2Swv ezmVGrAwWCpEgQjnoHfxEWJcgHxkPx1ah9nZmczhx9990KH8j/oz/eGX5NkPMNRHalxK rfQpIzeoGQGrEkx54fiIAaA8MhcAZx17nb5K8AVWIK/FDNwk6OfA35uDzi6ea0iKuCbT XyCgXIE/FM9tfE392JASwlulmZxo2OmKt6d/Md/MTdU2/6RBffOXnnU0p+RRcJn+zaZ/ P+QLFxqD+8o20MQdDcjLa/hNd8qVZLDBzP0GLN+zZfk3sRI8s+SQ/lb6b5LxMumLHvr/ i85w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=l1YN9VpH/aYg0X3xZabsO4bXrFRVzSizK1a70qL3LOE=; b=Ff/vN+JFZapW+jF+j2Sam0lzCwmT1pXttJAo/5qNDPaDrqwt74FWMUnoCFlBVIhCF4 yJe8D8oaXDKX/WHHTlV6yUNXpI0Qad/XRuVoV9ROgz5VdLBGxvtPylA4GQn7ZfBjCKWq kWr4PkMrT+UibvNmgeXnMbMwMp+XOfDyt0jyoXi4F8Op8Za8UztRWtK0uCmCmxylmwwt y4UcmAkS657HYme+5j92y+ws2gAy7S7HlpABBIVn1t7QcC76m1Nwxw0qlDNm7vDEmCfC /Asd4nDr6DnnxqnqGIiC8KAMwhe6ccybAhTKywiDDNmysFLVlf+VC6SLgjIA2Y1+LDo7 y5HA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d15si1387770pgv.90.2019.09.05.03.19.18; Thu, 05 Sep 2019 03:19:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733306AbfIEJ1C (ORCPT + 99 others); Thu, 5 Sep 2019 05:27:02 -0400 Received: from mx2.mailbox.org ([80.241.60.215]:61728 "EHLO mx2.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731550AbfIEJ1A (ORCPT ); Thu, 5 Sep 2019 05:27:00 -0400 Received: from smtp2.mailbox.org (smtp2.mailbox.org [80.241.60.241]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 572C6A167F; Thu, 5 Sep 2019 11:26:53 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by hefe.heinlein-support.de (hefe.heinlein-support.de [91.198.250.172]) (amavisd-new, port 10030) with ESMTP id A-4M1Owl_Xlc; Thu, 5 Sep 2019 11:26:48 +0200 (CEST) Date: Thu, 5 Sep 2019 19:26:22 +1000 From: Aleksa Sarai To: Peter Zijlstra Cc: Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Christian Brauner , Rasmus Villemoes , Eric Biederman , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Aleksa Sarai , Linus Torvalds , containers@lists.linux-foundation.org, linux-alpha@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org Subject: Re: [PATCH v12 01/12] lib: introduce copy_struct_{to,from}_user helpers Message-ID: <20190905092622.tlb6nn3uisssdfbu@yavin.dot.cyphar.com> References: <20190904201933.10736-1-cyphar@cyphar.com> <20190904201933.10736-2-cyphar@cyphar.com> <20190905073205.GY2332@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ctyxwo5xz342v75e" Content-Disposition: inline In-Reply-To: <20190905073205.GY2332@hirez.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --ctyxwo5xz342v75e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2019-09-05, Peter Zijlstra wrote: > On Thu, Sep 05, 2019 at 06:19:22AM +1000, Aleksa Sarai wrote: > > +/** > > + * copy_struct_to_user: copy a struct to user space > > + * @dst: Destination address, in user space. > > + * @usize: Size of @dst struct. > > + * @src: Source address, in kernel space. > > + * @ksize: Size of @src struct. > > + * > > + * Copies a struct from kernel space to user space, in a way that guar= antees > > + * backwards-compatibility for struct syscall arguments (as long as fu= ture > > + * struct extensions are made such that all new fields are *appended* = to the > > + * old struct, and zeroed-out new fields have the same meaning as the = old > > + * struct). > > + * > > + * @ksize is just sizeof(*dst), and @usize should've been passed by us= er space. > > + * The recommended usage is something like the following: > > + * > > + * SYSCALL_DEFINE2(foobar, struct foo __user *, uarg, size_t, usize) > > + * { > > + * int err; > > + * struct foo karg =3D {}; > > + * > > + * // do something with karg > > + * > > + * err =3D copy_struct_to_user(uarg, usize, &karg, sizeof(karg)); > > + * if (err) > > + * return err; > > + * > > + * // ... > > + * } > > + * > > + * There are three cases to consider: > > + * * If @usize =3D=3D @ksize, then it's copied verbatim. > > + * * If @usize < @ksize, then kernel space is "returning" a newer str= uct to an > > + * older user space. In order to avoid user space getting incomplete > > + * information (new fields might be important), all trailing bytes = in @src > > + * (@ksize - @usize) must be zerored >=20 > s/zerored/zero/, right? It should've been "zeroed". > > , otherwise -EFBIG is returned. >=20 > 'Funny' that, copy_struct_from_user() below seems to use E2BIG. This is a copy of the semantics that sched_[sg]etattr(2) uses -- E2BIG for a "too big" struct passed to the kernel, and EFBIG for a "too big" struct passed to user-space. I would personally have preferred EMSGSIZE instead of EFBIG, but felt using the existing error codes would be less confusing. >=20 > > + * * If @usize > @ksize, then the kernel is "returning" an older stru= ct to a > > + * newer user space. The trailing bytes in @dst (@usize - @ksize) w= ill be > > + * zero-filled. > > + * > > + * Returns (in all cases, some data may have been copied): > > + * * -EFBIG: (@usize < @ksize) and there are non-zero trailing bytes= in @src. > > + * * -EFAULT: access to user space failed. > > + */ > > +int copy_struct_to_user(void __user *dst, size_t usize, > > + const void *src, size_t ksize) > > +{ > > + size_t size =3D min(ksize, usize); > > + size_t rest =3D abs(ksize - usize); > > + > > + if (unlikely(usize > PAGE_SIZE)) > > + return -EFAULT; >=20 > Not documented above. Implementation consistent with *from*, but see > below. Will update the kernel-doc. > > + if (unlikely(!access_ok(dst, usize))) > > + return -EFAULT; > > + > > + /* Deal with trailing bytes. */ > > + if (usize < ksize) { > > + if (memchr_inv(src + size, 0, rest)) > > + return -EFBIG; > > + } else if (usize > ksize) { > > + if (__memzero_user(dst + size, rest)) > > + return -EFAULT; > > + } > > + /* Copy the interoperable parts of the struct. */ > > + if (__copy_to_user(dst, src, size)) > > + return -EFAULT; > > + return 0; > > +} > > +EXPORT_SYMBOL(copy_struct_to_user); > > + > > +/** > > + * copy_struct_from_user: copy a struct from user space > > + * @dst: Destination address, in kernel space. This buffer must be @= ksize > > + * bytes long. > > + * @ksize: Size of @dst struct. > > + * @src: Source address, in user space. > > + * @usize: (Alleged) size of @src struct. > > + * > > + * Copies a struct from user space to kernel space, in a way that guar= antees > > + * backwards-compatibility for struct syscall arguments (as long as fu= ture > > + * struct extensions are made such that all new fields are *appended* = to the > > + * old struct, and zeroed-out new fields have the same meaning as the = old > > + * struct). > > + * > > + * @ksize is just sizeof(*dst), and @usize should've been passed by us= er space. > > + * The recommended usage is something like the following: > > + * > > + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, = usize) > > + * { > > + * int err; > > + * struct foo karg =3D {}; > > + * > > + * err =3D copy_struct_from_user(&karg, sizeof(karg), uarg, size); > > + * if (err) > > + * return err; > > + * > > + * // ... > > + * } > > + * > > + * There are three cases to consider: > > + * * If @usize =3D=3D @ksize, then it's copied verbatim. > > + * * If @usize < @ksize, then the user space has passed an old struct= to a > > + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @= usize) > > + * are to be zero-filled. > > + * * If @usize > @ksize, then the user space has passed a new struct = to an > > + * older kernel. The trailing bytes unknown to the kernel (@usize -= @ksize) > > + * are checked to ensure they are zeroed, otherwise -E2BIG is retur= ned. > > + * > > + * Returns (in all cases, some data may have been copied): > > + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes= in @src. > > + * * -E2BIG: @usize is "too big" (at time of writing, >PAGE_SIZE). > > + * * -EFAULT: access to user space failed. > > + */ > > +int copy_struct_from_user(void *dst, size_t ksize, > > + const void __user *src, size_t usize) > > +{ > > + size_t size =3D min(ksize, usize); > > + size_t rest =3D abs(ksize - usize); > > + > > + if (unlikely(usize > PAGE_SIZE)) > > + return -EFAULT; >=20 > Documented above as returning -E2BIG. I will switch this (and to) back to -E2BIG -- I must've had a brain-fart when doing some refactoring. >=20 > > + if (unlikely(!access_ok(src, usize))) > > + return -EFAULT; > > + > > + /* Deal with trailing bytes. */ > > + if (usize < ksize) > > + memset(dst + size, 0, rest); > > + else if (usize > ksize) { > > + const void __user *addr =3D src + size; > > + char buffer[BUFFER_SIZE] =3D {}; >=20 > Isn't that too big for on-stack? Is a 64-byte buffer too big? I picked the number "at random" to be the size of a cache line, but I could shrink it down to 32 bytes if the size is an issue (I wanted to avoid needless allocations -- hence it being on-stack). > > + > > + while (rest > 0) { > > + size_t bufsize =3D min(rest, sizeof(buffer)); > > + > > + if (__copy_from_user(buffer, addr, bufsize)) > > + return -EFAULT; > > + if (memchr_inv(buffer, 0, bufsize)) > > + return -E2BIG; > > + > > + addr +=3D bufsize; > > + rest -=3D bufsize; > > + } >=20 > The perf implementation uses get_user(); but if that is too slow, surely > we can do something with uaccess_try() here? Is there a non-x86-specific way to do that (unless I'm mistaken only x86 has uaccess_try() or the other *_try() wrappers)? The main "performance improvement" (if you can even call it that) is that we use memchr_inv() which finds non-matching characters more efficiently than just doing a loop. > > + } > > + /* Copy the interoperable parts of the struct. */ > > + if (__copy_from_user(dst, src, size)) > > + return -EFAULT; > > + return 0; > > +} > > +EXPORT_SYMBOL(copy_struct_from_user); >=20 > And personally I'm not a big fan of EXPORT_SYMBOL(). I don't have much of an opinion (after all, it only really makes sense a lot of sense for syscalls) -- though out-of-tree modules that define ioctl()s wouldn't be able to make use of them. --=20 Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH --ctyxwo5xz342v75e Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSxZm6dtfE8gxLLfYqdlLljIbnQEgUCXXDUuwAKCRCdlLljIbnQ EkuOAP40xlR/F06o1fNB6rvD1iKaBJIRC05rW3WDn2pxUoltnAD/bSvjzMtd1lc1 JInrmBQUHIPZa+Rk1zPMB2BFjgHRZAA= =mJdv -----END PGP SIGNATURE----- --ctyxwo5xz342v75e--