Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp1215221ybg; Thu, 11 Jun 2020 04:09:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxO7WhZ0n2snqtM3CvoPxhowWrKtjhPVyvBhRU/Gg5kPX4lUQhQX5CR9TqUleoAgGWR8PM+ X-Received: by 2002:a05:6402:1812:: with SMTP id g18mr6397475edy.96.1591873798077; Thu, 11 Jun 2020 04:09:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591873798; cv=none; d=google.com; s=arc-20160816; b=F/Mbqyb+DllFWj5ALxeruqC2FJNvgmSVECe6bqO4iy2+dohR09DOx7y7p/spwSjKUJ w/Q9EBFDwqfryMqWVgEx7e48gZjetIxpSOyoXBRJegvjXhbmjS+C8LrO4DMxWevbtF0E DvOe0gJ2/TPLYLcEDjphiMFNGOqgaRkrpL+cG7iwvtIYraBfRmEcAr6DipsIYowS7inj J8kAQITkvas+6Avci/Y3RTdkY6yXBXocusdmkXa694hU8yVYhTAQd4Q6ryw/dAAcOxmt sBXWUnM8vF+rFmzrM/coPTTQPZFBDZ9NXW0mbGaljkQH98o8YGdTKGOcu+E4SYvQze38 WbHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=2R+wiFfzvQNb0IhDifqVeL6cc3fjUJMhYBce6rC6DCk=; b=HiNovMOLvBQDAdCLtM0yS4gVdWUmuxyUhihQRUep52vRdTzilwgcQGQtDLSbarQnAb oV3bsGI7+Lap/B8z1m8V1P5hNZj4OocgYI2A5u7CuxmVOvnVGnj53wbewQHSRNnXVoGQ kAe//PJmxHcvKIxLX9ll99sBV3AkErSVlnXmfXUfVZMz4z8N/6hVrfLFVPZr5lAXKGYi wJLveav2VGotLvQilh8znhXeDf+Xlr6Vy9wgyV7rtEP0hYOeSirYPL1jRBThvRivreaS 2FCFo23PFA+XusxZcYaQW8LxJ3U+JO6Q/aGu58a/2IEab6N6pOXk5T2BxPsBzgCG5d85 wj1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sargun.me header.s=google header.b=Z+sW2TmS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y16si1763848ejb.177.2020.06.11.04.09.33; Thu, 11 Jun 2020 04:09:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@sargun.me header.s=google header.b=Z+sW2TmS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727955AbgFKLGi (ORCPT + 99 others); Thu, 11 Jun 2020 07:06:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727946AbgFKLGf (ORCPT ); Thu, 11 Jun 2020 07:06:35 -0400 Received: from mail-io1-xd41.google.com (mail-io1-xd41.google.com [IPv6:2607:f8b0:4864:20::d41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26795C08C5C3 for ; Thu, 11 Jun 2020 04:06:35 -0700 (PDT) Received: by mail-io1-xd41.google.com with SMTP id o5so5810609iow.8 for ; Thu, 11 Jun 2020 04:06:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sargun.me; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=2R+wiFfzvQNb0IhDifqVeL6cc3fjUJMhYBce6rC6DCk=; b=Z+sW2TmSn6OM09sIzUe+GQMvwrm3rd/qAFbZjYr2L+algtLl9Jmw9KCJ03Kqh3WBmH TUdyrRddl5XGBkntRXl1Rqrp3Xl8aYytjB4MgxAI8tcXsASWj5iaReXFPsMcxI6pG+l3 Q+69LhfrlGgdVOJf0uh2V+rCbQOn19wm6MmiI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=2R+wiFfzvQNb0IhDifqVeL6cc3fjUJMhYBce6rC6DCk=; b=uXLil1vFzaCfKUooTzxfDWFyKIpr6YesKb1UXGElqiOHuej/F8wnATKvVjlcft24C8 L7Fxm7dCi0BiQwPjwSfu9CdqDvR/PQXImwI5PlUXvFv4/2GBtNws9VM3IHX3G64p2qmm s+MHOT8NeLyeVQgUFRNxq0dDhxXzrAs0e4ORooDiWJb44HKJyNiwAc4mM8WF+zNrMcc0 QlkUyGvxrTure4900VzkPL/EJL+Jib2elfe0ZWqjSCpTuOmQfZjCwxauCDvGkw20BrX0 phiGoOCzYx3/ySO5SjqzJc8ugT4+T7nnUlCbJBDRsm4FleXZiSZJEo3VuOhWfo0NEN9e asZA== X-Gm-Message-State: AOAM531lonWIvU8n1gwL1BDa4Op9lyJLV+Ld97f2rDYJX3NR0QFlZ/n0 KROJ8crWSI4jE4pJHOZB57KBIw== X-Received: by 2002:a6b:1487:: with SMTP id 129mr7966323iou.197.1591873594035; Thu, 11 Jun 2020 04:06:34 -0700 (PDT) Received: from ircssh-2.c.rugged-nimbus-611.internal (80.60.198.104.bc.googleusercontent.com. [104.198.60.80]) by smtp.gmail.com with ESMTPSA id p10sm1369817ilm.32.2020.06.11.04.06.33 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 11 Jun 2020 04:06:33 -0700 (PDT) Date: Thu, 11 Jun 2020 11:06:31 +0000 From: Sargun Dhillon To: Christian Brauner Cc: Kees Cook , containers@lists.linux-foundation.org, Giuseppe Scrivano , Robert Sesek , Chris Palmer , Jann Horn , Greg Kroah-Hartman , Daniel Wagner , linux-kernel@vger.kernel.org, Matt Denton , John Fastabend , linux-fsdevel@vger.kernel.org, Tejun Heo , Al Viro , cgroups@vger.kernel.org, stable@vger.kernel.org, "David S . Miller" Subject: Re: [PATCH v3 1/4] fs, net: Standardize on file_receive helper to move fds across processes Message-ID: <20200611110630.GB30103@ircssh-2.c.rugged-nimbus-611.internal> References: <20200604125226.eztfrpvvuji7cbb2@wittgenstein> <20200605075435.GA3345@ircssh-2.c.rugged-nimbus-611.internal> <202006091235.930519F5B@keescook> <20200609200346.3fthqgfyw3bxat6l@wittgenstein> <202006091346.66B79E07@keescook> <037A305F-B3F8-4CFA-B9F8-CD4C9EF9090B@ubuntu.com> <202006092227.D2D0E1F8F@keescook> <20200610081237.GA23425@ircssh-2.c.rugged-nimbus-611.internal> <202006101953.899EFB53@keescook> <20200611100114.awdjswsd7fdm2uzr@wittgenstein> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200611100114.awdjswsd7fdm2uzr@wittgenstein> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 11, 2020 at 12:01:14PM +0200, Christian Brauner wrote: > On Wed, Jun 10, 2020 at 07:59:55PM -0700, Kees Cook wrote: > > On Wed, Jun 10, 2020 at 08:12:38AM +0000, Sargun Dhillon wrote: > > > As an aside, all of this junk should be dropped: > > > + ret = get_user(size, &uaddfd->size); > > > + if (ret) > > > + return ret; > > > + > > > + ret = copy_struct_from_user(&addfd, sizeof(addfd), uaddfd, size); > > > + if (ret) > > > + return ret; > > > > > > and the size member of the seccomp_notif_addfd struct. I brought this up > > > off-list with Tycho that ioctls have the size of the struct embedded in them. We > > > should just use that. The ioctl definition is based on this[2]: > > > #define _IOC(dir,type,nr,size) \ > > > (((dir) << _IOC_DIRSHIFT) | \ > > > ((type) << _IOC_TYPESHIFT) | \ > > > ((nr) << _IOC_NRSHIFT) | \ > > > ((size) << _IOC_SIZESHIFT)) > > > > > > > > > We should just use copy_from_user for now. In the future, we can either > > > introduce new ioctl names for new structs, or extract the size dynamically from > > > the ioctl (and mask it out on the switch statement in seccomp_notify_ioctl. > > > > Yeah, that seems reasonable. Here's the diff for that part: > > Why does it matter that the ioctl() has the size of the struct embedded > within? Afaik, the kernel itself doesn't do anything with that size. It > merely checks that the size is not pathological and it does so at > compile time. > > #ifdef __CHECKER__ > #define _IOC_TYPECHECK(t) (sizeof(t)) > #else > /* provoke compile error for invalid uses of size argument */ > extern unsigned int __invalid_size_argument_for_IOC; > #define _IOC_TYPECHECK(t) \ > ((sizeof(t) == sizeof(t[1]) && \ > sizeof(t) < (1 << _IOC_SIZEBITS)) ? \ > sizeof(t) : __invalid_size_argument_for_IOC) > #endif > > The size itself is not verified at runtime. copy_struct_from_user() > still makes sense at least if we're going to allow expanding the struct > in the future. Right, but if we simply change our headers and extend the struct, it will break all existing programs compiled against those headers. In order to avoid that, if we intend on extending this struct by appending to it, we need to have a backwards compatibility mechanism. Just having copy_struct_from_user isn't enough. The data structure either must be fixed size, or we need a way to handle multiple ioctl numbers derived from headers with different sized struct arguments The two approaches I see are: 1. use more indirection. This has previous art in drm[1]. That's look something like this: struct seccomp_notif_addfd_ptr { __u64 size; __u64 addr; } ... And then it'd be up to us to dereference the addr and copy struct from user. 2. Expose one ioctl to the user, many internally e.g., public api: struct seccomp_notif { __u64 id; __u64 pid; struct seccomp_data; __u64 fancy_new_field; } #define SECCOMP_IOCTL_NOTIF_RECV SECCOMP_IOWR(0, struct seccomp_notif) internally: struct seccomp_notif_v1 { __u64 id; __u64 pid; struct seccomp_data; } struct seccomp_notif_v2 { __u64 id; __u64 pid; struct seccomp_data; __u64 fancy_new_field; } and we can switch like this: switch (cmd) { /* for example. We actually have to do this for any struct we intend to * extend to get proper backwards compatibility */ case SECCOMP_IOWR(0, struct seccomp_notif_v1) return seccomp_notify_recv(filter, buf, sizeof(struct seccomp_notif_v1)); case SECCOMP_IOWR(0, struct seccomp_notif_v2) return seccomp_notify_recv(filter, buf, sizeof(struct seccomp_notif_v3)); ... case SECCOMP_IOCTL_NOTIF_SEND: return seccomp_notify_send(filter, buf); case SECCOMP_IOCTL_NOTIF_ID_VALID: return seccomp_notify_id_valid(filter, buf); default: return -EINVAL; } This has the downside that programs compiled against more modern kernel headers will break on older kernels. 3. We can take the approach you suggested. #define UNSIZED(cmd) (cmd & ~(_IOC_SIZEMASK << _IOC_SIZESHIFT) static long seccomp_notify_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct seccomp_filter *filter = file->private_data; void __user *buf = (void __user *)arg; int size = _IOC_SIZE(cmd); cmd = UNSIZED(cmd); switch (cmd) { /* for example. We actually have to do this for any struct we intend to * extend to get proper backwards compatibility */ case UNSIZED(SECCOMP_IOCTL_NOTIF_RECV): return seccomp_notify_recv(filter, buf, size); ... case SECCOMP_IOCTL_NOTIF_SEND: return seccomp_notify_send(filter, buf); case SECCOMP_IOCTL_NOTIF_ID_VALID: return seccomp_notify_id_valid(filter, buf); default: return -EINVAL; } } > > Leaving that aside, the proposed direction here seems to mean that any > change to the struct itself will immediately mean a new ioctl() but > afaict, that also means a new struct. Since when you simply extend the > struct for the sake of the new ioctl you also change the size for the > ioctl. > > Sure, you can simply treat the struct coming through the old ioctl as > being "capped" by e.g. hiding the size as suggested but then the gain > by having two separate ioctls is 0 compared to simply versioning the > struct with an explicit size member since the size encoded in the ioctl > and the actual size of the struct don't line up anymore which is the > only plus I can see for relying on _IOC_SIZE(). All this manages to do > then is to make it more annoying for userspace since they now need to > maintain multiple ioctls(). And if you have - however unlikely - say > three different ioctls all to be used with a different struct size of > the same struct I now need to know which ioctl() goes with which size of > the struct (I guess you could append the size to the ioctl name? > *shudder*). If you have the size in the struct itself you don't need to > care about any of that. > Maybe I'm not making sense or I misunderstand what's going on though. > > Christian > I don't understand why userspace has to have any knowledge of this. As soon as we add the code above, and we use copy_struct_from_user based on _that_ size, userspace will get free upgrades. If they are compiling against an older header than the kernel, size will return a smaller number, and thus we will zero out our trailing bits, and if their number is bigger, we just check their bits are appropriately zeroed. This approach would be forwards-and-backwards compatible. There's a little bit of prior art here as well [2]. The approach is that we effectively do the thing we had earlier with passing a size with copy_struct_from_user, but instead of the size being embedded in the struct, it's embedded in the ioctl command itself. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/drm/radeon_drm.h?h=v5.7#n831 [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/firewire/core-cdev.c?id=v5.7#n1621