Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1889387ybk; Sun, 17 May 2020 03:53:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyVr3idwu7zM5ChAkZTJlgRkeSn+5lYEoMKyQ+UiXHwkbcA+Ff7vTYGdsAjNJFT9vY0r7iB X-Received: by 2002:a17:906:ece4:: with SMTP id qt4mr11026421ejb.162.1589712782068; Sun, 17 May 2020 03:53:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589712782; cv=none; d=google.com; s=arc-20160816; b=ZOAcAW6SJ0kxSSARxaCKUzMaArdDcHNKTnKFnOZWE9oQb6MeL8SDW8ZfrXt6zncqXW YZ3PiL6Mo7bsU+dg2nYW4zqlwAU2ATnmuTG7/zMd6dWRFnRTt6KfsVOKxNl/0cp7GfF0 u/S4F8PEqJCi5sWu/01FlbQQ0vLHKub2eQTV2gqgiJKgLauXzyfatHT3lMXrhAo9OrJa Nr4ObI7ndhi4KUkq4kSyPUV+zdNUKNs9mXWiAJyyAFdddS0EUc+o6KJeTIUtlySsthka oOui0QC53MvP9X5MSeu5mGj7E17f87eT0lRdCibT/55E6NxrsRe3S/ZzsxM/fh8LK+Xf QQCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=dfQoL55+kVZMiCtGrkwsF4TShmvX3iLP14tydIbQUMU=; b=Zy0Jy78wcoxkuDL2b93Xmm16l3JKBW9EwQK0sCPbosYeyATWy6TSpOzYCrQFmwfbWH nJMSm7fFLD6Y+IRbXelKdo+nwy3RVYBsAdJMrwCrXLQ/K5qxMrt37ZR8aYe7e+asP28q DCnXo/dOQIqYNmxfKnd/pscI/qifAztIuhRv4YBeWKkjhLSbflwbbm5Uy85mjmA7s9Ev XXGyHd7/MoTkTKohBj3T3UEs2IQ9I/pcNYcMoBIwGXVrZDjhkrX4RHx+5d1BoKbLLeh2 qpX8VhRuO7T3By6BM63WKcnCpTSBKagSq12kW/1eAxpLOudVHQiPdZpQiJwKhqtRYS2l 3byg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h4si4859865edn.445.2020.05.17.03.52.38; Sun, 17 May 2020 03:53:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727983AbgEQKrc (ORCPT + 99 others); Sun, 17 May 2020 06:47:32 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:56066 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727957AbgEQKra (ORCPT ); Sun, 17 May 2020 06:47:30 -0400 Received: from ip5f5af183.dynamic.kabel-deutschland.de ([95.90.241.131] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jaGpB-0006im-8W; Sun, 17 May 2020 10:47:17 +0000 Date: Sun, 17 May 2020 12:47:01 +0200 From: Christian Brauner To: Kees Cook Cc: Sargun Dhillon , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, linux-api@vger.kernel.org, tycho@tycho.ws, cyphar@cyphar.com Subject: Re: [PATCH] seccomp: Add group_leader pid to seccomp_notif Message-ID: <20200517104701.bbn2d2rqaplwchdw@wittgenstein> References: <20200515234005.32370-1-sargun@sargun.me> <202005162344.74A02C2D@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <202005162344.74A02C2D@keescook> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 17, 2020 at 12:17:14AM -0700, Kees Cook wrote: > On Fri, May 15, 2020 at 04:40:05PM -0700, Sargun Dhillon wrote: > > This includes the thread group leader ID in the seccomp_notif. This is > > immediately useful for opening up a pidfd for the group leader, as > > pidfds only work on group leaders. > > > > Previously, it was considered to include an actual pidfd in the > > seccomp_notif structure[1], but it was suggested to avoid proliferating > > mechanisms to create pidfds[2]. > > > > [1]: https://lkml.org/lkml/2020/1/24/133 > > [2]: https://lkml.org/lkml/2020/5/15/481 > > nit: please use lore.kernel.org/lkml/ URLs > > > Suggested-by: Christian Brauner > > Signed-off-by: Sargun Dhillon > > --- > > include/uapi/linux/seccomp.h | 2 + > > kernel/seccomp.c | 1 + > > tools/testing/selftests/seccomp/seccomp_bpf.c | 50 +++++++++++++++++++ > > 3 files changed, 53 insertions(+) > > > > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h > > index c1735455bc53..f0c272ef0f1e 100644 > > --- a/include/uapi/linux/seccomp.h > > +++ b/include/uapi/linux/seccomp.h > > @@ -75,6 +75,8 @@ struct seccomp_notif { > > __u32 pid; > > __u32 flags; > > struct seccomp_data data; > > + __u32 tgid; > > + __u8 pad0[4]; > > }; > > I think we need to leave off padding and instead use __packed. If we > don't then userspace can't tell when "pad0" changes its "meaning" (i.e. > the size of seccomp_notif becomes 88 bytes with above -- either via > explicit padding like you've got or via implicit by the compiler. If > some other u32 gets added in the future, user space will still see "88" > as the size. > > So I *think* the right change here is: > > -}; > + __u32 tgid; > +} __packed; > > Though tgid may need to go above seccomp_data... for when it grows. > Agh... > > _However_, unfortunately, I appear to have no thought this through very > well, and there is actually no sanity-checking in the kernel for dealing > with an old userspace when sizes change. :( For example, if a userspace > doesn't check sizes and calls an ioctl, etc, the kernel will clobber the > user buffer if it's too small. I'd just argue that that's just userspace messing up. > > Even the SECCOMP_GET_NOTIF_SIZES command lacks a buffer size argument. > :( > > So: > > - should we just declare such userspace as "wrong"? I don't think > that'll work, especially since what if we ever change the size of > seccomp_data... that predated the ..._SIZES command. Yeah, that's nasty since the struct member in seccomp_notif would now clobber each other. > > - should we add a SECCOMP_SET_SIZES command to tell the kernel what > we're expecting? There's no "state" associated across seccomp(2) > calls, but maybe that doesn't matter because only user_notif writes > back to userspace. For the ioctl, the state could be part of the > private file data? Sending seccomp_data back to userspace only > happens here, and any changes in seccomp_data size will just be seen > as allowing a filter to query further into it. Sounds ok-ish in my opinion. > > - should GET_SIZES report "useful" size? (i.e. exclude padding?) Or... And that's more invasive but ultimately cleaner we v2 the whole thing so e.g. SECCOMP_IOCTL_NOTIF_RECV2, SECCOMP_IOCTL_NOTIF_SEND2, and embedd the size argument in the structs. Userspace sets the size argument, we use get_user() to get the size first and then copy_struct_from_user() to handle it cleanly based on that. A similar model as with sched (has other unrelated quirks because they messed up something too): static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr) { u32 size; int ret; /* Zero the full structure, so that a short copy will be nice: */ memset(attr, 0, sizeof(*attr)); ret = get_user(size, &uattr->size); if (ret) return ret; /* ABI compatibility quirk: */ if (!size) size = SCHED_ATTR_SIZE_VER0; if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE) goto err_size; ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size); if (ret) { if (ret == -E2BIG) goto err_size; return ret; } We're probably the biggest user of this right now and I'd be ok with that change. If it's a v2 than whatever. :) Christian