Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1882011imu; Sun, 18 Nov 2018 10:45:30 -0800 (PST) X-Google-Smtp-Source: AJdET5ewSOFMo9FRSBS+wWiSTBnbhGIx13cOqkNx2WxsrzId+TOSDgxP6oJqt9vcICoHg7oiUIPW X-Received: by 2002:a17:902:8346:: with SMTP id z6mr19224353pln.340.1542566730719; Sun, 18 Nov 2018 10:45:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542566730; cv=none; d=google.com; s=arc-20160816; b=THTpNIRAmR+reUkIraPewMnkjuR8VErEk9pxraCwr+UTiiGtFoYdAMHDS6jclkd17e pxuirfP7lsUJ25kd5zq28dizWMZkUBszGU340JRBlLAhn/0NY4d9JuQpdRIp/jZp4WhW GsXc5cVjmBHQvJQYTUJO97oSUBG9aQNty0oPM6t/8f2/GK4vyQlflhQcHDkoo8CnqYBW EOxlOqkPPNweMmHN0kFbtY0lkwrTVgKYMbZ7pZ4RUS0z5ebrQ/cEimydxXfBQ1u5927n UPXcz3ZzXxhFapAZtF2+2Cdwl0gV/LzD+AUJp9lYduNgevsVmIKXzxGTEn2NDzMV6suj 4ZzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=Z7jWucpNbcARGImXzn3CViVjGpMa7ko5qvrz96R9Lrc=; b=wjZ0cUV9CosiF1RkMeHjzmLbrcPpCMhM57jy8I4ifDleyHkExmLo4AZ63zEXj23UX8 GMGapzCtwTDioggJQaZnFp2xo8g2SI9PpVHt/+fP+rSJsjh9pi2NZcr117NzM8n/Rw8/ oWZcORGIH+CF3+LKN/PGYAVw5H7PBEogS3fdvexdK+k6mN8IowQk8wy0r3KeB8rKYB8+ eMwVK1iH2RNxMbxmF//YBF5knbg2gHSScogeP6aIr/KZweow3Cychg+98OxkKU1V0m48 62skeRrOgML+a3BDh58E06FWJZmgMtjLfZkQhOfLTNewtveCs7GQf12Mk1fgY9u3aPq6 Y2Fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=D45NGXHG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u8-v6si38487410plh.188.2018.11.18.10.44.59; Sun, 18 Nov 2018 10:45:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=D45NGXHG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725908AbeKSFE4 (ORCPT + 99 others); Mon, 19 Nov 2018 00:04:56 -0500 Received: from mail-vk1-f193.google.com ([209.85.221.193]:36762 "EHLO mail-vk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725762AbeKSFEz (ORCPT ); Mon, 19 Nov 2018 00:04:55 -0500 Received: by mail-vk1-f193.google.com with SMTP id u62so6314439vkb.3 for ; Sun, 18 Nov 2018 10:43:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Z7jWucpNbcARGImXzn3CViVjGpMa7ko5qvrz96R9Lrc=; b=D45NGXHGV/gmLWEgZmR5gFIPiCE/iTb/IQDIIwM9Wjf2n3Q8X7quL0uxGakZYcQj5b 418dIf8jwjcm2e55wB2ilJClX9dbUFM1KfXizPfALL4Nkb4ID9V67CXdBFNWFQSkLZmu aDRzBF//TWaMKDElx/eAetE7xpHNLyVdvcT4xqhDdF8iJrqZABfjkHmE2Xp7KnYUlJn9 kYmxSFnNip04I2qHh6ow+aa76czNRQM5ubNQoBSfXJxyLmL+KbKLartxUFpz8eA8BpNs jPh2dU3pwjIRHgkdWd46OFw4rGight/TG1/Th3cP8njQAW84GTcQvRG6zS69/mMxGsbF jXWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Z7jWucpNbcARGImXzn3CViVjGpMa7ko5qvrz96R9Lrc=; b=PaVC6299nhEYvscD6w4J+sswtiCmU/nCWwvqvieGgrwHcjo0MnU6wpav1beZFYhFBG vKivFd2hLxWRTufBgpslGpMfiQR0y3ruHYTW3dFGHZNrlytej7jDXEARBnNeY/D6eQRl mPKlbzkSDzCp57dGI4qA9k3QfDLDrHVzEy0rnO+ZNy8bsDFkyMo2yhn1lKeInw+Ld/zE PJhze818ctxOAq9mBcjqJA6lvmmQ/sLSlJy9Iphd6IEmv8gi689VxaAjbqkkZGQKYRo3 Yo8eurqBFfuSCViw220dCAdLWQ3ISWpYo6T+LmPa7qUX0CR2xgkjYH/vYPOeasEZUhQP PXvQ== X-Gm-Message-State: AGRZ1gIRF6IcKFV8P6bFLkbwPDqUZnZns4ALlJs7AqkAadjpKZTozZMe j7yxNBtiPXowISjNkEO7ZHIZKuA9wlK2jilO9SNz1QEkjeg= X-Received: by 2002:a1f:f203:: with SMTP id q3mr7793280vkh.54.1542566633586; Sun, 18 Nov 2018 10:43:53 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a67:f48d:0:0:0:0:0 with HTTP; Sun, 18 Nov 2018 10:43:52 -0800 (PST) In-Reply-To: References: <20181118111751.6142-1-christian@brauner.io> From: Daniel Colascione Date: Sun, 18 Nov 2018 10:43:52 -0800 Message-ID: Subject: Re: [PATCH] proc: allow killing processes via file descriptors To: Andy Lutomirski Cc: Randy Dunlap , Christian Brauner , "Eric W. Biederman" , LKML , "Serge E. Hallyn" , Jann Horn , Andrew Morton , Oleg Nesterov , Aleksa Sarai , Al Viro , Linux FS Devel , Linux API , Tim Murray , Kees Cook , Jan Engelhardt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 18, 2018 at 10:28 AM, Andy Lutomirski wrote: > On Sun, Nov 18, 2018 at 9:51 AM Daniel Colascione wrote: >> >> > I'm not entirely sure that ship has sailed. In the kernel, we already >> > have a bit of a distinction between a pid (and tid, etc -- I'm >> > referring to struct pid) and a task. If we make a new >> > process-management API, we could put a distinction like this into the >> > API. >> >> It would be a disaster to have different APIs give callers a different >> idea of process identity over its lifetime. The precedent is >> well-established that execve and setreuid do not change a process's >> identity. Invaliding some identifiers but not others in response to >> supposedly-internal things a process might do under rare circumstances >> is creating a bug machine.. > > Here's my point: if we're really going to make a new API to manipulate > processes by their fd, I think we should have at least a decent idea > of how that API will get extended in the future. Right now, we have > an extremely awkward situation where opening an fd in /proc requires > certain capabilities or uids, and using those fds often also checks > current's capabilities, and the target process may have changed its > own security context, including gaining privilege via SUID, SGID, or > LSM transition rules in the mean time. This has been a huge source of > security bugs. It would be nice to have a model for future APIs that > avoids these problems. > > And I didn't say in my proposal that a process's identity should > fundamentally change when it calls execve(). I'm suggesting that > certain operations that could cause a process to gain privilege or > otherwise require greater permission to introspect (mainly execve) > could be handled by invalidating the new process management fds. > Sure, if init re-execs itself, it's still PID 1, but that doesn't > necessarily mean that: > > fd = process_open_management_fd(1); > [init reexecs] > process_do_something(fd); > > needs to work. PID 1 is a bad example here, because it doesn't get recycled. Other PIDs do. The snippet you gave *does* need to work, in general, because if exec invalidates the handle, and you need to reopen by PID to re-establish your right to do something with the process, that process may in fact have died between the invalidation and your reopen, and your reopened FD may refer to some other random process. The only way around this problem is to have two separate FDs --- one to represent process identity, which *must* be continuous across execve, and the other to represent some specific capability, some ability to do something to that process. It's reasonable to invalidate capability after execve, but it's not reasonable to invalidate identity. In concrete terms, I don't see a big advantage to this separation, and I think a single identity FD combined with per-operation capability checks is sufficient. And much simpler. >> > setresuid() has no effect >> > here -- if you have W access to the process and the process calls >> > setresuid(), you still have W access. >> >> Now you've created a situation in which an operation that security >> policy previously blocked now becomes possible, invaliding previous >> designs based on the old security invariant. That's the definition of >> introducing a security hole. > > I think you're overstating your case. To a pretty good approximation, > setresuid() allows the caller to remove elements from the set {ruid, > suid, euid}, unless the caller has CAP_SETUID. If you could ptrace a > process before it calls setresuid(), you might as well be able to > ptrace() it after, since you could have just ptraced it and made it > call setresuid() while still ptracing it. What about a child that execs a setuid binary? > Similarly, it seems like > it's probably safe to be able to open an fd that lets you watch the > exit status of a process, have the process call setresuid(), and still > see the exit status. Is it? That's an open question. > > Regardless of how you feel about these issues, if you're going to add > an API by which you open an fd, wait for a process to exit, and read > the exit status, you need to define the conditions under which you may > open the fd and under which you may read the exit status once you have > the fd. There are probably multiple valid answers, but the question > still needs to be answered. Yes. That's the point I made in that previous message of mine that I referenced. > My POLLERR hack, aside from being ugly, > avoids this particular issue because it merely lets you wait for > something you already could have observed using readdir(). Yes. I mentioned this same issue-punting as the motivation behind exithand, initially, just reading EOF on exit.