Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1870584imu; Sun, 18 Nov 2018 10:30:46 -0800 (PST) X-Google-Smtp-Source: AJdET5doqC/Wk0KZ8EiY2zpUnWNJQqEcZZkWAyxoXYaZZrY5hMU9+vcZteFjSWADoWDiZ7a/NXo6 X-Received: by 2002:a17:902:b282:: with SMTP id u2mr19519053plr.89.1542565846702; Sun, 18 Nov 2018 10:30:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542565846; cv=none; d=google.com; s=arc-20160816; b=HTWaHGb4sakrwKll8xTAFpmjGSY6wUAaspwM4KV6JBgb4LGvtWL+b60qxW3t1VsT3f bg32bsdJUdIypL4fZr6OAip7YIT4P7b78U1qRt1dRlqR4E7VRQ4qzTuLU+DyGcBgXI58 otODjZDJRnS4CPPpcvKLyPlltr3u43sLS7ar6ZEW2BdNWR7NXYel3eXqdry8P/4Fb5D1 zWvo9o8xR5ikJXwF05oH1eldaOJ3gzZ3XG/Ami/8iRhinGMKD9o8TmNI3fH19dxU9yiE N3QE3ETFF0iMx+DvUQ35xFForECaR3eDzOPM+1tGM//XBmpWsJg7wPPOT1qDr4ycJ4k6 yEaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=jOrs6F0bZR9Gz6Z6NiQrkp4BgTmOo7KQYRhRHQRVjjw=; b=q4AHOPxpws6Z5ZP3liKsTnUU9z4IdML5eKm6pIv5dONQQOHgltBn7mMMCZxqmK1let s2Kl65MVBgFNPFA5DIPrtY7NpR8Y0NMmH0lTLRcivbifiwc/JcmWuoibGTVO+DmQIMJc lpFCwPmdGyoHCon/yQLhjHOwsukaO87YQefja5VG5Sdy0k8vUNNFmkCnmEZNFgZBFx47 YVmplMMDrK3S+aaCf0AuCUxQZ+NV7olDJU3AoQbc9hv03Jj5xhuSQy9+wBmzBgIfAHHN i2Jj1w4P5RKgesjsgfHCFOOCEsCyP9KW/X/MTAZqu9awaAU5XRzASkBXfvIyidlqUyZP lWKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rhckfhPF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w5si329222pll.64.2018.11.18.10.30.29; Sun, 18 Nov 2018 10:30:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rhckfhPF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727033AbeKSEtW (ORCPT + 99 others); Sun, 18 Nov 2018 23:49:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:54118 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726744AbeKSEtW (ORCPT ); Sun, 18 Nov 2018 23:49:22 -0500 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5409E2089F for ; Sun, 18 Nov 2018 18:28:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1542565702; bh=Vwss4soWhguhGo9DAqnWYTzn8piXNCMnRF+M5eFiLJE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=rhckfhPFIMVeCxeCNnU+AELALIvXCNhF1b7EiE4qF0mX1isIrlHzuZsglGfpy9gxK jzRdLb0yTfPbcuVnqx0cwQMZcqnEAAwdCoBS+npQk295A+OYbdl59RMjm5vRQcETXd cp9Fv9Q8U6z3wjSDaSwd4gstu1QBvZA+N9PeDW0s= Received: by mail-wm1-f52.google.com with SMTP id w7-v6so3166256wmc.1 for ; Sun, 18 Nov 2018 10:28:22 -0800 (PST) X-Gm-Message-State: AA+aEWZduyrD+BX4e481dOictpjZhJNdPd6+CXqrSDOlKSc0DZEpUja5 WJ5S4PbpHrKq+cdlFDLJt52wfvtZuNHMttGht7KSPQ== X-Received: by 2002:a1c:110b:: with SMTP id 11mr2052879wmr.83.1542565700692; Sun, 18 Nov 2018 10:28:20 -0800 (PST) MIME-Version: 1.0 References: <20181118111751.6142-1-christian@brauner.io> In-Reply-To: From: Andy Lutomirski Date: Sun, 18 Nov 2018 10:28:08 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] proc: allow killing processes via file descriptors To: Daniel Colascione Cc: Andrew Lutomirski , Randy Dunlap , Christian Brauner , "Eric W. Biederman" , LKML , "Serge E. Hallyn" , Jann Horn , Andrew Morton , Oleg Nesterov , Aleksa Sarai , Al Viro , Linux FS Devel , Linux API , Tim Murray , Kees Cook , Jan Engelhardt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 18, 2018 at 9:51 AM Daniel Colascione wrote: > > > I'm not entirely sure that ship has sailed. In the kernel, we already > > have a bit of a distinction between a pid (and tid, etc -- I'm > > referring to struct pid) and a task. If we make a new > > process-management API, we could put a distinction like this into the > > API. > > It would be a disaster to have different APIs give callers a different > idea of process identity over its lifetime. The precedent is > well-established that execve and setreuid do not change a process's > identity. Invaliding some identifiers but not others in response to > supposedly-internal things a process might do under rare circumstances > is creating a bug machine.. Here's my point: if we're really going to make a new API to manipulate processes by their fd, I think we should have at least a decent idea of how that API will get extended in the future. Right now, we have an extremely awkward situation where opening an fd in /proc requires certain capabilities or uids, and using those fds often also checks current's capabilities, and the target process may have changed its own security context, including gaining privilege via SUID, SGID, or LSM transition rules in the mean time. This has been a huge source of security bugs. It would be nice to have a model for future APIs that avoids these problems. And I didn't say in my proposal that a process's identity should fundamentally change when it calls execve(). I'm suggesting that certain operations that could cause a process to gain privilege or otherwise require greater permission to introspect (mainly execve) could be handled by invalidating the new process management fds. Sure, if init re-execs itself, it's still PID 1, but that doesn't necessarily mean that: fd = process_open_management_fd(1); [init reexecs] process_do_something(fd); needs to work. > > > setresuid() has no effect > > here -- if you have W access to the process and the process calls > > setresuid(), you still have W access. > > Now you've created a situation in which an operation that security > policy previously blocked now becomes possible, invaliding previous > designs based on the old security invariant. That's the definition of > introducing a security hole. I think you're overstating your case. To a pretty good approximation, setresuid() allows the caller to remove elements from the set {ruid, suid, euid}, unless the caller has CAP_SETUID. If you could ptrace a process before it calls setresuid(), you might as well be able to ptrace() it after, since you could have just ptraced it and made it call setresuid() while still ptracing it. Similarly, it seems like it's probably safe to be able to open an fd that lets you watch the exit status of a process, have the process call setresuid(), and still see the exit status. Regardless of how you feel about these issues, if you're going to add an API by which you open an fd, wait for a process to exit, and read the exit status, you need to define the conditions under which you may open the fd and under which you may read the exit status once you have the fd. There are probably multiple valid answers, but the question still needs to be answered. My POLLERR hack, aside from being ugly, avoids this particular issue because it merely lets you wait for something you already could have observed using readdir().