Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1819514imu; Sun, 18 Nov 2018 09:27:06 -0800 (PST) X-Google-Smtp-Source: AJdET5fgkz0YO8Ew0LXvEVVd5btQYPYr7eIiKg/qGFBtN0E9QVVNj4I6hb6jZZ4bKXamD7bHHfup X-Received: by 2002:a62:e30d:: with SMTP id g13mr7950239pfh.151.1542562026367; Sun, 18 Nov 2018 09:27:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542562026; cv=none; d=google.com; s=arc-20160816; b=lHdJ7kE+GTbcz0x223M8H2W7JhjquKkzDKZfjId3pWYRgvnwcqtOpRMuEO5aFkwzG5 nlFptAlBEt85Q6JRBfc0RSbaMkbSJMRuc07mCS8+R4JkvGqy4tx3xb/L0FlSwUS41lDs jIW5tLEzVuLLCTsKzLEcVekM2fSsFKe7M9rPsKH6KvarjqyE2FfQ14s23iZp6Nj9E7K2 XGBaO+DPJ77w8s9OUQIbhPm3N9maxWrfH4GirqqsIusLoY7Cp5j+gDBPl4tkAWEPJ5YN onAOfqJb+t/Dqrgs1nVjF+tpE9J6QTz2pZMPusFSp8Ad4a7XWeuoZOW1ay/njQLg3TQv Ug1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=Wytjs4fpH0VD9nKliwqpD+0su5qcPR+2gu7HKTH97n4=; b=MJI+5cvXhcGpi1+gxVU2TwwVPx7R79thNFNG/0aNIfmSawZYEOliTtKgaiI+qPH8Do 4o6O1roz6s1DTklUIsso1Q4Gz21IqqOpIkitkRo8gj0uxP/vAu9UsjlSPN6A02uIfo10 3NA4PPfW936KBE1U6uKZbKf/1ZlAWciaPtvpmCKafKCpRlbp1GPp8VlCrTNZ5RdNYcSU xd6ZEXtDDvMTcYws6PTwX4kXKHcA5Wr53RsXUfyihND3cFOZKNAqHGz7TNM0l64HYNAv EYdNiuNNVPPEoR43teC8NxQcOYxscA5xxIvoyTajVJzY72cxhXck5zvc/xk7KLffosTm Qfew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UHAokVmi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h16si35947725pgj.203.2018.11.18.09.26.49; Sun, 18 Nov 2018 09:27:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UHAokVmi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726916AbeKSDp2 (ORCPT + 99 others); Sun, 18 Nov 2018 22:45:28 -0500 Received: from mail-vs1-f65.google.com ([209.85.217.65]:33659 "EHLO mail-vs1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726366AbeKSDp2 (ORCPT ); Sun, 18 Nov 2018 22:45:28 -0500 Received: by mail-vs1-f65.google.com with SMTP id p74so16494475vsc.0 for ; Sun, 18 Nov 2018 09:24:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Wytjs4fpH0VD9nKliwqpD+0su5qcPR+2gu7HKTH97n4=; b=UHAokVmiVyoDk/BeC3n4nE1ZmTfQwlityg+VMPmH7l9m7l5dzORebTki/1mZZx2XM5 Kx4kz4Ca8Tw1NEtSFySoIVnFyqRpswTZXZWZZnfoAJj5/9Blxagq2kSBvY6sIJ6xidF0 +kUe2dGYAi9bu5JcDVZ5+YrVBcpPPrOq/GtZe+SF8whQyyhBKclyOiI/lx6+5uUQxV1u JQHwEPjLR5iVZGT/fK8ZbtC9vt9PGftkPcTLuxKNyg/1koID3jNPurh6lR2Ro7Nb9Zcf B57uTV+qV2qkxJLSRq2wAJ+JoNphZyc8uexXmixKl7sIW8AIt30vS5yaFdfKSRWVzser VbPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Wytjs4fpH0VD9nKliwqpD+0su5qcPR+2gu7HKTH97n4=; b=OgqYmMAlTaoFsTPDPU2wuGrWOkw44074ZoQ+KtwLLzdde2ERTqR/kXSrc5ZE4enKhl M2hiR9f/DZl8vbmceP5S8Y54SFApl4br0mtRQgGwTWyRWwATnCXLQYlQCfxYCIb+zfxK C9rNfWtL5yVMCcK83oozf+6milwtywcs0IxBPI0hxDBmDYshY/tG2fALq3AbAXusr0+a 1wbHIoGZDjgtuXw7h2Xi9XqI+3fcH0pJxTvGdliNtpMwCmXz6waW5jyBcTChNdW2DAn7 jm4xbaWCtsCmG+vgMMT7l2By+StcP2xjZePk5ou6T9O0YXz8euNEd3HMnYh+OigK0YKv Hbaw== X-Gm-Message-State: AGRZ1gJqMzlPg1nkgWMr0kIlIQlt3Ud9SgQOmSyD3+01aoCgEtevQkFb oVq7Ozj8A063ZEUPgZ6TshyJgNRmr1xyiGxpT6yCBQ== X-Received: by 2002:a67:b44:: with SMTP id 65mr7763131vsl.77.1542561877422; Sun, 18 Nov 2018 09:24:37 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a67:f48d:0:0:0:0:0 with HTTP; Sun, 18 Nov 2018 09:24:36 -0800 (PST) In-Reply-To: References: <20181118111751.6142-1-christian@brauner.io> From: Daniel Colascione Date: Sun, 18 Nov 2018 09:24:36 -0800 Message-ID: Subject: Re: [PATCH] proc: allow killing processes via file descriptors To: Andy Lutomirski Cc: Randy Dunlap , Christian Brauner , "Eric W. Biederman" , LKML , "Serge E. Hallyn" , Jann Horn , Andrew Morton , Oleg Nesterov , Aleksa Sarai , Al Viro , Linux FS Devel , Linux API , Tim Murray , Kees Cook , Jan Engelhardt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 18, 2018 at 9:09 AM, Andy Lutomirski wrote: > On Sun, Nov 18, 2018 at 8:49 AM Daniel Colascione wrote: >> >> On Sun, Nov 18, 2018 at 8:33 AM, Randy Dunlap wrote: >> > On 11/18/18 8:17 AM, Andy Lutomirski wrote: >> >> On Sun, Nov 18, 2018 at 7:53 AM Daniel Colascione wrote: >> >>> >> >>> On Sun, Nov 18, 2018 at 7:38 AM, Andy Lutomirski wrote: >> >>>> I fully agree that a more comprehensive, less expensive API for >> >>>> managing processes would be nice. But I also think that this patch >> >>>> (using the directory fd and ioctl) is better from a security >> >>>> perspective than using a new file in /proc. >> >>> >> >>> That's an assertion, not an argument. And I'm not opposed to an >> >>> operation on the directory FD, now that it's clear Linus has banned >> >>> "write(2)-as-a-command" APIs. I just insist that we implement the API >> >>> with a system call instead of a less-reliable ioctl due to the >> >>> inherent namespace collision issues in ioctl command names. >> >> >> >> Linus banned it because of bugs iike the ones in the patch. >> >> >> >>> >> >>>> I have an old patch to make proc directory fds pollable: >> >>>> >> >>>> https://lore.kernel.org/patchwork/patch/345098/ >> >>>> >> >>>> That patch plus the one in this thread might make a nice addition to >> >>>> the kernel even if we expect something much better to come along >> >>>> later. >> >>> >> >>> I've always commented on that patch. You never addressed my technical >> >>> objections. Why are you bringing up this patch again as if that >> >>> discussion had never happened? To review, that patch has various race >> >>> conditions >> >> >> >> I don't think I ever saw that review. >> >> >> >>> and even if it were technically correct, it'd be an abuse >> >>> of directory objects (in what other circumstance do we poll >> >>> directories?) and not logically generalizable to a model in which we >> >>> expose process exit status via the exit-monitoring API. >> >> >> >> I agree it's weird. It might be better to have /proc/PID/exit_status >> >> and make *that* pollable. >> >> >> > >> > If there is a new exit_status file, it could even be more than >> > 8 bits of exit status: >> > >> > See https://lore.kernel.org/lkml/alpine.LSU.2.20.1507091257010.9602@nerf40.vanv.qr/T/#u >> > and http://austingroupbugs.net/view.php?id=594#c1317 >> >> First of all, as I discussed in [1], we need to first figure out *who* >> should have access to the process exit information. FreeBSD appears to >> make it public without disaster, and if we make exit status public, a >> lot of problems just disappear. > > I kind of want to go in the other direction of making a lot of process > information (especially cmdline) less publicly accessible. Okay. That has nothing to do with exit status. Please address the points related to the API we're discussing and that I raised in the other thread. Assuming we don't broaden exit status readability (which would make a lot of things simpler), the exit notification mechanism must work like this: if you can see a process in /proc, you should be able to wait on it. If you learn that process's exit status through some other means --- e.g., you're the process's parent, you can ptrace the process, you have CAP_WHATEVER_IT_IS_ --- then you should be able to learn the fate of the process. Otherwise you just be able to learn that the process exited. > Windows has an easy time of it because Windows has an easier time of it because it doesn't use an ad-hoc ambient authority permission model. In Windows, if you can open a handle to do something, that handle lets you do the thing. Period. There's none of this "well, I opened this process FD, but since I opened it, the process called setuid, so now I can't get its exit status" nonsense. Privilege elevation is always accomplished via a separate call to CreateProcessWithToken, which creates a *new* process with the elevated privileges. An existing process can't suddenly and magically become this special thing that you can't inspect, but that has the same PID and identity as this other process that you used to be able to inspect. The model is just better, because permission is baked into the HANDLE. Now, that ship has sailed. We're stuck with setreuid and exec. But let's be clear about what's causing the complexity.