Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2934174imu; Mon, 19 Nov 2018 08:16:16 -0800 (PST) X-Google-Smtp-Source: AJdET5dzCLKc0sKYruRZZjX9EB9bbvPrpQEUJgpn3c5ohPVNefSLSa/7/VPQ6Tj+BSnaBa96vf68 X-Received: by 2002:a63:2315:: with SMTP id j21mr20819301pgj.297.1542644176809; Mon, 19 Nov 2018 08:16:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542644176; cv=none; d=google.com; s=arc-20160816; b=aatDTXapn/SOeOXjlqh9Zv8IIaHYVxaNjFLpOVhtsPtXcSIq+2eThgyi8jQOySAI6E qneFFAMUt1Mof2BSEOPr16NE6jm/nW6FP3y/MM7cNG9ml83uw5ArMipVd+SDiM/NNQQh 6VZjnh3XKFX8Ez/h88GyKsW7eCExdfFvydt4t7s0QUst8ElzWY+kmOXKBPmrON+JnzM1 j0qH/GnD8RCK1MpYP0ctihelqsu4C9crzspEuuiwEiBrlAOMwWIedrPwzvwFeoR/q2Xb MsNv9/W88GMSd+blix8DCzbub0YrxV6WIZ0QHzJFH2Mm/5qIS4T4C+aMnepn0wJw0Qz4 5asw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=HSsl5sUdP/vssoWlNDZWLB+DW1MRA7rNnOrsZ3lot2k=; b=nfXW2VO9ZYUcgwn1Jpbmipj/AA3n1gP61qhg9pcnJYgYOgnONDyTMoM5Vg/HiXWq7t 0TsgHrnjAdbI05GenJj1gL5hjnM42dhzujl6AnVx+HNgBJm6eXBR13sBZn1MZMxxLU2n 7QP6LJiA9c4LPDRbx+ubHzYO7d5gZFbXr9PV+leCUkYr0ypTcciF0tAFyUrwN9Ihfhwk Qr2a2TnKEJXaegZ9Zipd9qNjNtFJja8x9Hk+xNqvZNmbbZY6iw3308j1cXXb9gqJC7+y oa2Zvf93zijRdvkiwdKt5pS9NCFW6TtU5Gs9ycRTNOcRacoPKk6ExDBLS/17r2tfXcbU ySpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=t9V0KK9m; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k11si15617956plt.68.2018.11.19.08.16.01; Mon, 19 Nov 2018 08:16:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=t9V0KK9m; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729967AbeKTChq (ORCPT + 99 others); Mon, 19 Nov 2018 21:37:46 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:34927 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729796AbeKTChp (ORCPT ); Mon, 19 Nov 2018 21:37:45 -0500 Received: by mail-it1-f196.google.com with SMTP id v11so8254160itj.0; Mon, 19 Nov 2018 08:13:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HSsl5sUdP/vssoWlNDZWLB+DW1MRA7rNnOrsZ3lot2k=; b=t9V0KK9mnqFLYpl5384MbZYtJQlSmRGyQ6kmMY5ZVaAPCf+eXO1uF14cH2tw20qETO rFpg1R1EfQgCqDuICVuKTfUB2KVp2DpsBtNg5ydM+BEJfnJae3UpIOxVeaGWk6G10ipl gsJbNq/kFuNwaXYJtnHnNK19+eaDqDpras84tPPTSxdj6D0zIBvfvpMRKZRN+4NEZmW5 bHZsa4ERpKB+KvqkvtzcHnE0pdBM6LZxrSc9wgMGh3NBG2OcvLyxpk98q2iiCW0qG58b Vp0PKL1KElqlVZV5iWzwRQ3Wr1Yg+PhgI3i1DnXp1hKGz1ce97+WxBLDy0WYXcRCIWif b1Ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HSsl5sUdP/vssoWlNDZWLB+DW1MRA7rNnOrsZ3lot2k=; b=sqDbECs7q5SJ8b5CeonZ6ni1aQIDIuRaC0ymvXZRZT90xSQNtFlM2Z++zznGeJxdEp 2LnodOVAHhoNePUDC7tyQHsTIqucVLT4Zw1vPW/TjB7WgePH54GP5WRhArNY1nY82BaZ q90XvRt1Zdypr60pLvnveH3djcpBVU0BaLfhaMAqENyGue4fmO+FruCK5ce0Fua8C/y1 AGF5WGnFXTZeQoMvYPkUH+fLlvA1iNlQTORmSrmNyUO/fqKmNRXrxudWC42pfKXYjAoB Dx2i0dTign/fhqH45hKwU//StimLDyurWj1H4p/cWowxYPruFjs7x00sZzehH6+CDF2V TtLQ== X-Gm-Message-State: AGRZ1gJ69rYXyDLV4WeA+310u2KONX5uGe396s0vvBMGMZ8ubOfXX/MV i23JWFGxscQPlVQVCOPuVl0V11ycjqInjAXwp9c= X-Received: by 2002:a02:1b1d:: with SMTP id l29mr4580965jad.98.1542644022543; Mon, 19 Nov 2018 08:13:42 -0800 (PST) MIME-Version: 1.0 References: <20181118111751.6142-1-christian@brauner.io> In-Reply-To: From: Dmitry Safonov <0x7f454c46@gmail.com> Date: Mon, 19 Nov 2018 16:13:30 +0000 Message-ID: Subject: Re: [PATCH] proc: allow killing processes via file descriptors To: Andy Lutomirski Cc: dancol@google.com, rdunlap@infradead.org, christian@brauner.io, "Eric W. Biederman" , open list , Serge Hallyn , jannh@google.com, Andrew Morton , Oleg Nesterov , cyphar@cyphar.com, Al Viro , linux-fsdevel@vger.kernel.org, Linux API , timmurray@google.com, Kees Cook , jengelh@inai.de, Andrei Vagin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 18 Nov 2018 at 18:30, Andy Lutomirski wrote: > Here's my point: if we're really going to make a new API to manipulate > processes by their fd, I think we should have at least a decent idea > of how that API will get extended in the future. Right now, we have > an extremely awkward situation where opening an fd in /proc requires > certain capabilities or uids, and using those fds often also checks > current's capabilities, and the target process may have changed its > own security context, including gaining privilege via SUID, SGID, or > LSM transition rules in the mean time. This has been a huge source of > security bugs. It would be nice to have a model for future APIs that > avoids these problems. > > And I didn't say in my proposal that a process's identity should > fundamentally change when it calls execve(). I'm suggesting that > certain operations that could cause a process to gain privilege or > otherwise require greater permission to introspect (mainly execve) > could be handled by invalidating the new process management fds. > Sure, if init re-execs itself, it's still PID 1, but that doesn't > necessarily mean that: > > fd = process_open_management_fd(1); > [init reexecs] > process_do_something(fd); > > needs to work. > > > > > > setresuid() has no effect > > > here -- if you have W access to the process and the process calls > > > setresuid(), you still have W access. > > > > Now you've created a situation in which an operation that security > > policy previously blocked now becomes possible, invaliding previous > > designs based on the old security invariant. That's the definition of > > introducing a security hole. > > I think you're overstating your case. To a pretty good approximation, > setresuid() allows the caller to remove elements from the set {ruid, > suid, euid}, unless the caller has CAP_SETUID. If you could ptrace a > process before it calls setresuid(), you might as well be able to > ptrace() it after, since you could have just ptraced it and made it > call setresuid() while still ptracing it. Similarly, it seems like > it's probably safe to be able to open an fd that lets you watch the > exit status of a process, have the process call setresuid(), and still > see the exit status. > > Regardless of how you feel about these issues, if you're going to add > an API by which you open an fd, wait for a process to exit, and read > the exit status, you need to define the conditions under which you may > open the fd and under which you may read the exit status once you have > the fd. There are probably multiple valid answers, but the question > still needs to be answered. My POLLERR hack, aside from being ugly, > avoids this particular issue because it merely lets you wait for > something you already could have observed using readdir(). Beg your pardon for hijacking the thread.. I wonder how fast it would be holding a pid with another open()ed fd. And then you need to read comm (or how you filter whom to kill). It seems to me that procfs will be even slower with this safe-way. But I might misunderstand the idea, excuses. So, I just wanted to gently remind about procfs with netlink socket[1]. It seems to me that whenever you receive() pid information, you can request some uniq 64(?) bit number and kill the process using it. Whenever uniqueness of 64-bit number to handle pids will be a question the netlink message might be painlessly extended to 128 or whatever. Also, it may provide the facilities to atomically kill process say by name by adding another field to netlink message. Probably, if it's time to add a new API for procfs, netlink may be more desirable. [1]: https://lwn.net/Articles/650243/ Thanks, Dmitry