Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp477583imm; Fri, 3 Aug 2018 06:36:54 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcWXEIX0xSqYhSlwm6/+MBQYw/iRZMdzrWmCUXxZQEzAyrkh+HAmYwXcgzGPHtOlMfLTD7k X-Received: by 2002:a17:902:1001:: with SMTP id b1-v6mr3600871pla.155.1533303414806; Fri, 03 Aug 2018 06:36:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533303414; cv=none; d=google.com; s=arc-20160816; b=VRqOEDNdLH9wnmXh8gZP9beF280EkyqDIhfOK+iBvtKnDy8O5T82uxgAMnLJNn45U/ WCytQvlzgUJcbcZ/14l0Ito9SwEakwVokjnonr99LozPT8gtej/33EGIaKd+dDovv5D1 TB2Z1H2pALLfT7KLR/WPVM06sx7GIazl6pN1Nsx94j0zBrPrfBdIyFMW4jahETs8ELdW gDo8wyzkH1Ylhg9oieINsIdcnJnUgxmpT46NkZA4w5IrlWtDIloTc7/YFaC8dzNS3VEB 1qn8++QrH9kFCitu7AJKJHCFGDgsTpTPYjfTK4oWQ2hswyb2vcaDxRTc3Bt71z2lc8Mb F79A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:content-transfer-encoding :mime-version:user-agent:message-id:in-reply-to:date:references:cc :to:from:arc-authentication-results; bh=9NXBvPbok0Egl2ovVXq9/toX2eCcTtb+iGruy66Qei8=; b=jYdMeiKLvoNg3Z5ArGJKMiaWlX8Am4K/pLE3amOFdU+9bpVeJW779m5qjWTsNOCCZV 0B09xpnpuNF4AqoETwfJFyWXQv+zhohNg1qJ0/Dy7ibBgBpbChy+klh6B++vwUS6F5BA eGVxncIGCKx4pnKPJWhE5z72ZLGw6dc+TboTdEqW71CxsQSK7/gmJaoHO3D8F4tCM3n2 0XSJoIRTrkwd5/rZPKvnjK3RPoKuZ+vVZEl0Yv13vchjsAPe88jxD8uTkNg/N6x0lN4H geBEzQDbi8Xu52mpMDC0a8fdldPrvyCPGWUSVmybVQDc3Dm3O/Zcqcm5+nThEo7Kred+ iCvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i21-v6si5240400pgg.513.2018.08.03.06.36.39; Fri, 03 Aug 2018 06:36:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732171AbeHCPbc convert rfc822-to-8bit (ORCPT + 99 others); Fri, 3 Aug 2018 11:31:32 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:36506 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729871AbeHCPbc (ORCPT ); Fri, 3 Aug 2018 11:31:32 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1flaEV-0001Fi-Tv; Fri, 03 Aug 2018 07:35:07 -0600 Received: from [97.119.167.31] (helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1flaEU-00069r-E1; Fri, 03 Aug 2018 07:35:07 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: =?utf-8?Q?J=C3=BCrg?= Billeter Cc: Oleg Nesterov , Andrew Morton , Thomas Gleixner , linux-api@vger.kernel.org, linux-kernel@vger.kernel.org References: <20180730075241.24002-1-j@bitron.ch> <20180731070337.61004-1-j@bitron.ch> <20180731143949.GA1890@redhat.com> <20180801141914.GA21248@redhat.com> <7f7c57230e0279f4599bf13ae1d1d449d76ac232.camel@bitron.ch> Date: Fri, 03 Aug 2018 08:34:59 -0500 In-Reply-To: <7f7c57230e0279f4599bf13ae1d1d449d76ac232.camel@bitron.ch> (=?utf-8?Q?=22J=C3=BCrg?= Billeter"'s message of "Fri, 03 Aug 2018 12:15:16 +0200") Message-ID: <87sh3vd14s.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1flaEU-00069r-E1;;;mid=<87sh3vd14s.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.119.167.31;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19qNfvi3hSKHFwc3GRSITPyOQdHWfLQdyM= X-SA-Exim-Connect-IP: 97.119.167.31 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa04.xmission.com X-Spam-Level: * X-Spam-Status: No, score=1.3 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,TR_Symld_Words,T_TM2_M_HEADER_IN_MSG autolearn=disabled version=3.4.1 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 1.5 TR_Symld_Words too many words that have symbols inside * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: =?ISO-8859-1?Q?*;J=c3=bcrg Billeter ?= X-Spam-Relay-Country: X-Spam-Timing: total 764 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 14 (1.8%), b_tie_ro: 11 (1.4%), parse: 1.53 (0.2%), extract_message_metadata: 23 (3.0%), get_uri_detail_list: 4.5 (0.6%), tests_pri_-1000: 9 (1.1%), tests_pri_-950: 1.63 (0.2%), tests_pri_-900: 1.36 (0.2%), tests_pri_-400: 33 (4.3%), check_bayes: 31 (4.1%), b_tokenize: 12 (1.6%), b_tok_get_all: 10 (1.3%), b_comp_prob: 3.9 (0.5%), b_tok_touch_all: 3.1 (0.4%), b_finish: 0.71 (0.1%), tests_pri_0: 252 (33.0%), check_dkim_signature: 0.80 (0.1%), check_dkim_adsp: 3.4 (0.4%), tests_pri_500: 425 (55.6%), poll_dns_idle: 417 (54.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH v2] prctl: add PR_[GS]ET_KILLABLE X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jürg Billeter writes: > On Wed, 2018-08-01 at 16:19 +0200, Oleg Nesterov wrote: >> On 07/31, Jürg Billeter wrote: >> > >> > > Could you explain your use-case? Why a shell wants to use >> > > CLONE_NEWPID? >> > >> > To guarantee that there won't be any runaway processes, i.e., ensure >> > that no descendants (background helper daemons or misbehaving >> > processes) survive when the child process is terminated. >> >> We already have PR_SET_CHILD_SUBREAPER. >> >> Perhaps we can finally add PR_KILL_MY_DESCENDANTS_ON_EXIT? This was already >> discussed some time ago, but I can't find the previous discussion... Simple >> to implement. > > This would definitely be an option. You mentioned it last October in > the PR_SET_PDEATHSIG_PROC discussion¹. However, as PID namespaces > already exist and appear to be a good fit for the most part, I think it > makes sense to just add the missing pieces to PID namespaces instead of > duplicating part of the PID namespace functionality. > > Also, based on Eric's comment in that other discussion about > no_new_privs not being allowed to increase the attack surface, > PR_KILL_MY_DESCENDANTS_ON_EXIT might require CAP_SYS_ADMIN as well (due > to setuid children). In which case the only potential benefit would be > that it still allows the child to kill arbitrary processes, as far as I > can tell. We don't require CAP_SYS_ADMIN if it is a session and so I think a similar allowance can be made for PR_KILL_MY_DESCENDANTS_ON_EXIT. There is a long standing tradition of being able to kill your own descendants in linux. I don't think this allows anything that the tranditional session allowance for killing process won't. From the other direction I think we can just go ahead and fix handling of the job control stop signals as well. As far as I understand it there is a legitimate complaint that SIGTSTP SIGTTIN SIGTTOU do not work on a pid namespace leader. The current implementation actual overshoots. We only need to ignore signals from the descendants in the pid namespace. Ideally signals from other processes are treated like normal. We have only been able to apply that ideal to SIGSTOP and SIGKILL as we can handle them in prepare_signal. Other signals can be blocked which means the logic to handle them needs to live in get_signal where we may have no sender information. Signals with signal handlers we treat as normal. Signals with whose default action is to ignore the signal we treat as normal. If a process is not in a context where job control has been set up then SIGTSTP SIGTTIN and SIGTTOU are ignored. I believe a typical init process lives in just such an environment. So I think we can safely remove the special handling for the job control stops and not have anyone care. The rule is that the process group of the process must have a parent in the same session, or the job control signals are ignored. A typical init processes calls setsid, which guarantees it has no parents in the same session. So the default action of the job control stops will be to ignore the signal. A process once a session leader will always be a session leader, and will never have any parents in a different pgrp in the same session. So I think this gives us wiggle room needed to just fix this behavior. Let's see. For the signals SIGTSTP SIGTTIN and SIGTTOU if we are the typical init process and we are a session leader we simply don't care who sends those signals they will be ignored. So I say we double check my assumption. Look at sysv init, busy box, upstart, systemd, whatever android uses, and the container runtimes light weight inits. Document it in a change log and just remove the special case. If except when handling job control signals is interesting init always winds up a signal group leader I can't see the point in forcing init to ignore the job control stop signals. > ¹ https://lkml.org/lkml/2017/10/5/546 In the future please use mesage-id based links to email disccussions. That way people can look up the conversations in other email archives. Eric