Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp6506526imm; Mon, 23 Jul 2018 20:29:21 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdDekCePbpFF8UgeVy9eAJt1+NxfDrF5Jsk/1dLi5kVpaen6/5LToj+FmmO9ATCwfogr5zA X-Received: by 2002:a62:954:: with SMTP id e81-v6mr15841066pfd.231.1532402961786; Mon, 23 Jul 2018 20:29:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532402961; cv=none; d=google.com; s=arc-20160816; b=S/qIFGskuJ9Wj6FIIGWpZwvbU4okX+2lpcF3oaoC4HK58ffwtwO7S311zt2d78cXLZ YU0nrMz3iq1lWIaChAy/0r7xUbOWPEd3oOxLFR++HFgHWg1ZW2fufYvVgI8PCWhkgT3Q ej/deS/QCWV/Ixzk/HSW/Pop0iYp3AEakVHWuT1cuoNiwvruMcKiB1csSo7fROw5FXs/ kN7YOQmdhX5YVkWmaV3/z4Du3hnc2ufnDS8YwqDz2h03yyOCtsWipRPM6MZcOvhuUZPw NoqkXtxrAsLAQgL4Q2JrVlostU2GApQLwH2RSyexIGby7jyBLgI80ZqnzPcckvvm9tqn J99Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:references:in-reply-to:message-id :date:cc:to:from:arc-authentication-results; bh=Mzop5Mk8Vr3wK/zbZk8xwMCb4gQW1hoEkPwuLh4ktoU=; b=e9MDqA0+YRQpzbNtnQfngxrqAF8Pq1c1PMtNB+RphQLzQY9Mj66fCMmwlRXGHBjMnF 11rPoPgXA4+FJh+4iytD7O/L8z+Xp3yvEOn/jhVs01YP33iR2AHjk2ipRD99te1CEPFN jJ2q6rxZLc3GkQPHDbHGZ40A+qLy3hb4a19tTtrlILQKbZAfTj6e/5vwgWCSCQqAHLAV Ai6/gI7NNmZ/jHW53E7rj61bNp+bGp9k9MplnGXGGD611+7UUqF58TGpa/GTCyyvfh6B SUtiKpLRiRFMyHat3pBC4A7xlkkHUoe9UU5WH6HG3xlahkHQej3bWv58NT20RG2n7zZE 02tQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g7-v6si9353128plt.149.2018.07.23.20.29.07; Mon, 23 Jul 2018 20:29:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388251AbeGXEcQ (ORCPT + 99 others); Tue, 24 Jul 2018 00:32:16 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:58867 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388134AbeGXEby (ORCPT ); Tue, 24 Jul 2018 00:31:54 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out01.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fhnz5-0005pp-1x; Mon, 23 Jul 2018 21:27:35 -0600 Received: from [97.119.167.31] (helo=x220.int.ebiederm.org) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fhnxx-0008AK-6L; Mon, 23 Jul 2018 21:26:25 -0600 From: "Eric W. Biederman" To: Linus Torvalds Cc: Oleg Nesterov , Andrew Morton , linux-kernel@vger.kernel.org, Wen Yang , majiang , "Eric W. Biederman" Date: Mon, 23 Jul 2018 22:24:18 -0500 Message-Id: <20180724032419.20231-19-ebiederm@xmission.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <87efft5ncd.fsf_-_@xmission.com> References: <87efft5ncd.fsf_-_@xmission.com> X-XM-SPF: eid=1fhnxx-0008AK-6L;;;mid=<20180724032419.20231-19-ebiederm@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.119.167.31;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+JBH7mYbLLQG145QkrFo8lOfz78n248a0= X-SA-Exim-Connect-IP: 97.119.167.31 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa06.xmission.com X-Spam-Level: ** X-Spam-Status: No, score=2.0 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TooManySym_01,XMNoVowels,XMSubLong autolearn=disabled version=3.4.1 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 294 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 3.2 (1.1%), b_tie_ro: 2.1 (0.7%), parse: 0.91 (0.3%), extract_message_metadata: 13 (4.3%), get_uri_detail_list: 2.9 (1.0%), tests_pri_-1000: 5 (1.8%), tests_pri_-950: 1.16 (0.4%), tests_pri_-900: 0.98 (0.3%), tests_pri_-400: 24 (8.1%), check_bayes: 23 (7.7%), b_tokenize: 8 (2.8%), b_tok_get_all: 7 (2.5%), b_comp_prob: 2.2 (0.8%), b_tok_touch_all: 3.0 (1.0%), b_finish: 0.64 (0.2%), tests_pri_0: 239 (81.2%), check_dkim_signature: 0.48 (0.2%), check_dkim_adsp: 2.5 (0.9%), tests_pri_500: 4.7 (1.6%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 19/20] fork: Have new threads join on-going signal group stops X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There are only two signals that are delivered to every member of a signal group: SIGSTOP and SIGKILL. Signal delivery requires every signal appear to be delivered either before or after a clone syscall. SIGKILL terminates the clone so does not need to be considered. Which leaves only SIGSTOP that needs to be considered when creating new threads. Today in the event of a group stop TIF_SIGPENDING will get set and the fork will restart ensuring the fork syscall participates in the group stop. A fork (especially of a process with a lot of memory) is one of the most expensive system so we really only want to restart a fork when necessary. It is easy so check to see if a SIGSTOP is ongoing have have the new thread join it immediate after the clone completes. Making it appear the clone completed happened just before the SIGSTOP. The calculate_sigpending function will see the bits set in jobctl and set TIF_SIGPENDING to ensure the new task takes the slow path to userspace. Signed-off-by: "Eric W. Biederman" --- include/linux/sched/signal.h | 2 ++ kernel/fork.c | 27 +++++++++++++++------------ kernel/signal.c | 14 ++++++++++++++ 3 files changed, 31 insertions(+), 12 deletions(-) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 7cabc0bc38f6..f3507bf165d0 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -385,6 +385,8 @@ static inline void ptrace_signal_wake_up(struct task_struct *t, bool resume) signal_wake_up_state(t, resume ? __TASK_TRACED : 0); } +void task_join_group_stop(struct task_struct *task); + #ifdef TIF_RESTORE_SIGMASK /* * Legacy restore_sigmask accessors. These are inefficient on diff --git a/kernel/fork.c b/kernel/fork.c index e07281254552..6c358846a8b8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1934,18 +1934,20 @@ static __latent_entropy struct task_struct *copy_process( goto bad_fork_cancel_cgroup; } - /* - * Process group and session signals need to be delivered to just the - * parent before the fork or both the parent and the child after the - * fork. Restart if a signal comes in before we add the new process to - * it's process group. - * A fatal signal pending means that current will exit, so the new - * thread can't slip out of an OOM kill (or normal SIGKILL). - */ - recalc_sigpending(); - if (signal_pending(current)) { - retval = -ERESTARTNOINTR; - goto bad_fork_cancel_cgroup; + if (!(clone_flags & CLONE_THREAD)) { + /* + * Process group and session signals need to be delivered to just the + * parent before the fork or both the parent and the child after the + * fork. Restart if a signal comes in before we add the new process to + * it's process group. + * A fatal signal pending means that current will exit, so the new + * thread can't slip out of an OOM kill (or normal SIGKILL). + */ + recalc_sigpending(); + if (signal_pending(current)) { + retval = -ERESTARTNOINTR; + goto bad_fork_cancel_cgroup; + } } @@ -1986,6 +1988,7 @@ static __latent_entropy struct task_struct *copy_process( &p->group_leader->thread_group); list_add_tail_rcu(&p->thread_node, &p->signal->thread_head); + task_join_group_stop(p); } attach_pid(p, PIDTYPE_PID); calculate_sigpending(p); diff --git a/kernel/signal.c b/kernel/signal.c index f6687c7d7a8c..78e2d5d196f3 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -375,6 +375,20 @@ static bool task_participate_group_stop(struct task_struct *task) return false; } +void task_join_group_stop(struct task_struct *task) +{ + /* Have the new thread join an on-going signal group stop */ + unsigned long jobctl = current->jobctl; + if (jobctl & JOBCTL_STOP_PENDING) { + struct signal_struct *sig = current->signal; + unsigned long signr = jobctl & JOBCTL_STOP_SIGMASK; + unsigned long gstop = JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME; + if (task_set_jobctl_pending(task, signr | gstop)) { + sig->group_stop_count++; + } + } +} + /* * allocate a new signal queue record * - this may be called without locks if and only if t == current, otherwise an -- 2.17.1