Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp1533246ybi; Wed, 19 Jun 2019 22:57:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqxrv7k8EjnUsBcxLMsH+O38HeRUR2h5OjhkjwQfGaIU8+r8hI8ODttA1mvvPOm1CBF0DlGf X-Received: by 2002:a63:d218:: with SMTP id a24mr11555626pgg.419.1561010232817; Wed, 19 Jun 2019 22:57:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561010232; cv=none; d=google.com; s=arc-20160816; b=rNaIpj1iAM7sknNte9FH2iOQTYxrTPTi/1jlWF3ihEM9qC6itJXUHvOAC+xdweq3HL Lujg7R3ht6Smwq9ud5wK5uH3kE5LH/cYP+VBKkcibwm7Vza6eXdN3lND+RHgmpTVzmeS evLgkfnNQWmZPLXBUA0q/rqcbesP71a9tHA2AHAQzN9sSQ7vHqPpp7Ovl8QmO67KYRAD Fd4doZCc8/3pg3KHb+f4A2gRoFzZCw9Zqdyiix7w9SPSrWGfrwtLWSBgFYyrfZv5Jh0K 9P7DssIl701SwEDRqWH685jzRME1It4Z9ixaLL/2JH0wf6k4fmFKFJItrTCQRmErdBI9 BaCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=K9pZXmjEarEwCyykMezDXwpehpmv4umtwvd9btCXgKc=; b=Lc5yUA4zU1TpZ8ePszlOzcCdULIjVdsAQnBcIlAkHZ9mNXvfFRReEiZdb5Pvb6t3BK nrR/thvcxAOG5jE9O5KNzSYDOKz6TFdTqA7JhqbSC/MRkCuL86L3VYIXv0fmoXmoSpB6 JGTrCSm/+LMb8Wdagqdzu5pv38s+vHmKbDe5MFRyrJshJTdWrSvgOeCcaVpsQVwUk2r6 d55asNuOaRcc9iGPaYn1pFzxD6OoJrnqivRoSQHY6EqN+3bAOvFLqnWMW50pc+LV82Np AIT2bCjPeQ0OMb1A7j0Nli0RoVKDpYhtk+dHy4GQe/TSC76F1m7R+1urDldjp/sn48hR nS9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XCVaRrbp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 207si19376459pfu.258.2019.06.19.22.56.56; Wed, 19 Jun 2019 22:57:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XCVaRrbp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730815AbfFTFzg (ORCPT + 99 others); Thu, 20 Jun 2019 01:55:36 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:38839 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726082AbfFTFzg (ORCPT ); Thu, 20 Jun 2019 01:55:36 -0400 Received: by mail-pg1-f195.google.com with SMTP id v11so977088pgl.5 for ; Wed, 19 Jun 2019 22:55:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=K9pZXmjEarEwCyykMezDXwpehpmv4umtwvd9btCXgKc=; b=XCVaRrbpse0CSGD70QnaHV3nIfsX43qaQFSxrfnJW/VmR080ALVpX8dSSZASxz2lzr R100yUGLnMUR69TxsxEuO/YdkK1EoNyWdMbIA1NyYz/XZsCfcBs6Skb5n2cMnE8JqY4f /4FWSMQ1niGhs+4sXX5B++cu0bQoqURFh5bctYtIIDQCl68u0pFC5wNpqDjU5KH7UNIU YsEnNOIrZDaeCz/vLS0/wSj53HhulsyCiFljCWI4xgvqTRa17HnK+oyaPzOckmuwojEU Xm2lUMHgFK9vjYwO2QyJ0L04c4ePZe/bF8/mFVgfFaos8RtBC1HHk19D5le0FS3bLlon utIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=K9pZXmjEarEwCyykMezDXwpehpmv4umtwvd9btCXgKc=; b=jfXdy9GlaV2Fl/TLGsbHbH2jrk3CjBGRY8coX833kMDga+svu65X35yCfc6VKs2LNy Tpf4MmtcPjya67j+G8340Tz/wK/K0Lc8W5lX4V8+CtZsrxt/xAvbZsG1+krf+8FGl5rv UIwOuP0pyUnBSLGTrY+tP2e9OySCzYj4UCOz2SIxh/s3bwPE2w6pAnqistiX5Es25sgF VCNmlELiP4kh1Ide32VJrn+pycI51HHuAPss2zJ8oP+FeXZ0/amE0JVotaScdGyY28f2 H3Wza8crStRYZixdC7+9w6jkWS1pjcPVtvwNLO7rjqlyg6FZpKJ5IVCF3O03qkKsTifI lr3A== X-Gm-Message-State: APjAAAUjo+ZMgGqChd2aPjfPOhii7pNaPS/Gfb6Qv7VmmpKxHc7t12MK xC3Lb0cluAHd6qA/yVu2N0ls/mgWcpc= X-Received: by 2002:a65:63c6:: with SMTP id n6mr11208778pgv.370.1561010135545; Wed, 19 Jun 2019 22:55:35 -0700 (PDT) Received: from localhost.localdomain ([203.100.54.194]) by smtp.gmail.com with ESMTPSA id n17sm37308044pfq.182.2019.06.19.22.55.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 22:55:34 -0700 (PDT) From: Yafang Shao To: akpm@linux-foundation.org, penguin-kernel@I-love.SAKURA.ne.jp, dvyukov@google.com Cc: linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH] hung_task: recover hung task warnings in next check interval Date: Thu, 20 Jun 2019 13:55:00 +0800 Message-Id: <1561010100-14080-1-git-send-email-laoar.shao@gmail.com> X-Mailer: git-send-email 1.8.3.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When sys_hung_task_warnings reaches 0, the hang task messages will not be reported any more. If the user want to get more hung task messages, he must reset kernel.hung_task_warnings to a postive integer or -1 with sysctl. This is not a good way for the user. We'd better reset hung task warnings in the kernel, and then the user don't need to pay attention to this value. With this patch, hung task warnings will be reset with sys_hung_task_warnings setting in evenry check interval. Another difference is if the user set kernel.hung_task_warnings with a new value, the new value will take effect in next check interval. For example, when the kernel is printing the hung task messages, the user can't set it to 0 to stop the printing, but I don't think this will happen in the real world. (If that happens, then sys_hung_task_warnings must be protected by a lock) Signed-off-by: Yafang Shao --- Documentation/sysctl/kernel.txt | 5 ++++- kernel/hung_task.c | 19 ++++++++++++------- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index f0c86fb..350df41 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -377,6 +377,8 @@ This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. 0 (default): means use hung_task_timeout_secs as checking interval. Possible values to set are in range {0..LONG_MAX/HZ}. +hung_task_check_interval_secs must not be set greater than +hung_task_timeout_secs. ============================================================== @@ -384,7 +386,8 @@ hung_task_warnings: The maximum number of warnings to report. During a check interval if a hung task is detected, this value is decreased by 1. -When this value reaches 0, no more warnings will be reported. +When this value reaches 0, no more warnings will be reported until +next check interval begins. This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. -1: report an infinite number of warnings. diff --git a/kernel/hung_task.c b/kernel/hung_task.c index 14a625c..01e6c94 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -85,7 +85,8 @@ static int __init hung_task_panic_setup(char *str) .notifier_call = hung_task_panic, }; -static void check_hung_task(struct task_struct *t, unsigned long timeout) +static void check_hung_task(struct task_struct *t, unsigned long timeout, + int *warnings) { unsigned long switch_count = t->nvcsw + t->nivcsw; @@ -124,9 +125,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) * Ok, the task did not get scheduled for more than 2 minutes, * complain: */ - if (sysctl_hung_task_warnings) { - if (sysctl_hung_task_warnings > 0) - sysctl_hung_task_warnings--; + if (*warnings) { + if (*warnings > 0) + (*warnings)--; pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n", t->comm, t->pid, (jiffies - t->last_switch_time) / HZ); pr_err(" %s %s %.*s\n", @@ -170,7 +171,8 @@ static bool rcu_lock_break(struct task_struct *g, struct task_struct *t) * a really long time (120 seconds). If that happens, print out * a warning. */ -static void check_hung_uninterruptible_tasks(unsigned long timeout) +static void check_hung_uninterruptible_tasks(unsigned long timeout, + int *warnings) { int max_count = sysctl_hung_task_check_count; unsigned long last_break = jiffies; @@ -195,7 +197,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) } /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ if (t->state == TASK_UNINTERRUPTIBLE) - check_hung_task(t, timeout); + check_hung_task(t, timeout, warnings); } unlock: rcu_read_unlock(); @@ -271,6 +273,7 @@ static int hungtask_pm_notify(struct notifier_block *self, static int watchdog(void *dummy) { unsigned long hung_last_checked = jiffies; + int warnings; set_user_nice(current, 0); @@ -284,9 +287,11 @@ static int watchdog(void *dummy) interval = min_t(unsigned long, interval, timeout); t = hung_timeout_jiffies(hung_last_checked, interval); if (t <= 0) { + warnings = sysctl_hung_task_warnings; if (!atomic_xchg(&reset_hung_task, 0) && !hung_detector_suspended) - check_hung_uninterruptible_tasks(timeout); + check_hung_uninterruptible_tasks(timeout, + &warnings); hung_last_checked = jiffies; continue; } -- 1.8.3.1