Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp4176653imm; Mon, 30 Jul 2018 09:56:07 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfbqFsQnZFk9FF+Ls1Ed2mjA06mCvM+l8/9N0HQMCimvhI7+b2V3ndRNvZWyG4zjDODzveD X-Received: by 2002:a63:4e5f:: with SMTP id o31-v6mr17441398pgl.256.1532969767848; Mon, 30 Jul 2018 09:56:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532969767; cv=none; d=google.com; s=arc-20160816; b=LUvcKKs9RZSgVXqusZt6l/oTNqSSe4/zE9li4H2KtfwsSkiNQ3Ig1TnMihGjSjB8Tn DkDlcbBuTDcnVNjG8QNg3gU/W390OmkEdeCUxJ1W8y66uUIwR4gzT8WHhzfXMiibv0xE zgx8iPXIPs8VVzUt0xlDz9I/Qtwq/34blyMjOLaoKRgyRGaZkFuGx+0EjMLWCZelZMqI El6zM8HwaYgcSS1IlWSVtOcac3yYSDFCIT6pztPQWEeND077kke9ArtwmpwUSZg6FEsB xd6qbPpFzkwOrjSxOpIKXFP92Ydtoklnupl37NOovNwAu5acn6qRaNXjF/HTtLoffRxo W6GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=fM7CeMs8kaPwZvcZmIf/lAvP8PbPgJnuEWdZDS7WcQE=; b=eR7LZLe6LdCEd+GtSfWmbbZZBU7kaq0l7hR0oxYmULasLMLYQXvYF5Thqzv6qEG4B/ auXs4rqUD+whjD/piIBWlZMH+H3Bc+iOG8jzBrZ9yt4JRFb2C+LmkNiYLNSH3Ya4PXud 6xjz1se3HiCWRwIjPidnz5AsHAr8Fqxd+WL+QUjmuRKDpgnzplsnQxQbkSmLJUcj3UY7 9AoepJRz90XdUiYCYF2Mki7lbpYjhzzvkzxo9n0aRpIS5gF7wRY+YkmbQ7aEm6ZGhi96 uZaLtLQ6/F2Oju2lqJvwnYq8NIXaPnYv0gnWnehT5IW5EODx1Yc+3vtnkMARZozziG9U YV4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=pcdKQhzp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o12-v6si12048305pgi.112.2018.07.30.09.55.52; Mon, 30 Jul 2018 09:56:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=pcdKQhzp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727357AbeG3Sap (ORCPT + 99 others); Mon, 30 Jul 2018 14:30:45 -0400 Received: from smtp-fw-9101.amazon.com ([207.171.184.25]:53478 "EHLO smtp-fw-9101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726668AbeG3Sao (ORCPT ); Mon, 30 Jul 2018 14:30:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1532969691; x=1564505691; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=fM7CeMs8kaPwZvcZmIf/lAvP8PbPgJnuEWdZDS7WcQE=; b=pcdKQhzp3WndL0SKwuyh4UJ+Dl82R0INzWKU7wplJ/28wWC2F/hopdDj udsrV8LtbTxvOAEW023DlzC5E4LzYEJYtVxWbWboF/kybU5klFmxtVerY +wJs1z07/5hux7fKR6shHcP1aOQltqIUbgJQWDuF7A5Vw/evEIALNFHvP I=; X-IronPort-AV: E=Sophos;i="5.51,422,1526342400"; d="scan'208";a="752237316" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2a-e7be2041.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 30 Jul 2018 16:44:59 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-e7be2041.us-west-2.amazon.com (8.14.7/8.14.7) with ESMTP id w6UGiIpB098426 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Mon, 30 Jul 2018 16:44:20 GMT Received: from EX13D05UWB004.ant.amazon.com (10.43.161.208) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Mon, 30 Jul 2018 16:44:19 +0000 Received: from EX13MTAUEA001.ant.amazon.com (10.43.61.82) by EX13D05UWB004.ant.amazon.com (10.43.161.208) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Mon, 30 Jul 2018 16:44:18 +0000 Received: from localhost (10.55.160.54) by mail-relay.amazon.com (10.43.61.243) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Mon, 30 Jul 2018 16:44:18 +0000 Date: Mon, 30 Jul 2018 09:44:17 -0700 From: Eduardo Valentin To: "Rafael J. Wysocki" CC: Eduardo Valentin , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Dou Liyang , Len Brown , "Rafael J. Wysocki" , "mike.travis@hpe.com" , Rajvi Jingar , Pavel Tatashin , Philippe Ombredanne , "Kate Stewart" , Greg Kroah-Hartman , the arch/x86 maintainers , Linux Kernel Mailing List , Linux PM Subject: Re: [PATCH RESEND 1/1] x86: tsc: avoid system instability in hibernation Message-ID: <20180730164417.GE15414@u40b0340c692b58f6553c.ant.amazon.com> References: <20180726155656.14873-1-eduval@amazon.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 30, 2018 at 09:15:48AM +0200, Rafael J. Wysocki wrote: > On Thu, Jul 26, 2018 at 5:56 PM, Eduardo Valentin wrote: > > System instability are seen during resume from hibernation when system > > is under heavy CPU load. This is due to the lack of update of sched > > clock data, > > Isn't that the actual bug? > > > and the scheduler would then think that heavy CPU hog > > tasks need more time in CPU, causing the system to freeze > > during the unfreezing of tasks. For example, threaded irqs, > > and kernel processes servicing network interface may be delayed > > for several tens of seconds, causing the system to be unreachable. > > > > Situation like this can be reported by using lockup detectors > > such as workqueue lockup detectors: > > > > [root@ip-172-31-67-114 ec2-user]# echo disk > /sys/power/state > > > > Message from syslogd@ip-172-31-67-114 at May 7 18:23:21 ... > > kernel:BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 57s! > > > > Message from syslogd@ip-172-31-67-114 at May 7 18:23:21 ... > > kernel:BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 57s! > > > > Message from syslogd@ip-172-31-67-114 at May 7 18:23:21 ... > > kernel:BUG: workqueue lockup - pool cpus=3 node=0 flags=0x1 nice=0 stuck for 57s! > > > > Message from syslogd@ip-172-31-67-114 at May 7 18:29:06 ... > > kernel:BUG: workqueue lockup - pool cpus=3 node=0 flags=0x1 nice=0 stuck for 403s! > > > > The fix for this situation is to mark the sched clock as unstable > > as early as possible in the resume path, leaving it unstable > > for the duration of the resume process. > > I would rather call it a workaround. ok. > > > This will force the > > scheduler to attempt to align the sched clock across CPUs using > > the delta with time of day, updating sched clock data. In a post > > hibernation event, we can then mark the sched clock as stable > > again, avoiding unnecessary syncs with time of day on systems > > in which TSC is reliable. > > > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: "H. Peter Anvin" > > Cc: Peter Zijlstra > > Cc: Dou Liyang > > Cc: Len Brown > > Cc: "Rafael J. Wysocki" > > Cc: Eduardo Valentin > > Cc: "mike.travis@hpe.com" > > Cc: Rajvi Jingar > > Cc: Pavel Tatashin > > Cc: Philippe Ombredanne > > Cc: Kate Stewart > > Cc: Greg Kroah-Hartman > > Cc: x86@kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-pm@vger.kernel.org > > Signed-off-by: Eduardo Valentin > > --- > > > > Hey, > > > > No changes from first attempt, no pressure on resending. The RESEND > > tag is just because I missed linux-pm in the first attempt. > > > > BR, > > > > arch/x86/kernel/tsc.c | 29 +++++++++++++++++++++++++++++ > > include/linux/sched/clock.h | 5 +++++ > > kernel/sched/clock.c | 4 ++-- > > 3 files changed, 36 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > > index 8ea117f8142e..f197c9742fef 100644 > > --- a/arch/x86/kernel/tsc.c > > +++ b/arch/x86/kernel/tsc.c > > @@ -13,6 +13,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -1377,3 +1378,31 @@ unsigned long calibrate_delay_is_known(void) > > return 0; > > } > > #endif > > + > > +static int tsc_pm_notifier(struct notifier_block *notifier, > > + unsigned long pm_event, void *unused) > > +{ > > + switch (pm_event) { > > + case PM_HIBERNATION_PREPARE: > > + clear_sched_clock_stable(); > > + break; > > This is too early IMO. This happens before hibernation starts, even > before the image is created. Yeah, I think, as long as it is marked, it should be fine. > > > + case PM_POST_HIBERNATION: > > + /* Set back to the default */ > > + if (!check_tsc_unstable()) > > + set_sched_clock_stable(); > > + break; > > + } > > + > > + return 0; > > +}; > > If anything like this is the way to go, which honestly I doubt, I > would prefer it to be done in hibernate() in the !in_suspend case. > The problem is more in the unfreeze of tasks.. > But why does it only affect hibernation? Do we do something extra for > system-wide suspend/resume that is not done for hibernation? I don't think we do anything special in hibernation per si. Only thing is the unfreezing of tasks seams to get confused when CPU hog tasks are present. > -- All the best, Eduardo Valentin