Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1682103imm; Wed, 1 Aug 2018 22:17:43 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcw+c++/8bBvS3yqPixEcuxhl68sBtSHca5wUN+6KWDzDQMHL0aoxBfquglglP05NKLyawN X-Received: by 2002:a62:d842:: with SMTP id e63-v6mr1386156pfg.88.1533187063007; Wed, 01 Aug 2018 22:17:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533187062; cv=none; d=google.com; s=arc-20160816; b=S+zkPaqDfTqtFqhH7EGuzzq29z6uv0fVd/Q9Q7h4TpQDQm4hRljBa6yJLNJthDiO+N flOSi3BrjhF2WET96jc/LvTjme20Hhgo7r1csi3b63GDdeyTtjNK7c+2BU23lMgfSUbS VnNTRE6HASk5gwNr/73O2VuizDxf8atF3m0YiD6TnRA6Ia19nJiP7emlXp/TCPWkOf0/ cBm/JLizAyynrtx0A8ovDzrw7zkHxSWVdgerAPoOLqBHmel53T0v8Sh1fCEuyalU8wwf gTHfz0afEt76RcpmmQV8bj01aq3dJFxBQeJ+wsH76KYjwQ0E3yoPzuYkDJ65N5Sqypoy mk7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=B8G1v+ob7x5eaxT4cojlaZk5RDXQy4eS1DethQ+oRV0=; b=biuNJgdrlEjoTBCaloI5GEkwfO1epLH/FyARAp9x0LUWmn0IxaNhaqIUU1t9WK7fVN tzmK5tKDTnZJYaSQxnLbzixzx2QDnPKSl/MwqquZ/8YOVyyiw95Mx+nr5LMDutqjJnDs uAxalCQiJFg3HnqRN+2PrdR8eB4wEDYKb0nSAgkNBr0CbW+5BMkAwG3uHBs3cRZOxQsF 6Xd1n4zXqBn5DuOnf1xHTndCAqDbNWS1v7zMjoxMqsiGEwA1H4pPT+D+UMa5/LQQ52yS vvP5RrKuRXmUbl51EkIAojpMBlUM4tRpic+e++fm0KNr2Kn/OnfmGCqAcsOrk2ob/opO kMFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i18-v6si1062610pgk.595.2018.08.01.22.17.28; Wed, 01 Aug 2018 22:17:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726277AbeHBHGA (ORCPT + 99 others); Thu, 2 Aug 2018 03:06:00 -0400 Received: from alexa-out-blr-01.qualcomm.com ([103.229.18.197]:10214 "EHLO alexa-out-blr-01.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726030AbeHBHF7 (ORCPT ); Thu, 2 Aug 2018 03:05:59 -0400 X-IronPort-AV: E=Sophos;i="5.51,434,1526322600"; d="scan'208";a="127586" Received: from ironmsg01-blr.qualcomm.com ([10.86.208.130]) by alexa-out-blr-01.qualcomm.com with ESMTP/TLS/AES256-SHA; 02 Aug 2018 10:45:06 +0530 X-IronPort-AV: E=McAfee;i="5900,7806,8972"; a="833041" Received: from gkohli-linux.qualcomm.com ([10.204.78.26]) by ironmsg01-blr.qualcomm.com with ESMTP; 02 Aug 2018 10:45:06 +0530 Received: by gkohli-linux.qualcomm.com (Postfix, from userid 427023) id 6AA701C43; Thu, 2 Aug 2018 10:45:05 +0530 (IST) From: Gaurav Kohli To: tglx@linutronix.de, john.stultz@linaro.org, sboyd@kernel.org Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, Gaurav Kohli Subject: [PATCH v2] timers: Clear must_forward_clk inside base lock Date: Thu, 2 Aug 2018 10:45:03 +0530 Message-Id: <1533186903-28419-1-git-send-email-gkohli@codeaurora.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Timer wheel base->must_forward_clock is indicating that the base clock might be stale due to a long idle sleep. The forwarding of base clock takes place in softirq of timer or when a timer is enqueued to base which is idle. While migrate timer from remote CPU to the new base which is idle, then following race can happen: CPU0 CPU1 run_timer_softirq timers_dead_cpu base = lock_timer_base(timer); base->must_forward_clk = false if (base->must_forward_clk) forward(base); >>skip migrate_timer_list enqueue_timer(base, timer, idx); >> idx is calculated high due to >> stale base unlock_timer_base(timer); base = lock_timer_base(timer); forward(base); The root cause is that base->must_forward_clk is cleared outside the base->lock held region, so the remote queuing CPU observes it as cleared, but the base clock is still stale. This can cause large granularity values for timers, i.e. the accuracy of the expiry time suffers. Prevent this by clearing the flag with base->lock held, so that the forwarding takes place before the cleared flag is observable by a remote CPU. Signed-off-by: Gaurav Kohli --- Changes since v1: - Updated comment suggested by Thomas. diff --git a/kernel/time/timer.c b/kernel/time/timer.c index cc2d23e..8f61d45 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1657,6 +1657,22 @@ static inline void __run_timers(struct timer_base *base) raw_spin_lock_irq(&base->lock); + /* + * Timer wheel base must_forward_clk must be cleared before running + * timers so that any timer functions that call mod_timer will not + * try to forward the base. idle tracking / clock forwarding logic + * is only used with BASE_STD timers. + * + * The must_forward_clk flag is cleared unconditionally also for + * the deferrable base. The deferrable base is not affected by idle + * tracking and never forwarded, so clearing the flag is a NOOP. + * + * The fact that the deferrable base is never forwarded can cause + * large variations in granularity for deferrable timers, but they + * can be deferred for long periods due to idle anyway. + */ + base->must_forward_clk = false; + while (time_after_eq(jiffies, base->clk)) { levels = collect_expired_timers(base, heads); @@ -1676,19 +1692,6 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h) { struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); - /* - * must_forward_clk must be cleared before running timers so that any - * timer functions that call mod_timer will not try to forward the - * base. idle trcking / clock forwarding logic is only used with - * BASE_STD timers. - * - * The deferrable base does not do idle tracking at all, so we do - * not forward it. This can result in very large variations in - * granularity for deferrable timers, but they can be deferred for - * long periods due to idle. - */ - base->must_forward_clk = false; - __run_timers(base); if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); -- 1.9.1