Received: by 10.223.176.5 with SMTP id f5csp3398664wra; Mon, 29 Jan 2018 12:36:39 -0800 (PST) X-Google-Smtp-Source: AH8x224NDu1TsmvI3icXwwnUNT9UBQaf4CLDJHMlF8WTqebCLxYSRWm3fnu/qdCm4W2Vw8bJdzFU X-Received: by 10.98.204.144 with SMTP id j16mr28333316pfk.101.1517258199333; Mon, 29 Jan 2018 12:36:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517258199; cv=none; d=google.com; s=arc-20160816; b=fkVUCcLsTxJVlrPH2wlKWQyD1HVvpr9tuUcjW0NJP1oapCuTsFYpbQFZBKRaz6S4Pj gJIInvOVIKkrDDqp1IYftACq4tyeLlI610v9CvUBLmp6Ff7l43vl7Sw/1hOC5BwYDwS1 Vju3iyvADFgVIv++vq9Zj9ScAYg7mtU0vD7SGe8wHcGbdkSEMi9xPNMvgCIs4RgpZn2n befSv/dpWx6cPwe/YCxsAbvJ5uX8rk7mExihTYsO7WhOqDJI7ziPku4hWSxdBH8DtM4p zFExyV3chsF4zkPo4R/aaMN1/wW2l3f/1Zi7g/ZYyc50fMD+IZsm37Wz8hY3oLj0wWOn zjPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=NVI8BvHcFir7YJEKnoy3+1rWiJunJtcgqq50bh9HKX0=; b=FinvDoqiZCyvUJoFgvCXXEMpR7JrfuKsAX4M9VM+6a2nXi0tIGGvbvzVQ8Cszm8ppm N2QOiuLzC8FZWS84US73RJ2OFZv5wBs6as2kZA5SfhPXE8gxp8zxGveUaXzyvm8sfzyg sIjGe/wtLfErKTnaeQmg41Wug772lnvPVIopMcARspepim9G9eYtPv0Ik36Qv6r18X7N uuVePwQwRwUUu5en9E8KPyGcPhnn9rTuyf2GKyn0a2HE6luVqC6YLqvspvOqY6WvPklW NFwNO08ZRq4WQzpVkAy/E9BWAF2RaN8QHFN6qaNG1DcQ7/BLtIiRb7ftCzR3Qkv8HSQ5 6dpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bi5-v6si2076404plb.226.2018.01.29.12.36.24; Mon, 29 Jan 2018 12:36:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932185AbeA2UON (ORCPT + 99 others); Mon, 29 Jan 2018 15:14:13 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:39682 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753888AbeA2UOK (ORCPT ); Mon, 29 Jan 2018 15:14:10 -0500 Received: from localhost (LFbn-1-12258-90.w90-92.abo.wanadoo.fr [90.92.71.90]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7D42B2F51; Mon, 29 Jan 2018 13:04:05 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Paul E. McKenney" , Thomas Gleixner , Peter Zijlstra , Sebastian Sewior , Anna-Maria Gleixner Subject: [PATCH 4.4 57/74] hrtimer: Reset hrtimer cpu base proper on CPU hotplug Date: Mon, 29 Jan 2018 13:57:02 +0100 Message-Id: <20180129123850.144331216@linuxfoundation.org> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180129123847.507563674@linuxfoundation.org> References: <20180129123847.507563674@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Thomas Gleixner commit d5421ea43d30701e03cadc56a38854c36a8b4433 upstream. The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney Signed-off-by: Thomas Gleixner Cc: Peter Zijlstra Cc: Sebastian Sewior Cc: Anna-Maria Gleixner Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos Signed-off-by: Greg Kroah-Hartman --- kernel/time/hrtimer.c | 3 +++ 1 file changed, 3 insertions(+) --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -669,7 +669,9 @@ static void hrtimer_reprogram(struct hrt static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base) { base->expires_next.tv64 = KTIME_MAX; + base->hang_detected = 0; base->hres_active = 0; + base->next_timer = NULL; } /* @@ -1615,6 +1617,7 @@ static void init_hrtimers_cpu(int cpu) timerqueue_init_head(&cpu_base->clock_base[i].active); } + cpu_base->active_bases = 0; cpu_base->cpu = cpu; hrtimer_init_hres(cpu_base); }