Received: by 10.192.165.148 with SMTP id m20csp4145958imm; Mon, 30 Apr 2018 12:37:12 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpQQdH8MmHs/BDly67xIrIyR/PSQdo72+bPBV5wqxriOGWfOFM3eOrFgr6e3jJcq92UkjCf X-Received: by 10.98.60.16 with SMTP id j16mr13130122pfa.7.1525117032483; Mon, 30 Apr 2018 12:37:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525117032; cv=none; d=google.com; s=arc-20160816; b=WJW+v725ZrRj9ECej2m3+qYTSnKA1E+NuTImDLXqczLkSm4tXGh8DyANuVgXeEIIfb SrTZFLFNbnO/zKciHd65VTE7S+UFB1CmULudb1ppi+2oEus3GSlR0sKu//qNVAg5nIai nm8jkiZYpbJae7SQ+b8nAXoWyUOHy9hgTgj1+VKlV8RZjJP4J0tbrOX/L8/Gy5ML65bw ho1d7crtW1FN7V2cwC3OQKDw5f2OPi0oe1bjH9egfwLonqCpLMGWx1x5G7OjUFFgbe9U GaL/IQUjd2Ptp2jsYa4/Mg7HHD8F491609fFemMMXJFnVBJk4ZDWXLCzcs9iDNxbWqnn YR8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dmarc-filter :arc-authentication-results; bh=8w6aM0fr2eVs1lgSQbZ6NaeMHPquQRknsZL+4zSiCmg=; b=Tp7bvUbGa6bdccMR+kTjIwNP98I73Dm5SLlAbMS8bRUYMHST+MMO39EfiM2iOXRPKy a9BHMjPHXcYFAER4rbv67LCPFk5oAzvWZMZ1WndPl/HDckkx0VU1lqmgAJXOD0aqtxad Q3ClXk1ugsz+0YV269iI9qKLB/zTrUHwqK+V9YYmfCeMviqGkiAy01n5buCK67Eu88wp AFjpv1Gq/TO/mfVVX6/DkdpJx8fEzuyn4tmmfeaJGB1VE4FqHM57Pf8CVjZGPXQigLEm abWxDCS2qfYAjcxhfV0+NYqBYp6hRteoIbSkkaSWLvwYjZSX4hpkjbeXSk9AHTvJ5d+3 +tOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v123si8063031pfc.273.2018.04.30.12.36.58; Mon, 30 Apr 2018 12:37:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756337AbeD3Tfx (ORCPT + 99 others); Mon, 30 Apr 2018 15:35:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:36890 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756429AbeD3T2r (ORCPT ); Mon, 30 Apr 2018 15:28:47 -0400 Received: from localhost (unknown [104.132.1.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CAC7C22DBF; Mon, 30 Apr 2018 19:28:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAC7C22DBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linuxfoundation.org Authentication-Results: mail.kernel.org; spf=fail smtp.mailfrom=gregkh@linuxfoundation.org From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Nicholas Piggin , Pridhiviraj Paidipeddi , Shilpasri G Bhat , Viresh Kumar , Vaidyanathan Srinivasan , Michael Ellerman Subject: [PATCH 4.16 096/113] cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt Date: Mon, 30 Apr 2018 12:25:07 -0700 Message-Id: <20180430184019.264163678@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180430184015.043892819@linuxfoundation.org> References: <20180430184015.043892819@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Shilpasri G Bhat commit c0f7f5b6c69107ca92909512533e70258ee19188 upstream. gpstate_timer_handler() uses synchronous smp_call to set the pstate on the requested core. This causes the below hard lockup: smp_call_function_single+0x110/0x180 (unreliable) smp_call_function_any+0x180/0x250 gpstate_timer_handler+0x1e8/0x580 call_timer_fn+0x50/0x1c0 expire_timers+0x138/0x1f0 run_timer_softirq+0x1e8/0x270 __do_softirq+0x158/0x3e4 irq_exit+0xe8/0x120 timer_interrupt+0x9c/0xe0 decrementer_common+0x114/0x120 -- interrupt: 901 at doorbell_global_ipi+0x34/0x50 LR = arch_send_call_function_ipi_mask+0x120/0x130 arch_send_call_function_ipi_mask+0x4c/0x130 smp_call_function_many+0x340/0x450 pmdp_invalidate+0x98/0xe0 change_huge_pmd+0xe0/0x270 change_protection_range+0xb88/0xe40 mprotect_fixup+0x140/0x340 SyS_mprotect+0x1b4/0x350 system_call+0x58/0x6c One way to avoid this is removing the smp-call. We can ensure that the timer always runs on one of the policy-cpus. If the timer gets migrated to a cpu outside the policy then re-queue it back on the policy->cpus. This way we can get rid of the smp-call which was being used to set the pstate on the policy->cpus. Fixes: 7bc54b652f13 ("timers, cpufreq/powernv: Initialize the gpstate timer as pinned") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Nicholas Piggin Reported-by: Pridhiviraj Paidipeddi Signed-off-by: Shilpasri G Bhat Acked-by: Nicholas Piggin Acked-by: Viresh Kumar Acked-by: Vaidyanathan Srinivasan Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman --- drivers/cpufreq/powernv-cpufreq.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -679,6 +679,16 @@ void gpstate_timer_handler(struct timer_ if (!spin_trylock(&gpstates->gpstate_lock)) return; + /* + * If the timer has migrated to the different cpu then bring + * it back to one of the policy->cpus + */ + if (!cpumask_test_cpu(raw_smp_processor_id(), policy->cpus)) { + gpstates->timer.expires = jiffies + msecs_to_jiffies(1); + add_timer_on(&gpstates->timer, cpumask_first(policy->cpus)); + spin_unlock(&gpstates->gpstate_lock); + return; + } /* * If PMCR was last updated was using fast_swtich then @@ -718,10 +728,8 @@ void gpstate_timer_handler(struct timer_ if (gpstate_idx != gpstates->last_lpstate_idx) queue_gpstate_timer(gpstates); + set_pstate(&freq_data); spin_unlock(&gpstates->gpstate_lock); - - /* Timer may get migrated to a different cpu on cpu hot unplug */ - smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1); } /*