Received: by 10.192.165.148 with SMTP id m20csp4324059imm; Tue, 24 Apr 2018 00:20:02 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqla1crSMtFBNRDoCn7O8tdreeSJLiq9M9+BNTgzaYrSmQ1p0oSbzTN/sRP0Wonyy0Z+IFj X-Received: by 2002:a17:902:5382:: with SMTP id c2-v6mr1377680pli.335.1524554402269; Tue, 24 Apr 2018 00:20:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524554402; cv=none; d=google.com; s=arc-20160816; b=KN36tmS7QBLIaxqsDk/MKH7WlD6FtTiWXMWJv/m1Q01T5yvmGj4fJ0BpX/MTnqEAOJ //pHyLXlLm7kLtEv84M8nK0o5pJEfBr1L/4c/aqG7wQQecNklHrucVQ7Xqoe7YNmlDpB zh/ofiQ9NF0/a8qll8NA4SdByXLLkdCSpApj6iWyE+5RWIImLAaUxWUDhiPUgG9DbJkd F84jgjB5RSUJ7Kc3HxZS4bQbfXDRDrdnNPT+1uU4fTpq7uZhP3Ev+T1Ei+1+q/XeIFif Dw96uX2ldDM6haQyEk8r4kNmfvBzYfSFqGl+uJmDB8SJNZR0u1+AfIBbKhcWcay268qT 9OvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:from:cc:references:to :subject:arc-authentication-results; bh=1+7V7vKb+GJQLIeNJHQlXK/CN4iJ0uNWiVFfZ6M6BCs=; b=keYA1rsgxsJLYpDZL7XiyVoVdwIteCqztJbL3fvi+1684Y/uM9Y4ExyZQ2wFSg4Tni 2gdU+5b0m+GpDkZdb4eymp4pSbVvHTFt/Egvj6AxJ2ig3Ztt54HAx/fxT+lfDtcU6NB9 /qkWrVo95U0V438AjARnOluN2rzFXaufCp5YH+cdb8NnRdxKzbp5C9J+4i0UmtWqCptW yjPjnKlaJ2gxbXe0psHCLG6KM5xPyT/QDvGAJAxzMM1EWSu9XtLZ2IxMOL6D2IS4S9nM UmkA9K/TBiMH8orfomBlKgRoQosGWUReLR0tNZjWIdNDVLQuu9/sZGBDTI5fciU9Me7K kJtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t135si6410747pgc.656.2018.04.24.00.19.47; Tue, 24 Apr 2018 00:20:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756229AbeDXHRs (ORCPT + 99 others); Tue, 24 Apr 2018 03:17:48 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:58460 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752059AbeDXHRm (ORCPT ); Tue, 24 Apr 2018 03:17:42 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w3O7EtMp031123 for ; Tue, 24 Apr 2018 03:17:42 -0400 Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 2hhvkd8x2v-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 24 Apr 2018 03:17:42 -0400 Received: from localhost by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 24 Apr 2018 08:17:40 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp11.uk.ibm.com (192.168.101.141) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 24 Apr 2018 08:17:36 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w3O7HZ9m5767610; Tue, 24 Apr 2018 07:17:35 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 387CD4C04E; Tue, 24 Apr 2018 08:09:57 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 667FB4C044; Tue, 24 Apr 2018 08:09:55 +0100 (BST) Received: from oc4502181600.ibm.com (unknown [9.124.35.76]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 24 Apr 2018 08:09:55 +0100 (BST) Subject: Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt To: Nicholas Piggin References: <1524544906-31512-1-git-send-email-shilpa.bhat@linux.vnet.ibm.com> <20180424160034.6e9d2274@roar.ozlabs.ibm.com> Cc: rjw@rjwysocki.net, viresh.kumar@linaro.org, benh@kernel.crashing.org, mpe@ellerman.id.au, linux-pm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, ppaidipe@linux.vnet.ibm.com, svaidy@linux.vnet.ibm.com From: Shilpasri G Bhat Date: Tue, 24 Apr 2018 12:47:32 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <20180424160034.6e9d2274@roar.ozlabs.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 18042407-0040-0000-0000-00000450C833 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18042407-0041-0000-0000-000020F5345D Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-04-24_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804240074 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 04/24/2018 11:30 AM, Nicholas Piggin wrote: > On Tue, 24 Apr 2018 10:11:46 +0530 > Shilpasri G Bhat wrote: > >> gpstate_timer_handler() uses synchronous smp_call to set the pstate >> on the requested core. This causes the below hard lockup: >> >> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable) >> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250 >> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580 >> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0 >> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0 >> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270 >> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4 >> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120 >> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0 >> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120 >> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50 >> LR = arch_send_call_function_ipi_mask+0x120/0x130 >> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable) >> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450 >> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0 >> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270 >> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40 >> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340 >> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350 >> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c >> >> Fix this by using the asynchronus smp_call in the timer interrupt handler. >> We don't have to wait in this handler until the pstates are changed on >> the core. This change will not have any impact on the global pstate >> ramp-down algorithm. >> >> Reported-by: Nicholas Piggin >> Reported-by: Pridhiviraj Paidipeddi >> Signed-off-by: Shilpasri G Bhat >> --- >> drivers/cpufreq/powernv-cpufreq.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c >> index 0591874..7e0c752 100644 >> --- a/drivers/cpufreq/powernv-cpufreq.c >> +++ b/drivers/cpufreq/powernv-cpufreq.c >> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t) >> spin_unlock(&gpstates->gpstate_lock); >> >> /* Timer may get migrated to a different cpu on cpu hot unplug */ >> - smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1); >> + smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0); >> } >> >> /* > > This can still deadlock because !wait case still ends up having to wait > if another !wait smp_call_function caller had previously used the > call single data for this cpu. > > If you go this way you would have to use smp_call_function_async, which > is more work. > > As a rule it would be better to avoid smp_call_function entirely if > possible. Can you ensure the timer is running on the right CPU? Use > add_timer_on and try again if the timer is on the wrong CPU, perhaps? > Yeah that is doable we can check for the cpu and re-queue it. We will only ramp-down slower in that case which is no harm. (If the targeted core turns out to be offline then we will not queue the timer again as we would have already set the pstate to min in the cpu-down path.) Thanks and Regards, Shilpa > Thanks, > Nick >