Received: by 10.192.165.148 with SMTP id m20csp505205imm; Wed, 25 Apr 2018 03:16:14 -0700 (PDT) X-Google-Smtp-Source: AIpwx49TBzxgpjzcqgolDXZwCkhWHVQcVxm4yeMifRw0TvNEHJhev83LNn9LrYx/Rs8plwLj7eV2 X-Received: by 10.99.143.75 with SMTP id r11mr23478860pgn.341.1524651374430; Wed, 25 Apr 2018 03:16:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524651374; cv=none; d=google.com; s=arc-20160816; b=CryCq7noiaIbu2g/4b+qfsxKqyjfbklbSbiIU2QIp9cBpLBgaod0F1/DU4RxHcPzDX /Fxl+tAXZGBxfPPlA8RViYjt8eEJiBoWzo3HeJQL2qu/J5uUtkVsgL73jKKAviUuIHkl 4LS5Bo08WDI84TPN7N+yB2HiFC3KoBVLQFb7Va5xt5hV0jvyIC++B0jMi28XMKf6SZnS yg/P4VE/aFC535T2Un3d/pbur4PJi5ujnOjwtPxW+5wdhLMKggfZ/vAkXAXzThEa7PDu TPb+C5r23e3gs/n0DB67UMSV9P0DUZnQou5pbol2Gsuafd0WI7qhOJflWsoB6C54wzNN rRYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=6AbNWJeqk+eE486FMVvORgG1F7Mxhy4MBg/C0fXgU8E=; b=vY4D9X7RkfVT4f/POiS7HyCNqJirjcD/3QI3+gxPlpdue9kYSYl/0+oFJQbTrmLy8t NA6qVP5lBOy5KK4DFaEygJReU6nuEyMsbtubpt8s4iUzkw53bS++Ayjb0+AhshL7j1ZV 4M/7t8KZ9Ps56taI3/W+iVFLx+9drWhMwBC/UV9l1GdJA/b4mVlGAsyJlDXFcZyeuB/Z rqGEohgowTgK9BzXa75vg9SZAGLDPcBeXoFSrLGtHJV+n/Vb/rroSR5B4IkymzaVD8eA FmvwHWWigtVf8OYqj0i2GIeYuXXkPUdPhShTZOIou5vWH+pxMzRHrT+1I9bnzt4JFdhr zJig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Sv6S6gZx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e1si13060275pgt.423.2018.04.25.03.15.59; Wed, 25 Apr 2018 03:16:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Sv6S6gZx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751804AbeDYKOw (ORCPT + 99 others); Wed, 25 Apr 2018 06:14:52 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:45415 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751212AbeDYKOu (ORCPT ); Wed, 25 Apr 2018 06:14:50 -0400 Received: by mail-pf0-f193.google.com with SMTP id l27so14793678pfk.12 for ; Wed, 25 Apr 2018 03:14:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=6AbNWJeqk+eE486FMVvORgG1F7Mxhy4MBg/C0fXgU8E=; b=Sv6S6gZxTbALLgVlrHijE1kTh0E2Cm71dNR7+UFsUf8rodo/Ly6JyGCbyk8JrSfye7 KSFBJ8KereTtUi4dVbddHQfLpzcAnTC2VQIyK+Zd595MSY/3ZJrvz7SuDSn6bjSm1LwC Foe9anyuS4+cIQTKeNKW8E0cZsW2M1ZCmGfvs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=6AbNWJeqk+eE486FMVvORgG1F7Mxhy4MBg/C0fXgU8E=; b=ImZOUulD+QNkvC87P3w6XRXefaQotdPu9p0mN699gVg+tfAkw6la++KjzhJhHwsc4B KmcQmmC0UrnUno8/JsrlwjXw+bY/wjgxChtchsJFxaOx5vPWikyDuibMCUpO+9GN1Gq4 Fgiuikg1kpKxAbk+EpWXAmc0HDcvlzu82Hj1buvfuu5+FjlpCF8sizz3HHUBbR+Fxa8x XelxkeyiSh68MZ1UIdWAdsBe+xby3kdYSPSwQtjfL8m4+kTrQl619lljTZBGlJz2Q94U 05Rfdjad/dTSfe7+rFD39g/xzYSX9QJRNqP4ItLp50qusmqyNF9E7+Um4hm6ZFwXcNfz 1iFQ== X-Gm-Message-State: ALQs6tAde0ErL/WaOgOJQN3agP5FAWss0iX4K2UMmeVzu1syCVgCOknZ Sv+cdcCu495qCAvnm2O5gBBAHQ== X-Received: by 10.98.147.200 with SMTP id r69mr20226437pfk.59.1524651289505; Wed, 25 Apr 2018 03:14:49 -0700 (PDT) Received: from localhost ([122.172.61.40]) by smtp.gmail.com with ESMTPSA id v16sm31838764pfj.123.2018.04.25.03.14.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Apr 2018 03:14:48 -0700 (PDT) Date: Wed, 25 Apr 2018 15:44:47 +0530 From: Viresh Kumar To: Shilpasri G Bhat Cc: rjw@rjwysocki.net, npiggin@gmail.com, linux-pm@vger.kernel.org, ppaidipe@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, svaidy@linux.vnet.ibm.com, stable@vger.kernel.org Subject: Re: [PATCH V2] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt Message-ID: <20180425101447.eynrhpyait2emzoa@vireshk-i7> References: <1524646968-526-1-git-send-email-shilpa.bhat@linux.vnet.ibm.com> <20180425091701.kxxslgalw3sdpiym@vireshk-i7> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180323-120-3dd1ac Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 25-04-18, 15:32, Shilpasri G Bhat wrote: > Hi, > > On 04/25/2018 02:47 PM, Viresh Kumar wrote: > > On 25-04-18, 14:32, Shilpasri G Bhat wrote: > >> gpstate_timer_handler() uses synchronous smp_call to set the pstate > >> on the requested core. This causes the below hard lockup: > >> > >> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable) > >> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250 > >> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580 > >> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0 > >> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0 > >> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270 > >> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4 > >> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120 > >> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0 > >> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120 > >> -- interrupt: 901 at doorbell_global_ipi+0x34/0x50 > >> LR = arch_send_call_function_ipi_mask+0x120/0x130 > >> [c000003fe566ba50] [c00000000004876c] > >> arch_send_call_function_ipi_mask+0x4c/0x130 > >> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450 > >> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0 > >> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270 > >> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40 > >> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340 > >> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350 > >> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c > >> > >> One way to avoid this is removing the smp-call. We can ensure that the timer > >> always runs on one of the policy-cpus. If the timer gets migrated to a > >> cpu outside the policy then re-queue it back on the policy->cpus. This way > >> we can get rid of the smp-call which was being used to set the pstate > >> on the policy->cpus. > >> > >> Fixes: 7bc54b652f13 (timers, cpufreq/powernv: Initialize the gpstate timer as pinned) > >> Cc: [4.8+] > >> Reported-by: Nicholas Piggin > >> Reported-by: Pridhiviraj Paidipeddi > >> Signed-off-by: Shilpasri G Bhat > >> --- > >> Changes from V1: > >> - Remove smp_call in the pstate handler. > >> > >> drivers/cpufreq/powernv-cpufreq.c | 23 ++++++++++++++++++++--- > >> 1 file changed, 20 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c > >> index 71f8682..dc8ffb5 100644 > >> --- a/drivers/cpufreq/powernv-cpufreq.c > >> +++ b/drivers/cpufreq/powernv-cpufreq.c > >> @@ -679,6 +679,25 @@ void gpstate_timer_handler(struct timer_list *t) > >> > >> if (!spin_trylock(&gpstates->gpstate_lock)) > >> return; > >> + /* > >> + * If the timer has migrated to the different cpu then bring > >> + * it back to one of the policy->cpus > >> + */ > >> + if (!cpumask_test_cpu(raw_smp_processor_id(), policy->cpus)) { > >> + /* > >> + * Timer should be deleted if policy is inactive. > >> + * If policy is active then re-queue on one of the > >> + * policy->cpus. > >> + */ > > > > This looks racy. Shouldn't you guarantee that the timer is already > > removed in a synchronous way before de-activating the policy ? > > > > The timer is deleted in driver->stop_cpu(). So we ensure to remove the timer > before de-activating the policy. > > > >> + if (!cpumask_empty(policy->cpus)) { > > So are you suggesting to remove ^^ the check for active policy here? > (I put that as a safety check.) Either you are sure or you are not, and you don't need a safety check if you are sure :) -- viresh