Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1733912ybg; Sat, 19 Oct 2019 01:20:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqz0I668CtlNRkrghqAEni/AKvU4UKwKAYP5/LG+sp1bclTtbjRiTcZgtqEi0BhEXMIaLjhG X-Received: by 2002:a05:6402:1454:: with SMTP id d20mr14438767edx.53.1571473212499; Sat, 19 Oct 2019 01:20:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571473212; cv=none; d=google.com; s=arc-20160816; b=QkN4I1EW8wKRX9nUpwVTOe8jPOnkoBp3MDpqi438J++XHs+MgUPWVEA9k/vlOBaZTr ni/USd4UFkw/jRmwN7RJ1sVxZVWF4Y78devXs/aBhJvvavFxe/Bkjt7rVuB4hI8MchmN r0lR/bBhqH25jFc35Ed/Xbld4vmakvrK7kuSm4L3Eb3eUrBx7U+PeI6Z9erQLKlOuMi/ quX6tQAXX1xSnlUiHTXOdfZenNSlbwBGPsAbFDdKtWd7c0XlI+HuJVrrDoeYDzkvuUAQ 9DXRnIChAVGz+ujdIAGBE8hHApSpStyPElMNxoR7UYDilIMI0dps8R63FRzc1tb96AQr kG6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=eAMOfCg7IpDnuaTMRsE0IkFZW2sot+ppZWDn1w3H3I4=; b=R29lPZAMc7Nff+nljWrEiVhzA6g4WvgitFzkgVf7iODtjYhB0sWPUskymu3R/1lN1w Mn6Y7F/z4DNa1nwWwmfdx5bTDEaInXszV9i6FAF4Imk/j43JZ618WgY0OeATnhAr3NhK uM64WqN//RyxeChfYjukd0l3/b0x3x7C1kNqURB1D8o2apU2WLhNUVrHh053EDWP9mox teanguI5TAiSLtAoeF63bHNF0uZoEVWvNWZVzlOvzM3SeM7rkeUyh5L+ZqW60UJMO8Zs CEYjGWNH6whbLRqRbzKfiQ1EuqlCkJRKaDZhSexLx9n8/ioSssNFM5OmAu6QazQ9GPrQ B66g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id pk19si5098356ejb.257.2019.10.19.01.19.49; Sat, 19 Oct 2019 01:20:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2442524AbfJRLmK (ORCPT + 99 others); Fri, 18 Oct 2019 07:42:10 -0400 Received: from [217.140.110.172] ([217.140.110.172]:36286 "EHLO foss.arm.com" rhost-flags-FAIL-FAIL-OK-OK) by vger.kernel.org with ESMTP id S2409900AbfJRLmK (ORCPT ); Fri, 18 Oct 2019 07:42:10 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 56AB8CA3; Fri, 18 Oct 2019 04:41:47 -0700 (PDT) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6A87F3F6C4; Fri, 18 Oct 2019 04:41:45 -0700 (PDT) Date: Fri, 18 Oct 2019 12:41:43 +0100 From: Mark Rutland To: Yunfeng Ye Cc: catalin.marinas@arm.com, will@kernel.org, kstewart@linuxfoundation.org, sudeep.holla@arm.com, gregkh@linuxfoundation.org, lorenzo.pieralisi@arm.com, tglx@linutronix.de, David.Laight@ACULAB.COM, ard.biesheuvel@linaro.org, "hushiyuan@huawei.com" , "linfeilong@huawei.com" , wuyun.wu@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH V3] arm64: psci: Reduce waiting time for cpu_psci_cpu_kill() Message-ID: <20191018114143.GE27759@lakrids.cambridge.arm.com> References: <433980c7-f246-f741-f00c-fce103a60af7@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <433980c7-f246-f741-f00c-fce103a60af7@huawei.com> User-Agent: Mutt/1.11.1+11 (2f07cb52) (2018-12-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 18, 2019 at 07:24:14PM +0800, Yunfeng Ye wrote: > In a case like suspend-to-disk, a large number of CPU cores need to be > shut down. At present, the CPU hotplug operation is serialised, and the > CPU cores can only be shut down one by one. In this process, if PSCI > affinity_info() does not return LEVEL_OFF quickly, cpu_psci_cpu_kill() > needs to wait for 10ms. If hundreds of CPU cores need to be shut down, > it will take a long time. Do we have an idea of roughly how long a CPU _usually_ takes to transition state? i.e. are we _just_ missing the transition the first time we call AFFINITY_INFO? > Normally, it is no need to wait 10ms in cpu_psci_cpu_kill(). So change > the wait interval from 10 ms to max 1 ms and use usleep_range() instead > of msleep() for more accurate schedule. > > In addition, reduce the time interval will increase the messages output, > so remove the "Retry ..." message, instead, put the number of waiting > times to the sucessful message. > > Signed-off-by: Yunfeng Ye > --- > v2 -> v3: > - update the comment > - remove the busy-wait logic, modify the loop logic and output message > > v1 -> v2: > - use usleep_range() instead of udelay() after waiting for a while > > arch/arm64/kernel/psci.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c > index c9f72b2665f1..00b8c0825a08 100644 > --- a/arch/arm64/kernel/psci.c > +++ b/arch/arm64/kernel/psci.c > @@ -91,15 +91,14 @@ static int cpu_psci_cpu_kill(unsigned int cpu) > * while it is dying. So, try again a few times. > */ > > - for (i = 0; i < 10; i++) { > + for (i = 0; i < 100; i++) { > err = psci_ops.affinity_info(cpu_logical_map(cpu), 0); > if (err == PSCI_0_2_AFFINITY_LEVEL_OFF) { > - pr_info("CPU%d killed.\n", cpu); > + pr_info("CPU%d killed by waiting %d loops.\n", cpu, i); Could we please make that: pr_info("CPU%d killed (polled %d times)\n", cpu, i + 1); > return 0; > } > > - msleep(10); > - pr_info("Retrying again to check for CPU kill\n"); > + usleep_range(100, 1000); Hmm, so now we'll wait somewhere between 10ms and 100ms before giving up on a CPU depending on how long we actually sleep for each iteration of the loop. That should be called out in the commit message. That could matter for kdump when you have a large number of CPUs, as in the worst case for 256 CPUs we've gone from ~2.6s to ~26s. But tbh in that case I'm not sure I care that much... In the majority of cases I'd hope AFFINITY_INFO would return OFF after an iteration or two. Thanks, Mark.