Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp6609641yba; Wed, 1 May 2019 16:13:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqwvG1tySIQ4cPNke+VpsrwwPU5YXrvwndGgBn9jsBiTIg7Xy+U8S+4lM6RWGwQUjdv4o9fb X-Received: by 2002:aa7:8208:: with SMTP id k8mr582530pfi.69.1556752436059; Wed, 01 May 2019 16:13:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556752436; cv=none; d=google.com; s=arc-20160816; b=Vbz00kRcuT0Zve0WmmMfNHkRWRFGBOrOz7xm/9M1RC7OCzSyn1vWKRaJnANtausGu6 Gtewq1OZkj48gH71I3LZL5sLCuwp8Fpsf99QFf1TGeicvmjsASiwqEJAQHVdCCI+n19s LPxIFCl60hx3kzNANIoq6osnA7ipit2zC0rikhRu4vcK3J0kb53Qrdrm2N+xIiVB8OJB AfCjLsuLzYX+zx3r01T6ttuoIyi4/+ro0Czi0STIjm/rfIXwI3yFsrqPT/q7TTaPtADJ 4W2qzuAy3nNYgJx6bq6fWN/qnTyzy0i4h3FwPV/jrVERFeaVL0COHeQpe4UBCIB308o6 jM8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:message-id :user-agent:mime-version:in-reply-to:references:cc:to:subject:from :date:dkim-signature; bh=xkHY0Vk0CxJ06DIPyqIQG5n7Y+l7wmOyvhLV4dHGd9M=; b=UL+YmitZUr8mJ6mQLhA+n8vrdnLIgrVtT1fnKN3eGU5lhAZIDAUDALdPs+aoHDEkab 0xNyIMU8ZgxAIo8l3Hz93hgtlmPh4oXBJpZe+BWi4eBHlFbRGMB6d3Tv1e8kKa0TeqIA D9nZ5jNAJt++htZ45qLh3RksLsZ7SA7TwkhPt/po2dH00HHD+PkHLjqmvBkoYQx5sF5t s+jJtZT/ASvKmm6u9d4bskcpg7Xgd5tiXp3gd/pn1oHnLugi3Oh2Wziw4BUvcGEcCIgI UdZaEEMhJHeqiCxnK9FWHFI3lWP6vBl2biDU+UPSuAyqezx41lJEcui9T3VofkuBeQeq KmcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C9jW4+Wf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y10si40844221pga.96.2019.05.01.16.13.38; Wed, 01 May 2019 16:13:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C9jW4+Wf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726173AbfEAXMo (ORCPT + 99 others); Wed, 1 May 2019 19:12:44 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:44978 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726125AbfEAXMo (ORCPT ); Wed, 1 May 2019 19:12:44 -0400 Received: by mail-pl1-f193.google.com with SMTP id l2so101509plt.11 for ; Wed, 01 May 2019 16:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:subject:to:cc:references:in-reply-to:mime-version :user-agent:message-id:content-transfer-encoding; bh=xkHY0Vk0CxJ06DIPyqIQG5n7Y+l7wmOyvhLV4dHGd9M=; b=C9jW4+Wf+dG9vgT4hlyOesCrC6DT31u4n5QFfZ4h83VmJwVBANCoa2/4T7aNfba6O/ Knev24dh6CyFTdVrpgU/LF1LvjWfogUqgxcQpxw0jJknCi09EFFhBTFVnDBhZ1IIUjjP SWfdvb4gNYZInR0Poe2Cc0xZY0C+qsmoyhn/oMeSlpZ9MEor77ihhqx2bbsWLDFChWCi 6n8XsSL6uEnRD7wSNTbC62rw8zEdy/XYk0Ri+YLGy3jMA1vuZ8JV7tMEUOkBTg8oqI2e 0c4Cu4BdaxiN5JFG+g5yZcwZThco7XvxoXDbdR0NICQ1jIJ6yCiNagCiXWDb1oKz0UEe bNNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:user-agent:message-id:content-transfer-encoding; bh=xkHY0Vk0CxJ06DIPyqIQG5n7Y+l7wmOyvhLV4dHGd9M=; b=HP2eYlteogQ5pz+IcR9NloftxP6DfhIpzjpPTl4JJ3cr6YTbIfcM4Vcu+MXNhHzFe8 WTTwCZXiuRfAAZJ++EPb8lRNynWIvGxz43ik44E7VYOP4iYfM9LSPIyVR0nlz++I2NIl s0B/7ZcnRcwIWWeq2hgm+2sW41YTQlG1Slb0x8OaHv/iOpnI7UZzIjJkn5HYehI0JIuu CZWOw6vAB3qQJKtjE8FJLeHe4V4XZSlNi+9PJdm+v/OmzNbexHIVX4LhE7gz8kYfpz+j Ara94J16CcWxplytrkDr4RBINSuZidgK8UPjy64+WQBzKb/d37Y4UGxumyCDRIhRLQUi l27w== X-Gm-Message-State: APjAAAXJcO/3BpS0pcEoyiWp3GhUHDQdFfEoEsEWhDZ6uNqF77sqYgQY HJSCgvP9QY+LOGNlDHXbRpTEHRvM X-Received: by 2002:a17:902:302:: with SMTP id 2mr275822pld.232.1556752363524; Wed, 01 May 2019 16:12:43 -0700 (PDT) Received: from localhost ([61.68.7.233]) by smtp.gmail.com with ESMTPSA id i10sm93944058pfj.7.2019.05.01.16.12.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 01 May 2019 16:12:42 -0700 (PDT) Date: Thu, 02 May 2019 09:12:34 +1000 From: Nicholas Piggin Subject: Re: [PATCH v4] powerpc/pseries: Remove limit in wait for dying CPU To: Thiago Jung Bauermann , Nathan Lynch Cc: Gautham R Shenoy , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Michael Bringmann , Vaidyanathan Srinivasan , Tyrel Datwyler References: <20190423223914.3882-1-bauerman@linux.ibm.com> <877ebbsb8u.fsf@linux.ibm.com> <87v9yve02x.fsf@morokweng.localdomain> <8736lyrzmh.fsf@linux.ibm.com> In-Reply-To: <8736lyrzmh.fsf@linux.ibm.com> MIME-Version: 1.0 User-Agent: astroid/0.14.0 (https://github.com/astroidmail/astroid) Message-Id: <1556752043.jyg2z3kgaw.astroid@bobo.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Nathan Lynch's on May 2, 2019 12:57 am: > Hi Thiago, >=20 > Thiago Jung Bauermann writes: >> Nathan Lynch writes: >>> Thiago Jung Bauermann writes: >>>> + while (true) { >>>> cpu_status =3D smp_query_cpu_stopped(pcpu); >>>> if (cpu_status =3D=3D QCSS_STOPPED || >>>> cpu_status =3D=3D QCSS_HARDWARE_ERROR) >>>> break; >>>> - cpu_relax(); >>>> + udelay(100); >>>> } >>>> } >>> >>> I agree with looping indefinitely but doesn't it need a cond_resched() >>> or similar check? >> >> If there's no kernel or hypervisor bug, it shouldn't take more than a >> few tens of ms for this loop to complete (Gautham measured a maximum of >> 10 ms on a POWER9 with an earlier version of this patch). >=20 > 10ms is twice the default scheduler quantum... >=20 >=20 >> In case of bugs related to CPU hotplug (either in the kernel or the >> hypervisor), I was hoping that the resulting lockup warnings would be a >> good indicator that something is wrong. :-) >=20 > Not convinced we should assume something is wrong if it takes a few > dozen ms to complete the operation. Right, and if there is no kernel or hypervisor bug then it will stop eventually :) > AFAIK we don't have any guarantees > about the maximum latency of stop-self, and it can be affected by other > activity in the system, whether we're in shared processor mode, etc. Not > to mention smp_query_cpu_stopped has to acquire the global RTAS lock and > be serialized with other tasks calling into RTAS. So I am concerned > about generating spurious warnings here. Agreed. >=20 > If for whatever reason the operation is taking too long, drmgr or > whichever application is initiating the change will appear to stop > making progress. It's not too hard to find out what's going on with > facilities like perf or /proc/pid/stack. >=20 >=20 >> Though perhaps adding a cond_resched() every 10 ms or so, with a >> WARN_ON() if it loops for more than 50 ms would be better. >=20 > A warning doesn't seem appropriate to me, and cond_resched should be > invoked in each iteration. Or just msleep(1) in each iteration would be > fine, I think. >=20 > But I'd like to bring in some more context -- here is the body of > pseries_cpu_die: >=20 > static void pseries_cpu_die(unsigned int cpu) > { > int tries; > int cpu_status =3D 1; > unsigned int pcpu =3D get_hard_smp_processor_id(cpu); >=20 > if (get_preferred_offline_state(cpu) =3D=3D CPU_STATE_INACTIVE) { > cpu_status =3D 1; > for (tries =3D 0; tries < 5000; tries++) { > if (get_cpu_current_state(cpu) =3D=3D CPU_STATE_INACTIVE) { > cpu_status =3D 0; > break; > } > msleep(1); > } > } else if (get_preferred_offline_state(cpu) =3D=3D CPU_STATE_OFFLINE) { >=20 > for (tries =3D 0; tries < 25; tries++) { > cpu_status =3D smp_query_cpu_stopped(pcpu); > if (cpu_status =3D=3D QCSS_STOPPED || > cpu_status =3D=3D QCSS_HARDWARE_ERROR) > break; > cpu_relax(); > } > } >=20 > This patch alters the behavior of the second loop (the CPU_STATE_OFFLINE > branch). The CPU_STATE_INACTIVE branch is used when the offline behavior > is to use H_CEDE instead of stop-self, correct? >=20 > And isn't entering H_CEDE expected to be quite a bit faster than > stop-self? If so, why does that path get five whole seconds[*] while > we're bikeshedding about tens of milliseconds for stop-self? :-) >=20 > [*] And should it be made to retry indefinitely as well? I think so. Thanks, Nick =