Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp460569ybl; Wed, 11 Dec 2019 02:38:27 -0800 (PST) X-Google-Smtp-Source: APXvYqxtcrJ8F33R2OsMRdGt3IPkXzNlY90j54qg6vneYCAKyy/P06mX2tvJLrnL4PpX7h1H6ZZQ X-Received: by 2002:aca:fd58:: with SMTP id b85mr2283929oii.106.1576060707727; Wed, 11 Dec 2019 02:38:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576060707; cv=none; d=google.com; s=arc-20160816; b=ORPWErEk7SbSTYK/O8MKitYP4gxGzhi7URozqQr9j6CCHQ2KewYrlRwAwaKhU4dAad aJ4RJNJwAX8CiY5A9gqSA13EfG1Vl9/eDel2RgaYl4t7k+qzKuQ1qi06B0lW16mJx7/9 S0dRtxdHiwlGY0yTRqVRg4A2Jsfszeuf/8VOB5k+SKdKNRFvVBrSN991KlCuncxZ0M9f M4cifSY/2VQ+mRHZc/8WArMQIO1ZrR1G5WEPDsDu0WS1Xtbhacu/sTUXGJnY6qCsFLp1 EZR6TUAknnWODUeMNSDOhbIPCSxlid0iUyiNpp/wukC83etILU5LtOePJN5A1KC9GKOd LbOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ypE33xjQV/xvJGAZCWivOIGTaz+zZc3+WM/gMk54BwY=; b=HLzPH1eWyNa56Y4IdycS/LrSTIxm7/XqmAnXzsXqKxHioQWJRn1Ef93r/kpQZM8AAU G+kzmiM0r01TKuanleJ2903kbG5iiU6TwPlKZOxRwKX2+PNMcyRpIldhOi9S9Hsx9Tja LhPJe1R32YT3Rbs8l1fsKs9KKoTPbDN3jozILamgbrpUX1G7LpTrRhSAtrUtpTGjlSod tTLabNSu9i64qKp+o3cSrZzCCQHmgwG1weD+BUpVSgM9fcPNBk4VKm661xrNSXg7xvV4 i9kEcasmBNGDMMwrn6xCdszb7BXHfihLkc1SYfXSKTvBgczSfM+gKGgNcfplHyfWyYQP XcKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ku5JWRXH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w18si780940otl.54.2019.12.11.02.38.15; Wed, 11 Dec 2019 02:38:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ku5JWRXH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728907AbfLKKhC (ORCPT + 99 others); Wed, 11 Dec 2019 05:37:02 -0500 Received: from mail-pf1-f193.google.com ([209.85.210.193]:43128 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728030AbfLKKhC (ORCPT ); Wed, 11 Dec 2019 05:37:02 -0500 Received: by mail-pf1-f193.google.com with SMTP id h14so1597939pfe.10 for ; Wed, 11 Dec 2019 02:37:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=ypE33xjQV/xvJGAZCWivOIGTaz+zZc3+WM/gMk54BwY=; b=ku5JWRXHY/4veanZyG8iQdHT0kYKtnWrt0lOiJIykUMYt7UpMhO8waPPbxxBj2otrl bDgisHUNs9rz9JvoydvEjjQIAbGxuoPj3HVse/aKJEQE9b2yYigFC1JvqXInJ92IvOaO sABULdBYbXyovUn5TOaZG/UGWVOX1aW3Qy890mHCEszuZ/BVltf1ZX7sZlSLfdlE3Wzs CNZB2w2B4ClFHBAPQf1CVZnPyiGMtJTNpJ692XLwBNj+G7OK6PTQuZxWn4MmfOebuJWw FZoL84GO37bX8zZYHpjNncA6n0ObQk804kuvInGOR2q4sunTTT1E2+cVt31xezT6Nx9T Lc3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ypE33xjQV/xvJGAZCWivOIGTaz+zZc3+WM/gMk54BwY=; b=c1kTdVuou6nzgxhZP4aUkDCGk1rM6UE1r8W02XroHeHhF7e5DkmfTsPSFhNgLnm9wl pD9tDDpex6rWa84BB2dfpW3pZSJPRux8uLAR4IFkyGWj/l9IN4hIK+u+HVM7TJaQaksg 9rj1IELKtuSTfpiH5uCeoSwtXfFJrYu9SpT7QWA0Y66p1hgBahdBdIGHtFIkJ64l/7aO 2CtVg36dzExO4KZ2kBCIpCbdbxtGDsp1ZtxQeD79Pm3+7EggIdf2ist+VG5phpZTX1Pq uPoD6YNp6s98uLxBu6oqh62cC1LRjWBmfLRvOvw1Pz+I/Tk7h/VtqO5UTrtUAkBgBSaJ 60jw== X-Gm-Message-State: APjAAAVbH9oWuRpgnh3ppdbeZdvB9CtMqSvs/yDVqV6IdhobIHCCK9jb XHWmbyof6UoLw550O5GJzNXxGw== X-Received: by 2002:a63:5f91:: with SMTP id t139mr3243528pgb.185.1576060621746; Wed, 11 Dec 2019 02:37:01 -0800 (PST) Received: from localhost ([122.171.112.123]) by smtp.gmail.com with ESMTPSA id z10sm2311173pgg.39.2019.12.11.02.37.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Dec 2019 02:37:01 -0800 (PST) Date: Wed, 11 Dec 2019 16:06:58 +0530 From: Viresh Kumar To: "Rafael J. Wysocki" Cc: Peter Zijlstra , Linux PM , LKML , Anson Huang , Peng Fan , "Rafael J. Wysocki" Subject: Re: [PATCH] cpufreq: Avoid leaving stale IRQ work items during CPU offline Message-ID: <20191211103658.54pqb4jch3gxvzsv@vireshk-i7> References: <2691942.bH9KnLg61H@kreacher> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2691942.bH9KnLg61H@kreacher> User-Agent: NeoMutt/20180716-391-311a52 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11-12-19, 11:28, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki > > The scheduler code calling cpufreq_update_util() may run during CPU > offline on the target CPU after the IRQ work lists have been flushed > for it, so the target CPU should be prevented from running code that > may queue up an IRQ work item on it at that point. > > Unfortunately, that may not be the case if dvfs_possible_from_any_cpu > is set for at least one cpufreq policy in the system, because that > allows the CPU going offline to run the utilization update callback > of the cpufreq governor on behalf of another (online) CPU in some > cases. > > If that happens, the cpufreq governor callback may queue up an IRQ > work on the CPU running it, which is going offline, and the IRQ work > will not be flushed after that point. Moreover, that IRQ work cannot > be flushed until the "offlining" CPU goes back online, so if any > other CPU calls irq_work_sync() to wait for the completion of that > IRQ work, it will have to wait until the "offlining" CPU is back > online and that may not happen forever. In particular, a system-wide > deadlock may occur during CPU online as a result of that. > > The failing scenario is as follows. CPU0 is the boot CPU, so it > creates a cpufreq policy and becomes the "leader" of it > (policy->cpu). It cannot go offline, because it is the boot CPU. > Next, other CPUs join the cpufreq policy as they go online and they > leave it when they go offline. The last CPU to go offline, say CPU3, > may queue up an IRQ work while running the governor callback on > behalf of CPU0 after leaving the cpufreq policy because of the > dvfs_possible_from_any_cpu effect described above. Then, CPU0 is > the only online CPU in the system and the stale IRQ work is still > queued on CPU3. When, say, CPU1 goes back online, it will run > irq_work_sync() to wait for that IRQ work to complete and so it > will wait for CPU3 to go back online (which may never happen even > in principle), but (worse yet) CPU0 is waiting for CPU1 at that > point too and a system-wide deadlock occurs. > > To address this problem notice that CPUs which cannot run cpufreq > utilization update code for themselves (for example, because they > have left the cpufreq policies that they belonged to), should also > be prevented from running that code on behalf of the other CPUs that > belong to a cpufreq policy with dvfs_possible_from_any_cpu set and so > in that case the cpufreq_update_util_data pointer of the CPU running > the code must not be NULL as well as for the CPU which is the target > of the cpufreq utilization update in progress. > > Accordingly, change cpufreq_this_cpu_can_update() into a regular > function in kernel/sched/cpufreq.c (instead of a static inline in a > header file) and make it check the cpufreq_update_util_data pointer > of the local CPU if dvfs_possible_from_any_cpu is set for the target > cpufreq policy. > > Also update the schedutil governor to do the > cpufreq_this_cpu_can_update() check in the non-fast-switch > case too to avoid the stale IRQ work issues. > > Fixes: 99d14d0e16fa ("cpufreq: Process remote callbacks from any CPU if the platform permits") > Link: https://lore.kernel.org/linux-pm/20191121093557.bycvdo4xyinbc5cb@vireshk-i7/ > Reported-by: Anson Huang > Cc: 4.14+ # 4.14+ > Signed-off-by: Rafael J. Wysocki Acked-by: Viresh Kumar -- viresh