Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp2957017pxb; Mon, 17 Jan 2022 09:00:05 -0800 (PST) X-Google-Smtp-Source: ABdhPJxzbm07W9JJRJzAvcRWfvlgOz9c/FVHNWSIpZNSsKsCMx90TbsD2crUNgDH67wg54Azx0jB X-Received: by 2002:a17:902:ab8d:b0:14a:98aa:b87 with SMTP id f13-20020a170902ab8d00b0014a98aa0b87mr13180782plr.100.1642438805241; Mon, 17 Jan 2022 09:00:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642438805; cv=none; d=google.com; s=arc-20160816; b=TTwHMYSnxXYQ69Fm2zIlZH4E8aerRrt0bqpc/3Hb8yYWMK2jfbpRnc0pM6q+54wkRI 6lzpJXa82o/iWC3PXpBPpv9QTR0KaoNSZCZS51fR2y4qFKGh9Z+MJo+XKcMWskNA3KU6 +oRb8ZJRJ1qZ/8Eh55P5bKou7MJNrveO0sm3Coa9LYYV/5+kyswM0DU88KHC3lGZDXO/ AC8Pe6PgEhViZQyMh/S2JhqvKtve3U+WtJkPWQ9FrkOVTB8GLYGVZb/pa7b/RYKbsWlF Y5+iVncBoLS08C9peCI/3kaln1NTMs13l25ddOnfBggyXikV0taQYTzZFlCFmw3+CFjI iCpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=4KgHKk+SmCD9ChcM+9iNVMK/0GaPgfnU+5MoMUyVImo=; b=xSOtz6SY6kokO1zfsW+Atr5r38DiIzo1E10oUD0qmU92v2NJfpcAl7k2SC+LUFRKDo 1t8I77cIUf9enHHTSN5rcNHO/jAGbTW80GfK/kzFXN00WhZ3XhugILKUQlHOeXXTj8WR KhjoUdITcpPmbTW7g30Hfh/pNvIM8LWfsASETt5pwhqy40Z6mLaCLFsfUGcMstE7IRwo 7SulIGLbyv/HaozMZ1+7pBg5u+4ki86A1bnZLHKV1peHqunN4UcoyvoFTsKOFrWseciJ +0lP7WjRmQfWRx6dpFjXTyFLXFONxOnkpu5N6wL+bLynqDoEdncfKHPf111uaB513zqv YvUQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u21si14474963pgk.182.2022.01.17.08.59.52; Mon, 17 Jan 2022 09:00:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237993AbiAQIQA (ORCPT + 99 others); Mon, 17 Jan 2022 03:16:00 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:16715 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237985AbiAQIP7 (ORCPT ); Mon, 17 Jan 2022 03:15:59 -0500 Received: from dggpeml500023.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Jcl4b05S4zZfB6; Mon, 17 Jan 2022 16:12:15 +0800 (CST) Received: from localhost.localdomain (10.69.192.56) by dggpeml500023.china.huawei.com (7.185.36.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Mon, 17 Jan 2022 16:15:57 +0800 From: Shaokun Zhang To: , CC: Guo Yang , "Rafael J. Wysocki" , Daniel Lezcano , Shaokun Zhang Subject: [PATCH] cpuidle: menu: Fix long delay issue when tick stopped Date: Mon, 17 Jan 2022 16:16:15 +0800 Message-ID: <20220117081615.45449-1-zhangshaokun@hisilicon.com> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500023.china.huawei.com (7.185.36.114) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Guo Yang The network delay was always big on arm server tested by qperf, the reason was that the cpu entered deep power down idle state(like intel C6) and can't goto a shallow one. The intervals in @get_typical_interval() was much smaller than predicted_ns in @menu_select(), so the predict state is always deepest and cause long time network delay. Every time when the cpu got an interrupt from the network, the cpu was waken up and did the IRQ, after that the cpu enter @menu_select() but the @tick_nohz_tick_stopped() was true and get a big data->next_timer_ns, the cpu can never goto a shallow state util the data->next_timer_ns timeout. Below was the print when the issue occurrence. [ 37.082861] intervals = 36us [ 37.082875] intervals = 15us [ 37.082888] intervals = 22us [ 37.082902] intervals = 35us [ 37.082915] intervals = 34us [ 37.082929] intervals = 39us [ 37.082942] intervals = 39us [ 37.082956] intervals = 35us [ 37.082970] target_residency_ns = 10000, predicted_ns = 35832710 [ 37.082998] target_residency_ns = 600000, predicted_ns = 35832710 [ 37.083037] intervals = 36us [ 37.083050] intervals = 15us [ 37.083064] intervals = 22us [ 37.083077] intervals = 35us [ 37.083091] intervals = 34us [ 37.083104] intervals = 39us [ 37.083118] intervals = 39us [ 37.083131] intervals = 35us [ 37.083145] target_residency_ns = 10000, predicted_ns = 35657420 [ 37.083174] target_residency_ns = 600000, predicted_ns = 35657420 [ 37.083212] intervals = 36us [ 37.083225] intervals = 15us [ 37.083239] intervals = 22us [ 37.083253] intervals = 35us [ 37.083266] intervals = 34us [ 37.083279] intervals = 39us [ 37.083293] intervals = 39us [ 37.083307] intervals = 35us [ 37.083320] target_residency_ns = 10000, predicted_ns = 35482140 [ 37.083349] target_residency_ns = 600000, predicted_ns = 35482140 Add idle tick wakeup judge before change predicted_ns. Cc: "Rafael J. Wysocki" Cc: Daniel Lezcano Signed-off-by: Guo Yang Signed-off-by: Shaokun Zhang --- drivers/cpuidle/governors/menu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index c492268..3f03843 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -313,7 +313,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, get_typical_interval(data, predicted_us)) * NSEC_PER_USEC; - if (tick_nohz_tick_stopped()) { + if (tick_nohz_tick_stopped() && data->tick_wakeup) { /* * If the tick is already stopped, the cost of possible short * idle duration misprediction is much higher, because the CPU -- 1.8.3.1