Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp5873959pxb; Thu, 27 Jan 2022 01:06:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJxeNtoVMuzyaIB4a3C7/bRYGMBVYJHW6urBytC1MArcPX6wHyE50B8ImoObGG66D4NVWXYJ X-Received: by 2002:a17:90b:1b04:: with SMTP id nu4mr2267252pjb.137.1643274415141; Thu, 27 Jan 2022 01:06:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643274415; cv=none; d=google.com; s=arc-20160816; b=JOlZo/teqsxhUtQscKEEz9JEdQo1bNFXdODVwt7DirX7JhcgbnbCoAcsh+hZWNKscf Sy5FpOlNmii7xd/unTQTXSn+fywHqLkANUX+4+ibia0AugaJ4vV8nXvP9pJE9zpz3Gjt eHxlP5s3TsuMKeiZYytfxQzLD6bVPl/TILszncLsU0x1DK/BIxr7c2mcV/j71qOixWDI LEVaPO+J4NPANKPTJrBc4bWgnG4G+b0VdHydWcTyxQT18yrkN7UyZSC/fBuNQL2GwIZf RhQuKikAjC25ze5oDE1LHcf+0lGlxtU7Fvh+q/Cndju3tt1E/G0dacs+Lvtw+Zh3/t5k asMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=KgUUjgzKCXvUI/GxFhOruBFnz1iewPdp/WwUeBsT/YU=; b=gHkyIAhfCHIHxdhyK2ws2jSpPv7vH0eaDBPQkwN5W/QwFJ7LziID+LpOgQkKDfYTrm gBcS6F5V30YpUWAS8JPACz5p44m0ijNMZbtMoBDPb7aWCi8rCGpQDbFoITR0I7alisjF kqWbafnT4LZPo9J8+XScwCx5tiDr0y1ZzEETesGOLuUL7kzhFcweDJmeO4G+uxmi2z7I S9k2mtm8LN4XBFW6RcYY1ApH/8gUmOb+cTaFnKb5DhkOXB9rvRq+nUuQ7N1x7t1b9iFv 3tT4ho/UtQnxblzdqkzhpget1+GJyIz61It0GV6kO1NUcIgyzAlcJ1FBRXmGxLX5zacC Yoaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u16si1782125plf.527.2022.01.27.01.06.42; Thu, 27 Jan 2022 01:06:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234656AbiA0Bn7 (ORCPT + 99 others); Wed, 26 Jan 2022 20:43:59 -0500 Received: from szxga03-in.huawei.com ([45.249.212.189]:32123 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiA0Bn6 (ORCPT ); Wed, 26 Jan 2022 20:43:58 -0500 Received: from dggpeml500023.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4JkjwV3pGWz8wdn; Thu, 27 Jan 2022 09:40:58 +0800 (CST) Received: from [10.67.77.175] (10.67.77.175) by dggpeml500023.china.huawei.com (7.185.36.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Thu, 27 Jan 2022 09:43:56 +0800 Subject: Re: [PATCH] cpuidle: menu: Fix long delay issue when tick stopped To: "Rafael J. Wysocki" CC: Linux PM , Linux Kernel Mailing List , Guo Yang , Daniel Lezcano References: <20220117081615.45449-1-zhangshaokun@hisilicon.com> From: Shaokun Zhang Message-ID: Date: Thu, 27 Jan 2022 09:43:56 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.77.175] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500023.china.huawei.com (7.185.36.114) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Rafael, Apologies that reply later. On 2022/1/21 2:55, Rafael J. Wysocki wrote: > On Mon, Jan 17, 2022 at 9:16 AM Shaokun Zhang > wrote: >> ...... >> [ 37.083307] intervals = 35us >> [ 37.083320] target_residency_ns = 10000, predicted_ns = 35482140 >> [ 37.083349] target_residency_ns = 600000, predicted_ns = 35482140 >> >> Add idle tick wakeup judge before change predicted_ns. >> >> Cc: "Rafael J. Wysocki" >> Cc: Daniel Lezcano >> Signed-off-by: Guo Yang >> Signed-off-by: Shaokun Zhang >> --- >> drivers/cpuidle/governors/menu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c >> index c492268..3f03843 100644 >> --- a/drivers/cpuidle/governors/menu.c >> +++ b/drivers/cpuidle/governors/menu.c >> @@ -313,7 +313,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, >> get_typical_interval(data, predicted_us)) * >> NSEC_PER_USEC; >> >> - if (tick_nohz_tick_stopped()) { >> + if (tick_nohz_tick_stopped() && data->tick_wakeup) { > > data->tick_wakeup is only true if tick_nohz_idle_got_tick() has > returned true, but I'm not sure how this can happen after stopping the > tick. In order to debug this, call trace is added and as follow: if (predicted_us < TICK_USEC) predicted_us = ktime_to_us(delta_next); printk("predicted_us = %uus\n", predicted_us); dump_stack(); //add call trace print } When the issue came, the CPU was waken up by network interrupts [ 1048.130033] intervals = 1us [ 1048.130034] intervals = 1us [ 1048.130035] intervals = 1us [ 1048.130036] intervals = 1us [ 1048.130037] intervals = 1us [ 1048.130038] intervals = 1us [ 1048.130039] intervals = 1us [ 1048.130040] intervals = 1us [ 1048.130041] predicted_us = 484143us [ 1048.130043] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130044] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130045] Call Trace: [ 1048.130048] dump_stack+0x5a/0x73 [ 1048.130052] menu_select+0x3b0/0x6c0 [ 1048.130058] do_idle+0x1b4/0x290 [ 1048.130063] cpu_startup_entry+0x19/0x20 [ 1048.130067] start_secondary+0x155/0x1b0 [ 1048.130070] secondary_startup_64+0xa4/0xb0 [ 1048.130078] intervals = 1us [ 1048.130079] intervals = 1us [ 1048.130080] intervals = 1us [ 1048.130081] intervals = 1us [ 1048.130081] intervals = 1us [ 1048.130082] intervals = 1us [ 1048.130083] intervals = 1us [ 1048.130084] intervals = 1us [ 1048.130085] predicted_us = 484097us [ 1048.130087] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130088] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130089] Call Trace: [ 1048.130093] dump_stack+0x5a/0x73 [ 1048.130097] menu_select+0x3b0/0x6c0 [ 1048.130102] do_idle+0x1b4/0x290 [ 1048.130107] cpu_startup_entry+0x19/0x20 [ 1048.130112] start_secondary+0x155/0x1b0 [ 1048.130115] secondary_startup_64+0xa4/0xb0 [ 1048.130123] intervals = 1us [ 1048.130123] intervals = 1us [ 1048.130124] intervals = 1us [ 1048.130125] intervals = 1us [ 1048.130126] intervals = 1us [ 1048.130127] intervals = 1us [ 1048.130128] intervals = 1us [ 1048.130129] intervals = 1us [ 1048.130130] predicted_us = 484053us [ 1048.130132] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 5.3.0-rc6 #23 [ 1048.130133] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 0.39 12/01/2017 [ 1048.130134] Call Trace: [ 1048.130137] dump_stack+0x5a/0x73 [ 1048.130141] menu_select+0x3b0/0x6c0 [ 1048.130147] do_idle+0x1b4/0x290 [ 1048.130152] cpu_startup_entry+0x19/0x20 [ 1048.130156] start_secondary+0x155/0x1b0 [ 1048.130159] secondary_startup_64+0xa4/0xb0 > > IOW, it looks like the change simply makes the condition be always false. > Agree, any good feedback is welcome and we can try it. Thanks, Shaokun >> /* >> * If the tick is already stopped, the cost of possible short >> * idle duration misprediction is much higher, because the CPU >> -- >> 1.8.3.1 >> > . >