Received: by 10.213.65.68 with SMTP id h4csp198637imn; Wed, 28 Mar 2018 01:40:33 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/AQU9wa2UoHx/F1yaPxWUif5IlUv6qQ47Fykx1L1AxzF3bgHH7qsi0W2a8lWqbGY6hIuTy X-Received: by 2002:a17:902:6b49:: with SMTP id g9-v6mr2860891plt.75.1522226433023; Wed, 28 Mar 2018 01:40:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522226432; cv=none; d=google.com; s=arc-20160816; b=gNXpUijL7AkIYyLbl5fXzqPnZhj+lIY/+VSoKYMUVi+AugpMbRL2vknNRK4Ulxuv/T nxv1A01aU93s0H0wCh0vjCw2zJAErSZRRwz7PJegpS7oW7rkYGvNAtqGeh8Sc20kZQip aXZm8Z+fUKP2ImABAkwbMPpcgjKjj6j6aXBFEkjxo0CqmvJwD4eGO1rkACP52r37laey X/9m5hnzyp140h/xT2g9f2+Se4XvFUiYrsGBTtpV4t9qiF4IQYqPvz+tqwBCLeZQZdiE 2H4Q3MCm5MGYCQHh80tmQeoMGlIHubWNQLdo+igcN2rLtwoUDWf3diNvkLlh+FAlaLUt wxfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=u58Mr+OVFv9ojiG9LFviysbnuYR2VJC/75eK57acOZU=; b=FklI8E/dQqDPkMmXwSBQhQmGzDf7xjqtxfF9MlxgjhM8I3jKr6PbkO+BdxtTTqR7ex 3Pu6rnyCC72JM0wD/0czLJaKjVNBGwJT2pbGINmNxGjhlMnMvcLogEmiyLPktrC1zxcj kMMpAyswK1dVY6N7jccFEJwbJGubJwHIqx3BJWFFT8VoSMWPr8nuOBiaNAESOLhmHz4C 2bjYxxysSi2gWC4YArEp2gt4TMVK5TqAqi08ADr2LgXbpDvjsEpVNRPX2PsU11QF1PYf QIluQXru4x83igbjGzd1glJMRXNE8sZh0j6WjOvJcdICfAGYyCXLaZiBUpXoqvEQqx01 N3nA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j3-v6si3176336pld.137.2018.03.28.01.40.18; Wed, 28 Mar 2018 01:40:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752675AbeC1IjV (ORCPT + 99 others); Wed, 28 Mar 2018 04:39:21 -0400 Received: from mailout5.zih.tu-dresden.de ([141.30.67.74]:46598 "EHLO mailout5.zih.tu-dresden.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751513AbeC1IjS (ORCPT ); Wed, 28 Mar 2018 04:39:18 -0400 Received: from [172.26.34.104] (helo=msx.tu-dresden.de) by mailout5.zih.tu-dresden.de with esmtps (TLSv1.2:AES256-SHA:256) (Exim 4.84_2) (envelope-from ) id 1f16bo-0004st-RA; Wed, 28 Mar 2018 10:39:05 +0200 Received: from [141.30.69.3] (141.30.69.3) by MSX-L104.msx.ad.zih.tu-dresden.de (172.26.34.104) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Wed, 28 Mar 2018 10:38:51 +0200 Subject: Re: [RFT][PATCH v7 6/8] sched: idle: Select idle state before stopping the tick To: "Rafael J. Wysocki" CC: Peter Zijlstra , Linux PM , Frederic Weisbecker , "Thomas Gleixner" , Paul McKenney , Doug Smythies , Rik van Riel , "Aubrey Li" , Mike Galbraith , LKML References: <2390019.oHdSGtR3EE@aspire.rjw.lan> <2249320.0Z4q8AXauv@aspire.rjw.lan> <6462e44a-e207-6b97-22bf-ad4aed69afc2@tu-dresden.de> <4198010.6ArFqS34NK@aspire.rjw.lan> From: Thomas Ilsche Message-ID: Date: Wed, 28 Mar 2018 10:38:50 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: MSX-L106.msx.ad.zih.tu-dresden.de (172.26.34.106) To MSX-L104.msx.ad.zih.tu-dresden.de (172.26.34.104) X-PMWin-Version: 4.0.3, Antivirus-Engine: 3.70.2, Antivirus-Data: 5.49 X-TUD-Virus-Scanned: mailout5.zih.tu-dresden.de Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-03-28 10:13, Rafael J. Wysocki wrote: > On Wed, Mar 28, 2018 at 12:10 AM, Rafael J. Wysocki wrote: >> On Tuesday, March 27, 2018 11:50:02 PM CEST Thomas Ilsche wrote: >>> On 2018-03-20 16:45, Rafael J. Wysocki wrote: >>>> From: Rafael J. Wysocki >>>> >>>> In order to address the issue with short idle duration predictions >>>> by the idle governor after the tick has been stopped, reorder the >>>> code in cpuidle_idle_call() so that the governor idle state selection >>>> runs before tick_nohz_idle_go_idle() and use the "nohz" hint returned >>>> by cpuidle_select() to decide whether or not to stop the tick. >>>> >>>> This isn't straightforward, because menu_select() invokes >>>> tick_nohz_get_sleep_length() to get the time to the next timer >>>> event and the number returned by the latter comes from >>>> __tick_nohz_idle_enter(). Fortunately, however, it is possible >>>> to compute that number without actually stopping the tick and with >>>> the help of the existing code. >>> >>> I think something is wrong with the new tick_nohz_get_sleep_length. >>> It seems to return a value that is too large, ignoring immanent >>> non-sched timer. >> >> That's a very useful hint, let me have a look. >> >>> I tested idle-loop-v7.3. It looks very similar to my previous results >>> on the first idle-loop-git-version [1]. Idle and traditional synthetic >>> powernightmares are mostly good. >> >> OK >> >>> But it selects too deep C-states for short idle periods, which is bad >>> for power consumption [2]. >> >> That still needs to be improved, then. >> >>> I tracked this down with additional tests using >>> __attribute__((optimize("O0"))) menu_select >>> and perf probe. With this the behavior seems slightly different, but it >>> shows that data->next_timer_us is: >>> v4.16-rc6: the expected ~500 us [3] >>> idle-loop-v7.3: many milliseconds to minutes [4]. >>> This leads to the governor to wrongly selecting C6. >>> >>> Checking with 372be9e and 6ea0577, I can confirm that the change is >>> introduced by this patch. >> >> Yes, that's where the most intrusive reordering happens. > > Overall, this is an interesting conundrum, because the case in > question is when the tick should never be stopped at all during the > workload and the code's behavior in that case should not change, so > the change was not intentional. > > Now, from walking through the code, as long as can_stop_idle_tick() > returns 'true' all should be fine or at least I don't see why there is > any difference in behavior in that case. > > However, if can_stop_idle_tick() returns 'false' (for example, because > need_resched() returns 'true' when it is evaluated), the behavior *is* > different in a couple of ways. I sort of know how that can be > addressed, but I'd like to reproduce your results here. > > Are you still using the same workload as before to trigger this behavior? > Yes, the exact code I use is as follows $ gcc poller.c -O3 -fopenmp -o poller_omp $ GOMP_CPU_AFFINITY=0-35 ./poller_omp 500 #include #include #include int main(int argc, char *argv[]) { int sleep_us = 10000; if (argc == 2) { sleep_us = atoi(argv[1]); } #pragma omp parallel { while (1) { usleep(sleep_us); } } }