Received: by 10.213.65.68 with SMTP id h4csp298773imn; Wed, 28 Mar 2018 03:57:07 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+rEslW993poxaifna3QN9HUQc8X5mG79gecPuGe3CXBCSfoIERGyQFiuZFTCChE+0r2ryY X-Received: by 2002:a17:902:86:: with SMTP id a6-v6mr3379448pla.298.1522234627017; Wed, 28 Mar 2018 03:57:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522234626; cv=none; d=google.com; s=arc-20160816; b=CXaPWvgcoslOw/DOJVEJ0xK02Ip4bKruGhAdFD9ZibzgUfQL7D61oZL2k6seeKiicy ep85I7wbnlUvJzs/yCUmQXz9WHTrP3lutJ2v7/nGEOvgojdLv1myTMh/w0CE1mtxcdjG jL+y7KcwLE/zlx5UTJZWGVd/2z2V/ro0THy7dyDALToY1JF4JWjzoe4ws1CoPJrH+tYb Ea4/lB5Nhsbmge7X32va3QhaC5wd6GO/UrPV9sqNHcMe89sm0jXIo3/Rux7ldVyJXETu Hh4f2ZH8hlZQ9Dp+kkAaALwE8ny5tnauS+wCSfijnMWg7egRwyMU2C+FDqpPaDaLKx3y hHFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=4gwYsSF7mDuIh3gRgzOid5x1yHtwL2tGPDVKMh7XYHE=; b=sLuPjAqtQtNWxjKPNMvwjdx0DWMtr9CbAfoVkENPTzBergEI9y9BVcppa1RJV2ZdRK w9513YdnDZbx0AseR0jCDM42mwfEXUodXkPPsfzE60yvtsnDI8Ki3WAGiq+NSGxm/NV+ ItUvq0LIirvzKxJU1x7zFpDgZoMvdnTzxQiCBN0v3vArTfWtWWMLswdDQHJpp8oBu5Ny y3hzpEvIga4ncB4yaOizwGvRQs1qsJLM42kjiQr3ddrVigz83KULpNWI825QCvaxaEjB 5eXhPsVAu8s3cG7ZL4dMzxxEjhUZ7Norocj1HQjBbpFhDxa5B0xRweUFHRMWd2pBfP5u D1HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=uZ8IoNLx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h8si2326050pgr.370.2018.03.28.03.56.52; Wed, 28 Mar 2018 03:57:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=uZ8IoNLx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752562AbeC1KhK (ORCPT + 99 others); Wed, 28 Mar 2018 06:37:10 -0400 Received: from mail-ot0-f194.google.com ([74.125.82.194]:33275 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752125AbeC1KhH (ORCPT ); Wed, 28 Mar 2018 06:37:07 -0400 Received: by mail-ot0-f194.google.com with SMTP id 23-v6so2123648otj.0; Wed, 28 Mar 2018 03:37:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=4gwYsSF7mDuIh3gRgzOid5x1yHtwL2tGPDVKMh7XYHE=; b=uZ8IoNLxGaK5E8U5kL/WMTFJiToIcZhwvepwq5lmqpDUTo3PwhxBdYCskIDXvGOE15 mnhpGFhCYZZxkgJwEETM4ZO3XxttXYO6dtNEcQBzIQnBEu8IOFTGa4OD7U+MibIFWJtp lHaKkfUtGmwgeeRrQ4zkNB6oIXCffPnmGtF5uBMSmpZf+xUhgB0QmrxjWQP+sHrRWigl CLfRmBU2coBgj2HvQnEyLwdTXCilFYhvsDVHrRhqx4shtSVhJOTYQxJuIeBiOEm6FDEW wLQ4aHGPJ1BMwAVnokdgFyb91qs84ZA1e6eT03Jmuz3XI1hrbSKyn4WGRY7f3VrVpRV3 QtVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=4gwYsSF7mDuIh3gRgzOid5x1yHtwL2tGPDVKMh7XYHE=; b=B4DA/Xh2qB7IKI3vEG+Otnr19sfbVsskMHB/j1DYRqgKLGw7j8R0nBm7di9YWBtp6b 8SgY0U0LtgUxc4pGePx4sv6rnU+CBMhFfKa182J95oiWWZzvxa86fmL75615mDm6K1Mh 4qNvAfiqaQOU6D4LQRXWlKUhSTyXoN0MP6uUjddYy020Z9tdkrCqA7Z8wxLZfomPI7Uh fEx2Dpge9d2qI0NPXki/fEFTYn1cfmwUMbCkql9xl3ZVNB4kzIQHFf+iogEIpg/5wIC0 eoaQNmOohzn8CKk5WlCogwE45IqUC7GRqJ3lj4xmIXDy46SEn5rC8Pv+KwTPzREkCEWs WWIA== X-Gm-Message-State: AElRT7GwZUKB4BFL0nrPrXXqFuYfqG/0e7YXVNACld7VRbEwFh3YD2eR RFE95zYoaLHIOyjUYL3osZnuUl0hYKYOuFrl1fZJ+Q== X-Received: by 2002:a9d:4811:: with SMTP id c17-v6mr1696887otf.291.1522233426978; Wed, 28 Mar 2018 03:37:06 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:9f7:0:0:0:0:0 with HTTP; Wed, 28 Mar 2018 03:37:06 -0700 (PDT) In-Reply-To: References: <2390019.oHdSGtR3EE@aspire.rjw.lan> <2249320.0Z4q8AXauv@aspire.rjw.lan> <6462e44a-e207-6b97-22bf-ad4aed69afc2@tu-dresden.de> <4198010.6ArFqS34NK@aspire.rjw.lan> From: "Rafael J. Wysocki" Date: Wed, 28 Mar 2018 12:37:06 +0200 X-Google-Sender-Auth: g481QCn5NOH5bhH2rO9MOS6FZYw Message-ID: Subject: Re: [RFT][PATCH v7 6/8] sched: idle: Select idle state before stopping the tick To: Thomas Ilsche Cc: "Rafael J. Wysocki" , Peter Zijlstra , Linux PM , Frederic Weisbecker , Thomas Gleixner , Paul McKenney , Doug Smythies , Rik van Riel , Aubrey Li , Mike Galbraith , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 28, 2018 at 10:38 AM, Thomas Ilsche wrote: > On 2018-03-28 10:13, Rafael J. Wysocki wrote: >> >> On Wed, Mar 28, 2018 at 12:10 AM, Rafael J. Wysocki >> wrote: >>> >>> On Tuesday, March 27, 2018 11:50:02 PM CEST Thomas Ilsche wrote: >>>> >>>> On 2018-03-20 16:45, Rafael J. Wysocki wrote: >>>>> >>>>> From: Rafael J. Wysocki >>>>> >>>>> In order to address the issue with short idle duration predictions >>>>> by the idle governor after the tick has been stopped, reorder the >>>>> code in cpuidle_idle_call() so that the governor idle state selection >>>>> runs before tick_nohz_idle_go_idle() and use the "nohz" hint returned >>>>> by cpuidle_select() to decide whether or not to stop the tick. >>>>> >>>>> This isn't straightforward, because menu_select() invokes >>>>> tick_nohz_get_sleep_length() to get the time to the next timer >>>>> event and the number returned by the latter comes from >>>>> __tick_nohz_idle_enter(). Fortunately, however, it is possible >>>>> to compute that number without actually stopping the tick and with >>>>> the help of the existing code. >>>> >>>> >>>> I think something is wrong with the new tick_nohz_get_sleep_length. >>>> It seems to return a value that is too large, ignoring immanent >>>> non-sched timer. >>> >>> >>> That's a very useful hint, let me have a look. >>> >>>> I tested idle-loop-v7.3. It looks very similar to my previous results >>>> on the first idle-loop-git-version [1]. Idle and traditional synthetic >>>> powernightmares are mostly good. >>> >>> >>> OK >>> >>>> But it selects too deep C-states for short idle periods, which is bad >>>> for power consumption [2]. >>> >>> >>> That still needs to be improved, then. >>> >>>> I tracked this down with additional tests using >>>> __attribute__((optimize("O0"))) menu_select >>>> and perf probe. With this the behavior seems slightly different, but it >>>> shows that data->next_timer_us is: >>>> v4.16-rc6: the expected ~500 us [3] >>>> idle-loop-v7.3: many milliseconds to minutes [4]. >>>> This leads to the governor to wrongly selecting C6. >>>> >>>> Checking with 372be9e and 6ea0577, I can confirm that the change is >>>> introduced by this patch. >>> >>> >>> Yes, that's where the most intrusive reordering happens. >> >> >> Overall, this is an interesting conundrum, because the case in >> question is when the tick should never be stopped at all during the >> workload and the code's behavior in that case should not change, so >> the change was not intentional. >> >> Now, from walking through the code, as long as can_stop_idle_tick() >> returns 'true' all should be fine or at least I don't see why there is >> any difference in behavior in that case. >> >> However, if can_stop_idle_tick() returns 'false' (for example, because >> need_resched() returns 'true' when it is evaluated), the behavior *is* >> different in a couple of ways. I sort of know how that can be >> addressed, but I'd like to reproduce your results here. >> >> Are you still using the same workload as before to trigger this behavior? >> > > Yes, the exact code I use is as follows > > $ gcc poller.c -O3 -fopenmp -o poller_omp > $ GOMP_CPU_AFFINITY=0-35 ./poller_omp 500 > > #include > #include > #include > > int main(int argc, char *argv[]) > { > int sleep_us = 10000; > if (argc == 2) { > sleep_us = atoi(argv[1]); > } > > #pragma omp parallel > { > while (1) { > usleep(sleep_us); > } > } > } So I do $ for cpu in 0 1 2 3; do taskset -c $cpu sh -c 'while true; do usleep 500; done' & done which is a shell kind of imitation of the above and I cannot see this issue at all. I count the number of times data->next_timer_us in menu_select() is greater than TICK_USEC and while this "workload" is running, that number is exactly 0. I'll try with a C program still.