Received: by 10.223.185.116 with SMTP id b49csp1019280wrg; Wed, 21 Feb 2018 10:40:10 -0800 (PST) X-Google-Smtp-Source: AH8x225lvylm4er/2c72o2D1KDgwr1jcCi37LFCce7s3pCt2seIX6qjwbk+UPX4AOzrDTxodH4Ko X-Received: by 10.98.201.194 with SMTP id l63mr4182339pfk.126.1519238410342; Wed, 21 Feb 2018 10:40:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519238410; cv=none; d=google.com; s=arc-20160816; b=SJXosyqp8hoaOHAnaPXq1Ye/0PTT7AlaPoGn5OLAC8Y4xgVvWEcvUAFTk1pdAqjDeP 9h/QfvkMeVXYvxznl+TEVadK4Qf4lC/qyZhj11O4IwzQZRD+1VqHcbL/Gmjn5afGtv6D uYvT6TVq9i9xLGfUgEBPGC5+9odgWx1eSv26vQ0YVQEfGg3QX1WAK0QiOKLkgckgnJO3 yJTsnLVb/qw1PfLa+ggMKrAZFXVYcCBL7KtrD4bCwY9jnOOtZhJLspovSPAXq0YJMP2n WDAGmmb8Y+RQcUIQJemJAjhXpB67DY5Uo4ZoCAjWGmEap2VXQi+MEX7AdbkIcoJPlz5s Lbiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=GN5MW4JfzPtsDUI+x3V2P3hecTtaRkaBXt/Jn9zFUYA=; b=YLg4BVELR9lCdwd5kWWXwaankzzrjs3Z8wcYMCVk8ERUV14k9U0f8AwqZR4bzqWcgc 5CsqEDZ0XWTbFp5Egur+62pvCJ/D73nb8HG/PbHHnQu34qwJ7E3AhNg7WVz22/6JAIkg So9jDLyPuK5Q9ZhYWeuz7jcs0yTc2cI+tYSCQ1m+McCv3q1VOy3aNBhU+9u5i/VGxXIS LuFUrR66idHXqIRSBXWzH2kBTGD2CZ2gHyAyQgU8UqXNj/GLvL3pngPQdP8EqMUnPey4 KQa25sFzmV/pJqQoitOhNJOegWafhcqRYE1mA3/CtUe+92Ew+hZxfA4ffPSdTUKJdhYN ZYjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r69si6571808pgr.678.2018.02.21.10.39.55; Wed, 21 Feb 2018 10:40:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965762AbeBUNN5 (ORCPT + 99 others); Wed, 21 Feb 2018 08:13:57 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:54502 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752681AbeBUNNy (ORCPT ); Wed, 21 Feb 2018 08:13:54 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 70EB61435; Wed, 21 Feb 2018 05:13:54 -0800 (PST) Received: from [10.1.206.74] (e113632-lin.cambridge.arm.com [10.1.206.74]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 34C5E3F318; Wed, 21 Feb 2018 05:13:53 -0800 (PST) Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , Morten Rasmussen , Brendan Jackman , Dietmar Eggemann References: <1518622006-16089-1-git-send-email-vincent.guittot@linaro.org> <1518622006-16089-2-git-send-email-vincent.guittot@linaro.org> <44a7d9dc-f6f3-e003-44d6-b0c4aa7dc046@arm.com> From: Valentin Schneider Message-ID: Date: Wed, 21 Feb 2018 13:13:52 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/16/2018 01:44 PM, Vincent Guittot wrote: > On 16 February 2018 at 13:13, Valentin Schneider > wrote: >> On 02/14/2018 03:26 PM, Vincent Guittot wrote: >>> Stopped the periodic update of blocked load when all idle CPUs have fully >>> decayed. We introduce a new nohz.has_blocked that reflect if some idle >>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked >>> is set everytime that a Idle CPU can have blocked load and it is then clear >>> when no more blocked load has been detected during an update. We don't need >>> atomic operation but only to make cure of the right ordering when updating >>> nohz.idle_cpus_mask and nohz.has_blocked. >>> >>> Suggested-by: Peter Zijlstra (Intel) >>> Signed-off-by: Vincent Guittot >>> --- >>> kernel/sched/fair.c | 122 ++++++++++++++++++++++++++++++++++++++++++--------- >>> kernel/sched/sched.h | 1 + >>> 2 files changed, 102 insertions(+), 21 deletions(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 7af1fa9..5a6835e 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> >>> [...] >>> @@ -9383,11 +9452,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) >>> * work being done for other cpus. Next load >>> * balancing owner will pick it up. >>> */ >>> - if (need_resched()) >>> - break; >>> + if (need_resched()) { >>> + has_blocked_load = true; >>> + goto abort; >>> + } >>> >>> rq = cpu_rq(balance_cpu); >>> >> >> I'd say it's safe to do the following here. The flag is raised in >> nohz_balance_enter_idle() before the smp_mb(), so we won't skip a CPU >> that just got added to nohz.idle_cpus_mask. > > rq->has_blocked_load will be set before the barrier only if > nohz_tick_stopped is not already set, > Otherwise, we skip cpumask update and the barrier in nohz_balance_enter_idle > Right, forgot about that bit. I think it's still fine because: - nohz_balance_enter_idle() can't be called before the last running task is dequeued, which requires the rq lock. - update_blocked_averages takes the rq lock and clears rq->has_blocked_load with the lock held. So even though we could have some very unlikely scenario where a CPU quickly goes out/in of idle after nohz.idle_cpus_mask has been read, the blocked load itself is safe so rq->has_blocked_load will end up being set correctly. (Took me a while to see it that way) BTW, with the current set on Peter's sched/testing, update_nohz_stats() is called here, which doesn't do the update if !rq->has_blocked_load (Although that check is done without lock/barrier, so maybe we could not see a CPU that just went idle ?) I have one more question on that bit: has_blocked_load |= update_nohz_stats(rq, true); /* * If time for next balance is due, * do the balance. */ if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); } if (time_after(next_balance, rq->next_balance)) { next_balance = rq->next_balance; update_next_balance = 1; } Now that I think about it, shouldn't we always have a 'continue' after the blocked load update if (flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK ? AFAICT we don't want to push the next_balance forward, only the next_blocked. That would also take care of not doing the load balance. >> >> /* >> * This cpu doesn't have any remaining blocked load, skip it. >> * It's sane to do this because this flag is raised in >> * nohz_balance_enter_idle() >> */ >> if ((flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK && >> !rq->has_blocked_load) >> continue; >> >>> + update_blocked_averages(rq->cpu); >>> + has_blocked_load |= rq->has_blocked_load; >>> +