Received: by 10.223.185.116 with SMTP id b49csp1046766wrg; Fri, 16 Feb 2018 11:25:46 -0800 (PST) X-Google-Smtp-Source: AH8x227zi1dnRiLPJXGGYHZWj6ep1g/TGzycmqLvX/VW6h7/SogQb1RYtdi0/OQk4xsxSizre9KW X-Received: by 2002:a17:902:7c0a:: with SMTP id x10-v6mr6823621pll.59.1518809146227; Fri, 16 Feb 2018 11:25:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518809146; cv=none; d=google.com; s=arc-20160816; b=NpR2bET3bd0/tNElb54Jxa5aJimbwxEpNbmh8vhBtvn25xYLxxsJPSUcqUnv4oducF xzcELeWFt6CiByAfdeh7TFlkumf1d5uOLxMI5Q5p4Ry0BqRykUlwq4PHffCnYviskN6s P52ES7Im6VqSA+SN4D/u0v3XaBo+1hyfMWFccV6vpNBucxDCTUnvzZLoFCK9uTgVAqef 0PVfgBITooXNLCInDGqX7u6U0faGVgarHtAaCDeoCrsCj3Q16s2C+gqSEs08z+n3kYLB Jmuk2MRI4Bq+iTSMD5XDgcuewhbjYzMV22ctiU3sCPLcv89vxQnTl6bcMsD7AqRpV5kJ z+7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=aZRVvSkMoseC5LTOD30U+mibyCM5Uz2zV3fSfRsQXdM=; b=bLQSiLP/6ZZ6mdBln9SnlBy9qZN3vFJZRRm6MboWxJ/1jcl69UWiFgAUmJGpGDDB5Z NIFqZ2gFy44YjFH8EUO0a3Uk5/qQ4GQyHn2bu6x33eZhXQUqrD/LINwckuUEhdiP0Tyg s/frIvc8NCm+VYgvQuBb2ohsZw7bMNxds5zUVKiCwbHwFmpge9rmy1lwIwN7oHieW2ia hC0UUv3G/PgmBo+lB+oPoKVXBpciyAN+WHIKf6DDRfkr8Qrz3DgWzqpgOGa6zYu0Nwan JeMZmgtFt9tNxW80Fdb3Ot9efBPQ+JQPLwiIPRawMH/PiC6IkUOj2NB8twJVwbWW8XOg WGhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c129si978487pfa.362.2018.02.16.11.25.32; Fri, 16 Feb 2018 11:25:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751413AbeBPTXw (ORCPT + 99 others); Fri, 16 Feb 2018 14:23:52 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:43812 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750873AbeBPTXs (ORCPT ); Fri, 16 Feb 2018 14:23:48 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A331980D; Fri, 16 Feb 2018 11:23:47 -0800 (PST) Received: from [10.1.206.74] (e113632-lin.cambridge.arm.com [10.1.206.74]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 6EE8F3F24D; Fri, 16 Feb 2018 11:23:46 -0800 (PST) Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , Morten Rasmussen , Brendan Jackman , Dietmar Eggemann References: <1518622006-16089-1-git-send-email-vincent.guittot@linaro.org> <1518622006-16089-2-git-send-email-vincent.guittot@linaro.org> From: Valentin Schneider Message-ID: Date: Fri, 16 Feb 2018 19:23:45 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/16/2018 05:02 PM, Vincent Guittot wrote: > On 16 February 2018 at 13:53, Valentin Schneider > wrote: >> On 02/14/2018 03:26 PM, Vincent Guittot wrote: >>> Stopped the periodic update of blocked load when all idle CPUs have fully >>> decayed. We introduce a new nohz.has_blocked that reflect if some idle >>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked >>> is set everytime that a Idle CPU can have blocked load and it is then clear >>> when no more blocked load has been detected during an update. We don't need >>> atomic operation but only to make cure of the right ordering when updating >>> nohz.idle_cpus_mask and nohz.has_blocked. >>> >>> Suggested-by: Peter Zijlstra (Intel) >>> Signed-off-by: Vincent Guittot >>> --- >>> kernel/sched/fair.c | 122 ++++++++++++++++++++++++++++++++++++++++++--------- >>> kernel/sched/sched.h | 1 + >>> 2 files changed, 102 insertions(+), 21 deletions(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 7af1fa9..5a6835e 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> >>> [...] >>> >>> -static void update_nohz_stats(struct rq *rq) >>> +static bool update_nohz_stats(struct rq *rq) >>> { >>> #ifdef CONFIG_NO_HZ_COMMON >>> unsigned int cpu = rq->cpu; >>> >>> + if (!rq->has_blocked_load) >>> + return false; >>> + >>> if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) >>> - return; >>> + return false; >>> >>> if (!time_after(jiffies, rq->last_blocked_load_update_tick)) >>> - return; >>> + return true; >>> >>> update_blocked_averages(cpu); >>> + >>> + return rq->has_blocked_load; >>> +#else >>> + return false; >>> #endif >>> } >>> >> >> (Wrongly thought that this bit was in a different patch, comment should have >> been squashed in previous reply...) >> >> I've been thinking about this, and it's a messy one if we want to >> skip CPUs in idle_balance() / clear the nohz.has_blocked_flag. >> >> AFAICT, the rq->has_blocked_load flag can be wrongly cleared: if we're >> calling update_nohz_stats() for CPU A, but CPU A got out/in of >> idle really quickly in that same timeframe, I'm not sure you can guarantee >> the clearing of rq->has_blocked_load done in update_blocked_averages() will >> always end up in memory before the setting of the flag in >> nohz_balance_enter_idle(). > > Not sure it's a problem in this case because the clear done in > update_blocked_averages() only happens if there is no load on the rq > and new load can't be added in the mean time > You're right, and that's why there's that comment: >> /* >> * Can be set safely without rq->lock held >> * If a clear happens, it will have evaluated last additions because >> * rq->lock is held during the check and the clear >> */ >> rq->has_blocked_load = 1; Even though it's clearly written there my brain wouldn't process the fact that the flag is cleared with the rq lock held. So yeah, we can't wrongly clear rq->has_blocked_load. The only mishap that can happen is that it is re-raised even though we just went though an update_nohz_stats(), which would lead to a useless stats update in the future, but that's not as bad.