Received: by 10.192.165.156 with SMTP id m28csp907038imm; Wed, 11 Apr 2018 09:04:50 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/1NsrQGT94ixDTaZg/db/UfZrXTyveML3Ughnl3lMNzgnPoFKxfc8HFSlNwt+QzbIZDetB X-Received: by 10.167.130.76 with SMTP id e12mr4583702pfn.192.1523462689984; Wed, 11 Apr 2018 09:04:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523462689; cv=none; d=google.com; s=arc-20160816; b=ZOSij5o9C2rBSmKnLq6x2Ws/EFNamO0wgSeZMiAmo2mexLTwD83kqukhyfDtBvvqlk 0H5gDhE35wgLzzU+QHLqEAARpi7Jl4+31Z7TIu5r+BTKzX2ivaP+jQx2tZ9HKJnS/MG+ uh5u63xkmpapjrwF1u3MSTup9wumBJNcXIZ6UH7D4dILXURgKJfhaUnh1jetVqnD3RFA GE7CWSIB7rNjo7rR+Zy5NmG0A0hsIcrDvUhfSouh7Zpsh5CDawa5hcCOwKsbjPTZ+bG4 KLWe/tM6tVsvRu4M6dJLo0YosDTachrZAozEfIk5TVFU6Z7VVe0+cXGZFpljdI3QQ2dr ChsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=FWtYFe1EoSSluXvR0PWUb0yq8D4PRi+PkZT8JOcvSbs=; b=H3QtDEX9vVTVuKmIV74L/r37VdgNJg5r/9j7MPMnC8kTPgMEM7WDYF7QiuxoO5VcnK sVzycuZLILtPinjdJnrzfkxzupAYRXUSpRDBjIOUyvWgsBdI3uw9/rfwL2wOg1/Xgsl3 0MXdQnjEf4c3tNWNjKWwToARyHezGiWbINK77pItKJx+InNzMEzaXyvXZa8HVzO/EC3Y rjE6iL2MmQFM1pxek8DYulGQb03MXcJCYKmC9UkvAu59txDgF9ssiq34iXSM5vg3tsSv BCpLHmWyZqYkAqhWq8MeofJgu/jHLO49SAwLuxVW/YYuX1ltNk7ICpfjsbMBzlzBD4MQ Yttw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=PgThgWOB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 2si1070274pfk.291.2018.04.11.09.03.40; Wed, 11 Apr 2018 09:04:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=PgThgWOB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753464AbeDKQAJ (ORCPT + 99 others); Wed, 11 Apr 2018 12:00:09 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:45664 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752664AbeDKQAI (ORCPT ); Wed, 11 Apr 2018 12:00:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=FWtYFe1EoSSluXvR0PWUb0yq8D4PRi+PkZT8JOcvSbs=; b=PgThgWOBIe25KGxiNUZ9dmAL4 WYm0YgMUwrTvMd0K5VBSicN12Ps6a6usHbDyBuNBQg8iQjyJt/wwlh3VOfnCM3nOjYbh1kWfkReFm LAG4qIN0Dbsg0tFfoMhL5RS6BO7HfpbVC+UYFIcbUk8pp9aoXabwNmgg4kNp7yyv/i1WJAO8+zUzS Il/3dEV2l5xn6MwDxz/M6M7V5wqcnS4IzcWYNhgw6iVUV6j7YMphfoMBBG6VDtG2ToERnBM1OioVp LViBOsJ2WNQgCd4mZ305CC0VIc3GUbe4fkwUaLx6N2m+l/85PPL3Or9qjTglbj1QDQAHWIyQv8Me5 nfzn+nYYg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1f6IAE-0001I9-Ig; Wed, 11 Apr 2018 16:00:02 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 076682029908D; Wed, 11 Apr 2018 18:00:01 +0200 (CEST) Date: Wed, 11 Apr 2018 18:00:00 +0200 From: Peter Zijlstra To: Vincent Guittot Cc: Patrick Bellasi , linux-kernel , "open list:THERMAL" , Ingo Molnar , "Rafael J . Wysocki" , Viresh Kumar , Juri Lelli , Joel Fernandes , Steve Muckle , Dietmar Eggemann , Morten Rasmussen Subject: Re: [PATCH] sched/fair: schedutil: update only with all info available Message-ID: <20180411160000.GO4082@hirez.programming.kicks-ass.net> References: <20180406172835.20078-1-patrick.bellasi@arm.com> <20180410110412.GG14248@e110439-lin> <20180411151450.GK4043@hirez.programming.kicks-ass.net> <20180411153710.GN4082@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.3 (2018-01-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 11, 2018 at 05:41:24PM +0200, Vincent Guittot wrote: > Yes. and to be honest I don't have any clues of the root cause :-( > Heiner mentioned that it's much better in latest linux-next but I > haven't seen any changes related to the code of those patches Yeah, it's a bit of a puzzle. Now you touch nohz, and the patches in next that are most likely to have affected this are rjw's cpuidle-vs-nohz patches. The common demoninator being nohz. Now I think rjw's patches will ensure we enter nohz _less_, they avoid stopping the tick when we expect to go idle for a short period only. So if your patch makes nohz go wobbly, going nohz less will make that better. Of course, I've no actual clue as to what that patch (it's the last one in the series, right?: 31e77c93e432 ("sched/fair: Update blocked load when newly idle") ) does that is so offensive to that one machine. You never did manage to reproduce, right? Could is be that for some reason the nohz balancer now takes a very long time to run? Could something like the following happen (and this is really flaky thinking here): last CPU goes idle, we enter idle_balance(), that kicks ilb, ilb runs, which somehow again triggers idle_balance and around we go? I'm not immediately seeing how that could happen, but if we do something daft like that we can tie up the CPU for a while, mostly with IRQs disabled, and that would be visible as that latency he sees.