Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp3183109ybc; Thu, 21 Nov 2019 04:40:41 -0800 (PST) X-Google-Smtp-Source: APXvYqx+0seWfiGZHmT/PseGZtv8e8tWCVCDPfa8kv5hZvfnSxv1tclGon+kMdzox8XfUle/LG+7 X-Received: by 2002:a05:600c:2201:: with SMTP id z1mr4742137wml.68.1574340041336; Thu, 21 Nov 2019 04:40:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574340041; cv=none; d=google.com; s=arc-20160816; b=OmB2JaF751vJXbCnS7gSODWn81uNqq4JLy2+i73pI1D2pQJABXaCPPsHrUS9QL3eqr 4Kn6FEsCXCDKaDrjBD2jVGPxuOXsEEs2hgD8uiyjoV7ZvYNDj7oajwVIKeeBFWh7WkSM AFjv6RKE8eUVrb3BXnooA+2tO0c11Q7a75L5neg/Aozu0G6uv5CJHUzQqa2k6vBTMCg3 L70E7QgjENvaQjLWh87hzGpdromyuZQEfGTOsLOMACS0gg1nbrUJKa4N1RrFA/gmtyfX Un5q279tgCWNI4nBs5DOwY0zSSSgOaYZKxB77Kjen1BiYRZAbfKC6OUXvIH0X6ZoHlp9 OGqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=TXLU5lx1Bkxo3ywrxVI3JlMFn+Xwo+ncCfBlxFtyhyU=; b=RYTudvUKwz4+lYv5AsToULqkJQZkdPpFfUMsfJ/3iGnvT4zfE3174ii2p3aYqmZQeT Ofaitf4/h6saUDLkoxnHMi2GEk/BEgJ7BMCJgHA56NORmuYwYxQaLTmDUEKv1JBvUQG6 raVgrw7VbUyDTXl5NHKsaF0gMtwEpbDyUz4dVcIFnPMM/+S11UJkJYwtZDr4cK2+BsiY kaNyrKJDIszGIgPLrdGv2iQCqUMnBF7evZtGGEIj3tTQ7BeYxgiNOtSY6SEhFMlz8eJg UG5YcRmNeEDYwAXa9hCaLpECFKbfWqz7nZs49xFSSnTUG1AqB1khDn98vjeMaOwToDcu KTqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=H7Gffehm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d36si1931922ede.44.2019.11.21.04.40.16; Thu, 21 Nov 2019 04:40:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=H7Gffehm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726664AbfKUMiI (ORCPT + 99 others); Thu, 21 Nov 2019 07:38:08 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:37460 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726342AbfKUMiH (ORCPT ); Thu, 21 Nov 2019 07:38:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=TXLU5lx1Bkxo3ywrxVI3JlMFn+Xwo+ncCfBlxFtyhyU=; b=H7GffehmG0T3CLTlYNJs8Zerl vBvHd+/Y9VHhblQSJK0+3y3VxQVbjsnySzbA2VXp7pls9EPyj8FQBjeCXYuwkpXlvTB7x3+9A9jgQ BTDz5w/5gaSpAHTz/+X0lkryVpqUplpR88jRos7wX7OmOdqmS9yrtno9n3t7/7oqI9EZX1xgJZnde 0urUc2hGCUFeToxnX/uCsrvABdvS9XC2EBLnKjHdqoa/njUrpXQfMj5fLoFa0x2Ux8uKXloDxAttu 9xNfk5LADrUoH9iufgkhoGWFyIR7tESo14eqUiOCqjtIbU0/4lJIkkETR9Y5mfLEG/d8sDELZQXdn ips9IxP7g==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1iXlin-0005qq-Qq; Thu, 21 Nov 2019 12:38:05 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 55B7E300565; Thu, 21 Nov 2019 13:36:53 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 33847201DD6AF; Thu, 21 Nov 2019 13:38:04 +0100 (CET) Date: Thu, 21 Nov 2019 13:38:04 +0100 From: Peter Zijlstra To: YT Chang Cc: Matthias Brugger , wsd_upstream@mediatek.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Vincent Guittot Subject: Re: [PATCH 1/1] sched: cfs_rq h_load might not update due to irq disable Message-ID: <20191121123804.GR4097@hirez.programming.kicks-ass.net> References: <1574325009-10846-1-git-send-email-yt.chang@mediatek.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1574325009-10846-1-git-send-email-yt.chang@mediatek.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 21, 2019 at 04:30:09PM +0800, YT Chang wrote: > Syndrome: > > Two CPUs might do idle balance in the same time. > One CPU does idle balance and pulls some tasks. > However before pick next task, ALL task are pulled back to other CPU. > That results in infinite loop in both CPUs. Can you easily reproduce this? > ========================================= > code flow: > > in pick_next_task_fair() > > again: > > if nr_running == 0 > goto idle > pick next task > return > > idle: > idle_balance > /* pull some tasks from other CPU, > * However other CPU are also do idle balance, > * and pull back these task */ > > go to again > > ========================================= > The result to pull ALL tasks back when the task_h_load > is incorrect and too low. Clearly you're not running a PREEMPT kernel, otherwise the break in detach_tasks() would've saved you, right? > static unsigned long task_h_load(struct task_struct *p) > { > struct cfs_rq *cfs_rq = task_cfs_rq(p); > > update_cfs_rq_h_load(cfs_rq); > return div64_ul(p->se.avg.load_avg_contrib * cfs_rq->h_load, > cfs_rq->runnable_load_avg + 1); > } > > The cfs_rq->h_load is incorrect and might too small. > The original idea of cfs_rq::last_h_load_update will not > update cfs_rq::h_load more than once a jiffies. > When the Two CPUs pull each other in the pick_next_task_fair, > the irq disabled and result in jiffie not update. > (Other CPUs wait for runqueue lock locked by the two CPUs. > So, ALL CPUs are irq disabled.) This cannot be true; because the loop drops rq->lock, so other CPUs should have an opportunity to acquire the lock and make progress. > Solution: > cfs_rq h_load might not update due to irq disable > use sched_clock instead jiffies > > Signed-off-by: YT Chang > --- > kernel/sched/fair.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 83ab35e..231c53f 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7578,9 +7578,11 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq) > { > struct rq *rq = rq_of(cfs_rq); > struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)]; > - unsigned long now = jiffies; > + u64 now = sched_clock_cpu(cpu_of(rq)); > unsigned long load; > > + now = now * HZ >> 30; > + > if (cfs_rq->last_h_load_update == now) > return; > This is disguisting and wrong. That is not the correct relation between sched_clock() and jiffies.