Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp11902125ybi; Fri, 26 Jul 2019 01:30:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqyyOcs7EOcchf1B3zo/lMOPbk2cYYZw0CSIFGEq8RV7P4yUpVqvjPlPZlJm5qOblh2+jojq X-Received: by 2002:a63:2cd5:: with SMTP id s204mr74472141pgs.95.1564129831990; Fri, 26 Jul 2019 01:30:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564129831; cv=none; d=google.com; s=arc-20160816; b=qLIW6S+EaCaHb4U1xJ8AUz4f5a84RCRryejvUozW4DZ0DU9pSEpc7zQllf7MCz4dRf pYg7DdCDciU61Uoic8KDOuvoTl8Agb0xt4ZOzrF6BJVsGMcluTDDMbEGibI9MaqwUK9y 9qXzGxc3wICatJjJz7EFOrh3zAqJijaG/mkjFWmeengFlCALamVdCglAWpsqmQgHHZJB FDB9T1HqWOd9dgrQM3T+xK55CYXkuTUaoqJwF/ifhf43b9rzxVBWcB714ceRdqGF0ibe cUPCDr0SRo8RZYIom1NXFAwfPTYOpyGPWm3qh3CGWkzEWPQXkOZf0fPlfPCH5VhF6oi5 4H6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=cBo8hry6A4K82A7j1LKtJ3fJ7ZeYKERMunvWmrH/P+Q=; b=HEkOKBgvZmX/JfNhs9cryOL/yIYbfg7R6Gbi2nelUkIquw+yae1vwz/zBg3oIbvM13 bXpGL/l0r+qt66uybWNlTmmWMMSKch3fYkQafo0N5m95BmBonbC25g4qiC9iC3tpFcU0 Kjo5U6x+yI5Kaw5KcVUanHosc7DgaYR0M+sv26jYhdpfEByH6kpvDjmJlCkVOqgoThOq oyQ670vFzMXVCd8/rVbpAotAD5Zw+4lCnZeqt8iQGXJBR/U9cDLpiTsKeyDcHHWhZKBL AqYW/ZUGVFDT4mgKz+aJvJDxjMYK12rwGq9x/A4H+spiHoNG5oSpK3yV5YN/wMCph+O8 7wrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x8si20044431pgj.322.2019.07.26.01.30.16; Fri, 26 Jul 2019 01:30:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726516AbfGZI3i (ORCPT + 99 others); Fri, 26 Jul 2019 04:29:38 -0400 Received: from foss.arm.com ([217.140.110.172]:39482 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725815AbfGZI3g (ORCPT ); Fri, 26 Jul 2019 04:29:36 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BC53C152D; Fri, 26 Jul 2019 01:29:35 -0700 (PDT) Received: from e107985-lin.arm.com (e107985-lin.cambridge.arm.com [10.1.194.38]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9D06F3F71A; Fri, 26 Jul 2019 01:29:34 -0700 (PDT) From: Dietmar Eggemann To: Peter Zijlstra , Ingo Molnar , Juri Lelli Cc: Luca Abeni , Daniel Bristot de Oliveira , Valentin Schneider , Qais Yousef , linux-kernel@vger.kernel.org Subject: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Date: Fri, 26 Jul 2019 09:27:52 +0100 Message-Id: <20190726082756.5525-2-dietmar.eggemann@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190726082756.5525-1-dietmar.eggemann@arm.com> References: <20190726082756.5525-1-dietmar.eggemann@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org push_dl_task() always calls deactivate_task() with flags=0 which sets p->on_rq=TASK_ON_RQ_MIGRATING. push_dl_task()->deactivate_task()->dequeue_task()->dequeue_task_dl() calls sub_[running/rq]_bw() since p->on_rq=TASK_ON_RQ_MIGRATING. So sub_[running/rq]_bw() in push_dl_task() is double-accounting for that task. The same is true for add_[rq/running]_bw() and activate_task() on the destination (later) CPU. push_dl_task()->activate_task()->enqueue_task()->enqueue_task_dl() calls add_[rq/running]_bw() again since p->on_rq is still set to TASK_ON_RQ_MIGRATING. So the add_[rq/running]_bw() in enqueue_task_dl() is double-accounting for that task. Fix this by removing the rq/running bw accounting in push_dl_task(). Trace (CONFIG_SCHED_DEBUG=y) before the fix on a 6 CPUs system with 6 DL (12000, 100000, 100000) tasks showing the issue: [ 48.147868] dl_rq->running_bw > old [ 48.147886] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:98 ... [ 48.274832] inactive_task_timer+0x468/0x4e8 [ 48.279057] __hrtimer_run_queues+0x10c/0x3b8 [ 48.283364] hrtimer_interrupt+0xd4/0x250 [ 48.287330] tick_handle_oneshot_broadcast+0x198/0x1d0 ... [ 48.360057] dl_rq->running_bw > dl_rq->this_bw [ 48.360065] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:86 ... [ 48.488294] task_contending+0x1a0/0x208 [ 48.492172] enqueue_task_dl+0x3b8/0x970 [ 48.496050] activate_task+0x70/0xd0 [ 48.499584] ttwu_do_activate+0x50/0x78 [ 48.503375] try_to_wake_up+0x270/0x7a0 [ 48.507167] wake_up_process+0x14/0x20 [ 48.510873] hrtimer_wakeup+0x1c/0x30 ... [ 50.062867] dl_rq->this_bw > old [ 50.062885] WARNING: CPU: 1 PID: 2048 at kernel/sched/deadline.c:122 ... [ 50.190520] dequeue_task_dl+0x1e4/0x1f8 [ 50.194400] __sched_setscheduler+0x1d0/0x860 [ 50.198707] _sched_setscheduler+0x74/0x98 [ 50.202757] do_sched_setscheduler+0xa8/0x110 [ 50.207065] __arm64_sys_sched_setscheduler+0x1c/0x30 Signed-off-by: Dietmar Eggemann --- kernel/sched/deadline.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index de2bd006fe93..d1aeada374e1 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq) } deactivate_task(rq, next_task, 0); - sub_running_bw(&next_task->dl, &rq->dl); - sub_rq_bw(&next_task->dl, &rq->dl); set_task_cpu(next_task, later_rq->cpu); - add_rq_bw(&next_task->dl, &later_rq->dl); /* * Update the later_rq clock here, because the clock is used * by the cpufreq_update_util() inside __add_running_bw(). */ update_rq_clock(later_rq); - add_running_bw(&next_task->dl, &later_rq->dl); activate_task(later_rq, next_task, ENQUEUE_NOCLOCK); ret = 1; -- 2.17.1