Received: by 10.192.165.156 with SMTP id m28csp1915239imm; Thu, 12 Apr 2018 05:50:26 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/BfXjf/7Y/+avfDrUkmE1b5TkWb2J+tJvwcLmyGRTZI3YYGajGOdZWXVDU7MVt5OccNkcz X-Received: by 10.99.66.69 with SMTP id p66mr627201pga.290.1523537426421; Thu, 12 Apr 2018 05:50:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523537426; cv=none; d=google.com; s=arc-20160816; b=yItuaE6L2JscQmnJ9OMkHX+BhLqzX1D42LYYkP10oPh6pdIawANlB+sxftLn0Qk8EG hH3s9s7anastbzRHDPZ4J6mqNLZYolCdmvuM8LNyzauro5iszEmuhlwO7ADQLL34C1+D Rt24h6cazXBM4bLkSUC86cTrxALwgopZSU7juo53uqk8mGn2mta7NAOf/BdIqrYh1YNB C03Pl1PHemcUffPArBHohvTkcTDPgZMeXAuoHHQQ66iouFIpLxQ6+4W3Wh7TiHWNi3qz Hy7hHeXEj83jT3bf2S7fWpq4XHvJS2qQnqjRpm8hyeNlR5jnVYIMX2oX9hYh8bCCfW1U 05Kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=WN3Dj3SImZrt7D3q2NlLnKLQjdr/Ydy9mxKI0yqPB20=; b=a+Z0rQYgS7GKd94pzKGHwyuQkpyIMRup6yHq36ktDf7yEGBWk+9KbtvUc7xn39hLjf gQBepnV7ekGZ0IbuckHOIVzwgXELuHPWWpFycWcoRP1DtqQ9QqQiaxYMECIoddS5imM+ n9TXfEmyiIzDiV0DU2Y/FQUIlQPUYJ6CIVTtWbUcmyGWAux5hUTNRT1BhCIHM0sTcONL EZYlUqqkySykAK58GGt7rWCEffeMOtQBLPE7tjdyc/bp6LOHU9kIFS0EhJs90RYjlwrm XBUikLrYmjYDJv8LoSpKqN6tnGEXsGt96xXcXeAww3om1S/jMVEcss2Mt1JipTbbKXH9 VVAA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y136si2588715pfg.81.2018.04.12.05.49.49; Thu, 12 Apr 2018 05:50:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752790AbeDLMnW (ORCPT + 99 others); Thu, 12 Apr 2018 08:43:22 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:43228 "EHLO huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752062AbeDLMnH (ORCPT ); Thu, 12 Apr 2018 08:43:07 -0400 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 7FB4E1F405132; Thu, 12 Apr 2018 20:43:02 +0800 (CST) Received: from huawei.com (10.175.102.37) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.361.1; Thu, 12 Apr 2018 20:42:56 +0800 From: Li Bin To: , , CC: , , Subject: [PATCH v2 2/2] sched/deadline.c: pick and check task if double_lock_balance() unlock the rq Date: Thu, 12 Apr 2018 20:33:04 +0800 Message-ID: <1523536384-26781-3-git-send-email-huawei.libin@huawei.com> X-Mailer: git-send-email 1.7.12.4 In-Reply-To: <1523536384-26781-1-git-send-email-huawei.libin@huawei.com> References: <1523536384-26781-1-git-send-email-huawei.libin@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.102.37] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org push_dl_task() pick the first pushable task and find an eligible lowest_rq, then double_lock_balance(rq, lowest_rq). So if double_lock_balance() unlock the rq (when double_lock_balance() return 1), we have to check if this task is still on the rq. The problem is that the check conditions are not sufficient: if (unlikely(task_rq(task) != rq || !cpumask_test_cpu(later_rq->cpu, &task->cpus_allowed) || task_running(rq, task) || !dl_task(task) || !task_on_rq_queued(task))) { cpu2 cpu1 cpu0 push_dl_task(rq1) pick task_A on rq1 find rq0 double_lock_balance(rq1, rq0) unlock(rq1) rq1 __schedule pick task_A run task_A sleep (dequeued) lock(rq0) lock(rq1) do_above_check(task_A) task_rq(task_A) == rq1 cpus_allowed unchanged task_running == false dl_task(task_A) == true try_to_wake_up(task_A) select_cpu = cpu3 enqueue(rq3, task_A) task_A->on_rq = 1 task_on_rq_queued(task_A) above_check passed, return rq0 ... migrate task_A from rq1 to rq0 So we can't rely on these checks of task_A to make sure the task_A is still on the rq1, even though we hold the rq1->lock. This patch will repick the first pushable task to be sure the task is still on the rq. Signed-off-by: Li Bin Acked-by: Peter Zijlstra (Intel) Reviewed-by: Steven Rostedt (VMware) --- kernel/sched/deadline.c | 55 +++++++++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 25 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 9df0978..8e0f6a4 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1936,6 +1936,26 @@ static int find_later_rq(struct task_struct *task) return -1; } +static struct task_struct *pick_next_pushable_dl_task(struct rq *rq) +{ + struct task_struct *p; + + if (!has_pushable_dl_tasks(rq)) + return NULL; + + p = rb_entry(rq->dl.pushable_dl_tasks_root.rb_leftmost, + struct task_struct, pushable_dl_tasks); + + BUG_ON(rq->cpu != task_cpu(p)); + BUG_ON(task_current(rq, p)); + BUG_ON(p->nr_cpus_allowed <= 1); + + BUG_ON(!task_on_rq_queued(p)); + BUG_ON(!dl_task(p)); + + return p; +} + /* Locks the rq it finds */ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) { @@ -1965,11 +1985,16 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) /* Retry if something changed. */ if (double_lock_balance(rq, later_rq)) { - if (unlikely(task_rq(task) != rq || - !cpumask_test_cpu(later_rq->cpu, &task->cpus_allowed) || - task_running(rq, task) || - !dl_task(task) || - !task_on_rq_queued(task))) { + struct task_struct *next_task; + /* + * We had to unlock the run queue. In + * the mean time, task could have + * migrated already or had its affinity changed. + * Also make sure that it wasn't scheduled on its rq. + */ + next_task = pick_next_pushable_dl_task(rq); + if (unlikely(next_task != task || + !cpumask_test_cpu(later_rq->cpu, &task->cpus_allowed))) { double_unlock_balance(rq, later_rq); later_rq = NULL; break; @@ -1994,26 +2019,6 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) return later_rq; } -static struct task_struct *pick_next_pushable_dl_task(struct rq *rq) -{ - struct task_struct *p; - - if (!has_pushable_dl_tasks(rq)) - return NULL; - - p = rb_entry(rq->dl.pushable_dl_tasks_root.rb_leftmost, - struct task_struct, pushable_dl_tasks); - - BUG_ON(rq->cpu != task_cpu(p)); - BUG_ON(task_current(rq, p)); - BUG_ON(p->nr_cpus_allowed <= 1); - - BUG_ON(!task_on_rq_queued(p)); - BUG_ON(!dl_task(p)); - - return p; -} - /* * See if the non running -deadline tasks on this rq * can be sent to some other CPU where they can preempt -- 1.7.12.4