Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934537AbdIYLyL (ORCPT ); Mon, 25 Sep 2017 07:54:11 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:6996 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933057AbdIYLyK (ORCPT ); Mon, 25 Sep 2017 07:54:10 -0400 Message-ID: <59C8EE29.4070904@huawei.com> Date: Mon, 25 Sep 2017 19:53:13 +0800 From: zhouchengming User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Zhou Chengming , , , , CC: Subject: Re: [PATCH] sched/rt.c: pick and check task if double_lock_balance() unlock the rq References: <1505112709-102019-1-git-send-email-zhouchengming1@huawei.com> In-Reply-To: <1505112709-102019-1-git-send-email-zhouchengming1@huawei.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.236.183] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090205.59C8EE38.0030,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 165376affd61c9d1c1eec407f1340da5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4035 Lines: 129 ping... Or it isn't a real problem ? Thanks. On 2017/9/11 14:51, Zhou Chengming wrote: > push_rt_task() pick the first pushable task and find an eligible > lowest_rq, then double_lock_balance(rq, lowest_rq). So if > double_lock_balance() unlock the rq (when double_lock_balance() return 1), > we have to check if this task is still on the rq. > > The problem is that the check conditions are not sufficient: > > if (unlikely(task_rq(task) != rq || > !cpumask_test_cpu(lowest_rq->cpu,&task->cpus_allowed) || > task_running(rq, task) || > !rt_task(task) || > !task_on_rq_queued(task))) { > > cpu2 cpu1 cpu0 > push_rt_task(rq1) > pick task_A on rq1 > find rq0 > double_lock_balance(rq1, rq0) > unlock(rq1) > rq1 __schedule > pick task_A run > task_A sleep (dequeued) > lock(rq0) > lock(rq1) > do_above_check(task_A) > task_rq(task_A) == rq1 > cpus_allowed unchanged > task_running == false > rt_task(task_A) == true > try_to_wake_up(task_A) > select_cpu = cpu3 > enqueue(rq3, task_A) > task_A->on_rq = 1 > task_on_rq_queued(task_A) > above_check passed, return rq0 > ... > migrate task_A from rq1 to rq0 > > So we can't rely on these checks of task_A to make sure the task_A is > still on the rq1, even though we hold the rq1->lock. This patch will > repick the first pushable task to be sure the task is still on the rq. > > Signed-off-by: Zhou Chengming > --- > kernel/sched/rt.c | 49 +++++++++++++++++++++++-------------------------- > 1 file changed, 23 insertions(+), 26 deletions(-) > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 45caf93..787b721 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -1703,6 +1703,26 @@ static int find_lowest_rq(struct task_struct *task) > return -1; > } > > +static struct task_struct *pick_next_pushable_task(struct rq *rq) > +{ > + struct task_struct *p; > + > + if (!has_pushable_tasks(rq)) > + return NULL; > + > + p = plist_first_entry(&rq->rt.pushable_tasks, > + struct task_struct, pushable_tasks); > + > + BUG_ON(rq->cpu != task_cpu(p)); > + BUG_ON(task_current(rq, p)); > + BUG_ON(p->nr_cpus_allowed<= 1); > + > + BUG_ON(!task_on_rq_queued(p)); > + BUG_ON(!rt_task(p)); > + > + return p; > +} > + > /* Will lock the rq it finds */ > static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) > { > @@ -1734,13 +1754,10 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) > * We had to unlock the run queue. In > * the mean time, task could have > * migrated already or had its affinity changed. > - * Also make sure that it wasn't scheduled on its rq. > */ > - if (unlikely(task_rq(task) != rq || > - !cpumask_test_cpu(lowest_rq->cpu,&task->cpus_allowed) || > - task_running(rq, task) || > - !rt_task(task) || > - !task_on_rq_queued(task))) { > + struct task_struct *next_task = pick_next_pushable_task(rq); > + if (unlikely(next_task != task || > + !cpumask_test_cpu(lowest_rq->cpu,&task->cpus_allowed))) { > > double_unlock_balance(rq, lowest_rq); > lowest_rq = NULL; > @@ -1760,26 +1777,6 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) > return lowest_rq; > } > > -static struct task_struct *pick_next_pushable_task(struct rq *rq) > -{ > - struct task_struct *p; > - > - if (!has_pushable_tasks(rq)) > - return NULL; > - > - p = plist_first_entry(&rq->rt.pushable_tasks, > - struct task_struct, pushable_tasks); > - > - BUG_ON(rq->cpu != task_cpu(p)); > - BUG_ON(task_current(rq, p)); > - BUG_ON(p->nr_cpus_allowed<= 1); > - > - BUG_ON(!task_on_rq_queued(p)); > - BUG_ON(!rt_task(p)); > - > - return p; > -} > - > /* > * If the current CPU has more than one RT task, see if the non > * running task can migrate over to a CPU that is running a task