Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966951AbdIZBYL (ORCPT ); Mon, 25 Sep 2017 21:24:11 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:6997 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935285AbdIZBYJ (ORCPT ); Mon, 25 Sep 2017 21:24:09 -0400 Message-ID: <59C9AC08.9000302@huawei.com> Date: Tue, 26 Sep 2017 09:23:20 +0800 From: zhouchengming User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Steven Rostedt CC: , , , Subject: Re: [PATCH] sched/rt.c: pick and check task if double_lock_balance() unlock the rq References: <1505112709-102019-1-git-send-email-zhouchengming1@huawei.com> <20170925154057.191e3fd1@vmware.local.home> In-Reply-To: <20170925154057.191e3fd1@vmware.local.home> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.236.183] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090206.59C9AC16.00D8,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 165376affd61c9d1c1eec407f1340da5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1964 Lines: 65 On 2017/9/26 3:40, Steven Rostedt wrote: > On Mon, 11 Sep 2017 14:51:49 +0800 > Zhou Chengming wrote: > >> push_rt_task() pick the first pushable task and find an eligible >> lowest_rq, then double_lock_balance(rq, lowest_rq). So if >> double_lock_balance() unlock the rq (when double_lock_balance() return 1), >> we have to check if this task is still on the rq. >> >> The problem is that the check conditions are not sufficient: >> >> if (unlikely(task_rq(task) != rq || >> !cpumask_test_cpu(lowest_rq->cpu,&task->cpus_allowed) || >> task_running(rq, task) || >> !rt_task(task) || >> !task_on_rq_queued(task))) { >> >> cpu2 cpu1 cpu0 >> push_rt_task(rq1) >> pick task_A on rq1 >> find rq0 >> double_lock_balance(rq1, rq0) >> unlock(rq1) >> rq1 __schedule >> pick task_A run >> task_A sleep (dequeued) >> lock(rq0) >> lock(rq1) >> do_above_check(task_A) >> task_rq(task_A) == rq1 >> cpus_allowed unchanged >> task_running == false >> rt_task(task_A) == true >> try_to_wake_up(task_A) >> select_cpu = cpu3 >> enqueue(rq3, task_A) > How can this happen? The try_to_wake_up(task_A) needs to grab the rq > that task A is on, and we have that rq lock. > > /me confused. > > -- Steve Thanks for the reply! After the task_A sleep on cpu1, the try_to_wake_up(task_A) on cpu0 select a different cpu3, so it will grab the rq3 lock, not the rq1 lock. Thanks. > >> task_A->on_rq = 1 >> task_on_rq_queued(task_A) >> above_check passed, return rq0 >> ... >> migrate task_A from rq1 to rq0 >> >> So we can't rely on these checks of task_A to make sure the task_A is >> still on the rq1, even though we hold the rq1->lock. This patch will >> repick the first pushable task to be sure the task is still on the rq. >> >> Signed-off-by: Zhou Chengming >> > . >