Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965928AbdIYTlK (ORCPT ); Mon, 25 Sep 2017 15:41:10 -0400 Received: from smtprelay0006.hostedemail.com ([216.40.44.6]:47837 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S965176AbdIYTlJ (ORCPT ); Mon, 25 Sep 2017 15:41:09 -0400 X-Session-Marker: 726F737465647440676F6F646D69732E6F7267 X-Spam-Summary: 2,0,0,,d41d8cd98f00b204,rostedt@goodmis.org,:::::::::,RULES_HIT:41:355:379:541:599:800:960:973:988:989:1260:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2553:2559:2562:2693:3138:3139:3140:3141:3142:3165:3353:3622:3865:3866:3867:3868:3871:3874:4605:5007:6261:7875:7903:7974:10004:10400:10562:10848:10967:11026:11232:11658:11914:12043:12114:12740:12760:12895:13069:13255:13311:13357:13439:14096:14097:14181:14659:14721:21080:21324:21325:21627:30054:30070:30090:30091,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:3,LUA_SUMMARY:none X-HE-Tag: boats27_43ab6f5186913 X-Filterd-Recvd-Size: 2670 Date: Mon, 25 Sep 2017 15:40:57 -0400 From: Steven Rostedt To: Zhou Chengming Cc: , , , Subject: Re: [PATCH] sched/rt.c: pick and check task if double_lock_balance() unlock the rq Message-ID: <20170925154057.191e3fd1@vmware.local.home> In-Reply-To: <1505112709-102019-1-git-send-email-zhouchengming1@huawei.com> References: <1505112709-102019-1-git-send-email-zhouchengming1@huawei.com> X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1680 Lines: 56 On Mon, 11 Sep 2017 14:51:49 +0800 Zhou Chengming wrote: > push_rt_task() pick the first pushable task and find an eligible > lowest_rq, then double_lock_balance(rq, lowest_rq). So if > double_lock_balance() unlock the rq (when double_lock_balance() return 1), > we have to check if this task is still on the rq. > > The problem is that the check conditions are not sufficient: > > if (unlikely(task_rq(task) != rq || > !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_allowed) || > task_running(rq, task) || > !rt_task(task) || > !task_on_rq_queued(task))) { > > cpu2 cpu1 cpu0 > push_rt_task(rq1) > pick task_A on rq1 > find rq0 > double_lock_balance(rq1, rq0) > unlock(rq1) > rq1 __schedule > pick task_A run > task_A sleep (dequeued) > lock(rq0) > lock(rq1) > do_above_check(task_A) > task_rq(task_A) == rq1 > cpus_allowed unchanged > task_running == false > rt_task(task_A) == true > try_to_wake_up(task_A) > select_cpu = cpu3 > enqueue(rq3, task_A) How can this happen? The try_to_wake_up(task_A) needs to grab the rq that task A is on, and we have that rq lock. /me confused. -- Steve > task_A->on_rq = 1 > task_on_rq_queued(task_A) > above_check passed, return rq0 > ... > migrate task_A from rq1 to rq0 > > So we can't rely on these checks of task_A to make sure the task_A is > still on the rq1, even though we hold the rq1->lock. This patch will > repick the first pushable task to be sure the task is still on the rq. > > Signed-off-by: Zhou Chengming >