Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759520AbaJ3MTo (ORCPT ); Thu, 30 Oct 2014 08:19:44 -0400 Received: from service87.mimecast.com ([91.220.42.44]:47727 "EHLO service87.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759041AbaJ3MTn convert rfc822-to-8bit (ORCPT ); Thu, 30 Oct 2014 08:19:43 -0400 Message-ID: <54522CE2.6090806@arm.com> Date: Thu, 30 Oct 2014 12:19:46 +0000 From: Juri Lelli User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Kirill Tkhai , "linux-kernel@vger.kernel.org" CC: Peter Zijlstra , Juri Lelli , Ingo Molnar , Kirill Tkhai Subject: Re: [PATCH v3] sched/dl: Implement cancel_dl_timer() to use in switched_from_dl() References: <1414420852.19914.186.camel@tkhai> In-Reply-To: <1414420852.19914.186.camel@tkhai> X-OriginalArrivalTime: 30 Oct 2014 12:19:38.0880 (UTC) FILETIME=[C692F000:01CFF43B] X-MC-Unique: 114103012193908901 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kirill, On 27/10/14 14:40, Kirill Tkhai wrote: > > Currently used hrtimer_try_to_cancel() is racy: > > raw_spin_lock(&rq->lock) > ... dl_task_timer raw_spin_lock(&rq->lock) > ... raw_spin_lock(&rq->lock) ... > switched_from_dl() ... ... > hrtimer_try_to_cancel() ... ... > switched_to_fair() ... ... > ... ... ... > ... ... ... > raw_spin_unlock(&rq->lock) ... (asquired) > ... ... ... > ... ... ... > do_exit() ... ... > schedule() ... ... > raw_spin_lock(&rq->lock) ... raw_spin_unlock(&rq->lock) > ... ... ... > raw_spin_unlock(&rq->lock) ... raw_spin_lock(&rq->lock) > ... ... (asquired) > put_task_struct() ... ... > free_task_struct() ... ... > ... ... raw_spin_unlock(&rq->lock) > ... (asquired) ... > ... ... ... > ... (use after free) ... > > > So, let's implement 100% guaranteed way to cancel the timer and let's > be sure we are safe even in very unlikely situations. > > rq unlocking does not limit the area of switched_from_dl() use, because > this has already been possible in pull_dl_task() below. > > Let's consider the safety of of this unlocking. New code in the patch > is working when hrtimer_try_to_cancel() fails. This means the callback > is running. In this case hrtimer_cancel() is just waiting till the > callback is finished. Two > > 1)Since we are in switched_from_dl(), new class is not dl_sched_class and > new prio is not less MAX_DL_PRIO. So, the callback returns early; it's > right after !dl_task() check. After that hrtimer_cancel() returns back too. > > The above is: > > raw_spin_lock(rq->lock); ... > ... dl_task_timer() > ... raw_spin_lock(rq->lock); > switched_from_dl() ... > hrtimer_try_to_cancel() ... > raw_spin_unlock(rq->lock); ... > hrtimer_cancel() ... > ... raw_spin_unlock(rq->lock); > ... return HRTIMER_NORESTART; > ... ... > raw_spin_lock(rq->lock); ... > > 2)But the below is also possible: > dl_task_timer() > raw_spin_lock(rq->lock); > ... > raw_spin_unlock(rq->lock); > raw_spin_lock(rq->lock); ... > switched_from_dl() ... > hrtimer_try_to_cancel() ... > ... return HRTIMER_NORESTART; > raw_spin_unlock(rq->lock); ... > hrtimer_cancel(); ... > raw_spin_lock(rq->lock); ... > > In this case hrtimer_cancel() returns immediately. Very unlikely case, > just to mention. > > > Nobody can manipulate the task, because check_class_changed() is > always called with pi_lock locked. Nobody can force the task to > participate in (concurrent) priority inheritance schemes (the same reason). > > All concurrent task operations require pi_lock, which is held by us. > No deadlocks with dl_task_timer() are possible, because it returns > right after !dl_task() check (it does nothing). > > If we receive a new dl_task during the time of unlocked rq, we just > don't have to do pull_dl_task() in switched_from_dl() further. > > Signed-off-by: Kirill Tkhai So, it passed simple tests. I guess it is ok :). Acked-by: Juri Lelli Thanks, - Juri > --- > kernel/sched/deadline.c | 34 +++++++++++++++++++++++++++------- > 1 file changed, 27 insertions(+), 7 deletions(-) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 256e577..9435e05 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -555,11 +555,6 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se) > { > struct hrtimer *timer = &dl_se->dl_timer; > > - if (hrtimer_active(timer)) { > - hrtimer_try_to_cancel(timer); > - return; > - } > - > hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); > timer->function = dl_task_timer; > } > @@ -1567,10 +1562,35 @@ void init_sched_dl_class(void) > > #endif /* CONFIG_SMP */ > > +/* > + * Ensure p's dl_timer is cancelled. May drop rq->lock for a while. > + */ > +static void cancel_dl_timer(struct rq *rq, struct task_struct *p) > +{ > + struct hrtimer *dl_timer = &p->dl.dl_timer; > + > + /* Nobody will change task's class if pi_lock is held */ > + lockdep_assert_held(&p->pi_lock); > + > + if (hrtimer_active(dl_timer)) { > + int ret = hrtimer_try_to_cancel(dl_timer); > + > + if (unlikely(ret == -1)) { > + /* > + * Note, p may migrate OR new deadline tasks > + * may appear in rq when we are unlocking it. > + * A caller of us must be fine with that. > + */ > + raw_spin_unlock(&rq->lock); > + hrtimer_cancel(dl_timer); > + raw_spin_lock(&rq->lock); > + } > + } > +} > + > static void switched_from_dl(struct rq *rq, struct task_struct *p) > { > - if (hrtimer_active(&p->dl.dl_timer) && !dl_policy(p->policy)) > - hrtimer_try_to_cancel(&p->dl.dl_timer); > + cancel_dl_timer(rq, p); > > __dl_clear_params(p); > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/