Date: Tue, 5 Apr 2016 17:00:36 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Luca Abeni <luca.abeni@unitn.it>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
        Juri Lelli <juri.lelli@arm.com>
Subject: Re: [RFC v2 3/7] Improve the tracking of active utilisation
Message-ID: <20160405150036.GA3430@twins.programming.kicks-ass.net>
References: <1459523553-29089-1-git-send-email-luca.abeni@unitn.it>
 <1459523553-29089-4-git-send-email-luca.abeni@unitn.it>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1459523553-29089-4-git-send-email-luca.abeni@unitn.it>
User-Agent: Mutt/1.5.21 (2012-12-30)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2699
Lines: 99

On Fri, Apr 01, 2016 at 05:12:29PM +0200, Luca Abeni wrote:
> +static void task_go_inactive(struct task_struct *p)
> +{
> +	struct sched_dl_entity *dl_se = &p->dl;
> +	struct hrtimer *timer = &dl_se->inactive_timer;
> +	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> +	struct rq *rq = rq_of_dl_rq(dl_rq);
> +	ktime_t now, act;
> +	s64 delta;
> +	u64 zerolag_time;
> +
> +	WARN_ON(dl_se->dl_runtime == 0);
> +
> +	/* If the inactive timer is already armed, return immediately */
> +	if (hrtimer_active(&dl_se->inactive_timer))
> +		return;

So while we start the timer on the local cpu, we don't migrate the timer
when we migrate the task, so the callback can happen on a remote cpu,
right?

Therefore, the timer function might still be running, but just have done
task_rq_unlock(), which would have allowed our cpu to acquire the
rq->lock and get here.

Then the above check is true, we'll quit, but effectively the inactive
timer will not run 'again'.

> +
> +
> +	/*
> +	 * We want the timer to fire at the "0 lag time", but considering
> +	 * that it is actually coming from rq->clock and not from
> +	 * hrtimer's time base reading.
> +	 */
> +	zerolag_time = dl_se->deadline -
> +		 div64_long((dl_se->runtime * dl_se->dl_period),
> +			dl_se->dl_runtime);
> +
> +	act = ns_to_ktime(zerolag_time);
> +	now = hrtimer_cb_get_time(timer);
> +	delta = ktime_to_ns(now) - rq_clock(rq);
> +	act = ktime_add_ns(act, delta);
> +
> +	/*
> +	 * If the "0-lag time" already passed, decrease the active
> +	 * utilization now, instead of starting a timer
> +	 */
> +	if (ktime_us_delta(act, now) < 0) {
> +		sub_running_bw(dl_se, dl_rq);
> +		if (!dl_task(p))
> +			__dl_clear_params(p);
> +
> +		return;
> +	}
> +
> +	get_task_struct(p);
> +	hrtimer_start(timer, act, HRTIMER_MODE_ABS);
> +}


> @@ -1071,6 +1164,23 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
>  	}
>  	rcu_read_unlock();
>  
> +	if (rq != cpu_rq(cpu)) {

I don't think this is right, you want:

	if (task_cpu(p) != cpu) {

because @cpu does not need to be task_cpu().

> +		int migrate_active;
> +
> +		raw_spin_lock(&rq->lock);

Which then also means @rq is 'wrong', so you'll have to add:

		rq = task_rq(p);

before this.

> +		migrate_active = hrtimer_active(&p->dl.inactive_timer);
> +		if (migrate_active)
> +			sub_running_bw(&p->dl, &rq->dl);
> +		raw_spin_unlock(&rq->lock);

At this point task_rq() is still the above rq, so if the inactive timer
hits here it will lock this rq and subtract the running bw here _again_,
right?

> +		if (migrate_active) {
> +			rq = cpu_rq(cpu);
> +			raw_spin_lock(&rq->lock);
> +			add_running_bw(&p->dl, &rq->dl);
> +			raw_spin_unlock(&rq->lock);
> +		}
> +	}