Date: Mon, 8 Jun 2015 00:33:17 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: umgwanakikbuti@gmail.com, mingo@elte.hu, ktkhai@parallels.com,
        rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com,
        pang.xunlei@linaro.org, wanpeng.li@linux.intel.com,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH 08/14] hrtimer: Allow hrtimer::function() to free the
	timer
Message-ID: <20150607223317.GA5193@redhat.com>
References: <20150605084836.364306429@infradead.org> <20150605085205.723058588@infradead.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150605085205.723058588@infradead.org>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2783
Lines: 85

Not sure I read this patch correctly, it doesn't apply to Linus's tree.

And I simply can not understand the complication in hrtimer_active(),
please help!

On 06/05, Peter Zijlstra wrote:
>
> +bool hrtimer_active(const struct hrtimer *timer)
> +{
> +	struct hrtimer_cpu_base *cpu_base;
> +	unsigned int seq;
> +	bool active;
> +
> +	do {
> +		active = false;
> +		cpu_base = READ_ONCE(timer->base->cpu_base);
> +		seq = raw_read_seqcount(&cpu_base->seq);
> +
> +		if (timer->state != HRTIMER_STATE_INACTIVE ||
> +		    cpu_base->running == timer)
> +			active = true;

Why we can't simply return true in this case?

Unless you lock this timer, hrtimer_active() is inherently racy anyway.
Granted, it must not wrongly return False if the timer is pending or
running.

But "false positive" does not differ from the case when (say) the
running timer->function() finishes right after hrtimer_active() returns
True.

> +	} while (read_seqcount_retry(&cpu_base->seq, seq) ||
> +		 cpu_base != READ_ONCE(timer->base->cpu_base));

Why do we need to re-check >cpu_base?

I think we can ignore migrate_hrtimer_list(), it doesn't clear ->state.

Otherwise the timer can change its ->base only if it is not running and
inactive, and again I think we should only eliminate the false negative
return.

And I think there is a problem. Consider a timer TIMER which always
rearms itself using some "default" timeout.

In this case __hrtimer_start_range_ns(&TIMER, ...) must preserve
hrtimer_active(&TIMER) == T. By definition, and currently this is
true.

After this patch this is no longer true (afaics). If the timer is
pending but not running, __hrtimer_start_range_ns()->remove_hrtimer()
will clear ENQUEUED first, then set it again in enqueue_hrtimer().

This means that hrtimer_active() returns false in between. And note
that it doesn't matter if the timer changes its ->base or not, so
that 2nd cpu_base above can't help.

I think that __hrtimer_start_range_ns() should preserve ENQUEUED
like migrate_hrtimer_list() should do (see the previous email).


Finally. Suppose that timer->function() returns HRTIMER_RESTART
and hrtimer_active() is called right after __run_hrtimer() sets
cpu_base->running = NULL. I can't understand why hrtimer_active()
can't miss ENQUEUED in this case. We have wmb() in between, yes,
but then hrtimer_active() should do something like

	active = cpu_base->running == timer;
	if (!active) {
		rmb();
		active = state != HRTIMER_STATE_INACTIVE;
	}

No?

But I am already sleeping and probably totally confused.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/