2015-02-02 17:53:49

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 3/4] sched: Pull preemption disablement to __schedule() caller

On Wed, Jan 28, 2015 at 04:50:44PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 28, 2015 at 01:24:11AM +0100, Frederic Weisbecker wrote:
> > +++ b/kernel/sched/core.c
> > @@ -2760,7 +2760,6 @@ static void __sched __schedule(void)
> > struct rq *rq;
> > int cpu;
> >
> > - preempt_disable();
>
> Implies barrier();
>
> > cpu = smp_processor_id();
> > rq = cpu_rq(cpu);
> > rcu_note_context_switch();
> > @@ -2822,8 +2821,6 @@ static void __sched __schedule(void)
> > raw_spin_unlock_irq(&rq->lock);
> >
> > post_schedule(rq);
>
> implies barrier();
>
> > -
> > - sched_preempt_enable_no_resched();
> > }
> >
> > static inline void sched_submit_work(struct task_struct *tsk)
>
> > @@ -2883,9 +2882,9 @@ void __sched schedule_preempt_disabled(void)
> > static void preempt_schedule_common(void)
> > {
> > do {
> > - preempt_count_add(PREEMPT_ACTIVE);
> > + preempt_count_add(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
>
> Does _NOT_ imply barrier();
>
> > __schedule();
> > - preempt_count_sub(PREEMPT_ACTIVE);
>
> idem.

It looks like preempt_count_add/inc() mostly imply entering a context that we want
to be seen right away (thus want barrier() after) and preempt_count_sub/dec() mostly
want previous work to be visible before re-enabling interrupt, preemption, etc...
(thus want barrier() before).

So maybe these functions (the non-underscored ones) should imply a barrier() rather
than their callers (preempt_disable() and others). Inline functions instead of macros
would do the trick (if the headers hell let us do that).

Note the underscored implementations are all inline currently so this happens to
work by chance for direct calls to preempt_count_add/sub() outside preempt_disable().
If the non-underscored caller is turned into inline too I don't expect performance issues.

What do you think, does it make sense?


>
> > + preempt_count_sub(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
> >
> > /*
> > * Check again in case we missed a preemption opportunity


2015-02-03 10:53:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/4] sched: Pull preemption disablement to __schedule() caller

On Mon, Feb 02, 2015 at 06:53:45PM +0100, Frederic Weisbecker wrote:
> It looks like preempt_count_add/inc() mostly imply entering a context that we want
> to be seen right away (thus want barrier() after) and preempt_count_sub/dec() mostly
> want previous work to be visible before re-enabling interrupt, preemption, etc...
> (thus want barrier() before).
>
> So maybe these functions (the non-underscored ones) should imply a barrier() rather
> than their callers (preempt_disable() and others). Inline functions instead of macros
> would do the trick (if the headers hell let us do that).
>
> Note the underscored implementations are all inline currently so this happens to
> work by chance for direct calls to preempt_count_add/sub() outside preempt_disable().
> If the non-underscored caller is turned into inline too I don't expect performance issues.
>
> What do you think, does it make sense?

AFAIK inline does _not_ guarantee a compiler barrier, only an actual
function call does.

When inlining the compiler creates visibility into the 'call' and can
avoid the constraint -- teh interweb seems to agree and also pointed out
that 'pure' function calls, even when actual function calls, can avoid
being a compiler barrier.

The below blog seems to do a fair job of explaining things; in
particular the 'implied compiler barriers' section is relevant here:

http://preshing.com/20120625/memory-ordering-at-compile-time/

As it stands the difference between the non underscore and the
underscore version is debug/tracing muck. The underscore ops are the raw
operations without fancy bits on.

I think I would prefer keeping it that way; this means that
preempt_count_$op() is a pure op and when we want to build stuff with it
like preempt_{en,dis}able() they add the extra semantics on top.

In any case; if we make __schedule() noinline (I think that might make
sense) that function call would itself imply the compiler barrier and
something like:

__preempt_count_add(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
__schedule();
__preempt_count_sub(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);

Would actually be safe/correct.

As it stands I think __schedule() would fail the GCC inline static
criteria for being too large, but you never know, noinline guarantees it
will not.

2015-02-04 17:32:07

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 3/4] sched: Pull preemption disablement to __schedule() caller

On Tue, Feb 03, 2015 at 11:53:03AM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2015 at 06:53:45PM +0100, Frederic Weisbecker wrote:
> > It looks like preempt_count_add/inc() mostly imply entering a context that we want
> > to be seen right away (thus want barrier() after) and preempt_count_sub/dec() mostly
> > want previous work to be visible before re-enabling interrupt, preemption, etc...
> > (thus want barrier() before).
> >
> > So maybe these functions (the non-underscored ones) should imply a barrier() rather
> > than their callers (preempt_disable() and others). Inline functions instead of macros
> > would do the trick (if the headers hell let us do that).
> >
> > Note the underscored implementations are all inline currently so this happens to
> > work by chance for direct calls to preempt_count_add/sub() outside preempt_disable().
> > If the non-underscored caller is turned into inline too I don't expect performance issues.
> >
> > What do you think, does it make sense?
>
> AFAIK inline does _not_ guarantee a compiler barrier, only an actual
> function call does.
>
> When inlining the compiler creates visibility into the 'call' and can
> avoid the constraint -- teh interweb seems to agree and also pointed out
> that 'pure' function calls, even when actual function calls, can avoid
> being a compiler barrier.
>
> The below blog seems to do a fair job of explaining things; in
> particular the 'implied compiler barriers' section is relevant here:
>
> http://preshing.com/20120625/memory-ordering-at-compile-time/

Ok, ok then.

> As it stands the difference between the non underscore and the
> underscore version is debug/tracing muck. The underscore ops are the raw
> operations without fancy bits on.
>
> I think I would prefer keeping it that way; this means that
> preempt_count_$op() is a pure op and when we want to build stuff with it
> like preempt_{en,dis}able() they add the extra semantics on top.
>
> In any case; if we make __schedule() noinline (I think that might make
> sense) that function call would itself imply the compiler barrier and
> something like:
>
> __preempt_count_add(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
> __schedule();
> __preempt_count_sub(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
>
> Would actually be safe/correct.
>
> As it stands I think __schedule() would fail the GCC inline static
> criteria for being too large, but you never know, noinline guarantees it
> will not.

Right, although relying only on __schedule() as a function call is perhaps
error-prone in case we add things in preempt_schedule*() APIs later, before
the call to __schedule(), that need the preempt count to be visible.

I can create preempt_active_enter() / preempt_active_exit() that take care
of the preempt op and the barrier() for example.

2015-02-04 17:48:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/4] sched: Pull preemption disablement to __schedule() caller

On Wed, Feb 04, 2015 at 06:31:57PM +0100, Frederic Weisbecker wrote:

> > In any case; if we make __schedule() noinline (I think that might make
> > sense) that function call would itself imply the compiler barrier and
> > something like:
> >
> > __preempt_count_add(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
> > __schedule();
> > __preempt_count_sub(PREEMPT_ACTIVE + PREEMPT_CHECK_OFFSET);
> >
> > Would actually be safe/correct.
> >
> > As it stands I think __schedule() would fail the GCC inline static
> > criteria for being too large, but you never know, noinline guarantees it
> > will not.
>
> Right, although relying only on __schedule() as a function call is perhaps
> error-prone in case we add things in preempt_schedule*() APIs later, before
> the call to __schedule(), that need the preempt count to be visible.
>
> I can create preempt_active_enter() / preempt_active_exit() that take care
> of the preempt op and the barrier() for example.

Sure, like that exception_enter() in preempt_schedule_context() for
instance?