2020-10-19 11:42:51

by Christian Eggers

[permalink] [raw]
Subject: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):

-rc8-rt13 works fine
-rc8-rt14 doesn't compile (due to CONFIG_FRACE, already fixed in -rt16)
-rt15 dito.
-rt16 compiles, but doesn't boot (no console output at all)

After reverting (on -rt16)

de1c0755e6f9 ("tracing: fix compile failure on RT with PREEMPT_RT off")
30763ce6c15d ("sched: Add new migrate_disable() implementation")

the system boots fine again.

Tracking the problem down showed that calls to wait_for_completion_timeout()
(e.g. during imx_rngc_probe) will never return. The IRQ routine which should
fire the completion is not executed, and the call doesn't return after the
timeout. The IRQ flag on the ARM is not set before entering
wait_for_completion_timeout(), so CPU interrupts seem to be on.

When building with CONFIG_SMP, the system boots fine.

Any hints?

Best regards
Christian




Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
>

> Any hints?

Thank you for the report. The reason is the migrate_disable()
implementation for !SMP.

> Best regards
> Christian

Sebastian

2020-10-20 11:46:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On Tue, Oct 20, 2020 at 01:38:28PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-20 13:30:09 [+0200], Peter Zijlstra wrote:
> > On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > > I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> > > >
> > > …
> > > > Any hints?
> > >
> > > Thank you for the report. The reason is the migrate_disable()
> > > implementation for !SMP.
> >
> > This should fix things I suppose. I'll fold it in.
>
> It will. It will also break lazy-preemption. Each time a sleeping lock
> is acquired there is also migrate_disable() and the migrate-disable
> counter is != 0 (even for UP). The result is that a wake up for a
> SCHED_OTHER task with mg counter != 0 will not lead to context switch
> (same like preemption counter != 0). The difference is that a wake up
> for a RT task ignores this counter and perform a context switch anyway.

Right, but this patch set doesn't include the lazy preemption stuff, and
given the 'fun' Valentin and me are still having with it, I'd like to
keep it like that.

But yes, that might warrant a slightly less NOP implementation.

Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On 2020-10-20 13:41:37 [+0200], Peter Zijlstra wrote:
> Right, but this patch set doesn't include the lazy preemption stuff, and
> given the 'fun' Valentin and me are still having with it, I'd like to
> keep it like that.
>
> But yes, that might warrant a slightly less NOP implementation.

Uh. Looking at the actual implementation we don't look at the mg-counter
but have preempt_lazy_disable() for that.
Let me sync your bits then.
Thanks.

Sebastian

2020-10-20 12:40:27

by Christian Eggers

[permalink] [raw]
Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On Tuesday, 20 October 2020, 13:30:09 CEST, Peter Zijlstra wrote:
> On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > I have problems with the latest 5.9-rt releases on i.MX6ULL (!
CONFIG_SMP):
> > …
> >
> > > Any hints?
> >
> > Thank you for the report. The reason is the migrate_disable()
> > implementation for !SMP.
>
> This should fix things I suppose. I'll fold it in.
>
> ---
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -378,7 +378,12 @@ static inline void preempt_notifier_init
> extern void migrate_disable(void);
> extern void migrate_enable(void);
>
> -#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
> +#elif defined(CONFIG_PREEMPT_RT)
> +
> +static inline void migrate_disable(void) { }
> +static inline void migrate_enable(void { }
closing bracket missing

> +
> +#else /* !CONFIG_PREEMPT_RT */
>
> /**
> * migrate_disable - Prevent migration of the current task

I didn't understand much of you discussion with Sebastian,
but my system is able to boot now.

# uname -r
5.9.0-rt16+

Best regards
Christian



2020-10-20 23:52:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> >
> …
> > Any hints?
>
> Thank you for the report. The reason is the migrate_disable()
> implementation for !SMP.

This should fix things I suppose. I'll fold it in.

---
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -378,7 +378,12 @@ static inline void preempt_notifier_init
extern void migrate_disable(void);
extern void migrate_enable(void);

-#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
+#elif defined(CONFIG_PREEMPT_RT)
+
+static inline void migrate_disable(void) { }
+static inline void migrate_enable(void { }
+
+#else /* !CONFIG_PREEMPT_RT */

/**
* migrate_disable - Prevent migration of the current task

Subject: Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

On 2020-10-20 13:30:09 [+0200], Peter Zijlstra wrote:
> On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> > >
> > …
> > > Any hints?
> >
> > Thank you for the report. The reason is the migrate_disable()
> > implementation for !SMP.
>
> This should fix things I suppose. I'll fold it in.

It will. It will also break lazy-preemption. Each time a sleeping lock
is acquired there is also migrate_disable() and the migrate-disable
counter is != 0 (even for UP). The result is that a wake up for a
SCHED_OTHER task with mg counter != 0 will not lead to context switch
(same like preemption counter != 0). The difference is that a wake up
for a RT task ignores this counter and perform a context switch anyway.

That way we have RT wake ups on time but avoid stumbling from one lock
to another.

> ---
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -378,7 +378,12 @@ static inline void preempt_notifier_init
> extern void migrate_disable(void);
> extern void migrate_enable(void);
>
> -#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
> +#elif defined(CONFIG_PREEMPT_RT)
> +
> +static inline void migrate_disable(void) { }
> +static inline void migrate_enable(void { }
> +
> +#else /* !CONFIG_PREEMPT_RT */
>
> /**
> * migrate_disable - Prevent migration of the current task

Sebastian