Hi Rasmus,
Kernel 5.6-rc1 silently fails on boot.
I bisected the problem to commit b6231ea2b3c6 ("soc: fsl: qe: drop
broken lazy call of cpm_muram_init()")
I get a bad_page_fault() for an access at address 8 in
cpm_muram_alloc_common(), called from cpm_uart_console_setup() via
cpm_uart_allocbuf()
Reverting the guilty commit on top of 5.6-rc1 is not trivial.
In your commit text you explain that cpm_muram_init() is called via
subsys_initcall. But console init is done before that, so it cannot work.
Do you have a fix for that ?
Thanks
Christophe
NB: Next time, can you please copy powerpc mailing list when changing
such core parts of powerpc CPUs ?
Le 12/02/2020 à 15:24, Christophe Leroy a écrit :
> Hi Rasmus,
>
[...]
>
> NB: Next time, can you please copy powerpc mailing list when changing
> such core parts of powerpc CPUs ?
Apologise for that comment, in fact I was part of the recipients so it
didn't land in my linuxppc mailbox.
Seems like I missed that series.
On 02/12/2020 02:24 PM, Christophe Leroy wrote:
> Hi Rasmus,
>
> Kernel 5.6-rc1 silently fails on boot.
>
> I bisected the problem to commit b6231ea2b3c6 ("soc: fsl: qe: drop
> broken lazy call of cpm_muram_init()")
>
> I get a bad_page_fault() for an access at address 8 in
> cpm_muram_alloc_common(), called from cpm_uart_console_setup() via
> cpm_uart_allocbuf()
>
> Reverting the guilty commit on top of 5.6-rc1 is not trivial.
>
> In your commit text you explain that cpm_muram_init() is called via
> subsys_initcall. But console init is done before that, so it cannot work.
>
> Do you have a fix for that ?
>
The following patch allows powerpc 8xx to boot again. Don't know if
that's the good place and way to do the fix though.
---
diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
index 4cabded8390b..341d682ec6eb 100644
--- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
@@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
console *co, char *options)
clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR | SCC_GSMRL_ENT);
}
+ cpm_muram_init();
ret = cpm_uart_allocbuf(pinfo, 1);
if (ret)
---
Christophe
On 02/12/2020 22:50 PM, Christophe Leroy wrote:
> -----Original Message-----
> From: Christophe Leroy <[email protected]>
> Sent: 2020年2月12日 22:50
> To: Rasmus Villemoes <[email protected]>; Leo Li
> <[email protected]>; Qiang Zhao <[email protected]>; Greg
> Kroah-Hartman <[email protected]>
> Cc: Scott Wood <[email protected]>; [email protected]; LKML
> <[email protected]>; [email protected]
> Subject: Re: [Regression 5.6-rc1][Bisected b6231ea2b3c6] Powerpc 8xx doesn't
> boot anymore
>
> ---
> diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> index 4cabded8390b..341d682ec6eb 100644
> --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> @@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
> console *co, char *options)
> clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR |
> SCC_GSMRL_ENT);
> }
>
> + cpm_muram_init();
> ret = cpm_uart_allocbuf(pinfo, 1);
>
> if (ret)
>
How about the patch like below? Just a draft.
diff --git a/drivers/soc/fsl/qe/qe.c b/drivers/soc/fsl/qe/qe.c
index 96c2057..c5c2464 100644
--- a/drivers/soc/fsl/qe/qe.c
+++ b/drivers/soc/fsl/qe/qe.c
@@ -29,6 +29,8 @@
#include <soc/fsl/qe/immap_qe.h>
#include <soc/fsl/qe/qe.h>
+static int qe_inited;
+
static void qe_snums_init(void);
static int qe_sdma_init(void);
@@ -637,15 +639,19 @@ unsigned int qe_get_num_of_snums(void)
}
EXPORT_SYMBOL(qe_get_num_of_snums);
-static int __init qe_init(void)
+int __init qe_init(void)
{
struct device_node *np;
+ if(qe_inited)
+ return 0;
+
np = of_find_compatible_node(NULL, NULL, "fsl,qe");
if (!np)
return -ENODEV;
qe_reset();
of_node_put(np);
+ qe_inited = 1
return 0;
}
subsys_initcall(qe_init);
diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
index 19d5a4c..cbf2c32 100644
--- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
@@ -1373,6 +1373,7 @@ static struct console cpm_scc_uart_console = {
static int __init cpm_uart_console_init(void)
{
+ qe_init();
register_console(&cpm_scc_uart_console);
return 0;
}
diff --git a/include/soc/fsl/qe/qe.h b/include/soc/fsl/qe/qe.h
index e282ac0..531ba05 100644
--- a/include/soc/fsl/qe/qe.h
+++ b/include/soc/fsl/qe/qe.h
@@ -88,6 +88,7 @@ static inline bool qe_clock_is_brg(enum qe_clock clk)
extern spinlock_t cmxgcr_lock;
+int __init qe_init(void);
/* Export QE common operations */
#ifdef CONFIG_QUICC_ENGINE
extern void qe_reset(void);
Best Regards
Qiang Zhao
Le 13/02/2020 à 04:35, Qiang Zhao a écrit :
> On 02/12/2020 22:50 PM, Christophe Leroy wrote:
>> -----Original Message-----
>> From: Christophe Leroy <[email protected]>
>> Sent: 2020年2月12日 22:50
>> To: Rasmus Villemoes <[email protected]>; Leo Li
>> <[email protected]>; Qiang Zhao <[email protected]>; Greg
>> Kroah-Hartman <[email protected]>
>> Cc: Scott Wood <[email protected]>; [email protected]; LKML
>> <[email protected]>; [email protected]
>> Subject: Re: [Regression 5.6-rc1][Bisected b6231ea2b3c6] Powerpc 8xx doesn't
>> boot anymore
>>
>> ---
>> diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> index 4cabded8390b..341d682ec6eb 100644
>> --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> @@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
>> console *co, char *options)
>> clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR |
>> SCC_GSMRL_ENT);
>> }
>>
>> + cpm_muram_init();
>> ret = cpm_uart_allocbuf(pinfo, 1);
>>
>> if (ret)
>>
> How about the patch like below? Just a draft.
Yes, I see the idea. I think we could go for something like that.
But in the powerpc 8xx case, we are talking about cpm_init(), not qe_init().
And maybe the return code should be checked, because if it's not 0,
cpm_muram_init() won't have been called.
Thanks,
Christophe
On 02/13/2020 14:17 PM, Christophe Leroy wrote:
> -----Original Message-----
> From: Christophe Leroy <[email protected]>
> Sent: 2020年2月13日 14:17
> To: Qiang Zhao <[email protected]>; Rasmus Villemoes
> <[email protected]>; Leo Li <[email protected]>; Greg
> Kroah-Hartman <[email protected]>
> Cc: Scott Wood <[email protected]>; [email protected]; LKML
> <[email protected]>; [email protected]
> Subject: Re: [Regression 5.6-rc1][Bisected b6231ea2b3c6] Powerpc 8xx doesn't
> boot anymore
>
>
>
> Le 13/02/2020 à 04:35, Qiang Zhao a écrit :
> > On 02/12/2020 22:50 PM, Christophe Leroy wrote:
> >> -----Original Message-----
> >> From: Christophe Leroy <[email protected]>
> >> Sent: 2020年2月12日 22:50
> >> To: Rasmus Villemoes <[email protected]>; Leo Li
> >> <[email protected]>; Qiang Zhao <[email protected]>; Greg
> >> Kroah-Hartman <[email protected]>
> >> Cc: Scott Wood <[email protected]>; [email protected];
> >> LKML <[email protected]>;
> >> [email protected]
> >> Subject: Re: [Regression 5.6-rc1][Bisected b6231ea2b3c6] Powerpc 8xx
> >> doesn't boot anymore
> >>
> >> ---
> >> diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> >> b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> >> index 4cabded8390b..341d682ec6eb 100644
> >> --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> >> +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> >> @@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
> >> console *co, char *options)
> >> clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR |
> >> SCC_GSMRL_ENT);
> >> }
> >>
> >> + cpm_muram_init();
> >> ret = cpm_uart_allocbuf(pinfo, 1);
> >>
> >> if (ret)
> >>
> > How about the patch like below? Just a draft.
>
> Yes, I see the idea. I think we could go for something like that.
> But in the powerpc 8xx case, we are talking about cpm_init(), not qe_init().
Yes, cpm_init.
Best Regards
Qiang Zhao
On 12/02/2020 15.24, Christophe Leroy wrote:
> Hi Rasmus,
>
> Kernel 5.6-rc1 silently fails on boot.
>
> I bisected the problem to commit b6231ea2b3c6 ("soc: fsl: qe: drop
> broken lazy call of cpm_muram_init()")
>
> I get a bad_page_fault() for an access at address 8 in
> cpm_muram_alloc_common(), called from cpm_uart_console_setup() via
> cpm_uart_allocbuf()
Sorry about that. But I'm afraid I don't see what I could have done
differently - the patch series, including b6231ea2b3c6, has been in
-next since 20191210, both you and ppc-dev were cc'ed on the entire
series (last revision sent November 28). And I've been dogfooding the
patches on both arm- and ppc-derived boards ever since (but obviously
only for a few cpus).
> Reverting the guilty commit on top of 5.6-rc1 is not trivial.
>
> In your commit text you explain that cpm_muram_init() is called via
> subsys_initcall. But console init is done before that, so it cannot work.
No, but neither did the code I removed seem to work - how does doing
spin_lock_init on a held spinlock, and then unlocking it, work? Is
everything-spinlock always a no-op in your configuration? And even so,
I'd think a GFP_KERNEL allocation under spin_lock_irqsave() would
trigger some splat somewhere?
Please note I'm not claiming my patch is not at fault, it clearly is, I
just want to try to understand how I could have been wrong about the
"nobody can have been relying on it" part.
> Do you have a fix for that ?
Not right now, but I'll have a look. It's true that the patch probably
doesn't revert cleanly, but it shouldn't be hard to add back those few
lines in the appropriate spot, with a big fat comment that this does
something very fishy (at least as a temporary measure if we don't find a
proper solution soonish).
Rasmus
On 12/02/2020 15.50, Christophe Leroy wrote:
>
>
> On 02/12/2020 02:24 PM, Christophe Leroy wrote:
>> In your commit text you explain that cpm_muram_init() is called via
>> subsys_initcall. But console init is done before that, so it cannot work.
>>
>> Do you have a fix for that ?
>>
>
> The following patch allows powerpc 8xx to boot again. Don't know if
> that's the good place and way to do the fix though.
>
> ---
> diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> index 4cabded8390b..341d682ec6eb 100644
> --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
> @@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
> console *co, char *options)
> clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR | SCC_GSMRL_ENT);
> }
>
> + cpm_muram_init();
> ret = cpm_uart_allocbuf(pinfo, 1);
>
> if (ret)
Hmm, that seems to be a somewhat random place, making it hard to see
that it is indeed early enough. Would it work to put it inside the
console_initcall that registers the cpm console? I.e.
static int __init cpm_uart_console_init(void)
{
+ cpm_muram_init();
register_console(&cpm_scc_uart_console);
return 0;
}
console_initcall(cpm_uart_console_init);
Rasmus
On 02/13/2020 07:45 AM, Rasmus Villemoes wrote:
> On 12/02/2020 15.24, Christophe Leroy wrote:
>> Hi Rasmus,
>>
>> Kernel 5.6-rc1 silently fails on boot.
>>
>> I bisected the problem to commit b6231ea2b3c6 ("soc: fsl: qe: drop
>> broken lazy call of cpm_muram_init()")
>>
>> I get a bad_page_fault() for an access at address 8 in
>> cpm_muram_alloc_common(), called from cpm_uart_console_setup() via
>> cpm_uart_allocbuf()
>
> Sorry about that. But I'm afraid I don't see what I could have done
> differently - the patch series, including b6231ea2b3c6, has been in
> -next since 20191210, both you and ppc-dev were cc'ed on the entire
> series (last revision sent November 28). And I've been dogfooding the
> patches on both arm- and ppc-derived boards ever since (but obviously
> only for a few cpus).
Yes, this patch series should have ringed a bell in my head, looks like
I'm the one who introduced this 4 years ago through commit 4d486e008379
("soc/fsl/qe: fix Oops on CPM1 (and likely CPM2)")
But I had completely forgotten that patch until I did some git blame
this morning on this lazy call.
>
>> Reverting the guilty commit on top of 5.6-rc1 is not trivial.
>>
>> In your commit text you explain that cpm_muram_init() is called via
>> subsys_initcall. But console init is done before that, so it cannot work.
>
> No, but neither did the code I removed seem to work - how does doing
> spin_lock_init on a held spinlock, and then unlocking it, work? Is
> everything-spinlock always a no-op in your configuration? And even so,
> I'd think a GFP_KERNEL allocation under spin_lock_irqsave() would
> trigger some splat somewhere?
>
> Please note I'm not claiming my patch is not at fault, it clearly is, I
> just want to try to understand how I could have been wrong about the
> "nobody can have been relying on it" part.
>
It seems spin_lock_init() does just nothing.
spin_lock_irqsave() just disable IRQs and increases preempt_count.
spin_lock_irqrestore() restore IRQ state, decreace preempt_count and
call preempt_schedule if preempt_count reaches 0.
Maybe with some debugging options like DEBUG_ATOMIC_SLEEP could detect it ?
>> Do you have a fix for that ?
>
> Not right now, but I'll have a look. It's true that the patch probably
> doesn't revert cleanly, but it shouldn't be hard to add back those few
> lines in the appropriate spot, with a big fat comment that this does
> something very fishy (at least as a temporary measure if we don't find a
> proper solution soonish).
>
Thanks
Christophe
Le 13/02/2020 à 10:37, Rasmus Villemoes a écrit :
> On 12/02/2020 15.50, Christophe Leroy wrote:
>>
>>
>> On 02/12/2020 02:24 PM, Christophe Leroy wrote:
>
>>> In your commit text you explain that cpm_muram_init() is called via
>>> subsys_initcall. But console init is done before that, so it cannot work.
>>>
>>> Do you have a fix for that ?
>>>
>>
>> The following patch allows powerpc 8xx to boot again. Don't know if
>> that's the good place and way to do the fix though.
>>
>> ---
>> diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> index 4cabded8390b..341d682ec6eb 100644
>> --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
>> @@ -1351,6 +1351,7 @@ static int __init cpm_uart_console_setup(struct
>> console *co, char *options)
>> clrbits32(&pinfo->sccp->scc_gsmrl, SCC_GSMRL_ENR | SCC_GSMRL_ENT);
>> }
>>
>> + cpm_muram_init();
>> ret = cpm_uart_allocbuf(pinfo, 1);
>>
>> if (ret)
>
> Hmm, that seems to be a somewhat random place, making it hard to see
> that it is indeed early enough. Would it work to put it inside the
> console_initcall that registers the cpm console? I.e.
>
> static int __init cpm_uart_console_init(void)
> {
> + cpm_muram_init();
> register_console(&cpm_scc_uart_console);
> return 0;
> }
>
> console_initcall(cpm_uart_console_init);
>
Yes that works too.
Thanks
Christophe