At boot, we've got a stack trace that looks something like this
(exynos5 as example)
* exynos5_map_io
* s3c_init_cpu
* exynos_init_io
* exynos5_dt_map_io
* paging_init
* setup_arch
When paging_init() runs we'll lose any early MMU mappings that we
might have had to allow us access to S3C_VA_UART. We won't add those
mappings back in until after the SoC-specific map_io() function is
called. However, we print the CPU ID _right before_ we call the
SoC-specific function. Oops.
Things happen to work all right most of the time because the mapping
is sticking around in our TLB. ...but if we get really unlucky (like
me!) or we put an explicit flush_tlb_all() at the start of
exynos_init_io(), then things go boom.
This patch moves the problematic printk() till after the cpu->map_io()
call. It also switches it over to pr_info(). This patch _doesn't_
remove the questionable printks in the panic case, since we might get
lucky and the TLB might still let us print. This patch also adds a
few warnings to help others avoid similar headaches.
Signed-off-by: Doug Anderson <[email protected]>
---
arch/arm/mach-exynos/common.c | 7 +++++++
arch/arm/plat-samsung/init.c | 8 +++++---
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-exynos/common.c b/arch/arm/mach-exynos/common.c
index 027c9e7..8b51b0d 100644
--- a/arch/arm/mach-exynos/common.c
+++ b/arch/arm/mach-exynos/common.c
@@ -386,6 +386,13 @@ int __init exynos_fdt_map_chipid(unsigned long node, const char *uname,
void __init exynos_init_io(struct map_desc *mach_desc, int size)
{
+ /*
+ * WARNING: use of printk in this function or its children can be
+ * deadly. We've switched over to new page tables but haven't yet
+ * added S3C_VA_UART into the mapping. You might get lucky and see a
+ * printout work, but if you call flush_tlb_all() it will fail reliably.
+ */
+
#ifdef CONFIG_OF
if (initial_boot_params)
of_scan_flat_dt(exynos_fdt_map_chipid, NULL);
diff --git a/arch/arm/plat-samsung/init.c b/arch/arm/plat-samsung/init.c
index 79d10fc..494cfbb 100644
--- a/arch/arm/plat-samsung/init.c
+++ b/arch/arm/plat-samsung/init.c
@@ -49,18 +49,20 @@ void __init s3c_init_cpu(unsigned long idcode,
cpu = s3c_lookup_cpu(idcode, cputab, cputab_size);
if (cpu == NULL) {
+ /* Questionable printk; S3C_VA_UART not mapped yet! */
printk(KERN_ERR "Unknown CPU type 0x%08lx\n", idcode);
panic("Unknown S3C24XX CPU");
}
-
- printk("CPU %s (id 0x%08lx)\n", cpu->name, idcode);
-
if (cpu->map_io == NULL || cpu->init == NULL) {
+ /* Questionable printk; S3C_VA_UART not mapped yet! */
printk(KERN_ERR "CPU %s support not enabled\n", cpu->name);
panic("Unsupported Samsung CPU");
}
cpu->map_io();
+
+ /* IMPORTANT: call this after cpu->map_io() so we can print reliably */
+ pr_info("CPU %s (id 0x%08lx)\n", cpu->name, idcode);
}
/* s3c24xx_init_clocks
--
1.8.3
Hi,
On Tue, Jun 04, 2013 at 06:58:59PM -0700, Doug Anderson wrote:
> At boot, we've got a stack trace that looks something like this
> (exynos5 as example)
> * exynos5_map_io
> * s3c_init_cpu
> * exynos_init_io
> * exynos5_dt_map_io
> * paging_init
> * setup_arch
>
> When paging_init() runs we'll lose any early MMU mappings that we
> might have had to allow us access to S3C_VA_UART. We won't add those
> mappings back in until after the SoC-specific map_io() function is
> called. However, we print the CPU ID _right before_ we call the
> SoC-specific function. Oops.
>
>
> Things happen to work all right most of the time because the mapping
> is sticking around in our TLB. ...but if we get really unlucky (like
> me!) or we put an explicit flush_tlb_all() at the start of
> exynos_init_io(), then things go boom.
>
> This patch moves the problematic printk() till after the cpu->map_io()
> call. It also switches it over to pr_info(). This patch _doesn't_
> remove the questionable printks in the panic case, since we might get
> lucky and the TLB might still let us print. This patch also adds a
> few warnings to help others avoid similar headaches.
This seems to be caused by not calling iotable_ini() in exynos_init_io()
when a device tree is passed into the kernel, thus not setting up the
mapping for the UART in that case.
I think the solution is instead to map the uart earlier. The window of
exposure is still there, but much smaller (and similar to how it always
has been).
In current upstream, if there is no map_io mach_desc entry at all,
debug_ll_io_init() will be called on all platforms. Seems appropriate
to call that explicitly before of_scan_flat_dt() in exynos_init_io()
in this case.
Or am I missing something?
-Olof
Olof,
On Tue, Jun 4, 2013 at 8:15 PM, Olof Johansson <[email protected]> wrote:
> This seems to be caused by not calling iotable_ini() in exynos_init_io()
> when a device tree is passed into the kernel, thus not setting up the
> mapping for the UART in that case.
>
> I think the solution is instead to map the uart earlier. The window of
> exposure is still there, but much smaller (and similar to how it always
> has been).
>
> In current upstream, if there is no map_io mach_desc entry at all,
> debug_ll_io_init() will be called on all platforms. Seems appropriate
> to call that explicitly before of_scan_flat_dt() in exynos_init_io()
> in this case.
Oh. Ummm, right. Yes, debug_ll_io_init() is exactly what's needed
here instead of all the complexity of what I proposed. New patch
coming shortly. Thanks! :)
-Doug
If the early MMU mapping of the UART happens to get booted out of the
TLB between the start of paging_init() and when we finally re-add the
UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
we've got early_printk enabled. Avoid this hang by calling
debug_ll_io_init() early.
Without this patch, you can reliably reproduce a hang when early
printk is enabled by adding flush_tlb_all() at the start of
exynos_init_io(). After this patch the hang goes away.
Signed-off-by: Doug Anderson <[email protected]>
---
Changes in v2:
- Use debug_ll_io_init() instead of reordering printks and adding
warnings. Thanks Olof!
arch/arm/mach-exynos/common.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm/mach-exynos/common.c b/arch/arm/mach-exynos/common.c
index 027c9e7..f7e504b 100644
--- a/arch/arm/mach-exynos/common.c
+++ b/arch/arm/mach-exynos/common.c
@@ -386,6 +386,8 @@ int __init exynos_fdt_map_chipid(unsigned long node, const char *uname,
void __init exynos_init_io(struct map_desc *mach_desc, int size)
{
+ debug_ll_io_init();
+
#ifdef CONFIG_OF
if (initial_boot_params)
of_scan_flat_dt(exynos_fdt_map_chipid, NULL);
--
1.8.3
On Wed, Jun 5, 2013 at 1:56 PM, Doug Anderson <[email protected]> wrote:
> If the early MMU mapping of the UART happens to get booted out of the
> TLB between the start of paging_init() and when we finally re-add the
> UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
> we've got early_printk enabled. Avoid this hang by calling
> debug_ll_io_init() early.
>
> Without this patch, you can reliably reproduce a hang when early
> printk is enabled by adding flush_tlb_all() at the start of
> exynos_init_io(). After this patch the hang goes away.
>
> Signed-off-by: Doug Anderson <[email protected]>
Acked-by: Olof Johansson <[email protected]>
Kukjin, this seems appropriate for 3.10. Do you have other fixes
queued or should I apply this directly?
-Olof
Olof Johansson wrote:
>
> On Wed, Jun 5, 2013 at 1:56 PM, Doug Anderson <[email protected]>
> wrote:
> > If the early MMU mapping of the UART happens to get booted out of the
> > TLB between the start of paging_init() and when we finally re-add the
> > UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
> > we've got early_printk enabled. Avoid this hang by calling
> > debug_ll_io_init() early.
> >
> > Without this patch, you can reliably reproduce a hang when early
> > printk is enabled by adding flush_tlb_all() at the start of
> > exynos_init_io(). After this patch the hang goes away.
> >
> > Signed-off-by: Doug Anderson <[email protected]>
>
> Acked-by: Olof Johansson <[email protected]>
>
>
> Kukjin, this seems appropriate for 3.10. Do you have other fixes
> queued or should I apply this directly?
>
If I remember correctly, nothing in my tree for 3.10.
So please go ahead with my ack if you want,
Acked-by: Kukjin Kim <[email protected]>
Thanks.
- Kukjin
On Thu, Jun 06, 2013 at 08:32:04AM +0900, Kukjin Kim wrote:
> Olof Johansson wrote:
> >
> > On Wed, Jun 5, 2013 at 1:56 PM, Doug Anderson <[email protected]>
> > wrote:
> > > If the early MMU mapping of the UART happens to get booted out of the
> > > TLB between the start of paging_init() and when we finally re-add the
> > > UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
> > > we've got early_printk enabled. Avoid this hang by calling
> > > debug_ll_io_init() early.
> > >
> > > Without this patch, you can reliably reproduce a hang when early
> > > printk is enabled by adding flush_tlb_all() at the start of
> > > exynos_init_io(). After this patch the hang goes away.
> > >
> > > Signed-off-by: Doug Anderson <[email protected]>
> >
> > Acked-by: Olof Johansson <[email protected]>
> >
> >
> > Kukjin, this seems appropriate for 3.10. Do you have other fixes
> > queued or should I apply this directly?
> >
> If I remember correctly, nothing in my tree for 3.10.
>
> So please go ahead with my ack if you want,
>
> Acked-by: Kukjin Kim <[email protected]>
>
Applied for 3.10 (together with the uncompress.h fix from Tushar)
-Olof