2015-05-05 11:13:08

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 1/6] dump_stack: Support adding to the dump stack arch description

Arch code can set a "dump stack arch description string" which is
displayed with oops output to describe the hardware platform.

It is useful to initialise this as early as possible, so that an early
oops will have the hardware description.

However in practice we discover the hardware platform in stages, so it
would be useful to be able to incrementally fill in the hardware
description as we discover it.

This patch adds that ability, by creating dump_stack_add_arch_desc().

If there is no existing string it behaves exactly like
dump_stack_set_arch_desc(). However if there is an existing string it
appends to it, with a leading space.

This makes it easy to call it multiple times from different parts of the
code and get a reasonable looking result.

Signed-off-by: Michael Ellerman <[email protected]>
---
include/linux/printk.h | 5 +++++
kernel/printk/printk.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)


If people are happy with this, ideally we'd take this via the powerpc tree with
the rest of the series.

There also might be a better way to implement it, if so I'm all ears, but I
think what I have here is at least correct.

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 9b30871c9149..7539fd417be0 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -165,6 +165,7 @@ u32 log_buf_len_get(void);
void log_buf_kexec_setup(void);
void __init setup_log_buf(int early);
void dump_stack_set_arch_desc(const char *fmt, ...);
+void dump_stack_add_arch_desc(const char *fmt, ...);
void dump_stack_print_info(const char *log_lvl);
void show_regs_print_info(const char *log_lvl);
#else
@@ -219,6 +220,10 @@ static inline void dump_stack_set_arch_desc(const char *fmt, ...)
{
}

+static inline void dump_stack_add_arch_desc(const char *fmt, ...)
+{
+}
+
static inline void dump_stack_print_info(const char *log_lvl)
{
}
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index c099b082cd02..11d7d587e252 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3028,6 +3028,52 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
}

/**
+ * dump_stack_add_arch_desc - add arch-specific info to show with task dumps
+ * @fmt: printf-style format string
+ * @...: arguments for the format string
+ *
+ * See dump_stack_set_arch_desc() for why you'd want to use this.
+ *
+ * This version adds to any existing string already created with either
+ * dump_stack_set_arch_desc() or dump_stack_add_arch_desc(). If there is an
+ * existing string a space will be prepended to the passed string.
+ */
+void __init dump_stack_add_arch_desc(const char *fmt, ...)
+{
+ va_list args;
+ int pos, len;
+ char *p;
+
+ /*
+ * If there's an existing string we snprintf() past the end of it, and
+ * then turn the terminating NULL of the existing string into a space
+ * to create one string separated by a space.
+ *
+ * If there's no existing string we just snprintf() to the buffer, like
+ * dump_stack_set_arch_desc(), but without calling it because we'd need
+ * a varargs version.
+ */
+
+ len = strnlen(dump_stack_arch_desc_str, sizeof(dump_stack_arch_desc_str));
+ pos = len;
+
+ if (len)
+ pos++;
+
+ if (pos >= sizeof(dump_stack_arch_desc_str))
+ return; /* Ran out of space */
+
+ p = &dump_stack_arch_desc_str[pos];
+
+ va_start(args, fmt);
+ vsnprintf(p, sizeof(dump_stack_arch_desc_str) - pos, fmt, args);
+ va_end(args);
+
+ if (len)
+ dump_stack_arch_desc_str[len] = ' ';
+}
+
+/**
* dump_stack_print_info - print generic debug info for dump_stack()
* @log_lvl: log level
*
--
2.1.0


2015-05-05 11:12:58

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 2/6] powerpc: Add cpu name to dump stack arch description

As soon as we know the name of the cpu we're on, add it to the dump
stack arch description, which is printed in case of an oops.

Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/kernel/cputable.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 60262fdf35ba..cf5e0c9b80cb 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -15,6 +15,7 @@
#include <linux/threads.h>
#include <linux/init.h>
#include <linux/export.h>
+#include <linux/printk.h>

#include <asm/oprofile_impl.h>
#include <asm/cputable.h>
@@ -2174,6 +2175,8 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset,
}
#endif /* CONFIG_PPC64 || CONFIG_BOOKE */

+ dump_stack_add_arch_desc(t->cpu_name);
+
return t;
}

--
2.1.0

2015-05-05 11:13:48

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 3/6] powerpc: Add device-tree model to dump stack arch description

As soon as we know the model of the machine we're on, add it to the dump
stack arch description, which is printed in case of an oops.

Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/kernel/prom.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 308c5e15676b..a8ac55f7bd7b 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -34,6 +34,7 @@
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/libfdt.h>
+#include <linux/printk.h>

#include <asm/prom.h>
#include <asm/rtas.h>
@@ -642,6 +643,23 @@ static void __init early_reserve_mem(void)
#endif
}

+static int __init
+early_init_dt_scan_model(unsigned long node, const char *uname,
+ int depth, void *data)
+{
+ const char *prop;
+
+ if (depth != 0)
+ return 0;
+
+ prop = of_get_flat_dt_prop(node, "model", NULL);
+ if (prop)
+ dump_stack_add_arch_desc(prop);
+
+ /* break now */
+ return 1;
+}
+
void __init early_init_devtree(void *params)
{
phys_addr_t limit;
@@ -652,6 +670,8 @@ void __init early_init_devtree(void *params)
if (!early_init_dt_verify(params))
panic("BUG: Failed verifying flat device tree, bad version?");

+ of_scan_flat_dt(early_init_dt_scan_model, NULL);
+
#ifdef CONFIG_PPC_RTAS
/* Some machines might need RTAS info for debugging, grab it now. */
of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
--
2.1.0

2015-05-05 11:13:40

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 4/6] powerpc: Add ppc_md.name to dump stack arch description

As soon as we know the name of the machine description we're using, add
it to the dump stack arch description, which is printed in case of an
oops.

Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/kernel/setup-common.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 44c8d03558ac..922115cd4714 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -21,6 +21,7 @@
#include <linux/delay.h>
#include <linux/initrd.h>
#include <linux/platform_device.h>
+#include <linux/printk.h>
#include <linux/seq_file.h>
#include <linux/ioport.h>
#include <linux/console.h>
@@ -605,6 +606,8 @@ void probe_machine(void)
for (;;);
}

+ dump_stack_add_arch_desc(ppc_md.name);
+
printk(KERN_INFO "Using %s machine description\n", ppc_md.name);
}

--
2.1.0

2015-05-05 11:13:34

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 5/6] powerpc/powernv: Add opal details to dump stack arch description

Once we have unflattened the device tree we can easily grab these opal
version details and add them to dump stack arch description, which is
printed in case of an oops.

Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/platforms/powernv/setup.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 16fdcb23f4c3..9250552d4b42 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -62,8 +62,34 @@ static void __init pnv_setup_arch(void)
/* XXX PMCS */
}

+static void __init pnv_add_dump_stack_arch_desc(void)
+{
+ struct device_node *dn;
+ const char *version;
+
+ dn = of_find_node_by_path("/ibm,opal/firmware");
+ if (!dn)
+ return;
+
+ version = of_get_property(dn, "version", NULL);
+ if (version)
+ dump_stack_add_arch_desc(version);
+
+ version = of_get_property(dn, "git-id", NULL);
+ if (version)
+ dump_stack_add_arch_desc(version);
+
+ version = of_get_property(dn, "mi-version", NULL);
+ if (version)
+ dump_stack_add_arch_desc(version);
+
+ of_node_put(dn);
+}
+
static void __init pnv_init_early(void)
{
+ pnv_add_dump_stack_arch_desc();
+
/*
* Initialize the LPC bus now so that legacy serial
* ports can be found on it
--
2.1.0

2015-05-05 11:13:21

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH 6/6] powerpc/pseries: Add firmware details to dump stack arch description

Once we have unflattened the device tree we can easily grab these
firmware version details and add them to dump stack arch description,
which is printed in case of an oops.

Currently /hypervisor only exists on KVM, so if we don't find that look
for something that suggests we're on phyp and if so that's probably a
good guess. The actual content of the ibm,fw-net-version seems to be
a full path so is too long to add to the description.

Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/platforms/pseries/setup.c | 35 ++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index df6a7041922b..e89282c4f799 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -680,6 +680,39 @@ static void pSeries_cmo_feature_init(void)
pr_debug(" <- fw_cmo_feature_init()\n");
}

+static void __init pseries_add_dump_stack_arch_desc(void)
+{
+ struct device_node *dn;
+ const char *prop;
+
+ dn = of_find_node_by_path("/openprom");
+ if (dn) {
+ prop = of_get_property(dn, "model", NULL);
+ if (prop)
+ dump_stack_add_arch_desc("prom:%s", prop);
+
+ of_node_put(dn);
+ }
+
+ dn = of_find_node_by_path("/hypervisor");
+ if (dn) {
+ prop = of_get_property(dn, "compatible", NULL);
+ if (prop)
+ dump_stack_add_arch_desc("hv:%s", prop);
+
+ of_node_put(dn);
+ } else {
+ dn = of_find_node_by_path("/");
+ if (dn) {
+ prop = of_get_property(dn, "ibm,fw-net-version", NULL);
+ if (prop)
+ dump_stack_add_arch_desc("hv:phyp");
+
+ of_node_put(dn);
+ }
+ }
+}
+
/*
* Early initialization. Relocation is on but do not reference unbolted pages
*/
@@ -687,6 +720,8 @@ static void __init pSeries_init_early(void)
{
pr_debug(" -> pSeries_init_early()\n");

+ pseries_add_dump_stack_arch_desc();
+
#ifdef CONFIG_HVC_CONSOLE
if (firmware_has_feature(FW_FEATURE_LPAR))
hvc_vio_init_early();
--
2.1.0

2015-05-05 21:16:53

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/6] dump_stack: Support adding to the dump stack arch description

On Tue, 5 May 2015 21:12:12 +1000 Michael Ellerman <[email protected]> wrote:

> Arch code can set a "dump stack arch description string" which is
> displayed with oops output to describe the hardware platform.
>
> It is useful to initialise this as early as possible, so that an early
> oops will have the hardware description.
>
> However in practice we discover the hardware platform in stages, so it
> would be useful to be able to incrementally fill in the hardware
> description as we discover it.
>
> This patch adds that ability, by creating dump_stack_add_arch_desc().
>
> If there is no existing string it behaves exactly like
> dump_stack_set_arch_desc(). However if there is an existing string it
> appends to it, with a leading space.
>
> This makes it easy to call it multiple times from different parts of the
> code and get a reasonable looking result.

Some example output in the changelog would be useful, to help people
understand the value. In particular, is there any convention for how
these fields should be presented? "name:value name:value", etc? Or it
is just put random stuff in there, hopefully with self-evident
meanings.

We're going to blow out the 128 byte dump_stack_arch_desc_str[] pretty
quickly. Is dynamic allocation a possibility?

> /**
> + * dump_stack_add_arch_desc - add arch-specific info to show with task dumps
> + * @fmt: printf-style format string
> + * @...: arguments for the format string
> + *
> + * See dump_stack_set_arch_desc() for why you'd want to use this.
> + *
> + * This version adds to any existing string already created with either
> + * dump_stack_set_arch_desc() or dump_stack_add_arch_desc(). If there is an
> + * existing string a space will be prepended to the passed string.
> + */
> +void __init dump_stack_add_arch_desc(const char *fmt, ...)
> +{
> + va_list args;
> + int pos, len;
> + char *p;
> +
> + /*
> + * If there's an existing string we snprintf() past the end of it, and
> + * then turn the terminating NULL of the existing string into a space
> + * to create one string separated by a space.
> + *
> + * If there's no existing string we just snprintf() to the buffer, like
> + * dump_stack_set_arch_desc(), but without calling it because we'd need
> + * a varargs version.
> + */
> +
> + len = strnlen(dump_stack_arch_desc_str, sizeof(dump_stack_arch_desc_str));
> + pos = len;
> +
> + if (len)
> + pos++;
> +
> + if (pos >= sizeof(dump_stack_arch_desc_str))
> + return; /* Ran out of space */
> +
> + p = &dump_stack_arch_desc_str[pos];
> +
> + va_start(args, fmt);
> + vsnprintf(p, sizeof(dump_stack_arch_desc_str) - pos, fmt, args);
> + va_end(args);

This code is almost race-free. A (documented) smp_wmb() in here would
make that 100%?

> + if (len)
> + dump_stack_arch_desc_str[len] = ' ';
> +}
> +

2015-05-06 00:13:35

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 1/6] dump_stack: Support adding to the dump stack arch description

On Tue, 2015-05-05 at 14:16 -0700, Andrew Morton wrote:
> On Tue, 5 May 2015 21:12:12 +1000 Michael Ellerman <[email protected]> wrote:
>
> > Arch code can set a "dump stack arch description string" which is
> > displayed with oops output to describe the hardware platform.
> >
> > It is useful to initialise this as early as possible, so that an early
> > oops will have the hardware description.
> >
> > However in practice we discover the hardware platform in stages, so it
> > would be useful to be able to incrementally fill in the hardware
> > description as we discover it.
> >
> > This patch adds that ability, by creating dump_stack_add_arch_desc().
> >
> > If there is no existing string it behaves exactly like
> > dump_stack_set_arch_desc(). However if there is an existing string it
> > appends to it, with a leading space.
> >
> > This makes it easy to call it multiple times from different parts of the
> > code and get a reasonable looking result.
>
> Some example output in the changelog would be useful, to help people
> understand the value.

OK, an example from a power system is:

Hardware name: POWER8E (raw) 8286-42A PowerNV skiboot-5.0 MI SV830_027 SV810_085 SV830_027
^ ^ ^ ^
| | | firmware version
cpu name | platform name
device tree model

Those are all added in separate parts of the code.

> In particular, is there any convention for how these fields should be
> presented? "name:value name:value", etc? Or it is just put random stuff in
> there, hopefully with self-evident meanings.

I was thinking there was no convention and it's basically up to arch/platform
authors to put what they want in there. I see it as a human readable bag of
stuff that might be helpful in diagnosing an oops, with the emphasis being on
having more information rather than precisely specified fields & values.

Having said that I did use name:value in one of my patches, just to
differentiate the various version numbers. So maybe I should do that for all
the fields. Though then we have the problem that the values contain spaces ...

> We're going to blow out the 128 byte dump_stack_arch_desc_str[] pretty
> quickly. Is dynamic allocation a possibility?

Are we? I think that's enough for us on most powerpc systems, and ideally it's
not too long or it will wrap when printed to the console.

I'd be inclined to avoid dynamic allocation until we absolutely need it, if
anything we could just bump it to 256.

> > /**
> > + * dump_stack_add_arch_desc - add arch-specific info to show with task dumps
> > + * @fmt: printf-style format string
> > + * @...: arguments for the format string
> > + *
> > + * See dump_stack_set_arch_desc() for why you'd want to use this.
> > + *
> > + * This version adds to any existing string already created with either
> > + * dump_stack_set_arch_desc() or dump_stack_add_arch_desc(). If there is an
> > + * existing string a space will be prepended to the passed string.
> > + */
> > +void __init dump_stack_add_arch_desc(const char *fmt, ...)
> > +{
> > + va_list args;
> > + int pos, len;
> > + char *p;
> > +
> > + /*
> > + * If there's an existing string we snprintf() past the end of it, and
> > + * then turn the terminating NULL of the existing string into a space
> > + * to create one string separated by a space.
> > + *
> > + * If there's no existing string we just snprintf() to the buffer, like
> > + * dump_stack_set_arch_desc(), but without calling it because we'd need
> > + * a varargs version.
> > + */
> > +
> > + len = strnlen(dump_stack_arch_desc_str, sizeof(dump_stack_arch_desc_str));
> > + pos = len;
> > +
> > + if (len)
> > + pos++;
> > +
> > + if (pos >= sizeof(dump_stack_arch_desc_str))
> > + return; /* Ran out of space */
> > +
> > + p = &dump_stack_arch_desc_str[pos];
> > +
> > + va_start(args, fmt);
> > + vsnprintf(p, sizeof(dump_stack_arch_desc_str) - pos, fmt, args);
> > + va_end(args);
>
> This code is almost race-free. A (documented) smp_wmb() in here would
> make that 100%?

Yeah good idea.

I'll send a v2 with that change and an updated changelog.

cheers


2015-05-06 23:50:15

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 2/6] powerpc: Add cpu name to dump stack arch description

On Tue, 2015-05-05 at 21:12 +1000, Michael Ellerman wrote:
> As soon as we know the name of the cpu we're on, add it to the dump
> stack arch description, which is printed in case of an oops.
>
> Signed-off-by: Michael Ellerman <[email protected]>
> ---
> arch/powerpc/kernel/cputable.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
> index 60262fdf35ba..cf5e0c9b80cb 100644
> --- a/arch/powerpc/kernel/cputable.c
> +++ b/arch/powerpc/kernel/cputable.c
> @@ -15,6 +15,7 @@
> #include <linux/threads.h>
> #include <linux/init.h>
> #include <linux/export.h>
> +#include <linux/printk.h>
>
> #include <asm/oprofile_impl.h>
> #include <asm/cputable.h>
> @@ -2174,6 +2175,8 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset,
> }
> #endif /* CONFIG_PPC64 || CONFIG_BOOKE */
>
> + dump_stack_add_arch_desc(t->cpu_name);
> +

Can we make this the PVR instead if the name? It gives us more fidelity
on what the hardware revision is.

Mikey

> return t;
> }
>

2015-05-08 06:59:31

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 1/6] dump_stack: Support adding to the dump stack arch description

On Tue, 2015-05-05 at 14:16 -0700, Andrew Morton wrote:
> On Tue, 5 May 2015 21:12:12 +1000 Michael Ellerman <[email protected]> wrote:
>
> > Arch code can set a "dump stack arch description string" which is
> > displayed with oops output to describe the hardware platform.
> > +
> > + len = strnlen(dump_stack_arch_desc_str, sizeof(dump_stack_arch_desc_str));
> > + pos = len;
> > +
> > + if (len)
> > + pos++;
> > +
> > + if (pos >= sizeof(dump_stack_arch_desc_str))
> > + return; /* Ran out of space */
> > +
> > + p = &dump_stack_arch_desc_str[pos];
> > +
> > + va_start(args, fmt);
> > + vsnprintf(p, sizeof(dump_stack_arch_desc_str) - pos, fmt, args);
> > + va_end(args);
>
> This code is almost race-free. A (documented) smp_wmb() in here would
> make that 100%?
>
> > + if (len)
> > + dump_stack_arch_desc_str[len] = ' ';
> > +}

On second thoughts I don't think it would.

It would order the stores in vsnprintf() vs the store of the space. The idea
being you never see a partially printed string. But for that to actually work
you need a barrier on the read side, and where do you put it?

The cpu printing the buffer could speculate the load of the tail of the buffer,
seeing something half printed from vsnprintf(), and then load the head of the
buffer and see the space, unless you order those loads.

So I don't think we can prevent a crashing cpu seeing a semi-printed buffer
without a lock, and we don't want to add a lock.

The other issue would be that a reader could miss the trailing NULL from the
vsnprintf() but see the space, meaning it would wander off the end of the
buffer. But the buffer's in BSS to start with, and we're careful not to print
off the end of it, so it should always be NULL terminated.

cheers