2009-11-12 17:19:40

by Mike Travis

[permalink] [raw]
Subject: [PATCH 0/7] Limit console output by suppressing repetitious messages


Second version...

--


2009-11-12 20:48:45

by David Rientjes

[permalink] [raw]
Subject: [patch] x86: reduce srat verbosity in the kernel log

It's possible to reduce the number of SRAT messages emitted to the kernel
log by printing each valid pxm once and then creating bitmaps to represent
the apic ids that map to the same node.

This reduces lines such as

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1

to

SRAT: PXM 0 -> APIC {0-1} -> Node 0
SRAT: PXM 1 -> APIC {2-3} -> Node 1

The buffer used to store the apic id list is 128 characters in length.
If that is too small to represent all the apic id ranges that are bound
to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
manually increased for such configurations.

Acked-by: Mike Travis <[email protected]>
Signed-off-by: David Rientjes <[email protected]>
---
arch/x86/mm/srat_64.c | 41 +++++++++++++++++++++++++++++++++++++----
drivers/acpi/numa.c | 5 +++++
include/linux/acpi.h | 3 ++-
3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;

+static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
+#define APICID_LIST_LEN (128)
+
static __init int setup_node(int pxm)
{
return acpi_map_pxm_to_node(pxm);
@@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
}

/* Callback for Proximity Domain -> LAPIC mapping */
@@ -170,8 +171,40 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+}
+
+void __init acpi_numa_print_srat_mapping(void)
+{
+ char apicid_list[APICID_LIST_LEN];
+ int i, j;
+
+ for (i = 0; i < MAX_PXM_DOMAINS; i++) {
+ int len;
+ int nid;
+
+ nid = pxm_to_node(i);
+ if (nid == NUMA_NO_NODE)
+ continue;
+
+ bitmap_zero(apicid_map, MAX_LOCAL_APIC);
+ for (j = 0; j < MAX_LOCAL_APIC; j++)
+ if (apicid_to_node[j] == nid)
+ set_bit(j, apicid_map);
+
+ if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
+ continue;
+
+ /*
+ * If the bitmap cannot be listed in a buffer of length
+ * APICID_LIST_LEN, then it is suffixed with "...".
+ */
+ len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
+ apicid_map, MAX_LOCAL_APIC);
+ pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
+ i, apicid_list,
+ (len == APICID_LIST_LEN - 1) ? "..." : "",
+ nid);
+ }
}

#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
handler, max_entries);
}

+void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
+{
+}
+
int __init acpi_numa_init(void)
{
/* SRAT: Static Resource Affinity Table */
@@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory_affinity,
NR_NODE_MEMBLKS);
+ acpi_numa_print_srat_mapping();
}

/* SLIT: System Locality Information Table */
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);

-/* the following four functions are architecture-dependent */
+/* the following six functions are architecture-dependent */
void acpi_numa_slit_init (struct acpi_table_slit *slit);
void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
void acpi_numa_arch_fixup(void);
+void acpi_numa_print_srat_mapping(void);

#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */

2009-11-12 22:16:58

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 0/7] Limit console output by suppressing repetitious messages

On Thu, Nov 12, 2009 at 9:19 AM, Mike Travis <[email protected]> wrote:
>
> Second version...

what is the boot time that you can spare your this patchset?

YH

2009-11-13 09:54:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log


* David Rientjes <[email protected]> wrote:

> It's possible to reduce the number of SRAT messages emitted to the kernel
> log by printing each valid pxm once and then creating bitmaps to represent
> the apic ids that map to the same node.
>
> This reduces lines such as
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
>
> to
>
> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>
> The buffer used to store the apic id list is 128 characters in length.
> If that is too small to represent all the apic id ranges that are bound
> to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> manually increased for such configurations.
>
> Acked-by: Mike Travis <[email protected]>
> Signed-off-by: David Rientjes <[email protected]>
> ---
> arch/x86/mm/srat_64.c | 41 +++++++++++++++++++++++++++++++++++++----
> drivers/acpi/numa.c | 5 +++++
> include/linux/acpi.h | 3 ++-
> 3 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
> static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
> static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
>
> +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> +#define APICID_LIST_LEN (128)
> +
> static __init int setup_node(int pxm)
> {
> return acpi_map_pxm_to_node(pxm);
> @@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> }
>
> /* Callback for Proximity Domain -> LAPIC mapping */
> @@ -170,8 +171,40 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> +}
> +
> +void __init acpi_numa_print_srat_mapping(void)
> +{
> + char apicid_list[APICID_LIST_LEN];
> + int i, j;
> +
> + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> + int len;
> + int nid;
> +
> + nid = pxm_to_node(i);
> + if (nid == NUMA_NO_NODE)
> + continue;
> +
> + bitmap_zero(apicid_map, MAX_LOCAL_APIC);
> + for (j = 0; j < MAX_LOCAL_APIC; j++)
> + if (apicid_to_node[j] == nid)
> + set_bit(j, apicid_map);
> +
> + if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
> + continue;
> +
> + /*
> + * If the bitmap cannot be listed in a buffer of length
> + * APICID_LIST_LEN, then it is suffixed with "...".
> + */
> + len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
> + apicid_map, MAX_LOCAL_APIC);
> + pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
> + i, apicid_list,
> + (len == APICID_LIST_LEN - 1) ? "..." : "",
> + nid);
> + }
> }

No. As i suggested many times before, just get rid of the printouts or
make them boot-debug-flag dependent.

Ingo

2009-11-13 10:02:27

by David Rientjes

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log

On Fri, 13 Nov 2009, Ingo Molnar wrote:

> > It's possible to reduce the number of SRAT messages emitted to the kernel
> > log by printing each valid pxm once and then creating bitmaps to represent
> > the apic ids that map to the same node.
> >
> > This reduces lines such as
> >
> > SRAT: PXM 0 -> APIC 0 -> Node 0
> > SRAT: PXM 0 -> APIC 1 -> Node 0
> > SRAT: PXM 1 -> APIC 2 -> Node 1
> > SRAT: PXM 1 -> APIC 3 -> Node 1
> >
> > to
> >
> > SRAT: PXM 0 -> APIC {0-1} -> Node 0
> > SRAT: PXM 1 -> APIC {2-3} -> Node 1
> >
> > The buffer used to store the apic id list is 128 characters in length.
> > If that is too small to represent all the apic id ranges that are bound
> > to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> > manually increased for such configurations.
> >
> > Acked-by: Mike Travis <[email protected]>
> > Signed-off-by: David Rientjes <[email protected]>
> > ---
> > arch/x86/mm/srat_64.c | 41 +++++++++++++++++++++++++++++++++++++----
> > drivers/acpi/numa.c | 5 +++++
> > include/linux/acpi.h | 3 ++-
> > 3 files changed, 44 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> > --- a/arch/x86/mm/srat_64.c
> > +++ b/arch/x86/mm/srat_64.c
> > @@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
> > static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
> > static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
> >
> > +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> > +#define APICID_LIST_LEN (128)
> > +
> > static __init int setup_node(int pxm)
> > {
> > return acpi_map_pxm_to_node(pxm);
> > @@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
> > apicid_to_node[apic_id] = node;
> > node_set(node, cpu_nodes_parsed);
> > acpi_numa = 1;
> > - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> > - pxm, apic_id, node);
> > }
> >
> > /* Callback for Proximity Domain -> LAPIC mapping */
> > @@ -170,8 +171,40 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> > apicid_to_node[apic_id] = node;
> > node_set(node, cpu_nodes_parsed);
> > acpi_numa = 1;
> > - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> > - pxm, apic_id, node);
> > +}
> > +
> > +void __init acpi_numa_print_srat_mapping(void)
> > +{
> > + char apicid_list[APICID_LIST_LEN];
> > + int i, j;
> > +
> > + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> > + int len;
> > + int nid;
> > +
> > + nid = pxm_to_node(i);
> > + if (nid == NUMA_NO_NODE)
> > + continue;
> > +
> > + bitmap_zero(apicid_map, MAX_LOCAL_APIC);
> > + for (j = 0; j < MAX_LOCAL_APIC; j++)
> > + if (apicid_to_node[j] == nid)
> > + set_bit(j, apicid_map);
> > +
> > + if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
> > + continue;
> > +
> > + /*
> > + * If the bitmap cannot be listed in a buffer of length
> > + * APICID_LIST_LEN, then it is suffixed with "...".
> > + */
> > + len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
> > + apicid_map, MAX_LOCAL_APIC);
> > + pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
> > + i, apicid_list,
> > + (len == APICID_LIST_LEN - 1) ? "..." : "",
> > + nid);
> > + }
> > }
>
> No. As i suggested many times before, just get rid of the printouts or
> make them boot-debug-flag dependent.
>

Hmm, so even if these were dependent on a kernel parameter, do you think
this patch would still be needed exactly as it is written? In other
words, do you believe that Mark's system should really emit 1272 lines for
this data instead of the 40 with this patch?

I'm all for making this dependent on a kernel parameter, but it's a
seperate change than what this patch is addressing. We need the apic id
mappings because they aren't exposed to userspace in any other way,
they need to exported somehow and this is a clear improvement to reduce
verbosity in the kernel log.

So I really don't know what you're insisting here, you want this
compression to be accompanied by another patch which makes the call to
acpi_numa_print_srat_mapping() depend on a kernel parameter? Or are you
saying we shouldn't compress it at all and Mike should just deal with 1272
lines instead of 40? Or are you saying we should export this static
information via sysfs?

2009-11-13 10:14:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log


* David Rientjes <[email protected]> wrote:

> On Fri, 13 Nov 2009, Ingo Molnar wrote:
>
> > > It's possible to reduce the number of SRAT messages emitted to the kernel
> > > log by printing each valid pxm once and then creating bitmaps to represent
> > > the apic ids that map to the same node.
> > >
> > > This reduces lines such as
> > >
> > > SRAT: PXM 0 -> APIC 0 -> Node 0
> > > SRAT: PXM 0 -> APIC 1 -> Node 0
> > > SRAT: PXM 1 -> APIC 2 -> Node 1
> > > SRAT: PXM 1 -> APIC 3 -> Node 1
> > >
> > > to
> > >
> > > SRAT: PXM 0 -> APIC {0-1} -> Node 0
> > > SRAT: PXM 1 -> APIC {2-3} -> Node 1
> > >
> > > The buffer used to store the apic id list is 128 characters in length.
> > > If that is too small to represent all the apic id ranges that are bound
> > > to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> > > manually increased for such configurations.
> > >
> > > Acked-by: Mike Travis <[email protected]>
> > > Signed-off-by: David Rientjes <[email protected]>
> > > ---
> > > arch/x86/mm/srat_64.c | 41 +++++++++++++++++++++++++++++++++++++----
> > > drivers/acpi/numa.c | 5 +++++
> > > include/linux/acpi.h | 3 ++-
> > > 3 files changed, 44 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> > > --- a/arch/x86/mm/srat_64.c
> > > +++ b/arch/x86/mm/srat_64.c
> > > @@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
> > > static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
> > > static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
> > >
> > > +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> > > +#define APICID_LIST_LEN (128)
> > > +
> > > static __init int setup_node(int pxm)
> > > {
> > > return acpi_map_pxm_to_node(pxm);
> > > @@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
> > > apicid_to_node[apic_id] = node;
> > > node_set(node, cpu_nodes_parsed);
> > > acpi_numa = 1;
> > > - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> > > - pxm, apic_id, node);
> > > }
> > >
> > > /* Callback for Proximity Domain -> LAPIC mapping */
> > > @@ -170,8 +171,40 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> > > apicid_to_node[apic_id] = node;
> > > node_set(node, cpu_nodes_parsed);
> > > acpi_numa = 1;
> > > - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> > > - pxm, apic_id, node);
> > > +}
> > > +
> > > +void __init acpi_numa_print_srat_mapping(void)
> > > +{
> > > + char apicid_list[APICID_LIST_LEN];
> > > + int i, j;
> > > +
> > > + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> > > + int len;
> > > + int nid;
> > > +
> > > + nid = pxm_to_node(i);
> > > + if (nid == NUMA_NO_NODE)
> > > + continue;
> > > +
> > > + bitmap_zero(apicid_map, MAX_LOCAL_APIC);
> > > + for (j = 0; j < MAX_LOCAL_APIC; j++)
> > > + if (apicid_to_node[j] == nid)
> > > + set_bit(j, apicid_map);
> > > +
> > > + if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
> > > + continue;
> > > +
> > > + /*
> > > + * If the bitmap cannot be listed in a buffer of length
> > > + * APICID_LIST_LEN, then it is suffixed with "...".
> > > + */
> > > + len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
> > > + apicid_map, MAX_LOCAL_APIC);
> > > + pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
> > > + i, apicid_list,
> > > + (len == APICID_LIST_LEN - 1) ? "..." : "",
> > > + nid);
> > > + }
> > > }
> >
> > No. As i suggested many times before, just get rid of the printouts or
> > make them boot-debug-flag dependent.
> >
>
> Hmm, so even if these were dependent on a kernel parameter, do you think
> this patch would still be needed exactly as it is written? In other
> words, do you believe that Mark's system should really emit 1272 lines for
> this data instead of the 40 with this patch?
>
> I'm all for making this dependent on a kernel parameter, but it's a
> seperate change than what this patch is addressing. We need the apic
> id mappings because they aren't exposed to userspace in any other way,
> they need to exported somehow and this is a clear improvement to
> reduce verbosity in the kernel log.
>
> So I really don't know what you're insisting here, you want this
> compression to be accompanied by another patch which makes the call to
> acpi_numa_print_srat_mapping() depend on a kernel parameter? Or are
> you saying we shouldn't compress it at all and Mike should just deal
> with 1272 lines instead of 40? Or are you saying we should export
> this static information via sysfs?

There's two problems outlined in this discussion:

A) too verbose bootup that is annoying with 64 CPUs and a show-stopper
with 4096 CPUs.

B) the ad-hoc nature of our topology enumeration. Some of it is in
/sys, some of it is in printk logs. None really works well and
there's no structure in it.

The simplest solution for (A) is what i suggested a few mails ago: dont
print the information by default, but allow (for trouble-shooting)
purposes for it to be printed when a boot option is passed.

Problem (B), topology info enumeration of a successful bootup is a
different matter. It should be exposed to user-space via proper /sys
abstractions, not via ad-hoc printks. There's ongoing work in that area,
from Andreas Hermann, with patches posted. hpa expressed the view there
that topology structure should be expressed via a nice vfs abstraction -
i share that opinion.

Ingo

2009-11-13 10:29:57

by David Rientjes

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log

On Fri, 13 Nov 2009, Ingo Molnar wrote:

> There's two problems outlined in this discussion:
>
> A) too verbose bootup that is annoying with 64 CPUs and a show-stopper
> with 4096 CPUs.
>
> B) the ad-hoc nature of our topology enumeration. Some of it is in
> /sys, some of it is in printk logs. None really works well and
> there's no structure in it.
>
> The simplest solution for (A) is what i suggested a few mails ago: dont
> print the information by default, but allow (for trouble-shooting)
> purposes for it to be printed when a boot option is passed.
>

Sigh, and even if that were done with a subsequent patch, you would still
want to reduce the debugging output from 1272 lines to 40, just like my
patch does without losing any information. It's insane to emit 1272
lines even if they are emitted only for a certain kernel parameter. I'm
sure we agree on that.

> Problem (B), topology info enumeration of a successful bootup is a
> different matter. It should be exposed to user-space via proper /sys
> abstractions, not via ad-hoc printks. There's ongoing work in that area,
> from Andreas Hermann, with patches posted. hpa expressed the view there
> that topology structure should be expressed via a nice vfs abstraction -
> i share that opinion.
>

Ingo, what do you want?

Your first criticism was that it should be limited only to a kernel
parameter but now it seems like you're insisting that the printk's get
removed completely and its exported via userspace. Then what is the
kernel parameter that you suggested for?

I'll leave the discussion with saying that if we still want to emit this
information with a parameter, that you'll still need to merge my patch at
some point to reduce the 1272 lines on Mark's system to 40. I'm unsure
why that isn't just merged right now since it's a clear improvement over
the current behavior, but I'm not going to beat a dead horse.

Thanks.

2009-11-13 10:58:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log


* David Rientjes <[email protected]> wrote:

> On Fri, 13 Nov 2009, Ingo Molnar wrote:
>
> > There's two problems outlined in this discussion:
> >
> > A) too verbose bootup that is annoying with 64 CPUs and a show-stopper
> > with 4096 CPUs.
> >
> > B) the ad-hoc nature of our topology enumeration. Some of it is in
> > /sys, some of it is in printk logs. None really works well and
> > there's no structure in it.
> >
> > The simplest solution for (A) is what i suggested a few mails ago: dont
> > print the information by default, but allow (for trouble-shooting)
> > purposes for it to be printed when a boot option is passed.
> >
>
> Sigh, and even if that were done with a subsequent patch, you would
> still want to reduce the debugging output from 1272 lines to 40, just
> like my patch does without losing any information. It's insane to
> emit 1272 lines even if they are emitted only for a certain kernel
> parameter. I'm sure we agree on that.

For _debug_ output, we want it pretty simple. 1272 lines is nothing if
it's only done for debugging/trouble-shooting.

Furthermore, if this _only_ gets used in the debug case, we want it
simple for the pure reason that it's uncommon code. It might work now,
but it might regress later without anyone noticing it - up to the point
when someone might need it when it breaks in the worst possible moment.

We've been through this many times before and that's the core principle
of all debug printout code: keep it simple.

> > Problem (B), topology info enumeration of a successful bootup is a
> > different matter. It should be exposed to user-space via proper /sys
> > abstractions, not via ad-hoc printks. There's ongoing work in that
> > area, from Andreas Hermann, with patches posted. hpa expressed the
> > view there that topology structure should be expressed via a nice
> > vfs abstraction - i share that opinion.
>
> Ingo, what do you want?

My goal is to accept good patches and to reject patches that are bad or
not good enough.

> Your first criticism was that it should be limited only to a kernel
> parameter but now it seems like you're insisting that the printk's get
> removed completely and its exported via userspace. Then what is the
> kernel parameter that you suggested for?

Please read my mail. There's three usecases:

- default bootup. We want no messages in the log.

- troubleshooting bootup with a boot flag specified. (an existing
example is apic=verbose) We want simple messages in that case and
obvious logic. (verbosity is not an overriding issue - we are
troubleshooting)

- Successful bootup and we want to retrieve topology information for
the system. Using the boot logs for it is not the right channel -
/sys is.

Your patch is not a 'good enough' solution to either of these usecases,
because:

- In the default case it prints 40 lines more than it should.

- In the troubleshooting case it provides the information but it is
not a simple mechanism. (anything with state and a bitmap in it is
out pretty much - stick to simple printks, those are in the least
danger of regressing down the road. It's also less bloated in terms
of code. )

- For extracing topology information kernel log messages is not
something that tools can rely on very well.

Thanks,

Ingo

2009-11-20 18:37:14

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log

Hi!

> There's two problems outlined in this discussion:
>
> A) too verbose bootup that is annoying with 64 CPUs and a show-stopper
> with 4096 CPUs.
>
> B) the ad-hoc nature of our topology enumeration. Some of it is in
> /sys, some of it is in printk logs. None really works well and
> there's no structure in it.
>
> The simplest solution for (A) is what i suggested a few mails ago: dont
> print the information by default, but allow (for trouble-shooting)
> purposes for it to be printed when a boot option is passed.

Well, yes, it is *simplest*, but is it best? Example printing 'cpus
0-31 stepping 3, cpus 32-63 stepping 4' seem rather convincing --such
info should be there by default and not after speciial option...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-11-20 18:58:42

by Mike Travis

[permalink] [raw]
Subject: Re: [patch] x86: reduce srat verbosity in the kernel log



Pavel Machek wrote:
> Hi!
>
>> There's two problems outlined in this discussion:
>>
>> A) too verbose bootup that is annoying with 64 CPUs and a show-stopper
>> with 4096 CPUs.
>>
>> B) the ad-hoc nature of our topology enumeration. Some of it is in
>> /sys, some of it is in printk logs. None really works well and
>> there's no structure in it.
>>
>> The simplest solution for (A) is what i suggested a few mails ago: dont
>> print the information by default, but allow (for trouble-shooting)
>> purposes for it to be printed when a boot option is passed.
>
> Well, yes, it is *simplest*, but is it best? Example printing 'cpus
> 0-31 stepping 3, cpus 32-63 stepping 4' seem rather convincing --such
> info should be there by default and not after speciial option...
> Pavel

I think that Ingo had a point. If it's printing a summary after the
all the cpu's boot, then it should also be available online, which in
this case it's in /proc/cpuinfo.

Ingo - was there anything left open on my last submission that I have
to deal with, to gain acceptance?

Thanks,
Mike