The output of MCE banks ownership information on boot tend to be long on
new processor which has many banks:
CPU 1 MCA banks SHD:0 SHD:1 CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:7 SHD:8 SHD:9 SHD:12 SHD:13 SHD:14 SHD:15 SHD:16 SHD:17 SHD:18 SHD:19 SHD:20 SHD:21
This patch changes this message to shorter format, in lower message level
(KERN_INFO -> KERN_DEBUG), and replace "MCA" by "MCE":
CPU 1 MCE banks map: ssCC PCss ssPP ssss ssss ss
where: s: shared, C: checked by CMCI, P: checked by poll.
(NOTE: This patch still use old format on the output when ownership is
updated, e.g. a cpu take over the ownership from hot-removed cpu.
CPU 1 MCE banks map updated: CMCI:6 CMCI:7 CMCI:10 CMCI:11
CPU 1 MCE banks map: ssCC PCCC ssPP ssCC ssss ss
)
Signed-off-by: Hidetoshi Seto <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce_intel.c | 29 +++++++++++++++++++++++------
1 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 7c78563..2b4c78b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -64,12 +64,25 @@ static void intel_threshold_interrupt(void)
mce_notify_irq();
}
+static void print_banks_map(int banks)
+{
+ int i;
+
+ pr_debug("CPU %d MCE banks map:", smp_processor_id());
+ for (i = 0; i < banks; i++) {
+ pr_cont("%s%s", (i % 4) ? "" : " ",
+ test_bit(i, __get_cpu_var(mce_banks_owned)) ? "C" :
+ test_bit(i, __get_cpu_var(mce_poll_banks)) ? "P" : "s");
+ }
+ pr_cont("\n");
+}
+
static void print_update(char *type, int *hdr, int num)
{
if (*hdr == 0)
- printk(KERN_INFO "CPU %d MCA banks", smp_processor_id());
+ pr_debug("CPU %d MCE banks map updated:", smp_processor_id());
*hdr = 1;
- printk(KERN_CONT " %s:%d", type, num);
+ pr_cont(" %s:%d", type, num);
}
/*
@@ -85,6 +98,7 @@ static void cmci_discover(int banks, int boot)
int i;
spin_lock_irqsave(&cmci_discover_lock, flags);
+
for (i = 0; i < banks; i++) {
u64 val;
@@ -95,7 +109,7 @@ static void cmci_discover(int banks, int boot)
/* Already owned by someone else? */
if (val & CMCI_EN) {
- if (test_and_clear_bit(i, owned) || boot)
+ if (test_and_clear_bit(i, owned) && !boot)
print_update("SHD", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
continue;
@@ -107,16 +121,19 @@ static void cmci_discover(int banks, int boot)
/* Did the enable bit stick? -- the bank supports CMCI */
if (val & CMCI_EN) {
- if (!test_and_set_bit(i, owned) || boot)
+ if (!test_and_set_bit(i, owned) && !boot)
print_update("CMCI", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
} else {
WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
}
}
- spin_unlock_irqrestore(&cmci_discover_lock, flags);
if (hdr)
- printk(KERN_CONT "\n");
+ pr_cont("\n");
+ if (boot || hdr)
+ print_banks_map(banks);
+
+ spin_unlock_irqrestore(&cmci_discover_lock, flags);
}
/*
--
1.6.5.2
Seems OK, but I think it would be even more useful to find a way to
print fewer lines of output; with CPUs that will be released shortly, a
system with 64 or even 128 logical CPUs will not be will not be that
exotic, and producing 128 lines of kernel log output for debugging
information that is rarely used and where the same info can be expressed
in 2 or 3 lines is silly-looking (and very annoying on a 57600 bps
serial console!).
- Roland
Roland Dreier wrote:
> Seems OK, but I think it would be even more useful to find a way to
> print fewer lines of output; with CPUs that will be released shortly, a
> system with 64 or even 128 logical CPUs will not be will not be that
> exotic, and producing 128 lines of kernel log output for debugging
> information that is rarely used and where the same info can be expressed
> in 2 or 3 lines is silly-looking (and very annoying on a 57600 bps
> serial console!).
Thanks for your review!
I think we could some effort like this for other messages during
CPU initialization.
For example I googled a full dmesg of recent hardware:
http://www.gossamer-threads.com/lists/linux/kernel/1134265
It shows that the lines like:
:
Booting processor 1 APIC 0x2 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5344.67 BogoMIPS (lpj=2672337)
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 256K
CPU: L3 cache: 8192K
mce: CPU supports 9 MCE banks
CPU1: Thermal monitoring enabled (TM1)
CPU 1 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz stepping 04
Skipping synchronization checks as TSC is reliable.
:
are that printed for every cpu.
We already eliminated "mce: CPU supports X MCE banks" in this repeat
and now going to compress "CPU X MCA banks ..." line.
I suppose:
- Cache information can be compressed too, could be in one line.
- Usually model name (and also cache size) will be same on all cpu.
I can understand that it is better to avoid printing same lines
again and again. But there are more redundant messages...
Maybe there would be more desirable ways, but I think that
"compress messages shorter to bear heavy repeating" will be
a good way at this time.
Thanks,
H.Seto
* Hidetoshi Seto <[email protected]> wrote:
> Maybe there would be more desirable ways, but I think that "compress
> messages shorter to bear heavy repeating" will be a good way at this
> time.
We really want to only print out relevant information. 128 lines of
identical output is not relevant. (or 2x 64 lines, or 4096 lines of
identical output)
So we only want to print MCE setup messages on the boot CPU. That gives
us 90% of the benefits already: we see the rough structure of the
hardware, and if the bootup has a problem with MCE initialization we get
relevant printouts that helps debugging.
Now, it's certainly true that with things like MCE bank sharing the MCE
setup output from different CPUs might not be identical all the time -
but the information is represented in other (topology) info anyway. (and
if not it wasnt all that important to begin with)
For non-boot CPUs we can perhaps add a mce=verbose (default-disabled)
mode of bootup that allows all CPUs to be printed - should there be any
problem with MCE details only visible on non-boot CPUs. (unlikely)
Ingo
Ingo Molnar wrote:
> * Hidetoshi Seto <[email protected]> wrote:
>> Maybe there would be more desirable ways, but I think that "compress
>> messages shorter to bear heavy repeating" will be a good way at this
>> time.
>
> We really want to only print out relevant information. 128 lines of
> identical output is not relevant. (or 2x 64 lines, or 4096 lines of
> identical output)
>
> So we only want to print MCE setup messages on the boot CPU. That gives
> us 90% of the benefits already: we see the rough structure of the
> hardware, and if the bootup has a problem with MCE initialization we get
> relevant printouts that helps debugging.
>
> Now, it's certainly true that with things like MCE bank sharing the MCE
> setup output from different CPUs might not be identical all the time -
> but the information is represented in other (topology) info anyway. (and
> if not it wasnt all that important to begin with)
>
> For non-boot CPUs we can perhaps add a mce=verbose (default-disabled)
> mode of bootup that allows all CPUs to be printed - should there be any
> problem with MCE details only visible on non-boot CPUs. (unlikely)
How about having a kind of "boot=quiet_ap" which suppress boot-time messages
for all non-boot CPUs, rather than "mce=verbose" which only suppress one
(or, with thermal message, two) line? I don't not understand why this problem
would be solved only by removing the lines from mce subsystem.
I have no problem on my box with 16 CPUs, but I would not think so if there
were 128 or 4096. Then I'd like to see in short:
:
2048 CPUs available, 4096 CPUs total
Booting 2047 processors ........<snip>..... OK.
Total of 2048 processors activated (XXXXXXXXXXX.XX BogoMIPS).
:
Anyway this will be a quite trivial problem for desktop/notebook users.
Thanks,
H.Seto
* Hidetoshi Seto <[email protected]> wrote:
> Ingo Molnar wrote:
> > * Hidetoshi Seto <[email protected]> wrote:
> >> Maybe there would be more desirable ways, but I think that "compress
> >> messages shorter to bear heavy repeating" will be a good way at this
> >> time.
> >
> > We really want to only print out relevant information. 128 lines of
> > identical output is not relevant. (or 2x 64 lines, or 4096 lines of
> > identical output)
> >
> > So we only want to print MCE setup messages on the boot CPU. That gives
> > us 90% of the benefits already: we see the rough structure of the
> > hardware, and if the bootup has a problem with MCE initialization we get
> > relevant printouts that helps debugging.
> >
> > Now, it's certainly true that with things like MCE bank sharing the MCE
> > setup output from different CPUs might not be identical all the time -
> > but the information is represented in other (topology) info anyway. (and
> > if not it wasnt all that important to begin with)
> >
> > For non-boot CPUs we can perhaps add a mce=verbose (default-disabled)
> > mode of bootup that allows all CPUs to be printed - should there be any
> > problem with MCE details only visible on non-boot CPUs. (unlikely)
>
> How about having a kind of "boot=quiet_ap" which suppress boot-time
> messages for all non-boot CPUs, rather than "mce=verbose" which only
> suppress one (or, with thermal message, two) line? I don't not
> understand why this problem would be solved only by removing the lines
> from mce subsystem.
The idea is to suppress bootup messages by default - and to allow opt-in
verbose printing, in the (rare) case of having to debug bootup problems
that involve APs.
Ingo
I don't know if this applies here, but one thing I've been doing is
to mark the msg level as DEBUG if system_state == SYSTEM_BOOTING
(which is where all the noise comes from), and INFO if the system
is running. This way hotplug events are logged on the console.
Hidetoshi Seto wrote:
> The output of MCE banks ownership information on boot tend to be long on
> new processor which has many banks:
>
> CPU 1 MCA banks SHD:0 SHD:1 CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:7 SHD:8 SHD:9 SHD:12 SHD:13 SHD:14 SHD:15 SHD:16 SHD:17 SHD:18 SHD:19 SHD:20 SHD:21
>
> This patch changes this message to shorter format, in lower message level
> (KERN_INFO -> KERN_DEBUG), and replace "MCA" by "MCE":
>
> CPU 1 MCE banks map: ssCC PCss ssPP ssss ssss ss
>
> where: s: shared, C: checked by CMCI, P: checked by poll.
>
> (NOTE: This patch still use old format on the output when ownership is
> updated, e.g. a cpu take over the ownership from hot-removed cpu.
>
> CPU 1 MCE banks map updated: CMCI:6 CMCI:7 CMCI:10 CMCI:11
> CPU 1 MCE banks map: ssCC PCCC ssPP ssCC ssss ss
> )
>
> Signed-off-by: Hidetoshi Seto <[email protected]>
> ---
> arch/x86/kernel/cpu/mcheck/mce_intel.c | 29 +++++++++++++++++++++++------
> 1 files changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> index 7c78563..2b4c78b 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -64,12 +64,25 @@ static void intel_threshold_interrupt(void)
> mce_notify_irq();
> }
>
> +static void print_banks_map(int banks)
> +{
> + int i;
> +
> + pr_debug("CPU %d MCE banks map:", smp_processor_id());
> + for (i = 0; i < banks; i++) {
> + pr_cont("%s%s", (i % 4) ? "" : " ",
> + test_bit(i, __get_cpu_var(mce_banks_owned)) ? "C" :
> + test_bit(i, __get_cpu_var(mce_poll_banks)) ? "P" : "s");
> + }
> + pr_cont("\n");
> +}
> +
> static void print_update(char *type, int *hdr, int num)
> {
> if (*hdr == 0)
> - printk(KERN_INFO "CPU %d MCA banks", smp_processor_id());
> + pr_debug("CPU %d MCE banks map updated:", smp_processor_id());
> *hdr = 1;
> - printk(KERN_CONT " %s:%d", type, num);
> + pr_cont(" %s:%d", type, num);
> }
>
> /*
> @@ -85,6 +98,7 @@ static void cmci_discover(int banks, int boot)
> int i;
>
> spin_lock_irqsave(&cmci_discover_lock, flags);
> +
> for (i = 0; i < banks; i++) {
> u64 val;
>
> @@ -95,7 +109,7 @@ static void cmci_discover(int banks, int boot)
>
> /* Already owned by someone else? */
> if (val & CMCI_EN) {
> - if (test_and_clear_bit(i, owned) || boot)
> + if (test_and_clear_bit(i, owned) && !boot)
> print_update("SHD", &hdr, i);
> __clear_bit(i, __get_cpu_var(mce_poll_banks));
> continue;
> @@ -107,16 +121,19 @@ static void cmci_discover(int banks, int boot)
>
> /* Did the enable bit stick? -- the bank supports CMCI */
> if (val & CMCI_EN) {
> - if (!test_and_set_bit(i, owned) || boot)
> + if (!test_and_set_bit(i, owned) && !boot)
> print_update("CMCI", &hdr, i);
> __clear_bit(i, __get_cpu_var(mce_poll_banks));
> } else {
> WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
> }
> }
> - spin_unlock_irqrestore(&cmci_discover_lock, flags);
> if (hdr)
> - printk(KERN_CONT "\n");
> + pr_cont("\n");
> + if (boot || hdr)
> + print_banks_map(banks);
> +
> + spin_unlock_irqrestore(&cmci_discover_lock, flags);
> }
>
> /*
I have a patch coming that summarizes the Booting processor msgs and
the Processor information messages:
The first increases the start count by a power of 2 so 4096 cpus
would only need 12 lines.
[ 25.388280] Booting Processors 1-7,320-327, Nodes 0-0
[ 26.064742] Booting Processors 8-15,328-335, Nodes 1-1
[ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
[ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
[ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
[ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
[ 50.917504] Booting Processors 256-319,576-639, Nodes 32-63
(the last line has a slight error in that actually only nodes 32-39
were starting. Btw, the processor numbers are correct, according
to the Intel spec.)
and
[ 103.860206] Summary Processor information for CPUS: 0-639
[ 103.864032] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
[ 103.873403] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 103.877885] CPU: L2 cache: 256K
[ 103.880032] CPU: L3 cache: 24576K
[ 103.884032] MIN 4266.68 BogoMIPS (lpj=8533378)
[ 103.888032] MAX 4267.70 BogoMIPS (lpj=8535412)
[ 103.896248] Total of 640 processors activated (2731043.30 BogoMIPS).
This summary looks for different cpu types, steppings and cache sizes.
Hidetoshi Seto wrote:
> Roland Dreier wrote:
>> Seems OK, but I think it would be even more useful to find a way to
>> print fewer lines of output; with CPUs that will be released shortly, a
>> system with 64 or even 128 logical CPUs will not be will not be that
>> exotic, and producing 128 lines of kernel log output for debugging
>> information that is rarely used and where the same info can be expressed
>> in 2 or 3 lines is silly-looking (and very annoying on a 57600 bps
>> serial console!).
>
> Thanks for your review!
>
> I think we could some effort like this for other messages during
> CPU initialization.
>
> For example I googled a full dmesg of recent hardware:
> http://www.gossamer-threads.com/lists/linux/kernel/1134265
> It shows that the lines like:
> :
> Booting processor 1 APIC 0x2 ip 0x6000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 5344.67 BogoMIPS (lpj=2672337)
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 256K
> CPU: L3 cache: 8192K
> mce: CPU supports 9 MCE banks
> CPU1: Thermal monitoring enabled (TM1)
> CPU 1 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
> x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
> CPU1: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz stepping 04
> Skipping synchronization checks as TSC is reliable.
> :
> are that printed for every cpu.
>
> We already eliminated "mce: CPU supports X MCE banks" in this repeat
> and now going to compress "CPU X MCA banks ..." line.
>
> I suppose:
> - Cache information can be compressed too, could be in one line.
> - Usually model name (and also cache size) will be same on all cpu.
>
> I can understand that it is better to avoid printing same lines
> again and again. But there are more redundant messages...
>
> Maybe there would be more desirable ways, but I think that
> "compress messages shorter to bear heavy repeating" will be
> a good way at this time.
>
>
> Thanks,
> H.Seto