2006-11-15 21:32:54

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

Hello,

Here is a small patch that adds two cpufeature bits to represent
Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
(BTS) features. Those features can be found on Intel P4 and Core 2
processors among others and can be used by perfmon.

changelog:
- add CPU_FEATURE_PEBS and CPU_FEATURE_BTS
- add feature detection code

signed-off-by: stephane eranian <[email protected]>

diff --exclude=.git -urNp linux-2.6.orig/arch/i386/kernel/cpu/intel.c linux-2.6.base/arch/i386/kernel/cpu/intel.c
--- linux-2.6.orig/arch/i386/kernel/cpu/intel.c 2006-10-17 05:33:35.000000000 -0700
+++ linux-2.6.base/arch/i386/kernel/cpu/intel.c 2006-11-15 08:12:22.000000000 -0800
@@ -97,7 +97,7 @@ static int __cpuinit num_cpu_cores(struc

static void __cpuinit init_intel(struct cpuinfo_x86 *c)
{
- unsigned int l2 = 0;
+ unsigned int l1, l2 = 0;
char *p = NULL;

#ifdef CONFIG_X86_F00F_BUG
@@ -195,6 +195,14 @@ static void __cpuinit init_intel(struct
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
+
+ if (cpu_has_ds) {
+ rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
+ if (!(l1 & (1<<11)))
+ set_bit(X86_FEATURE_BTS, c->x86_capability);
+ if (!(l1 & (1<<12)))
+ set_bit(X86_FEATURE_PEBS, c->x86_capability);
+ }
}

diff --exclude=.git -urNp linux-2.6.orig/include/asm-i386/cpufeature.h linux-2.6.base/include/asm-i386/cpufeature.h
--- linux-2.6.orig/include/asm-i386/cpufeature.h 2006-10-17 05:33:40.000000000 -0700
+++ linux-2.6.base/include/asm-i386/cpufeature.h 2006-11-15 08:13:37.000000000 -0800
@@ -73,6 +73,8 @@
#define X86_FEATURE_UP (3*32+ 9) /* smp kernel running on up */
#define X86_FEATURE_FXSAVE_LEAK (3*32+10) /* FXSAVE leaks FOP/FIP/FOP */
#define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */
+#define X86_FEATURE_PEBS (3*32+12) /* Precise-Event Based Sampling */
+#define X86_FEATURE_BTS (3*32+13) /* Branch Trace Store */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -134,6 +136,9 @@
#define cpu_has_phe_enabled boot_cpu_has(X86_FEATURE_PHE_EN)
#define cpu_has_pmm boot_cpu_has(X86_FEATURE_PMM)
#define cpu_has_pmm_enabled boot_cpu_has(X86_FEATURE_PMM_EN)
+#define cpu_has_ds boot_cpu_has(X86_FEATURE_DTES)
+#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
+#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)

#endif /* __ASM_I386_CPUFEATURE_H */


2006-11-15 21:35:14

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] x86-64 add Intel PEBS and BTS cpufeature bits and detection

Hello,

Here is a small patch that adds two cpufeature bits to represent
Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
(BTS) features. Those features can be found on Intel P4 and Core 2
processors among others and can be used by perfmon. This patch is
for x86-64.

changelog:
- add CPU_FEATURE_PEBS and CPU_FEATURE_BTS
- add feature detection code

signed-off-by: stephane eranian <[email protected]>

diff --exclude=.git -urNp linux-2.6.orig/include/asm-x86_64/cpufeature.h linux-2.6.base/include/asm-x86_64/cpufeature.h
--- linux-2.6.orig/include/asm-x86_64/cpufeature.h 2006-10-17 05:33:56.000000000 -0700
+++ linux-2.6.base/include/asm-x86_64/cpufeature.h 2006-11-15 13:08:25.000000000 -0800
@@ -68,6 +68,8 @@
#define X86_FEATURE_FXSAVE_LEAK (3*32+7) /* FIP/FOP/FDP leaks through FXSAVE */
#define X86_FEATURE_UP (3*32+8) /* SMP kernel running on UP */
#define X86_FEATURE_ARCH_PERFMON (3*32+9) /* Intel Architectural PerfMon */
+#define X86_FEATURE_PEBS (3*32+10) /* Precise-Event Based Sampling */
+#define X86_FEATURE_BTS (3*32+11) /* Branch Trace Store */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -112,5 +114,8 @@
#define cpu_has_cyrix_arr 0
#define cpu_has_centaur_mcr 0
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLSH)
+#define cpu_has_ds boot_cpu_has(X86_FEATURE_DTES)
+#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
+#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)

#endif /* __ASM_X8664_CPUFEATURE_H */
diff --exclude=.git -urNp linux-2.6.orig/arch/x86_64/kernel/setup.c linux-2.6.base/arch/x86_64/kernel/setup.c
--- linux-2.6.orig/arch/x86_64/kernel/setup.c 2006-10-17 05:33:35.000000000 -0700
+++ linux-2.6.base/arch/x86_64/kernel/setup.c 2006-11-15 08:21:12.000000000 -0800
@@ -835,6 +835,15 @@ static void __cpuinit init_intel(struct
set_bit(X86_FEATURE_ARCH_PERFMON, &c->x86_capability);
}

+ if (cpu_has_ds) {
+ unsigned int l1, l2;
+ rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
+ if (!(l1 & (1<<11)))
+ set_bit(X86_FEATURE_BTS, c->x86_capability);
+ if (!(l1 & (1<<12)))
+ set_bit(X86_FEATURE_PEBS, c->x86_capability);
+ }
+
n = c->extended_cpuid_level;
if (n >= 0x80000008) {
unsigned eax = cpuid_eax(0x80000008);

2006-11-16 00:24:53

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

On Wed, 15 Nov 2006 13:32:41 -0800
Stephane Eranian <[email protected]> wrote:

> Here is a small patch that adds two cpufeature bits to represent
> Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
> (BTS) features. Those features can be found on Intel P4 and Core 2
> processors among others and can be used by perfmon.
>

Andi has already merged a different version of this. If it needs
updating, please review his tree (most recent -mm will suit) and
send any needed updates.

2006-11-16 14:21:14

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] x86-64 add Intel BTS cpufeature bit and detection (take 2)

Andi,

Here is a small patch for x86-64 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.

The patch is relative to 2.6.19-rc5-git7 + x86_64-2.6.19-rc5-git7-061116-2

changelog:
- add CPU_FEATURE_BTS
- add Branch Trace Store detection

signed-off-by: stephane eranian <[email protected]>

diff --exclude=.git -urNp linux-2.6.19-rc5-git7-ak.orig/arch/x86_64/kernel/setup.c linux-2.6.19-rc5-git7-ak/arch/x86_64/kernel/setup.c
--- linux-2.6.19-rc5-git7-ak.orig/arch/x86_64/kernel/setup.c 2006-11-16 05:15:39.000000000 -0800
+++ linux-2.6.19-rc5-git7-ak/arch/x86_64/kernel/setup.c 2006-11-16 05:46:51.000000000 -0800
@@ -838,6 +838,8 @@ static void __cpuinit init_intel(struct
if (cpu_has_ds) {
unsigned int l1, l2;
rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
+ if (!(l1 & (1<<11)))
+ set_bit(X86_FEATURE_BTS, c->x86_capability);
if (!(l1 & (1<<12)))
set_bit(X86_FEATURE_PEBS, c->x86_capability);
}
diff --exclude=.git -urNp linux-2.6.19-rc5-git7-ak.orig/include/asm-x86_64/cpufeature.h linux-2.6.19-rc5-git7-ak/include/asm-x86_64/cpufeature.h
--- linux-2.6.19-rc5-git7-ak.orig/include/asm-x86_64/cpufeature.h 2006-11-16 05:15:39.000000000 -0800
+++ linux-2.6.19-rc5-git7-ak/include/asm-x86_64/cpufeature.h 2006-11-16 05:47:52.000000000 -0800
@@ -69,6 +69,7 @@
#define X86_FEATURE_UP (3*32+8) /* SMP kernel running on UP */
#define X86_FEATURE_ARCH_PERFMON (3*32+9) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS (3*32+10) /* Precise-Event Based Sampling */
+#define X86_FEATURE_BTS (3*32+11) /* Branch Trace Store */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -115,5 +116,6 @@
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLSH)
#define cpu_has_ds boot_cpu_has(X86_FEATURE_DS)
#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
+#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)

#endif /* __ASM_X8664_CPUFEATURE_H */

2006-11-16 14:23:38

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86-64 add Intel BTS cpufeature bit and detection (take 2)

On Thursday 16 November 2006 15:20, Stephane Eranian wrote:
> Andi,
>
> Here is a small patch for x86-64 which adds a cpufeature flag and
> detection code for Intel's Branch Trace Store (BTS) feature. This
> feature can be found on Intel P4 and Core 2 processors among others.
> It can also be used by perfmon.

Added thanks

-Andi

2006-11-16 14:22:45

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] i386 add Intel BTS cpufeature bit and detection (take 2)

Andi,

Here is a small patch for i386 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.

The patch is relative to 2.6.19-rc5-git7 + x86_64-2.6.19-rc5-git7-061116-2

changelog:
- add CPU_FEATURE_BTS
- add Branch Trace Store detection

signed-off-by: stephane eranian <[email protected]>

diff --exclude=.git -urNp linux-2.6.19-rc5-git7-ak.orig/arch/i386/kernel/cpu/intel.c linux-2.6.19-rc5-git7-ak/arch/i386/kernel/cpu/intel.c
--- linux-2.6.19-rc5-git7-ak.orig/arch/i386/kernel/cpu/intel.c 2006-11-16 05:15:39.000000000 -0800
+++ linux-2.6.19-rc5-git7-ak/arch/i386/kernel/cpu/intel.c 2006-11-16 05:49:11.000000000 -0800
@@ -199,6 +199,8 @@ static void __cpuinit init_intel(struct
if (cpu_has_ds) {
unsigned int l1;
rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
+ if (!(l1 & (1<<11)))
+ set_bit(X86_FEATURE_BTS, c->x86_capability);
if (!(l1 & (1<<12)))
set_bit(X86_FEATURE_PEBS, c->x86_capability);
}
diff --exclude=.git -urNp linux-2.6.19-rc5-git7-ak.orig/include/asm-i386/cpufeature.h linux-2.6.19-rc5-git7-ak/include/asm-i386/cpufeature.h
--- linux-2.6.19-rc5-git7-ak.orig/include/asm-i386/cpufeature.h 2006-11-16 05:15:39.000000000 -0800
+++ linux-2.6.19-rc5-git7-ak/include/asm-i386/cpufeature.h 2006-11-16 05:48:52.000000000 -0800
@@ -74,6 +74,7 @@
#define X86_FEATURE_FXSAVE_LEAK (3*32+10) /* FXSAVE leaks FOP/FIP/FOP */
#define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS (3*32+12) /* Precise-Event Based Sampling */
+#define X86_FEATURE_BTS (3*32+13) /* Branch Trace Store */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -138,6 +139,7 @@
#define cpu_has_ds boot_cpu_has(X86_FEATURE_DS)
#define cpu_has_pebs boot_cpu_has(X86_FEATURE_PEBS)
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLSH)
+#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)

#endif /* __ASM_I386_CPUFEATURE_H */

2006-11-17 01:22:33

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

Stephane Eranian wrote:
> Here is a small patch that adds two cpufeature bits to represent
> Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
> (BTS) features. Those features can be found on Intel P4 and Core 2
> processors among others and can be used by perfmon.
>

I've been thinking it would be useful for kernel debugging if kernel
oops messages could use the branch history to show the last few jumps on
processors which support it. It would help a lot with the "oh, an oops
with eip==esp==0" type crashes, which are otherwise pretty unhelpful.

Do you think that would be easy/possible to support? Would it interfere
with other uses of these features?

J

2006-11-17 04:29:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

On Friday 17 November 2006 02:34, Jeremy Fitzhardinge wrote:
> Stephane Eranian wrote:
> > Here is a small patch that adds two cpufeature bits to represent
> > Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
> > (BTS) features. Those features can be found on Intel P4 and Core 2
> > processors among others and can be used by perfmon.
> >
>
> I've been thinking it would be useful for kernel debugging if kernel
> oops messages could use the branch history to show the last few jumps on
> processors which support it. It would help a lot with the "oh, an oops
> with eip==esp==0" type crashes, which are otherwise pretty unhelpful.

I have had private patches for that myself, using the MSRs on AMD
and Intel.

The problem is that you have to insert hooks early into the exception
handlers to read the branch history MSRs, and that gets fairly ugly
and a little slow and we can't really enable it by default.

But using BTS with a long in memory buffer would be fine. It would
just be slower so it couldn't be enabled by default. But as a debugging
feature it would be nice.

-Andi

2006-11-17 07:58:20

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

Jeremy,

On Fri, Nov 17, 2006 at 05:29:02AM +0100, Andi Kleen wrote:
> On Friday 17 November 2006 02:34, Jeremy Fitzhardinge wrote:
> > Stephane Eranian wrote:
> > > Here is a small patch that adds two cpufeature bits to represent
> > > Intel's Precise-Event-Based Sampling (PEBS) and Branch Trace Store
> > > (BTS) features. Those features can be found on Intel P4 and Core 2
> > > processors among others and can be used by perfmon.
> > >
> >
> > I've been thinking it would be useful for kernel debugging if kernel
> > oops messages could use the branch history to show the last few jumps on
> > processors which support it. It would help a lot with the "oh, an oops
> > with eip==esp==0" type crashes, which are otherwise pretty unhelpful.
>
> I have had private patches for that myself, using the MSRs on AMD
> and Intel.
>
> The problem is that you have to insert hooks early into the exception
> handlers to read the branch history MSRs, and that gets fairly ugly
> and a little slow and we can't really enable it by default.
>
There are two ways of capturing branches on the Intel processors (I have not
looked at AMD): Last Branch Record (LBR) and Branch Trace Store (BTS). The former
stores from/to information into MSRs and is very small (4 branches). The later could
be as big as you want. On recent processors LBR and BTS can be constrained by priv level.
The issue is that they capture ALL taken branches, not just taken function call or return.

> But using BTS with a long in memory buffer would be fine. It would
> just be slower so it couldn't be enabled by default. But as a debugging
> feature it would be nice.
>

Yes, I think it could be used for debugging, you'd need to reserve a few pages
and initialize a single MSR (32_DEBUGCTL).

--
-Stephane

2006-11-17 09:22:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

> The former
> stores from/to information into MSRs and is very small (4 branches).

P4 since Prescott has 16

> On recent processors LBR and BTS can be constrained by priv level.

Doesn't help for kernel debugging.

-Andi

2006-11-17 12:29:48

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

On Fri, Nov 17, 2006 at 10:22:20AM +0100, Andi Kleen wrote:
> > The former
> > stores from/to information into MSRs and is very small (4 branches).
>
> P4 since Prescott has 16
>
Yes. I was talking about Core 2

> > On recent processors LBR and BTS can be constrained by priv level.
>
> Doesn't help for kernel debugging.
>
Well, if you set if for kernel level only, you do not capture user level
branches. This may happen if you crash soon after you've entered the kernel
and you have a small buffer.

--
-Stephane

2006-11-17 22:45:31

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

Andi Kleen wrote:
> I have had private patches for that myself, using the MSRs on AMD
> and Intel.
>

Would they be something that could be cleaned up into something
mergeable? It would be nice to have something that could be left
enabled all the time, but an option would at least make the
functionality available.

J

2006-11-18 08:24:15

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

On Friday 17 November 2006 23:57, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> > I have had private patches for that myself, using the MSRs on AMD
> > and Intel.
> >
>
> Would they be something that could be cleaned up into something
> mergeable?

Hmm maybe.

> It would be nice to have something that could be left
> enabled all the time,

That would add considerable latency to all exceptions.

And unfortunately we take more than 16 jumps before
we figure out an oops, so all the previous data would be gone
if it was only done in the error path.

If the CPU had a "disable LBR on exceptions" bit that would
work. Unfortunately it hasn't.

> but an option would at least make the
> functionality available.

If you have an debugging option you can as well enable the memory
based branch tracer, which gives a much larger picture
(but has somewhat more overhead too, i don't know how much
the difference is) Just someone would need to write a driver for it.
I think that would be more useful long term.

Here's the old MSR patch (for 64bit P4) for reference. I had
an AMD patch too, but I can't find it right now and on P4
it works much better anyways because it has 16 LBRs instead of 4.

-Andi

Dump last branch information on oopses

Signed-off-by: Andi Kleen <[email protected]>

Index: linux/arch/x86_64/kernel/entry.S
===================================================================
--- linux.orig/arch/x86_64/kernel/entry.S
+++ linux/arch/x86_64/kernel/entry.S
@@ -692,8 +692,38 @@ END(spurious_interrupt)
/*
* Exception entry points.
*/
+
+ .macro savemsr msr,var
+ movl $\msr,%ecx
+ rdmsr
+ movl %eax,\var
+ movl %edx,\var+4
+ .endm
+
+ .macro SAVELBR
+ cmpl $0,netburst
+ jz 1f
+ push %rax
+ push %rdx
+ push %rcx
+ savemsr 0x1da,lbr_tos
+ savemsr 0x1d7,ler_from
+ savemsr 0x1d8,ler_to
+ .set cnt,0
+ .rept 16
+ savemsr 0x680+cnt,lbr_from+cnt*8
+ savemsr 0x6c0+cnt,lbr_to+cnt*8
+ .set cnt,cnt+1
+ .endr
+ pop %rcx
+ pop %rdx
+ pop %rax
+1:
+ .endm
+
.macro zeroentry sym
INTR_FRAME
+ SAVELBR
pushq $0 /* push error code/oldrax */
CFI_ADJUST_CFA_OFFSET 8
pushq %rax /* push real oldrax to the rdi slot */
@@ -705,6 +735,7 @@ END(spurious_interrupt)

.macro errorentry sym
XCPT_FRAME
+ SAVELBR
pushq %rax
CFI_ADJUST_CFA_OFFSET 8
leaq \sym(%rip),%rax
@@ -715,6 +746,7 @@ END(spurious_interrupt)
/* error code is on the stack already */
/* handle NMI like exceptions that can happen everywhere */
.macro paranoidentry sym, ist=0, irqtrace=1
+ SAVELBR
SAVE_ALL
cld
movl $1,%ebx
Index: linux/arch/x86_64/kernel/setup.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup.c
+++ linux/arch/x86_64/kernel/setup.c
@@ -822,10 +822,13 @@ static void srat_detect_node(void)
#endif
}

+int netburst;
+
static void __cpuinit init_intel(struct cpuinfo_x86 *c)
{
/* Cache sizes */
unsigned n;
+ unsigned long val;

init_intel_cacheinfo(c);
if (c->cpuid_level > 9 ) {
@@ -867,6 +870,12 @@ static void __cpuinit init_intel(struct
c->x86_max_cores = intel_num_cpu_cores(c);

srat_detect_node();
+
+ if (c->x86 == 15) {
+ rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
+ wrmsrl(MSR_IA32_DEBUGCTLMSR, val | 1);
+ netburst = 1;
+ }
}

static void __cpuinit get_cpu_vendor(struct cpuinfo_x86 *c)
Index: linux/arch/x86_64/kernel/process.c
===================================================================
--- linux.orig/arch/x86_64/kernel/process.c
+++ linux/arch/x86_64/kernel/process.c
@@ -36,6 +36,7 @@
#include <linux/random.h>
#include <linux/notifier.h>
#include <linux/kprobes.h>
+#include <linux/kallsyms.h>

#include <asm/uaccess.h>
#include <asm/pgtable.h>
@@ -278,6 +279,9 @@ static int __init idle_setup (char *str)

__setup("idle=", idle_setup);

+unsigned long lbr_tos, lbr_from[16], lbr_to[16], ler_from, ler_to;
+extern int netburst;
+
/* Prints also some state that isn't saved in the pt_regs */
void __show_regs(struct pt_regs * regs)
{
@@ -326,6 +330,18 @@ void __show_regs(struct pt_regs * regs)
fs,fsindex,gs,gsindex,shadowgs);
printk("CS: %04x DS: %04x ES: %04x CR0: %016lx\n", cs, ds, es, cr0);
printk("CR2: %016lx CR3: %016lx CR4: %016lx\n", cr2, cr3, cr4);
+
+ if (netburst) {
+ unsigned i;
+ printk("LBR: TOS %lx", lbr_tos);
+ print_symbol(" LER_FROM %s", ler_from);
+ print_symbol(" LER_TO %s\n", ler_to);
+ for (i = 0; i < 16; i++) {
+ printk(" [%d]", i);
+ print_symbol(" FROM %s", lbr_from[i]);
+ print_symbol(" TO %s\n", lbr_from[i]);
+ }
+ }
}

void show_regs(struct pt_regs *regs)

2006-11-23 06:04:54

by Keith Owens

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

Andi Kleen (on Sat, 18 Nov 2006 09:24:01 +0100) wrote:
>On Friday 17 November 2006 23:57, Jeremy Fitzhardinge wrote:
>> Andi Kleen wrote:
>> > I have had private patches for that myself, using the MSRs on AMD
>> > and Intel.
>> >
>>
>> Would they be something that could be cleaned up into something
>> mergeable?
>
>Hmm maybe.
>
>> It would be nice to have something that could be left
>> enabled all the time,
>
>That would add considerable latency to all exceptions.
>
>And unfortunately we take more than 16 jumps before
>we figure out an oops, so all the previous data would be gone
>if it was only done in the error path.
>
>If the CPU had a "disable LBR on exceptions" bit that would
>work. Unfortunately it hasn't.

LBR is mainly useful on wild branches to random addresses. As such,
you really only need to disable LBR in the page fault handler. For a
long time, KDB had LBR support with this replacement for the i386 page
fault handler.

#if defined(CONFIG_KDB)
ENTRY(page_fault_mca)
pushl %ecx
pushl %edx
pushl %eax
movl $473,%ecx
rdmsr
andl $0xfffffffe,%eax /* Disable last branch recording */
wrmsr
popl %eax
popl %edx
popl %ecx
pushl $do_page_fault
jmp error_code
#endif

Nobody was using the LBR feature of KDB so I removed it in 2.6.17.

2006-11-23 08:52:59

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] i386 add Intel PEBS and BTS cpufeature bits and detection

> LBR is mainly useful on wild branches to random addresses. As such,

The page fault handler is one of the most performance critical
exceptions. And then on x86-64 a lot of stray pointers actually result in a
#GP, not a page fault.

-Andi