2015-12-10 07:12:07

by Jeff Merkey

[permalink] [raw]
Subject: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registered

If an int1 hardware breakpoint exception is triggered, but no perf bp
pevent block was registered from arch_install_hw_breakpoint, the
system will hard hang with the CPU stuck constantly re-interrupting at
the same execution address because the resume flag never gets set, and
the NOTIFY_DONE state prevents other int1 handlers, including the
default handler in do_debug, from running to handle the condition.
Can be reproduced by writing a program that sets an execute breakpoint
at schedule() without calling arch_install_hw_breakpoint.

The proposed fix checks the dr7 register and sets the resume flag in
pt->regs if it determines an executed breakpoint was triggered just in
case the check lower down fails. I have seen this bug and its a bug.

Signed-off-by: [email protected]
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad..6effcae 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -475,6 +475,14 @@ static int hw_breakpoint_handler(struct die_args *args)
for (i = 0; i < HBP_NUM; ++i) {
if (likely(!(dr6 & (DR_TRAP0 << i))))
continue;
+ /*
+ * Set up resume flag to avoid breakpoint recursion when
+ * returning back to origin in the event an int1
+ * exception is triggered and no event handler
+ * is present.
+ */
+ if ((dr7 & (3 << ((i * 4) + 16))) == 0)
+ args->regs->flags |= X86_EFLAGS_RF;

/*
* The counter may be concurrently released but that can only


2015-12-10 18:56:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registered

Jeff,

On Thu, 10 Dec 2015, Jeff Merkey wrote:

> If an int1 hardware breakpoint exception is triggered, but no perf bp
> pevent block was registered from arch_install_hw_breakpoint, the
> system will hard hang with the CPU stuck constantly re-interrupting at
> the same execution address because the resume flag never gets set, and
> the NOTIFY_DONE state prevents other int1 handlers, including the
> default handler in do_debug, from running to handle the condition.
> Can be reproduced by writing a program that sets an execute breakpoint
> at schedule() without calling arch_install_hw_breakpoint.
>
> The proposed fix checks the dr7 register and sets the resume flag in
> pt->regs if it determines an executed breakpoint was triggered just in
> case the check lower down fails. I have seen this bug and its a bug.

> Signed-off-by: [email protected]
> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> index 50a3fad..6effcae 100644
> --- a/arch/x86/kernel/hw_breakpoint.c
> +++ b/arch/x86/kernel/hw_breakpoint.c
> @@ -475,6 +475,14 @@ static int hw_breakpoint_handler(struct die_args *args)
> for (i = 0; i < HBP_NUM; ++i) {
> if (likely(!(dr6 & (DR_TRAP0 << i))))
> continue;
> + /*
> + * Set up resume flag to avoid breakpoint recursion when
> + * returning back to origin in the event an int1
> + * exception is triggered and no event handler
> + * is present.
> + */
> + if ((dr7 & (3 << ((i * 4) + 16))) == 0)

We have proper defines for all of this. See __encode_dr7().

> + args->regs->flags |= X86_EFLAGS_RF;

If there is a break point installed, then we do the same thing after
calling perf_bp_event() again.

Thanks,

tglx

2015-12-10 19:09:42

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registered

On Thu, Dec 10, 2015 at 10:55 AM, Thomas Gleixner <[email protected]> wrote:
> Jeff,
>
> On Thu, 10 Dec 2015, Jeff Merkey wrote:
>
>> If an int1 hardware breakpoint exception is triggered, but no perf bp
>> pevent block was registered from arch_install_hw_breakpoint, the
>> system will hard hang with the CPU stuck constantly re-interrupting at
>> the same execution address because the resume flag never gets set, and
>> the NOTIFY_DONE state prevents other int1 handlers, including the
>> default handler in do_debug, from running to handle the condition.
>> Can be reproduced by writing a program that sets an execute breakpoint
>> at schedule() without calling arch_install_hw_breakpoint.
>>
>> The proposed fix checks the dr7 register and sets the resume flag in
>> pt->regs if it determines an executed breakpoint was triggered just in
>> case the check lower down fails. I have seen this bug and its a bug.
>
>> Signed-off-by: [email protected]
>> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
>> index 50a3fad..6effcae 100644
>> --- a/arch/x86/kernel/hw_breakpoint.c
>> +++ b/arch/x86/kernel/hw_breakpoint.c
>> @@ -475,6 +475,14 @@ static int hw_breakpoint_handler(struct die_args *args)
>> for (i = 0; i < HBP_NUM; ++i) {
>> if (likely(!(dr6 & (DR_TRAP0 << i))))
>> continue;
>> + /*
>> + * Set up resume flag to avoid breakpoint recursion when
>> + * returning back to origin in the event an int1
>> + * exception is triggered and no event handler
>> + * is present.
>> + */
>> + if ((dr7 & (3 << ((i * 4) + 16))) == 0)
>
> We have proper defines for all of this. See __encode_dr7().
>
>> + args->regs->flags |= X86_EFLAGS_RF;
>
> If there is a break point installed, then we do the same thing after
> calling perf_bp_event() again.

On brief inspection, this smells like a microcode bug. Can you send
/proc/cpuinfo output?

For example, this CPU and microcode combination is known bad:

processor : 7
vendor_id : AuthenticAMD
cpu family : 21
model : 2
model name : AMD Opteron(tm) Processor 3380
stepping : 0
microcode : 0x6000832

If this is the issue, I'm not sure we want to be in the business of
working around localized microcode bugs and, if we do, then I think we
should explicitly detect the bug and log about it.

--Andy

2015-12-10 19:20:33

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registered

On Thu, Dec 10, 2015 at 11:09:21AM -0800, Andy Lutomirski wrote:
> On Thu, Dec 10, 2015 at 10:55 AM, Thomas Gleixner <[email protected]> wrote:
> > Jeff,
> >
> > On Thu, 10 Dec 2015, Jeff Merkey wrote:
> >
> >> If an int1 hardware breakpoint exception is triggered, but no perf bp
> >> pevent block was registered from arch_install_hw_breakpoint, the
> >> system will hard hang with the CPU stuck constantly re-interrupting at
> >> the same execution address because the resume flag never gets set, and
> >> the NOTIFY_DONE state prevents other int1 handlers, including the
> >> default handler in do_debug, from running to handle the condition.
> >> Can be reproduced by writing a program that sets an execute breakpoint
> >> at schedule() without calling arch_install_hw_breakpoint.
> >>
> >> The proposed fix checks the dr7 register and sets the resume flag in
> >> pt->regs if it determines an executed breakpoint was triggered just in
> >> case the check lower down fails. I have seen this bug and its a bug.
> >
> >> Signed-off-by: [email protected]
> >> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> >> index 50a3fad..6effcae 100644
> >> --- a/arch/x86/kernel/hw_breakpoint.c
> >> +++ b/arch/x86/kernel/hw_breakpoint.c
> >> @@ -475,6 +475,14 @@ static int hw_breakpoint_handler(struct die_args *args)
> >> for (i = 0; i < HBP_NUM; ++i) {
> >> if (likely(!(dr6 & (DR_TRAP0 << i))))
> >> continue;
> >> + /*
> >> + * Set up resume flag to avoid breakpoint recursion when
> >> + * returning back to origin in the event an int1
> >> + * exception is triggered and no event handler
> >> + * is present.
> >> + */
> >> + if ((dr7 & (3 << ((i * 4) + 16))) == 0)
> >
> > We have proper defines for all of this. See __encode_dr7().
> >
> >> + args->regs->flags |= X86_EFLAGS_RF;
> >
> > If there is a break point installed, then we do the same thing after
> > calling perf_bp_event() again.
>
> On brief inspection, this smells like a microcode bug. Can you send
> /proc/cpuinfo output?
>
> For example, this CPU and microcode combination is known bad:
>
> processor : 7
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 2
> model name : AMD Opteron(tm) Processor 3380
> stepping : 0
> microcode : 0x6000832
>
> If this is the issue, I'm not sure we want to be in the business of
> working around localized microcode bugs and, if we do, then I think we
> should explicitly detect the bug and log about it.

seems like the issue we hit some time ago:
http://marc.info/?l=linux-kernel&m=143976421117070&w=2

jirka

2015-12-10 19:25:39

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registered

On Thu, Dec 10, 2015 at 08:20:24PM +0100, Jiri Olsa wrote:
> seems like the issue we hit some time ago:
> http://marc.info/?l=linux-kernel&m=143976421117070&w=2

Time to ping people again. Looks like the previous thread stalled...

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-12-10 20:51:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On Thu, 10 Dec 2015, Andy Lutomirski wrote:
> On brief inspection, this smells like a microcode bug. Can you send
> /proc/cpuinfo output?
>
> If this is the issue, I'm not sure we want to be in the business of
> working around localized microcode bugs and, if we do, then I think we
> should explicitly detect the bug and log about it.

I think we should handle such stuff gracefully. Yes, we should log it
and we also should check what the contents of the debug registers are.

If dr7 has a break point enabled w/o perf having one installed then we
know that someone did a horrible hackery ....

Thanks,

tglx

2015-12-10 21:09:39

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On Thu, Dec 10, 2015 at 12:49 PM, Thomas Gleixner <[email protected]> wrote:
> On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>> On brief inspection, this smells like a microcode bug. Can you send
>> /proc/cpuinfo output?
>>
>> If this is the issue, I'm not sure we want to be in the business of
>> working around localized microcode bugs and, if we do, then I think we
>> should explicitly detect the bug and log about it.
>
> I think we should handle such stuff gracefully. Yes, we should log it
> and we also should check what the contents of the debug registers are.
>
> If dr7 has a break point enabled w/o perf having one installed then we
> know that someone did a horrible hackery ....

I mis-read this. I don't think this is the microcode bug.

Do we know what the actual problem is? Jeff, how did you trigger this?

If it's lazy DR switching (which I haven't looked at the details of),
then it seems that this could just be triggered by some unfortunate
combination of perf config and context switching). But, if so, then I
think that the proposed fix is wrong -- shouldn't we fix dr7 rather
than fudging RF to work around this instance of the problem? After
all, if we're hitting this condition in a tight userspace loop, we're
going to destroy performance unless we fix dr7.

--Andy

2015-12-10 21:11:58

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

There's no output to send when the bug occurs, the system locks up.
Here is the CPU info in any event for the affected system.

processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9850 Quad-Core Processor
stepping : 3
microcode : 0x1000095
cpu MHz : 1250.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
hw_pstate npt lbrv svm_lock vmmcall
bugs : tlb_mmatch fxsave_leak sysret_ss_attrs
bogomips : 5022.82
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor : 1
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9850 Quad-Core Processor
stepping : 3
microcode : 0x1000095
cpu MHz : 1250.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
hw_pstate npt lbrv svm_lock vmmcall
bugs : tlb_mmatch fxsave_leak sysret_ss_attrs
bogomips : 5022.82
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor : 2
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9850 Quad-Core Processor
stepping : 3
microcode : 0x1000095
cpu MHz : 1250.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
hw_pstate npt lbrv svm_lock vmmcall
bugs : tlb_mmatch fxsave_leak sysret_ss_attrs
bogomips : 5022.82
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor : 3
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9850 Quad-Core Processor
stepping : 3
microcode : 0x1000095
cpu MHz : 1250.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
hw_pstate npt lbrv svm_lock vmmcall
bugs : tlb_mmatch fxsave_leak sysret_ss_attrs
bogomips : 5022.82
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate



On 12/10/15, Thomas Gleixner <[email protected]> wrote:
> On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>> On brief inspection, this smells like a microcode bug. Can you send
>> /proc/cpuinfo output?
>>
>> If this is the issue, I'm not sure we want to be in the business of
>> working around localized microcode bugs and, if we do, then I think we
>> should explicitly detect the bug and log about it.
>
> I think we should handle such stuff gracefully. Yes, we should log it
> and we also should check what the contents of the debug registers are.
>
> If dr7 has a break point enabled w/o perf having one installed then we
> know that someone did a horrible hackery ....
>
> Thanks,
>
> tglx
>
>
>

2015-12-10 21:16:27

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On Thu, 10 Dec 2015, Andy Lutomirski wrote:
> On Thu, Dec 10, 2015 at 12:49 PM, Thomas Gleixner <[email protected]> wrote:
> > On Thu, 10 Dec 2015, Andy Lutomirski wrote:
> >> On brief inspection, this smells like a microcode bug. Can you send
> >> /proc/cpuinfo output?
> >>
> >> If this is the issue, I'm not sure we want to be in the business of
> >> working around localized microcode bugs and, if we do, then I think we
> >> should explicitly detect the bug and log about it.
> >
> > I think we should handle such stuff gracefully. Yes, we should log it
> > and we also should check what the contents of the debug registers are.
> >
> > If dr7 has a break point enabled w/o perf having one installed then we
> > know that someone did a horrible hackery ....
>
> I mis-read this. I don't think this is the microcode bug.
>
> Do we know what the actual problem is? Jeff, how did you trigger this?
>
> If it's lazy DR switching (which I haven't looked at the details of),
> then it seems that this could just be triggered by some unfortunate
> combination of perf config and context switching). But, if so, then I
> think that the proposed fix is wrong -- shouldn't we fix dr7 rather
> than fudging RF to work around this instance of the problem? After
> all, if we're hitting this condition in a tight userspace loop, we're
> going to destroy performance unless we fix dr7.

Right, if dr7 contains a valid breakpoint and perf does not have one
installed then either that lazy DR stuff is flaky or someone else
fiddled with dr7. In both cases we want to fix dr7 and yell about it.

If we neither have a breakpoint nor dr7 has one enabled then its
something spurious and we certainly want to set RF so the machine can
make progress.

Thanks,

tglx

2015-12-10 21:16:43

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
set an execute breakpoint without going through
arch_install_hw_breakpoint. When the breakpoint fires, the system
crashes and hangs on the processor stuck in an endless loop inside the
int1 handler in hw_breakpoint.c --



static int hw_breakpoint_handler(struct die_args *args)
{
int i, cpu, rc = NOTIFY_STOP;
struct perf_event *bp;
unsigned long dr7, dr6;
unsigned long *dr6_p;

/* The DR6 value is pointed by args->err */
dr6_p = (unsigned long *)ERR_PTR(args->err);
dr6 = *dr6_p;

/* If it's a single step, TRAP bits are random */
if (dr6 & DR_STEP)
return NOTIFY_DONE;

/* Do an early return if no trap bits are set in DR6 */
if ((dr6 & DR_TRAP_BITS) == 0)
return NOTIFY_DONE;

get_debugreg(dr7, 7);
/* Disable breakpoints during exception handling */
set_debugreg(0UL, 7);
/*
* Assert that local interrupts are disabled
* Reset the DRn bits in the virtualized register value.
* The ptrace trigger routine will add in whatever is needed.
*/
current->thread.debugreg6 &= ~DR_TRAP_BITS;
cpu = get_cpu();

/* Handle all the breakpoints that were triggered */
for (i = 0; i < HBP_NUM; ++i) {
if (likely(!(dr6 & (DR_TRAP0 << i))))
continue;

/*
* The counter may be concurrently released but that can only
* occur from a call_rcu() path. We can then safely fetch
* the breakpoint, use its callback, touch its counter
* while we are in an rcu_read_lock() path.
*/
rcu_read_lock();

bp = per_cpu(bp_per_reg[i], cpu);
/*
* Reset the 'i'th TRAP bit in dr6 to denote completion of
* exception handling
*/
(*dr6_p) &= ~(DR_TRAP0 << i);
/*
* bp can be NULL due to lazy debug register switching
* or due to concurrent perf counter removing.
*/
if (!bp) {
rcu_read_unlock();
break;
}

perf_bp_event(bp, args->regs);

/*
* Set up resume flag to avoid breakpoint recursion when
* returning back to origin.
*/
if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE)
args->regs->flags |= X86_EFLAGS_RF;

rcu_read_unlock();
}
/*
* Further processing in do_debug() is needed for a) user-space
* breakpoints (to generate signals) and b) when the system has
* taken exception due to multiple causes
*/
if ((current->thread.debugreg6 & DR_TRAP_BITS) ||
(dr6 & (~DR_TRAP_BITS)))
rc = NOTIFY_DONE;

set_debugreg(dr7, 7);
put_cpu();

return rc;
}


The check for en execute breakpoint to set the resume flag relies on
whether someone has registered a pevent bp event and should rely on
what the hell is in the dr7 register, otherwise, the resume flag never
gets set and the system just keeps getting the same int1 breakpoint
firing off from the hardware over and over again at the same address
and it result in a hard hang.


On 12/10/15, Andy Lutomirski <[email protected]> wrote:
> On Thu, Dec 10, 2015 at 12:49 PM, Thomas Gleixner <[email protected]>
> wrote:
>> On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>>> On brief inspection, this smells like a microcode bug. Can you send
>>> /proc/cpuinfo output?
>>>
>>> If this is the issue, I'm not sure we want to be in the business of
>>> working around localized microcode bugs and, if we do, then I think we
>>> should explicitly detect the bug and log about it.
>>
>> I think we should handle such stuff gracefully. Yes, we should log it
>> and we also should check what the contents of the debug registers are.
>>
>> If dr7 has a break point enabled w/o perf having one installed then we
>> know that someone did a horrible hackery ....
>
> I mis-read this. I don't think this is the microcode bug.
>
> Do we know what the actual problem is? Jeff, how did you trigger this?
>
> If it's lazy DR switching (which I haven't looked at the details of),
> then it seems that this could just be triggered by some unfortunate
> combination of perf config and context switching). But, if so, then I
> think that the proposed fix is wrong -- shouldn't we fix dr7 rather
> than fudging RF to work around this instance of the problem? After
> all, if we're hitting this condition in a tight userspace loop, we're
> going to destroy performance unless we fix dr7.
>
> --Andy
>

2015-12-10 21:26:50

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

This could be a problem down the road for KDB and KGDB as well since
they both directly fiddle with DR7 directly and by-pass the perf
layer for late breakpoints. At present they make certain to have a
perf bp pevent registered.

On 12/10/15, Thomas Gleixner <[email protected]> wrote:
> On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>> On Thu, Dec 10, 2015 at 12:49 PM, Thomas Gleixner <[email protected]>
>> wrote:
>> > On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>> >> On brief inspection, this smells like a microcode bug. Can you send
>> >> /proc/cpuinfo output?
>> >>
>> >> If this is the issue, I'm not sure we want to be in the business of
>> >> working around localized microcode bugs and, if we do, then I think we
>> >> should explicitly detect the bug and log about it.
>> >
>> > I think we should handle such stuff gracefully. Yes, we should log it
>> > and we also should check what the contents of the debug registers are.
>> >
>> > If dr7 has a break point enabled w/o perf having one installed then we
>> > know that someone did a horrible hackery ....
>>
>> I mis-read this. I don't think this is the microcode bug.
>>
>> Do we know what the actual problem is? Jeff, how did you trigger this?
>>
>> If it's lazy DR switching (which I haven't looked at the details of),
>> then it seems that this could just be triggered by some unfortunate
>> combination of perf config and context switching). But, if so, then I
>> think that the proposed fix is wrong -- shouldn't we fix dr7 rather
>> than fudging RF to work around this instance of the problem? After
>> all, if we're hitting this condition in a tight userspace loop, we're
>> going to destroy performance unless we fix dr7.
>
> Right, if dr7 contains a valid breakpoint and perf does not have one
> installed then either that lazy DR stuff is flaky or someone else
> fiddled with dr7. In both cases we want to fix dr7 and yell about it.
>
> If we neither have a breakpoint nor dr7 has one enabled then its
> something spurious and we certainly want to set RF so the machine can
> make progress.
>
> Thanks,
>
> tglx
>

2015-12-10 22:26:45

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

I'll submit a second patch later this evening with a graceful message
when this happens. I need to look over this code path -- its in a
very mature section of code and detecting this condition (bp not being
present) may break other code paths and subsystems ifs not done very
carefully. The current dr7 check works and I have tested this code,
but does not print a message.



On 12/10/15, Thomas Gleixner <[email protected]> wrote:
> On Thu, 10 Dec 2015, Andy Lutomirski wrote:
>> On brief inspection, this smells like a microcode bug. Can you send
>> /proc/cpuinfo output?
>>
>> If this is the issue, I'm not sure we want to be in the business of
>> working around localized microcode bugs and, if we do, then I think we
>> should explicitly detect the bug and log about it.
>
> I think we should handle such stuff gracefully. Yes, we should log it
> and we also should check what the contents of the debug registers are.
>
> If dr7 has a break point enabled w/o perf having one installed then we
> know that someone did a horrible hackery ....
>
> Thanks,
>
> tglx
>
>
>

2015-12-11 08:05:33

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] Fix int1 recursion when no perf_bp_event is registeredy

I have completed testing of the following patch, added the fix and log
a message when the error occurs. I also reviewed the code paths that
touch this section of code. The lazy debug register behaviors pass
state back and forth with this subsystem through the thread.debugred6
virtualized register value to determine whether or not another
notify_die handler has been registered. The case that causes the
lockup is when NOTIFY_STOP is set which halts calls to other handlers
and also bypasses the checks at the bottom of the do_debug() function
in traps.c and the bp event handler is NULL. In this case, the int1
exception has no code path to set the resume flag absent this patch.

There are some complex behaviors in this code path that require
careful review. I used the following code to test this change to the
kernel compiled as a module. It's a pretty robust test since it sets
breakpoints at interrupt contexts and process context. Unloading the
module clears the breakpoints. I can send this test module as a
separate patch if requested to verify.

Test Module to test this patch:

#include <linux/module.h>
#include <linux/string.h>
#include <linux/time.h>
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/blkdev.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
#include <linux/ip.h>
#include <linux/udp.h>
#include <linux/in.h>
#include <linux/inet.h>
#include <linux/version.h>
#include <linux/random.h>
#include <linux/ctype.h>
#include <linux/freezer.h>
#include <asm/atomic.h>
#include <net/sock.h>
#include <linux/delay.h>
#include <net/checksum.h>

#include <linux/fs.h>
#include <linux/pagemap.h>
#include <linux/highmem.h>
#include <linux/string.h>
#include <asm/uaccess.h>
#include <linux/namei.h>
#include <linux/statfs.h>
#include <linux/backing-dev.h>
#include "linux/kallsyms.h"

#define BREAK_EXECUTE 0
#define BREAK_WRITE 1
#define BREAK_IOPORT 2
#define BREAK_READWRITE 3
#define ONE_BYTE_FIELD 0
#define TWO_BYTE_FIELD 1
#define EIGHT_BYTE_FIELD 2
#define FOUR_BYTE_FIELD 3

/* DR7 Register */

#define L0_BIT 0x00000001
#define G0_BIT 0x00000002
#define L1_BIT 0x00000004
#define G1_BIT 0x00000008
#define L2_BIT 0x00000010
#define G2_BIT 0x00000020
#define L3_BIT 0x00000040
#define G3_BIT 0x00000080
#define LEXACT 0x00000100
#define GEXACT 0x00000200
#define GDETECT 0x00002000
#define DR7DEF 0x00000400

static int __init init_leaf(void)
{
unsigned long dr7 = 0;

printk("0: %lX\n", kallsyms_lookup_name("reschedule_interrupt"));
printk("1: %lX\n", kallsyms_lookup_name("apic_timer_interrupt"));
printk("2: %lX\n", kallsyms_lookup_name("schedule"));
printk("3: %lX\n", kallsyms_lookup_name("schedule"));

set_debugreg(kallsyms_lookup_name("reschedule_interrupt"), 0);
set_debugreg(kallsyms_lookup_name("apic_timer_interrupt"), 1);
set_debugreg(kallsyms_lookup_name("schedule"), 2);
set_debugreg(kallsyms_lookup_name("schedule"), 3);

dr7 &= 0xFFF0FFFF;
dr7 |= G0_BIT;
dr7 |= ((BREAK_EXECUTE << ((0 * 4) + 16)) | (ONE_BYTE_FIELD << ((0
* 4) + 18)));
dr7 &= 0xFF0FFFFF;
dr7 |= G1_BIT;
dr7 |= ((BREAK_EXECUTE << ((1 * 4) + 16)) | (ONE_BYTE_FIELD << ((1
* 4) + 18)));
dr7 &= 0xF0FFFFFF;
dr7 |= G2_BIT;
dr7 |= ((BREAK_EXECUTE << ((2 * 4) + 16)) | (ONE_BYTE_FIELD << ((2
* 4) + 18)));
dr7 &= 0x0FFFFFFF;
dr7 |= G3_BIT;
dr7 |= ((BREAK_READWRITE << ((3 * 4) + 16)) | (ONE_BYTE_FIELD <<
((3 * 4) + 18)));

set_debugreg(dr7, 7);
return 0;
}

static void __exit cleanup_leaf(void)
{

set_debugreg(0L, 7);
return;
}

module_init(init_leaf);
module_exit(cleanup_leaf);
MODULE_LICENSE("GPL");

The log output when the breakpoints are triggered on a 4 core system
is as expected. I did not put code in that will clear the breakpoints
automatically when this condition is detected. This is an item to
discuss.

[ 852.228301] hw_breakpoint_handler: 8816 callbacks suppressed
[ 852.228314] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA
[ 852.228336] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.228366] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.228883] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA
[ 852.228887] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.228973] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.229250] INFO: spurious INT1 exception dr6: 0x2 dr7: 0x300004AA
[ 852.229274] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.229546] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA
[ 852.230260] INFO: spurious INT1 exception dr6: 0x2 dr7: 0x300004AA
[ 857.285756] hw_breakpoint_handler: 3381 callbacks suppressed
[ 857.285764] INFO: spurious INT1 exception dr6: 0x1 dr7: 0x300004AA
[ 857.285780] INFO: spurious INT1 exception dr6: 0x4 dr7: 0x300004AA

Open Items:

Should we also disable these breakpoints by changing DR7 if they are
spurious or just throw up a lot of messages?

Jeff

Signed-off-by: Jeff V. Merkey <[email protected]>
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad..1613a61 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -444,7 +444,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
static int hw_breakpoint_handler(struct die_args *args)
{
int i, cpu, rc = NOTIFY_STOP;
- struct perf_event *bp;
+ struct perf_event *bp = NULL;
unsigned long dr7, dr6;
unsigned long *dr6_p;

@@ -475,6 +475,13 @@ static int hw_breakpoint_handler(struct die_args *args)
for (i = 0; i < HBP_NUM; ++i) {
if (likely(!(dr6 & (DR_TRAP0 << i))))
continue;
+ /*
+ * check if we got an execute breakpoint
+ * from the dr7 register. if we did, set
+ * the resume flag to avoid int1 recursion.
+ */
+ if ((dr7 & (3 << ((i * 4) + 16))) == 0)
+ args->regs->flags |= X86_EFLAGS_RF;

/*
* The counter may be concurrently released but that can only
@@ -503,7 +510,9 @@ static int hw_breakpoint_handler(struct die_args *args)

/*
* Set up resume flag to avoid breakpoint recursion when
- * returning back to origin.
+ * returning back to origin. Perform the check
+ * twice in case the event handler altered the
+ * system flags.
*/
if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE)
args->regs->flags |= X86_EFLAGS_RF;
@@ -519,6 +528,18 @@ static int hw_breakpoint_handler(struct die_args *args)
(dr6 & (~DR_TRAP_BITS)))
rc = NOTIFY_DONE;

+ /*
+ * if we are about to signal to
+ * do_debug() to stop further processing
+ * and we have not ascertained the source
+ * of the breakpoint, log it as spurious.
+ */
+ if (rc == NOTIFY_STOP && !bp) {
+ printk_ratelimited(KERN_INFO
+ "INFO: spurious INT1 exception dr6: 0x%lX dr7: 0x%lX\n",
+ dr6, dr7);
+ }
+
set_debugreg(dr7, 7);
put_cpu();

2015-12-11 19:04:17

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] Fix int1 recursion when no perf_bp_event is registeredy

One thing I noticed in this section of code:

/* Handle all the breakpoints that were triggered */
for (i = 0; i < HBP_NUM; ++i) {
if (likely(!(dr6 & (DR_TRAP0 << i))))
continue;

... snip ...

if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE)
args->regs->flags |= X86_EFLAGS_RF;

rcu_read_unlock();
}

Whoever wrote this loop did not seem to understand have observed the
code path in action on intel hardware. There is NEVER a case I have
seen when the hardware sends multiple breakpoint statuses through dr6,
they are sent one at a time. So the rolling check through all the
status bits is pointless since only one breakpoint will be reported by
the hardware at a time. This is not to say that someone in the future
might may change it, but these interrupts are delivered one by one in
order and if there are duplicates (like a read/write breakpoint set at
the same address as an execute breakpoint.

Does anyone know why it was coded this way because its flat wrong.

Jeff

2015-12-13 23:11:39

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] Fix int1 recursion when no perf_bp_event is registeredy

Is there anything else I need to do on this patch, is it acceptable,
do you want a v3 that disables the errant breakpoints? Problem with
that approach is if in fact there is a "lazy" breakpoint we might end
up clearing it being set if it fires off too soon.

So this is the first time I have submitted a patch through this new
process so what happens now?

2015-12-14 08:09:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy


A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

* Jeff Merkey <[email protected]> wrote:

> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
> set an execute breakpoint without going through
> arch_install_hw_breakpoint. When the breakpoint fires, the system
> crashes and hangs on the processor stuck in an endless loop inside the
> int1 handler in hw_breakpoint.c --

What is still not clear to me, can you trigger the hang not via some special
kernel driver that goes outside regular APIs and messes with the state of the
debug registers, but via the proper access methods, i.e. various user-space ABIs?

Thanks,

Ingo

2015-12-14 08:13:13

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On 12/14/15, Ingo Molnar <[email protected]> wrote:
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
>
> * Jeff Merkey <[email protected]> wrote:
>
>> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
>> set an execute breakpoint without going through
>> arch_install_hw_breakpoint. When the breakpoint fires, the system
>> crashes and hangs on the processor stuck in an endless loop inside the
>> int1 handler in hw_breakpoint.c --
>
> What is still not clear to me, can you trigger the hang not via some special
>
> kernel driver that goes outside regular APIs and messes with the state of
> the
> debug registers, but via the proper access methods, i.e. various user-space
> ABIs?
>
> Thanks,
>
> Ingo
>

Any process that can get access to the debug registers can trigger
this condition. As it stands, if restricted to the established API in
hw_breakpoint.c this bug should not occur unless someone triggers an
errant breakpoint. That being said, there is a severe bug in the code
path if for some reason an application triggers a breakpoint exception
and no event has been registered, the system will crash with no logged
output of any kind indicating why it happened. It's not severe enough
for a panic but does need to be handled gracefully just as exception
handling 101.

Sorry about the top posting. I forget sometimes.

Jeff

2015-12-14 08:26:47

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On 12/14/15, Jeff Merkey <[email protected]> wrote:
> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>> A: Top-posting.
>> Q: What is the most annoying thing in e-mail?
>>
>> * Jeff Merkey <[email protected]> wrote:
>>
>>> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
>>> set an execute breakpoint without going through
>>> arch_install_hw_breakpoint. When the breakpoint fires, the system
>>> crashes and hangs on the processor stuck in an endless loop inside the
>>> int1 handler in hw_breakpoint.c --
>>
>> What is still not clear to me, can you trigger the hang not via some
>> special
>>
>> kernel driver that goes outside regular APIs and messes with the state of
>> the
>> debug registers, but via the proper access methods, i.e. various
>> user-space
>> ABIs?
>>
>> Thanks,
>>
>> Ingo
>>
>
> Any process that can get access to the debug registers can trigger
> this condition. As it stands, if restricted to the established API in
> hw_breakpoint.c this bug should not occur unless someone triggers an
> errant breakpoint. That being said, there is a severe bug in the code
> path if for some reason an application triggers a breakpoint exception
> and no event has been registered, the system will crash with no logged
> output of any kind indicating why it happened. It's not severe enough
> for a panic but does need to be handled gracefully just as exception
> handling 101.
>
> Sorry about the top posting. I forget sometimes.
>
> Jeff
>

What's more problematic is that debuggers MUST zero dr7 upon entry
then restore it to prevent breakpoints while inside a debugger
console. kgdb/kdb both fiddle with dr7 directly and bypass the API in
many cases. Any kernel module can trigger a breakpoint including
userspace debuggers. Most of these cases are handled at the bottom of
do_debug and in the perf handlers. The problem here is that the
normal handling in do_debug gets short circuited when this bug gets
hit that causes the system to get stuck in an endless interrupt cycle
with int1 firing off at the same address because the resume flag never
gets set.

The v2 patch fixes the bug, prints a message. Disabling the
breakpoint in dr7 is also worrisome since bp can be null in this code
path, but when that happens there is a singaling via the
thread.debugreg6 variable that keeps the system from locking up unless
you trigger a breakpoint outside the API (which kgdb/kdb do in some
cases, though they seem to keep a "dummy" bp registered while they
fiddle with dr7).

Jeff

2015-12-14 09:28:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy


* Jeff Merkey <[email protected]> wrote:

> On 12/14/15, Ingo Molnar <[email protected]> wrote:
> >
> > A: Because it messes up the order in which people normally read text.
> > Q: Why is top-posting such a bad thing?
> > A: Top-posting.
> > Q: What is the most annoying thing in e-mail?
> >
> > * Jeff Merkey <[email protected]> wrote:
> >
> >> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and set an
> >> execute breakpoint without going through arch_install_hw_breakpoint. When
> >> the breakpoint fires, the system crashes and hangs on the processor stuck in
> >> an endless loop inside the int1 handler in hw_breakpoint.c --
> >
> > What is still not clear to me, can you trigger the hang not via some special
> >
> > kernel driver that goes outside regular APIs and messes with the state of the
> > debug registers, but via the proper access methods, i.e. various user-space
> > ABIs?
>
> Any process that can get access to the debug registers can trigger this
> condition. [...]

A process on an unmodified Linux kernel can only modify debug registers via the
proper APIs:

> [...] As it stands, if restricted to the established API in hw_breakpoint.c
> this bug should not occur unless someone triggers an errant breakpoint. [...]

So am I interpreting your report correctly:

"If the Linux kernel is modified to change debug registers without using the
proper APIs (such as loading a module that changes hardware registers in a raw
fashion), things may break and a difficult to debug hang may occur."

right?

This key piece of information should have been part of the original report.

So I'm wondering, why does your module modify debug registers in a raw fashion?
Why doesn't it use the proper APIs?

Thanks,

Ingo

2015-12-14 17:52:34

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On 12/14/15, Ingo Molnar <[email protected]> wrote:
>
> * Jeff Merkey <[email protected]> wrote:
>
>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>> >
>> > A: Because it messes up the order in which people normally read text.
>> > Q: Why is top-posting such a bad thing?
>> > A: Top-posting.
>> > Q: What is the most annoying thing in e-mail?
>> >
>> > * Jeff Merkey <[email protected]> wrote:
>> >
>> >> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
>> >> set an
>> >> execute breakpoint without going through arch_install_hw_breakpoint.
>> >> When
>> >> the breakpoint fires, the system crashes and hangs on the processor
>> >> stuck in
>> >> an endless loop inside the int1 handler in hw_breakpoint.c --
>> >
>> > What is still not clear to me, can you trigger the hang not via some
>> > special
>> >
>> > kernel driver that goes outside regular APIs and messes with the state
>> > of the
>> > debug registers, but via the proper access methods, i.e. various
>> > user-space
>> > ABIs?
>>
>> Any process that can get access to the debug registers can trigger this
>> condition. [...]
>
> A process on an unmodified Linux kernel can only modify debug registers via
> the
> proper APIs:
>
>> [...] As it stands, if restricted to the established API in
>> hw_breakpoint.c
>> this bug should not occur unless someone triggers an errant breakpoint.
>> [...]
>
> So am I interpreting your report correctly:
>
> "If the Linux kernel is modified to change debug registers without using
> the
> proper APIs (such as loading a module that changes hardware registers in
> a raw
> fashion), things may break and a difficult to debug hang may occur."
>
> right?
>
> This key piece of information should have been part of the original report.
>
> So I'm wondering, why does your module modify debug registers in a raw
> fashion?
> Why doesn't it use the proper APIs?
>
> Thanks,
>
> Ingo
>

Hi Ingo,

This will be a lengthy reply to properly explain this to you. First
some fundamental assumptions to clear up.

1. The MDB Debugger Module does not cause this problem. This is an
existing bug in the kernel in an exception code path.
2. This bug was discovered and triggered while running a TEST HARNESS
I use to test the debugger. Among other things, I check for
unregistered breakpoints while performing tests of debugging
blacked-out sections of the OS in a special mode the debugger can
employ called DIRECT MODE. In normal mode the MDB Debugger uses the
established breakpoint API.
3. The Breakpoint API in linux was not designed for debuggers. It
was designed for probe and application profiling. It has no concept
of global SMP breakpoints, no facilities to manage them, no on/off
settings for debugger entry conditions, requiring kernel debuggers to
do it themselves.
4. I handle all of these cases correctly as does kgdb and kdb and
have to use this severely deficient interface if not in direct mode.
5. Direct mode in a debugger allows the debugger to essentially be a
sliding window over sections of code that are normally blacked out.
For example, if placed in direct mode, MDB can debug the linux
debugger API itself because it is not calling it and is using the
registers directly without a software layer the debugger is dependent
on. Using this mode, its possible to debug across interrupts,
syscalls, and areas of the OS normally "blacked out" to the debugger.
6. This interface has a bug in its main execution path for handling
int1 exceptions. It asks the OS whether or not a breakpoint is an
execute breakpoint rather than querying the hardware dr7 register
which just delivered the interrupt, and since no bp is present, the
system hangs at the same address getting the interrupt over and over
again because the resume flag was never set.
7. The way an int1 exception works ingo is when the address of an
execute breakpoint is hit, the processor will interrupt through the
int1 handler and will keep asserting the same interrupt over and over
again unless the resume flag is set and reloaded into the processor.
This is how intel processors are designed to work.
8. If any process or module sets a breakpoint outside of linux
breakpoint API in this code path the system will crash. Its A BUG,
and it's been in linux 13 years. I am certain people have seen it
while running perf stuff but since it provides no diagnostic info,
someone would just reset the system.
9. This breakpoint API needs to be rewritten to be global breakpoint
aware, have an on/off switch so when a debugger enters an int1
exception, breakpoints are globally disabled (a requirement), among
other things.

The patch simply fixes the bug in the int handler that will cause a
lockup. The perf event system, kgdb, kdb, and any one of a number of
programs can trigger this bug, and probably have. People would blame
the debugger when its a bug in the int handler.

Jeff

2015-12-14 17:56:25

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On Mon, Dec 14, 2015 at 9:52 AM, Jeff Merkey <[email protected]> wrote:
> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>
>> * Jeff Merkey <[email protected]> wrote:
>>
>>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>> >
>>> > A: Because it messes up the order in which people normally read text.
>>> > Q: Why is top-posting such a bad thing?
>>> > A: Top-posting.
>>> > Q: What is the most annoying thing in e-mail?
>>> >
>>> > * Jeff Merkey <[email protected]> wrote:
>>> >
>>> >> I trigger it by writing to the dr7 and dr1, 2, 3 or four register and
>>> >> set an
>>> >> execute breakpoint without going through arch_install_hw_breakpoint.
>>> >> When
>>> >> the breakpoint fires, the system crashes and hangs on the processor
>>> >> stuck in
>>> >> an endless loop inside the int1 handler in hw_breakpoint.c --
>>> >
>>> > What is still not clear to me, can you trigger the hang not via some
>>> > special
>>> >
>>> > kernel driver that goes outside regular APIs and messes with the state
>>> > of the
>>> > debug registers, but via the proper access methods, i.e. various
>>> > user-space
>>> > ABIs?
>>>
>>> Any process that can get access to the debug registers can trigger this
>>> condition. [...]
>>
>> A process on an unmodified Linux kernel can only modify debug registers via
>> the
>> proper APIs:
>>
>>> [...] As it stands, if restricted to the established API in
>>> hw_breakpoint.c
>>> this bug should not occur unless someone triggers an errant breakpoint.
>>> [...]
>>
>> So am I interpreting your report correctly:
>>
>> "If the Linux kernel is modified to change debug registers without using
>> the
>> proper APIs (such as loading a module that changes hardware registers in
>> a raw
>> fashion), things may break and a difficult to debug hang may occur."
>>
>> right?
>>
>> This key piece of information should have been part of the original report.
>>
>> So I'm wondering, why does your module modify debug registers in a raw
>> fashion?
>> Why doesn't it use the proper APIs?
>>
>> Thanks,
>>
>> Ingo
>>
>
> Hi Ingo,
>
> This will be a lengthy reply to properly explain this to you. First
> some fundamental assumptions to clear up.
>
> 1. The MDB Debugger Module does not cause this problem. This is an
> existing bug in the kernel in an exception code path.

> 8. If any process or module sets a breakpoint outside of linux
> breakpoint API in this code path the system will crash. Its A BUG,
> and it's been in linux 13 years. I am certain people have seen it
> while running perf stuff but since it provides no diagnostic info,
> someone would just reset the system.

Putting "BUG" in caps doesn't make it so. What's wrong with it?

> 9. This breakpoint API needs to be rewritten to be global breakpoint
> aware, have an on/off switch so when a debugger enters an int1
> exception, breakpoints are globally disabled (a requirement), among
> other things.

A "requirement" for what?

>
> The patch simply fixes the bug in the int handler that will cause a
> lockup. The perf event system, kgdb, kdb, and any one of a number of
> programs can trigger this bug, and probably have. People would blame
> the debugger when its a bug in the int handler.

You ignored feedback from me and from tglx, and you still haven't
explained why this is a bug in the first place. Maybe the code could
degrade more gracefully if you use it wrong, but the int1 handler and
the rest of the kernel are very much aware of each other, and the int1
handler's failure to do what something that isn't in the kernel wants
it to do isn't a bug.

If you submit a clean patch to improve robustness of the handler and
if the new code is at least as clean as the old code, that might be a
different story.

--Andy

2015-12-14 18:16:55

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On 12/14/15, Andy Lutomirski <[email protected]> wrote:
> On Mon, Dec 14, 2015 at 9:52 AM, Jeff Merkey <[email protected]> wrote:
>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>>
>>> * Jeff Merkey <[email protected]> wrote:
>>>
>>>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>>> >
>>>> > A: Because it messes up the order in which people normally read text.
>>>> > Q: Why is top-posting such a bad thing?
>>>> > A: Top-posting.
>>>> > Q: What is the most annoying thing in e-mail?
>>>> >
>>>> > * Jeff Merkey <[email protected]> wrote:
>>>> >
>>>> >> I trigger it by writing to the dr7 and dr1, 2, 3 or four register
>>>> >> and
>>>> >> set an
>>>> >> execute breakpoint without going through arch_install_hw_breakpoint.
>>>> >> When
>>>> >> the breakpoint fires, the system crashes and hangs on the processor
>>>> >> stuck in
>>>> >> an endless loop inside the int1 handler in hw_breakpoint.c --
>>>> >
>>>> > What is still not clear to me, can you trigger the hang not via some
>>>> > special
>>>> >
>>>> > kernel driver that goes outside regular APIs and messes with the
>>>> > state
>>>> > of the
>>>> > debug registers, but via the proper access methods, i.e. various
>>>> > user-space
>>>> > ABIs?
>>>>
>>>> Any process that can get access to the debug registers can trigger this
>>>> condition. [...]
>>>
>>> A process on an unmodified Linux kernel can only modify debug registers
>>> via
>>> the
>>> proper APIs:
>>>
>>>> [...] As it stands, if restricted to the established API in
>>>> hw_breakpoint.c
>>>> this bug should not occur unless someone triggers an errant breakpoint.
>>>> [...]
>>>
>>> So am I interpreting your report correctly:
>>>
>>> "If the Linux kernel is modified to change debug registers without
>>> using
>>> the
>>> proper APIs (such as loading a module that changes hardware registers
>>> in
>>> a raw
>>> fashion), things may break and a difficult to debug hang may occur."
>>>
>>> right?
>>>
>>> This key piece of information should have been part of the original
>>> report.
>>>
>>> So I'm wondering, why does your module modify debug registers in a raw
>>> fashion?
>>> Why doesn't it use the proper APIs?
>>>
>>> Thanks,
>>>
>>> Ingo
>>>
>>
>> Hi Ingo,
>>
>> This will be a lengthy reply to properly explain this to you. First
>> some fundamental assumptions to clear up.
>>
>> 1. The MDB Debugger Module does not cause this problem. This is an
>> existing bug in the kernel in an exception code path.
>
>> 8. If any process or module sets a breakpoint outside of linux
>> breakpoint API in this code path the system will crash. Its A BUG,
>> and it's been in linux 13 years. I am certain people have seen it
>> while running perf stuff but since it provides no diagnostic info,
>> someone would just reset the system.
>
> Putting "BUG" in caps doesn't make it so. What's wrong with it?

Sorry about that.

>
>> 9. This breakpoint API needs to be rewritten to be global breakpoint
>> aware, have an on/off switch so when a debugger enters an int1
>> exception, breakpoints are globally disabled (a requirement), among
>> other things.
>
> A "requirement" for what?
>

When you enter a debugger console you must disable all breakpoints
globally or you will get nested breakpoints inside the debugger. kgdb
and kdb do not handle this well. MDB handles it but sooner or later
you will run out of stack space if you allow it. Intel documentation
says it's a no no based on how they designed the debug facilities in
their processors ...

>>
>> The patch simply fixes the bug in the int handler that will cause a
>> lockup. The perf event system, kgdb, kdb, and any one of a number of
>> programs can trigger this bug, and probably have. People would blame
>> the debugger when its a bug in the int handler.
>
> You ignored feedback from me and from tglx, and you still haven't
> explained why this is a bug in the first place. Maybe the code could
> degrade more gracefully if you use it wrong, but the int1 handler and
> the rest of the kernel are very much aware of each other, and the int1
> handler's failure to do what something that isn't in the kernel wants
> it to do isn't a bug.
>
> If you submit a clean patch to improve robustness of the handler and
> if the new code is at least as clean as the old code, that might be a
> different story.
>
> --Andy
>

Responded to the rest of it.

2015-12-14 18:18:25

by Jeff Merkey

[permalink] [raw]
Subject: Re: [PATCH 1/1] Fix int1 recursion when no perf_bp_event is registeredy

On 12/14/15, Andy Lutomirski <[email protected]> wrote:
> On Mon, Dec 14, 2015 at 9:52 AM, Jeff Merkey <[email protected]> wrote:
>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>>
>>> * Jeff Merkey <[email protected]> wrote:
>>>
>>>> On 12/14/15, Ingo Molnar <[email protected]> wrote:
>>>> >
>>>> > A: Because it messes up the order in which people normally read text.
>>>> > Q: Why is top-posting such a bad thing?
>>>> > A: Top-posting.
>>>> > Q: What is the most annoying thing in e-mail?
>>>> >
>>>> > * Jeff Merkey <[email protected]> wrote:
>>>> >
>>>> >> I trigger it by writing to the dr7 and dr1, 2, 3 or four register
>>>> >> and
>>>> >> set an
>>>> >> execute breakpoint without going through arch_install_hw_breakpoint.
>>>> >> When
>>>> >> the breakpoint fires, the system crashes and hangs on the processor
>>>> >> stuck in
>>>> >> an endless loop inside the int1 handler in hw_breakpoint.c --
>>>> >
>>>> > What is still not clear to me, can you trigger the hang not via some
>>>> > special
>>>> >
>>>> > kernel driver that goes outside regular APIs and messes with the
>>>> > state
>>>> > of the
>>>> > debug registers, but via the proper access methods, i.e. various
>>>> > user-space
>>>> > ABIs?
>>>>
>>>> Any process that can get access to the debug registers can trigger this
>>>> condition. [...]
>>>
>>> A process on an unmodified Linux kernel can only modify debug registers
>>> via
>>> the
>>> proper APIs:
>>>
>>>> [...] As it stands, if restricted to the established API in
>>>> hw_breakpoint.c
>>>> this bug should not occur unless someone triggers an errant breakpoint.
>>>> [...]
>>>
>>> So am I interpreting your report correctly:
>>>
>>> "If the Linux kernel is modified to change debug registers without
>>> using
>>> the
>>> proper APIs (such as loading a module that changes hardware registers
>>> in
>>> a raw
>>> fashion), things may break and a difficult to debug hang may occur."
>>>
>>> right?
>>>
>>> This key piece of information should have been part of the original
>>> report.
>>>
>>> So I'm wondering, why does your module modify debug registers in a raw
>>> fashion?
>>> Why doesn't it use the proper APIs?
>>>
>>> Thanks,
>>>
>>> Ingo
>>>
>>
>> Hi Ingo,
>>
>> This will be a lengthy reply to properly explain this to you. First
>> some fundamental assumptions to clear up.
>>
>> 1. The MDB Debugger Module does not cause this problem. This is an
>> existing bug in the kernel in an exception code path.
>
>> 8. If any process or module sets a breakpoint outside of linux
>> breakpoint API in this code path the system will crash. Its A BUG,
>> and it's been in linux 13 years. I am certain people have seen it
>> while running perf stuff but since it provides no diagnostic info,
>> someone would just reset the system.
>
> Putting "BUG" in caps doesn't make it so. What's wrong with it?
>
>> 9. This breakpoint API needs to be rewritten to be global breakpoint
>> aware, have an on/off switch so when a debugger enters an int1
>> exception, breakpoints are globally disabled (a requirement), among
>> other things.
>
> A "requirement" for what?
>
>>
>> The patch simply fixes the bug in the int handler that will cause a
>> lockup. The perf event system, kgdb, kdb, and any one of a number of
>> programs can trigger this bug, and probably have. People would blame
>> the debugger when its a bug in the int handler.
>
> You ignored feedback from me and from tglx, and you still haven't
> explained why this is a bug in the first place.


It crashes the system -- crash = bug -- right?

Maybe the code could
> degrade more gracefully if you use it wrong, but the int1 handler and
> the rest of the kernel are very much aware of each other, and the int1
> handler's failure to do what something that isn't in the kernel wants
> it to do isn't a bug.
>
> If you submit a clean patch to improve robustness of the handler and
> if the new code is at least as clean as the old code, that might be a
> different story.
>
> --Andy
>