If external debugger sets a breakpoint for one Kernel function
when device is in bootloader mode and loads Kernel, this breakpoint
will be wiped out in hw_breakpoint_reset(). To fix this, check
MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
0b1, halting debug is enabled. Don't reset debug registers in this case.
Signed-off-by: Tingwei Zhang <[email protected]>
---
arch/arm64/include/asm/debug-monitors.h | 1 +
arch/arm64/kernel/hw_breakpoint.c | 19 +++++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 7619f473155f..8dc2c28791a0 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -18,6 +18,7 @@
/* MDSCR_EL1 enabling bits */
#define DBG_MDSCR_KDE (1 << 13)
+#define DBG_MDSCR_HDE (1 << 14)
#define DBG_MDSCR_MDE (1 << 15)
#define DBG_MDSCR_MASK ~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 0b727edf4104..0180306f74d7 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -927,6 +927,17 @@ void hw_breakpoint_thread_switch(struct task_struct *next)
!next_debug_info->wps_disabled);
}
+/*
+ * Check if halted debug mode is enabled.
+ */
+static u32 hde_enabled(void)
+{
+ u32 mdscr;
+
+ asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr));
+ return (mdscr & DBG_MDSCR_HDE);
+}
+
/*
* CPU initialisation.
*/
@@ -934,6 +945,14 @@ static int hw_breakpoint_reset(unsigned int cpu)
{
int i;
struct perf_event **slots;
+
+ /*
+ * When halting debug mode is enabled, break point could be already
+ * set be external debugger. Don't reset debug registers here to
+ * reserve break point from external debugger.
+ */
+ if (hde_enabled())
+ return 0;
/*
* When a CPU goes through cold-boot, it does not have any installed
* slot, so it is safe to share the same function for restoring and
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
> If external debugger sets a breakpoint for one Kernel function
> when device is in bootloader mode and loads Kernel, this breakpoint
> will be wiped out in hw_breakpoint_reset(). To fix this, check
> MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
> 0b1, halting debug is enabled. Don't reset debug registers in this case.
I don't think this is sufficient, because the kernel can still
subsequently mess with breakpoints, and the HW debugger might not be
attached at this point in time anyhow.
I reckon this should hang off the existing "nodebumon" command line
option, and we shouldn't use HW breakpoints at all when that is passed.
Then you can pass that to prevent the kernel stomping on the external
debugger.
Will, thoughts?
Mark.
>
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> arch/arm64/include/asm/debug-monitors.h | 1 +
> arch/arm64/kernel/hw_breakpoint.c | 19 +++++++++++++++++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
> index 7619f473155f..8dc2c28791a0 100644
> --- a/arch/arm64/include/asm/debug-monitors.h
> +++ b/arch/arm64/include/asm/debug-monitors.h
> @@ -18,6 +18,7 @@
>
> /* MDSCR_EL1 enabling bits */
> #define DBG_MDSCR_KDE (1 << 13)
> +#define DBG_MDSCR_HDE (1 << 14)
> #define DBG_MDSCR_MDE (1 << 15)
> #define DBG_MDSCR_MASK ~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
>
> diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
> index 0b727edf4104..0180306f74d7 100644
> --- a/arch/arm64/kernel/hw_breakpoint.c
> +++ b/arch/arm64/kernel/hw_breakpoint.c
> @@ -927,6 +927,17 @@ void hw_breakpoint_thread_switch(struct task_struct *next)
> !next_debug_info->wps_disabled);
> }
>
> +/*
> + * Check if halted debug mode is enabled.
> + */
> +static u32 hde_enabled(void)
> +{
> + u32 mdscr;
> +
> + asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr));
> + return (mdscr & DBG_MDSCR_HDE);
> +}
> +
> /*
> * CPU initialisation.
> */
> @@ -934,6 +945,14 @@ static int hw_breakpoint_reset(unsigned int cpu)
> {
> int i;
> struct perf_event **slots;
> +
> + /*
> + * When halting debug mode is enabled, break point could be already
> + * set be external debugger. Don't reset debug registers here to
> + * reserve break point from external debugger.
> + */
> + if (hde_enabled())
> + return 0;
> /*
> * When a CPU goes through cold-boot, it does not have any installed
> * slot, so it is safe to share the same function for restoring and
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
> On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
> > If external debugger sets a breakpoint for one Kernel function
> > when device is in bootloader mode and loads Kernel, this breakpoint
> > will be wiped out in hw_breakpoint_reset(). To fix this, check
> > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
> > 0b1, halting debug is enabled. Don't reset debug registers in this case.
>
> I don't think this is sufficient, because the kernel can still
> subsequently mess with breakpoints, and the HW debugger might not be
> attached at this point in time anyhow.
>
> I reckon this should hang off the existing "nodebumon" command line
> option, and we shouldn't use HW breakpoints at all when that is passed.
> Then you can pass that to prevent the kernel stomping on the external
> debugger.
>
> Will, thoughts?
I was going to suggest the same thing, although we will also need to take
care to reset the registers if "nodebugmon" is toggled at runtime via the
"debug_enabled" file in debugfs.
Will
在 2020-03-30 21:42,Will Deacon 写道:
> On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
>> On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
>> > If external debugger sets a breakpoint for one Kernel function
>> > when device is in bootloader mode and loads Kernel, this breakpoint
>> > will be wiped out in hw_breakpoint_reset(). To fix this, check
>> > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
>> > 0b1, halting debug is enabled. Don't reset debug registers in this
> case.
>>
>> I don't think this is sufficient, because the kernel can still
>> subsequently mess with breakpoints, and the HW debugger might not be
>> attached at this point in time anyhow.
>>
>> I reckon this should hang off the existing "nodebumon" command line
>> option, and we shouldn't use HW breakpoints at all when that is
>> passed.
>> Then you can pass that to prevent the kernel stomping on the external
>> debugger.
>>
>> Will, thoughts?
>
> I was going to suggest the same thing, although we will also need to
> take
> care to reset the registers if "nodebugmon" is toggled at runtime via
> the
> "debug_enabled" file in debugfs.
>
> Will
Thanks for the suggestion, Mark and Will. It's a great idea to use
"nodebugmon". When "nodebugmon" is set, Kernel won't change HW
breakpoints.
For reset the registers after "debug_enabled" is toggled, I'm thinking
if
we are adding unnecessary complexity here.If we take that approach, we
will
hook "debug_enabled" interface and use smp_call_function_single() to
call
hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done and
change "debug_enabled". External debugger would clear the breakpoints
when
it detaches the device and restores its breakpoints when attaches the
device.
Assume debug_enabled is changed to one after external debugger detaches
the
device. Debugger would already clear the breakpoint registers. If
debgger is
still attached, there's nothing Kernel can do to stop it
restores/programs
the breakpoint registers.
What do you think of this?
Thanks,
Tingwei
On Tue, Mar 31, 2020 at 10:39:42AM +0800, [email protected] wrote:
> 在 2020-03-30 21:42,Will Deacon 写道:
> > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
> > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
> > > > If external debugger sets a breakpoint for one Kernel function
> > > > when device is in bootloader mode and loads Kernel, this breakpoint
> > > > will be wiped out in hw_breakpoint_reset(). To fix this, check
> > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
> > > > 0b1, halting debug is enabled. Don't reset debug registers in this
> > case.
> > >
> > > I don't think this is sufficient, because the kernel can still
> > > subsequently mess with breakpoints, and the HW debugger might not be
> > > attached at this point in time anyhow.
> > >
> > > I reckon this should hang off the existing "nodebumon" command line
> > > option, and we shouldn't use HW breakpoints at all when that is
> > > passed.
> > > Then you can pass that to prevent the kernel stomping on the external
> > > debugger.
> > >
> > > Will, thoughts?
> >
> > I was going to suggest the same thing, although we will also need to
> > take
> > care to reset the registers if "nodebugmon" is toggled at runtime via
> > the
> > "debug_enabled" file in debugfs.
> >
> Thanks for the suggestion, Mark and Will. It's a great idea to use
> "nodebugmon". When "nodebugmon" is set, Kernel won't change HW breakpoints.
>
> For reset the registers after "debug_enabled" is toggled, I'm thinking if
> we are adding unnecessary complexity here.If we take that approach, we will
> hook "debug_enabled" interface and use smp_call_function_single() to call
> hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done and
> change "debug_enabled". External debugger would clear the breakpoints when
> it detaches the device and restores its breakpoints when attaches the
> device.
> Assume debug_enabled is changed to one after external debugger detaches the
> device. Debugger would already clear the breakpoint registers. If debgger is
> still attached, there's nothing Kernel can do to stop it restores/programs
> the breakpoint registers.
>
> What do you think of this?
It's all a bit of a mess. Looking at it some more, why can't the external
debugger simply trap access to the debug registers using EDSCR.TDA? That
way, we don't have to change anything in the kernel.
Will
在 2020-03-31 15:41,Will Deacon 写道:
> On Tue, Mar 31, 2020 at 10:39:42AM +0800, [email protected] wrote:
>> 在 2020-03-30 21:42,Will Deacon 写道:
>> > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
>> > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
>> > > > If external debugger sets a breakpoint for one Kernel function
>> > > > when device is in bootloader mode and loads Kernel, this breakpoint
>> > > > will be wiped out in hw_breakpoint_reset(). To fix this, check
>> > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
>> > > > 0b1, halting debug is enabled. Don't reset debug registers in this
>> > case.
>> > >
>> > > I don't think this is sufficient, because the kernel can still
>> > > subsequently mess with breakpoints, and the HW debugger might not be
>> > > attached at this point in time anyhow.
>> > >
>> > > I reckon this should hang off the existing "nodebumon" command line
>> > > option, and we shouldn't use HW breakpoints at all when that is
>> > > passed.
>> > > Then you can pass that to prevent the kernel stomping on the external
>> > > debugger.
>> > >
>> > > Will, thoughts?
>> >
>> > I was going to suggest the same thing, although we will also need to
>> > take
>> > care to reset the registers if "nodebugmon" is toggled at runtime via
>> > the
>> > "debug_enabled" file in debugfs.
>> >
>> Thanks for the suggestion, Mark and Will. It's a great idea to use
>> "nodebugmon". When "nodebugmon" is set, Kernel won't change HW
>> breakpoints.
>>
>> For reset the registers after "debug_enabled" is toggled, I'm thinking
>> if
>> we are adding unnecessary complexity here.If we take that approach, we
>> will
>> hook "debug_enabled" interface and use smp_call_function_single() to
>> call
>> hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done
>> and
>> change "debug_enabled". External debugger would clear the breakpoints
>> when
>> it detaches the device and restores its breakpoints when attaches the
>> device.
>> Assume debug_enabled is changed to one after external debugger
>> detaches
>> the
>> device. Debugger would already clear the breakpoint registers. If
>> debgger
>> is
>> still attached, there's nothing Kernel can do to stop it
>> restores/programs
>> the breakpoint registers.
>>
>> What do you think of this?
>
> It's all a bit of a mess. Looking at it some more, why can't the
> external
> debugger simply trap access to the debug registers using EDSCR.TDA?
> That
> way, we don't have to change anything in the kernel.
>
> Will
External debugger has the function to trap access to debug registers
now.
What do we expect debugger to do after core is stopped? Skip that msr
instruction and continue to run?
Tingwei
On Tue, Mar 31, 2020 at 07:33:38PM +0800, [email protected] wrote:
> 在 2020-03-31 15:41,Will Deacon 写道:
> > On Tue, Mar 31, 2020 at 10:39:42AM +0800, [email protected] wrote:
> > > 在 2020-03-30 21:42,Will Deacon 写道:
> > > > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
> > > > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
> > > > > > If external debugger sets a breakpoint for one Kernel function
> > > > > > when device is in bootloader mode and loads Kernel, this breakpoint
> > > > > > will be wiped out in hw_breakpoint_reset(). To fix this, check
> > > > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
> > > > > > 0b1, halting debug is enabled. Don't reset debug registers in this
> > > > case.
> > > > >
> > > > > I don't think this is sufficient, because the kernel can still
> > > > > subsequently mess with breakpoints, and the HW debugger might not be
> > > > > attached at this point in time anyhow.
> > > > >
> > > > > I reckon this should hang off the existing "nodebumon" command line
> > > > > option, and we shouldn't use HW breakpoints at all when that is
> > > > > passed.
> > > > > Then you can pass that to prevent the kernel stomping on the external
> > > > > debugger.
> > > > >
> > > > > Will, thoughts?
> > > >
> > > > I was going to suggest the same thing, although we will also need to
> > > > take
> > > > care to reset the registers if "nodebugmon" is toggled at runtime via
> > > > the
> > > > "debug_enabled" file in debugfs.
> > > >
> > > Thanks for the suggestion, Mark and Will. It's a great idea to use
> > > "nodebugmon". When "nodebugmon" is set, Kernel won't change HW
> > > breakpoints.
> > >
> > > For reset the registers after "debug_enabled" is toggled, I'm
> > > thinking if
> > > we are adding unnecessary complexity here.If we take that approach, we
> > > will
> > > hook "debug_enabled" interface and use smp_call_function_single() to
> > > call
> > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done
> > > and
> > > change "debug_enabled". External debugger would clear the
> > > breakpoints when
> > > it detaches the device and restores its breakpoints when attaches the
> > > device.
> > > Assume debug_enabled is changed to one after external debugger
> > > detaches
> > > the
> > > device. Debugger would already clear the breakpoint registers. If
> > > debgger
> > > is
> > > still attached, there's nothing Kernel can do to stop it
> > > restores/programs
> > > the breakpoint registers.
> > >
> > > What do you think of this?
> >
> > It's all a bit of a mess. Looking at it some more, why can't the
> > external
> > debugger simply trap access to the debug registers using EDSCR.TDA? That
> > way, we don't have to change anything in the kernel.
> >
>
> External debugger has the function to trap access to debug registers now.
> What do we expect debugger to do after core is stopped? Skip that msr
> instruction and continue to run?
The nicest thing to do would probably be to record all the accesses made
by the OS so that it can emulate reads and replay writes when external
debugging is over. Given that you'd still be expecting to pass "nodebugmon",
the emulation should be pretty straightforward, I think.
Will
在 2020-03-31 19:45,Will Deacon 写道:
> On Tue, Mar 31, 2020 at 07:33:38PM +0800, [email protected] wrote:
>> 在 2020-03-31 15:41,Will Deacon 写道:
>> > On Tue, Mar 31, 2020 at 10:39:42AM +0800, [email protected] wrote:
>> > > 在 2020-03-30 21:42,Will Deacon 写道:
>> > > > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote:
>> > > > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote:
>> > > > > > If external debugger sets a breakpoint for one Kernel function
>> > > > > > when device is in bootloader mode and loads Kernel, this
>> > > > > > breakpoint
>> > > > > > will be wiped out in hw_breakpoint_reset(). To fix this, check
>> > > > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is
>> > > > > > 0b1, halting debug is enabled. Don't reset debug registers in
>> > > > > > this
>> > > > case.
>> > > > >
>> > > > > I don't think this is sufficient, because the kernel can still
>> > > > > subsequently mess with breakpoints, and the HW debugger might not
>> > > > > be
>> > > > > attached at this point in time anyhow.
>> > > > >
>> > > > > I reckon this should hang off the existing "nodebumon" command
>> > > > > line
>> > > > > option, and we shouldn't use HW breakpoints at all when that is
>> > > > > passed.
>> > > > > Then you can pass that to prevent the kernel stomping on the
>> > > > > external
>> > > > > debugger.
>> > > > >
>> > > > > Will, thoughts?
>> > > >
>> > > > I was going to suggest the same thing, although we will also need to
>> > > > take
>> > > > care to reset the registers if "nodebugmon" is toggled at runtime
>> > > > via
>> > > > the
>> > > > "debug_enabled" file in debugfs.
>> > > >
>> > > Thanks for the suggestion, Mark and Will. It's a great idea to use
>> > > "nodebugmon". When "nodebugmon" is set, Kernel won't change HW
>> > > breakpoints.
>> > >
>> > > For reset the registers after "debug_enabled" is toggled, I'm
>> > > thinking if
>> > > we are adding unnecessary complexity here.If we take that approach, we
>> > > will
>> > > hook "debug_enabled" interface and use smp_call_function_single() to
>> > > call
>> > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done
>> > > and
>> > > change "debug_enabled". External debugger would clear the
>> > > breakpoints when
>> > > it detaches the device and restores its breakpoints when attaches the
>> > > device.
>> > > Assume debug_enabled is changed to one after external debugger
>> > > detaches
>> > > the
>> > > device. Debugger would already clear the breakpoint registers. If
>> > > debgger
>> > > is
>> > > still attached, there's nothing Kernel can do to stop it
>> > > restores/programs
>> > > the breakpoint registers.
>> > >
>> > > What do you think of this?
>> >
>> > It's all a bit of a mess. Looking at it some more, why can't the
>> > external
>> > debugger simply trap access to the debug registers using EDSCR.TDA? That
>> > way, we don't have to change anything in the kernel.
>> >
>>
>> External debugger has the function to trap access to debug registers
>> now.
>> What do we expect debugger to do after core is stopped? Skip that msr
>> instruction and continue to run?
>
> The nicest thing to do would probably be to record all the accesses
> made
> by the OS so that it can emulate reads and replay writes when external
> debugging is over. Given that you'd still be expecting to pass
> "nodebugmon",
> the emulation should be pretty straightforward, I think.
>
> Will
Will,
To provide an update on this, I've worked with external debugger vendor
on this.
Now external debugger can trap the write to debug registers and ignore
the write.
This is the first step.
Thanks,
Tingwei
On Tue, Apr 21, 2020 at 11:49:11AM +0800, [email protected] wrote:
> 在 2020-03-31 19:45,Will Deacon 写道:
> > On Tue, Mar 31, 2020 at 07:33:38PM +0800, [email protected] wrote:
> > > 在 2020-03-31 15:41,Will Deacon 写道:
> > > > On Tue, Mar 31, 2020 at 10:39:42AM +0800, [email protected] wrote:
> > > > > For reset the registers after "debug_enabled" is toggled, I'm
> > > > > thinking if
> > > > > we are adding unnecessary complexity here.If we take that approach, we
> > > > > will
> > > > > hook "debug_enabled" interface and use smp_call_function_single() to
> > > > > call
> > > > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done
> > > > > and
> > > > > change "debug_enabled". External debugger would clear the
> > > > > breakpoints when
> > > > > it detaches the device and restores its breakpoints when attaches the
> > > > > device.
> > > > > Assume debug_enabled is changed to one after external debugger
> > > > > detaches
> > > > > the
> > > > > device. Debugger would already clear the breakpoint registers. If
> > > > > debgger
> > > > > is
> > > > > still attached, there's nothing Kernel can do to stop it
> > > > > restores/programs
> > > > > the breakpoint registers.
> > > > >
> > > > > What do you think of this?
> > > >
> > > > It's all a bit of a mess. Looking at it some more, why can't the
> > > > external
> > > > debugger simply trap access to the debug registers using EDSCR.TDA? That
> > > > way, we don't have to change anything in the kernel.
> > > >
> > >
> > > External debugger has the function to trap access to debug registers
> > > now.
> > > What do we expect debugger to do after core is stopped? Skip that msr
> > > instruction and continue to run?
> >
> > The nicest thing to do would probably be to record all the accesses made
> > by the OS so that it can emulate reads and replay writes when external
> > debugging is over. Given that you'd still be expecting to pass
> > "nodebugmon",
> > the emulation should be pretty straightforward, I think.
> >
>
> To provide an update on this, I've worked with external debugger vendor on
> this.
> Now external debugger can trap the write to debug registers and ignore the
> write.
> This is the first step.
Thanks for the update! Please let us know if you run into any unforeseen
problems.
Will