2020-07-16 15:23:08

by Daniel Thompson

[permalink] [raw]
Subject: [PATCH v2 0/3] kgdb: Honour the kprobe blacklist when setting breakpoints

kgdb has traditionally adopted a no safety rails approach to breakpoint
placement. If the debugger is commanded to place a breakpoint at an
address then it will do so even if that breakpoint results in kgdb
becoming inoperable.

A stop-the-world debugger with memory peek/poke intrinsically provides
its operator with the means to hose their system in all manner of
exciting ways (not least because stopping-the-world is already a DoS
attack ;-) ). Nevertheless the current no safety rail approach is
difficult to defend, especially given kprobes can provide us with plenty
of machinery to mark the parts of the kernel where breakpointing is
discouraged.

This patchset introduces some safety rails by using the existing kprobes
infrastructure and ensures this will be enabled by default on
architectures that implement kprobes. At present it does not cover
absolutely all locations where breakpoints can cause trouble but it will
block off several avenues, including the architecture specific parts
that are handled by arch_within_kprobe_blacklist().


Daniel Thompson (3):
kgdb: Honour the kprobe blocklist when setting breakpoints
kgdb: Use the kprobe blocklist to limit single stepping
kgdb: Add NOKPROBE labels on the trap handler functions

include/linux/kgdb.h | 19 +++++++++++++++++++
kernel/debug/debug_core.c | 25 +++++++++++++++++++++++++
kernel/debug/gdbstub.c | 10 +++++++++-
kernel/debug/kdb/kdb_bp.c | 17 +++++++++++------
kernel/debug/kdb/kdb_main.c | 10 ++++++++--
lib/Kconfig.kgdb | 14 ++++++++++++++
6 files changed, 86 insertions(+), 9 deletions(-)

--
2.25.4


2020-07-16 15:23:14

by Daniel Thompson

[permalink] [raw]
Subject: [PATCH v2 3/3] kgdb: Add NOKPROBE labels on the trap handler functions

Currently kgdb honours the kprobe blocklist but doesn't place its own
trap handling code on the list. Add labels to discourage attempting to
use kgdb to debug itself.

These changes do not make it impossible to provoke recursive trapping
since they do not cover all the calls that can be made on kgdb's entry
logic. However going much further whilst we are sharing the kprobe
blocklist risks reducing the capabilities of kprobe and this would be a
bad trade off (especially so given kgdb's users are currently conditioned
to avoid recursive traps).

Signed-off-by: Daniel Thompson <[email protected]>
---
kernel/debug/debug_core.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 4b59bcc90c5d..b056afb1beec 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -183,6 +183,7 @@ int __weak kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
return copy_to_kernel_nofault((char *)bpt->bpt_addr,
(char *)bpt->saved_instr, BREAK_INSTR_SIZE);
}
+NOKPROBE_SYMBOL(kgdb_arch_remove_breakpoint);

int __weak kgdb_validate_break_address(unsigned long addr)
{
@@ -315,6 +316,7 @@ static void kgdb_flush_swbreak_addr(unsigned long addr)
/* Force flush instruction cache if it was outside the mm */
flush_icache_range(addr, addr + BREAK_INSTR_SIZE);
}
+NOKPROBE_SYMBOL(kgdb_flush_swbreak_addr);

/*
* SW breakpoint management:
@@ -405,6 +407,7 @@ int dbg_deactivate_sw_breakpoints(void)
}
return ret;
}
+NOKPROBE_SYMBOL(dbg_deactivate_sw_breakpoints);

int dbg_remove_sw_break(unsigned long addr)
{
@@ -573,6 +576,7 @@ static int kgdb_reenter_check(struct kgdb_state *ks)

return 1;
}
+NOKPROBE_SYMBOL(kgdb_reenter_check);

static void dbg_touch_watchdogs(void)
{
@@ -811,6 +815,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,

return kgdb_info[cpu].ret_state;
}
+NOKPROBE_SYMBOL(kgdb_cpu_enter);

/*
* kgdb_handle_exception() - main entry point from a kernel exception
@@ -855,6 +860,7 @@ kgdb_handle_exception(int evector, int signo, int ecode, struct pt_regs *regs)
arch_kgdb_ops.enable_nmi(1);
return ret;
}
+NOKPROBE_SYMBOL(kgdb_handle_exception);

/*
* GDB places a breakpoint at this function to know dynamically loaded objects.
@@ -889,6 +895,7 @@ int kgdb_nmicallback(int cpu, void *regs)
#endif
return 1;
}
+NOKPROBE_SYMBOL(kgdb_nmicallback);

int kgdb_nmicallin(int cpu, int trapnr, void *regs, int err_code,
atomic_t *send_ready)
@@ -914,6 +921,7 @@ int kgdb_nmicallin(int cpu, int trapnr, void *regs, int err_code,
#endif
return 1;
}
+NOKPROBE_SYMBOL(kgdb_nmicallin);

static void kgdb_console_write(struct console *co, const char *s,
unsigned count)
--
2.25.4

2020-07-16 15:23:18

by Daniel Thompson

[permalink] [raw]
Subject: [PATCH v2 1/3] kgdb: Honour the kprobe blocklist when setting breakpoints

Currently kgdb has absolutely no safety rails in place to discourage or
prevent a user from placing a breakpoint in dangerous places such as
the debugger's own trap entry/exit and other places where it is not safe
to take synchronous traps.

Introduce a new config symbol KGDB_HONOUR_BLOCKLIST and modify the
default implementation of kgdb_validate_break_address() so that we use
the kprobe blocklist to prohibit instrumentation of critical functions
if the config symbol is set. The config symbol dependencies are set to
ensure that the blocklist will be enabled by default if we enable KGDB
and are compiling for an architecture where we HAVE_KPROBES.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Daniel Thompson <[email protected]>
---
include/linux/kgdb.h | 18 ++++++++++++++++++
kernel/debug/debug_core.c | 4 ++++
kernel/debug/kdb/kdb_bp.c | 9 +++++++++
lib/Kconfig.kgdb | 14 ++++++++++++++
4 files changed, 45 insertions(+)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 529116b0cabe..7caba4604edc 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -16,6 +16,7 @@
#include <linux/linkage.h>
#include <linux/init.h>
#include <linux/atomic.h>
+#include <linux/kprobes.h>
#ifdef CONFIG_HAVE_ARCH_KGDB
#include <asm/kgdb.h>
#endif
@@ -323,6 +324,23 @@ extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, int err_code,
atomic_t *snd_rdy);
extern void gdbstub_exit(int status);

+/*
+ * kgdb and kprobes both use the same (kprobe) blocklist (which makes sense
+ * given they are both typically hooked up to the same trap meaning on most
+ * architectures one cannot be used to debug the other)
+ *
+ * However on architectures where kprobes is not (yet) implemented we permit
+ * breakpoints everywhere rather than blocking everything by default.
+ */
+static inline bool kgdb_within_blocklist(unsigned long addr)
+{
+#ifdef CONFIG_KGDB_HONOUR_BLOCKLIST
+ return within_kprobe_blacklist(addr);
+#else
+ return false;
+#endif
+}
+
extern int kgdb_single_step;
extern atomic_t kgdb_active;
#define in_dbg_master() \
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 9e5934780f41..133a361578dc 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -188,6 +188,10 @@ int __weak kgdb_validate_break_address(unsigned long addr)
{
struct kgdb_bkpt tmp;
int err;
+
+ if (kgdb_within_blocklist(addr))
+ return -EINVAL;
+
/* Validate setting the breakpoint and then removing it. If the
* remove fails, the kernel needs to emit a bad message because we
* are deep trouble not being able to put things back the way we
diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
index d7ebb2c79cb8..ec4940146612 100644
--- a/kernel/debug/kdb/kdb_bp.c
+++ b/kernel/debug/kdb/kdb_bp.c
@@ -306,6 +306,15 @@ static int kdb_bp(int argc, const char **argv)
if (!template.bp_addr)
return KDB_BADINT;

+ /*
+ * This check is redundant (since the breakpoint machinery should
+ * be doing the same check during kdb_bp_install) but gives the
+ * user immediate feedback.
+ */
+ diag = kgdb_validate_break_address(template.bp_addr);
+ if (diag)
+ return diag;
+
/*
* Find an empty bp structure to allocate
*/
diff --git a/lib/Kconfig.kgdb b/lib/Kconfig.kgdb
index ffa7a76de086..9d0d408f81b1 100644
--- a/lib/Kconfig.kgdb
+++ b/lib/Kconfig.kgdb
@@ -19,6 +19,20 @@ menuconfig KGDB

if KGDB

+config KGDB_HONOUR_BLOCKLIST
+ bool "KGDB: use kprobe blocklist to prohibit unsafe breakpoints"
+ depends on HAVE_KPROBES
+ select KPROBES
+ default y
+ help
+ If set to Y the debug core will use the kprobe blocklist to
+ identify symbols where it is unsafe to set breakpoints.
+ In particular this disallows instrumentation of functions
+ called during debug trap handling and thus makes it very
+ difficult to inadvertently provoke recursive trap handling.
+
+ If unsure, say Y.
+
config KGDB_SERIAL_CONSOLE
tristate "KGDB: use kgdb over the serial console"
select CONSOLE_POLL
--
2.25.4

2020-07-16 15:23:53

by Daniel Thompson

[permalink] [raw]
Subject: [PATCH v2 2/3] kgdb: Use the kprobe blocklist to limit single stepping

If we are running in a part of the kernel that dislikes breakpoint
debugging then it is very unlikely to be safe to single step. Add
some safety rails to prevent stepping through anything on the kprobe
blocklist.

As part of this kdb_ss() will no longer set the DOING_SS flags when it
requests a step. This is safe because this flag is already redundant,
returning KDB_CMD_SS is all that is needed to request a step (and this
saves us from having to unset the flag if the safety check fails).

Signed-off-by: Daniel Thompson <[email protected]>
---
include/linux/kgdb.h | 1 +
kernel/debug/debug_core.c | 13 +++++++++++++
kernel/debug/gdbstub.c | 10 +++++++++-
kernel/debug/kdb/kdb_bp.c | 8 ++------
kernel/debug/kdb/kdb_main.c | 10 ++++++++--
5 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 7caba4604edc..aefe823998cb 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -214,6 +214,7 @@ extern void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc);

/* Optional functions. */
extern int kgdb_validate_break_address(unsigned long addr);
+extern int kgdb_validate_single_step_address(unsigned long addr);
extern int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt);
extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 133a361578dc..4b59bcc90c5d 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -208,6 +208,19 @@ int __weak kgdb_validate_break_address(unsigned long addr)
return err;
}

+int __weak kgdb_validate_single_step_address(unsigned long addr)
+{
+ /*
+ * Disallow stepping when we are executing code that is marked
+ * as unsuitable for breakpointing... stepping won't be safe
+ * either!
+ */
+ if (kgdb_within_blocklist(addr))
+ return -EINVAL;
+
+ return 0;
+}
+
unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs)
{
return instruction_pointer(regs);
diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
index 61774aec46b4..f1c88007cc2b 100644
--- a/kernel/debug/gdbstub.c
+++ b/kernel/debug/gdbstub.c
@@ -1041,8 +1041,16 @@ int gdb_serial_stub(struct kgdb_state *ks)
if (tmp == 0)
break;
/* Fall through - on tmp < 0 */
- case 'c': /* Continue packet */
case 's': /* Single step packet */
+ error = kgdb_validate_single_step_address(
+ kgdb_arch_pc(ks->ex_vector,
+ ks->linux_regs));
+ if (error != 0) {
+ error_packet(remcom_out_buffer, error);
+ break;
+ }
+ fallthrough;
+ case 'c': /* Continue packet */
if (kgdb_contthread && kgdb_contthread != current) {
/* Can't switch threads in kgdb */
error_packet(remcom_out_buffer, -EINVAL);
diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
index ec4940146612..4853c413f579 100644
--- a/kernel/debug/kdb/kdb_bp.c
+++ b/kernel/debug/kdb/kdb_bp.c
@@ -507,18 +507,14 @@ static int kdb_bc(int argc, const char **argv)
* None.
* Remarks:
*
- * Set the arch specific option to trigger a debug trap after the next
- * instruction.
+ * KDB_CMD_SS is a command that our caller acts on to effect the step.
*/

static int kdb_ss(int argc, const char **argv)
{
if (argc != 0)
return KDB_ARGCOUNT;
- /*
- * Set trace flag and go.
- */
- KDB_STATE_SET(DOING_SS);
+
return KDB_CMD_SS;
}

diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 5c7949061671..cd40bf780b93 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -1189,7 +1189,7 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
kdb_dbtrap_t db_result)
{
char *cmdbuf;
- int diag;
+ int diag, res;
struct task_struct *kdb_current =
kdb_curr_task(raw_smp_processor_id());

@@ -1346,10 +1346,16 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
}
if (diag == KDB_CMD_GO
|| diag == KDB_CMD_CPU
- || diag == KDB_CMD_SS
|| diag == KDB_CMD_KGDB)
break;

+ if (diag == KDB_CMD_SS) {
+ res = kgdb_validate_single_step_address(instruction_pointer(regs));
+ if (res == 0)
+ break;
+ diag = res;
+ }
+
if (diag)
kdb_cmderror(diag);
}
--
2.25.4

2020-07-17 13:51:31

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] kgdb: Honour the kprobe blacklist when setting breakpoints

Hi Daniel,

On Thu, 16 Jul 2020 16:19:40 +0100
Daniel Thompson <[email protected]> wrote:

> kgdb has traditionally adopted a no safety rails approach to breakpoint
> placement. If the debugger is commanded to place a breakpoint at an
> address then it will do so even if that breakpoint results in kgdb
> becoming inoperable.
>
> A stop-the-world debugger with memory peek/poke intrinsically provides
> its operator with the means to hose their system in all manner of
> exciting ways (not least because stopping-the-world is already a DoS
> attack ;-) ). Nevertheless the current no safety rail approach is
> difficult to defend, especially given kprobes can provide us with plenty
> of machinery to mark the parts of the kernel where breakpointing is
> discouraged.
>
> This patchset introduces some safety rails by using the existing kprobes
> infrastructure and ensures this will be enabled by default on
> architectures that implement kprobes. At present it does not cover
> absolutely all locations where breakpoints can cause trouble but it will
> block off several avenues, including the architecture specific parts
> that are handled by arch_within_kprobe_blacklist().

This series looks good to me.

Acked-by: Masami Hiramatsu <[email protected]>

To fix the build error with ipw2x00 driver, please feel free to
include my fix patch.

Thank you,

>
>
> Daniel Thompson (3):
> kgdb: Honour the kprobe blocklist when setting breakpoints
> kgdb: Use the kprobe blocklist to limit single stepping
> kgdb: Add NOKPROBE labels on the trap handler functions
>
> include/linux/kgdb.h | 19 +++++++++++++++++++
> kernel/debug/debug_core.c | 25 +++++++++++++++++++++++++
> kernel/debug/gdbstub.c | 10 +++++++++-
> kernel/debug/kdb/kdb_bp.c | 17 +++++++++++------
> kernel/debug/kdb/kdb_main.c | 10 ++++++++--
> lib/Kconfig.kgdb | 14 ++++++++++++++
> 6 files changed, 86 insertions(+), 9 deletions(-)
>
> --
> 2.25.4
>


--
Masami Hiramatsu <[email protected]>

2020-07-17 22:40:34

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] kgdb: Honour the kprobe blocklist when setting breakpoints

Hi,

On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
<[email protected]> wrote:
>
> Currently kgdb has absolutely no safety rails in place to discourage or
> prevent a user from placing a breakpoint in dangerous places such as
> the debugger's own trap entry/exit and other places where it is not safe
> to take synchronous traps.
>
> Introduce a new config symbol KGDB_HONOUR_BLOCKLIST and modify the
> default implementation of kgdb_validate_break_address() so that we use
> the kprobe blocklist to prohibit instrumentation of critical functions
> if the config symbol is set. The config symbol dependencies are set to
> ensure that the blocklist will be enabled by default if we enable KGDB
> and are compiling for an architecture where we HAVE_KPROBES.
>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> include/linux/kgdb.h | 18 ++++++++++++++++++
> kernel/debug/debug_core.c | 4 ++++
> kernel/debug/kdb/kdb_bp.c | 9 +++++++++
> lib/Kconfig.kgdb | 14 ++++++++++++++
> 4 files changed, 45 insertions(+)

Seems reasonable to me.

Reviewed-by: Douglas Anderson <[email protected]>

2020-07-17 22:42:31

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] kgdb: Add NOKPROBE labels on the trap handler functions

Hi,

On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
<[email protected]> wrote:
>
> Currently kgdb honours the kprobe blocklist but doesn't place its own
> trap handling code on the list. Add labels to discourage attempting to
> use kgdb to debug itself.
>
> These changes do not make it impossible to provoke recursive trapping
> since they do not cover all the calls that can be made on kgdb's entry
> logic. However going much further whilst we are sharing the kprobe
> blocklist risks reducing the capabilities of kprobe and this would be a
> bad trade off (especially so given kgdb's users are currently conditioned
> to avoid recursive traps).
>
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> kernel/debug/debug_core.c | 8 ++++++++
> 1 file changed, 8 insertions(+)

I could just be missing something, but...

I understand not adding "NOKPROBE_SYMBOL" to generic kernel functions
that kgdb happens to call, but I'm not quite sure I understand why all
of the kdb / kgdb code itself shouldn't be in the blocklist. I
certainly don't object to the functions you added to the blocklist, I
guess I'm must trying to understand why it's a bad idea to add more or
how you came up with the list of functions that you did.

2020-07-17 22:43:10

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kgdb: Use the kprobe blocklist to limit single stepping

Hi,

On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
<[email protected]> wrote:
>
> If we are running in a part of the kernel that dislikes breakpoint
> debugging then it is very unlikely to be safe to single step. Add
> some safety rails to prevent stepping through anything on the kprobe
> blocklist.
>
> As part of this kdb_ss() will no longer set the DOING_SS flags when it
> requests a step. This is safe because this flag is already redundant,
> returning KDB_CMD_SS is all that is needed to request a step (and this
> saves us from having to unset the flag if the safety check fails).
>
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> include/linux/kgdb.h | 1 +
> kernel/debug/debug_core.c | 13 +++++++++++++
> kernel/debug/gdbstub.c | 10 +++++++++-
> kernel/debug/kdb/kdb_bp.c | 8 ++------
> kernel/debug/kdb/kdb_main.c | 10 ++++++++--
> 5 files changed, 33 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 7caba4604edc..aefe823998cb 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -214,6 +214,7 @@ extern void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc);
>
> /* Optional functions. */
> extern int kgdb_validate_break_address(unsigned long addr);
> +extern int kgdb_validate_single_step_address(unsigned long addr);
> extern int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt);
> extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 133a361578dc..4b59bcc90c5d 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -208,6 +208,19 @@ int __weak kgdb_validate_break_address(unsigned long addr)
> return err;
> }
>
> +int __weak kgdb_validate_single_step_address(unsigned long addr)
> +{
> + /*
> + * Disallow stepping when we are executing code that is marked
> + * as unsuitable for breakpointing... stepping won't be safe
> + * either!
> + */
> + if (kgdb_within_blocklist(addr))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs)
> {
> return instruction_pointer(regs);
> diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
> index 61774aec46b4..f1c88007cc2b 100644
> --- a/kernel/debug/gdbstub.c
> +++ b/kernel/debug/gdbstub.c
> @@ -1041,8 +1041,16 @@ int gdb_serial_stub(struct kgdb_state *ks)
> if (tmp == 0)
> break;
> /* Fall through - on tmp < 0 */
> - case 'c': /* Continue packet */
> case 's': /* Single step packet */
> + error = kgdb_validate_single_step_address(
> + kgdb_arch_pc(ks->ex_vector,
> + ks->linux_regs));

I'm a little confused. Isn't this like saying "if
(i_am_standing_in_acid) dont_step_into_acid"?

Specifically you're checking the _current_ PC to see if it's in the
blocklist, right? ...but you've already (effectively) dropped into
the debugger at that location, so if it really was a problem wouldn't
we already be in trouble?

What you really want (I think?) is to know if the instruction that
you're stepping into is in the blocklist, right? ...but you can't
know that because it requires a full instruction emulator (that's why
CPUs have "single step mode").

I guess you get a marginal benefit if someone manually set their
instruction pointer to be an address in the middle of a blocklisted
function and then trying to step, but I'm not sure that's really
something we need to add code for?

It feels like the right solution is that the architecture-specific
single-step code should simply consider a single-step through a
blocklisted area to be a step through one giant instruction.


> + if (error != 0) {
> + error_packet(remcom_out_buffer, error);
> + break;
> + }
> + fallthrough;
> + case 'c': /* Continue packet */
> if (kgdb_contthread && kgdb_contthread != current) {
> /* Can't switch threads in kgdb */
> error_packet(remcom_out_buffer, -EINVAL);
> diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
> index ec4940146612..4853c413f579 100644
> --- a/kernel/debug/kdb/kdb_bp.c
> +++ b/kernel/debug/kdb/kdb_bp.c
> @@ -507,18 +507,14 @@ static int kdb_bc(int argc, const char **argv)
> * None.
> * Remarks:
> *
> - * Set the arch specific option to trigger a debug trap after the next
> - * instruction.
> + * KDB_CMD_SS is a command that our caller acts on to effect the step.
> */
>
> static int kdb_ss(int argc, const char **argv)
> {
> if (argc != 0)
> return KDB_ARGCOUNT;
> - /*
> - * Set trace flag and go.
> - */
> - KDB_STATE_SET(DOING_SS);
> +
> return KDB_CMD_SS;
> }
>
> diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
> index 5c7949061671..cd40bf780b93 100644
> --- a/kernel/debug/kdb/kdb_main.c
> +++ b/kernel/debug/kdb/kdb_main.c
> @@ -1189,7 +1189,7 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
> kdb_dbtrap_t db_result)
> {
> char *cmdbuf;
> - int diag;
> + int diag, res;
> struct task_struct *kdb_current =
> kdb_curr_task(raw_smp_processor_id());
>
> @@ -1346,10 +1346,16 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
> }
> if (diag == KDB_CMD_GO
> || diag == KDB_CMD_CPU
> - || diag == KDB_CMD_SS
> || diag == KDB_CMD_KGDB)
> break;
>
> + if (diag == KDB_CMD_SS) {
> + res = kgdb_validate_single_step_address(instruction_pointer(regs));

Is it legit to use instruction_pointer() directly? Should you be
calling kgdb_arch_pc() ...or does that just account for having just
hit a breakpoint?

2020-07-20 08:10:41

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kgdb: Use the kprobe blocklist to limit single stepping

On Fri, Jul 17, 2020 at 03:39:51PM -0700, Doug Anderson wrote:
> Hi,
>
> On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> <[email protected]> wrote:
> >
> > If we are running in a part of the kernel that dislikes breakpoint
> > debugging then it is very unlikely to be safe to single step. Add
> > some safety rails to prevent stepping through anything on the kprobe
> > blocklist.
> >
> > As part of this kdb_ss() will no longer set the DOING_SS flags when it
> > requests a step. This is safe because this flag is already redundant,
> > returning KDB_CMD_SS is all that is needed to request a step (and this
> > saves us from having to unset the flag if the safety check fails).
> >
> > Signed-off-by: Daniel Thompson <[email protected]>
> > ---
> > include/linux/kgdb.h | 1 +
> > kernel/debug/debug_core.c | 13 +++++++++++++
> > kernel/debug/gdbstub.c | 10 +++++++++-
> > kernel/debug/kdb/kdb_bp.c | 8 ++------
> > kernel/debug/kdb/kdb_main.c | 10 ++++++++--
> > 5 files changed, 33 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 7caba4604edc..aefe823998cb 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -214,6 +214,7 @@ extern void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc);
> >
> > /* Optional functions. */
> > extern int kgdb_validate_break_address(unsigned long addr);
> > +extern int kgdb_validate_single_step_address(unsigned long addr);
> > extern int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt);
> > extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
> >
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index 133a361578dc..4b59bcc90c5d 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -208,6 +208,19 @@ int __weak kgdb_validate_break_address(unsigned long addr)
> > return err;
> > }
> >
> > +int __weak kgdb_validate_single_step_address(unsigned long addr)
> > +{
> > + /*
> > + * Disallow stepping when we are executing code that is marked
> > + * as unsuitable for breakpointing... stepping won't be safe
> > + * either!
> > + */
> > + if (kgdb_within_blocklist(addr))
> > + return -EINVAL;
> > +
> > + return 0;
> > +}
> > +
> > unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs)
> > {
> > return instruction_pointer(regs);
> > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
> > index 61774aec46b4..f1c88007cc2b 100644
> > --- a/kernel/debug/gdbstub.c
> > +++ b/kernel/debug/gdbstub.c
> > @@ -1041,8 +1041,16 @@ int gdb_serial_stub(struct kgdb_state *ks)
> > if (tmp == 0)
> > break;
> > /* Fall through - on tmp < 0 */
> > - case 'c': /* Continue packet */
> > case 's': /* Single step packet */
> > + error = kgdb_validate_single_step_address(
> > + kgdb_arch_pc(ks->ex_vector,
> > + ks->linux_regs));
>
> I'm a little confused. Isn't this like saying "if
> (i_am_standing_in_acid) dont_step_into_acid"?

I describe it more as:

if (we_know_there_is_acid_nearby)
dont_step_forward

It is possible we are currently stepping in acid but it is also possible
(and reasonably likely) that we haven't stepped in it yet but will do so
soon.


> Specifically you're checking the _current_ PC to see if it's in the
> blocklist, right? ...but you've already (effectively) dropped into
> the debugger at that location, so if it really was a problem wouldn't
> we already be in trouble?

The basic use case is where someone is stepping and we reach a PC that
would be blocked for a breakpoint. This will typically be due (although
I think it does generalize) to a function call and the safety rail will
be reached after we have jumped to the blocked function but before we
actually execute any instructions within it.

Or putting it another way, there is no reason to worry if we start
somewhere "safe" and start stepping towards something on the blocklist.
We won't melt our shoes!

There are more complex cases when we drop into the debugger in the
middle of blocked code with a not-breakpoint-or-step trap. You're right
that we'd been in touble and the debugger it probably a bit fragile.
However that certainly doesn't mean blocking stepping at this point
is a bad thing!


> What you really want (I think?) is to know if the instruction that
> you're stepping into is in the blocklist, right? ...but you can't
> know that because it requires a full instruction emulator (that's why
> CPUs have "single step mode").

As above, I don't think this is needed but if there was an architecture
that did then it can override the default implementation if it wanted
to.


> I guess you get a marginal benefit if someone manually set their
> instruction pointer to be an address in the middle of a blocklisted
> function and then trying to step, but I'm not sure that's really
> something we need to add code for?

Perhaps off-topic given this isn't why we add the satefy rails but...

I think people who directly set PC should be regarded as very
sophisticated users (and therefore do not need safety rails) so I have
little interest in honouring the blocklist for direct writes to the
PC. More generally sophisticated users should be able to find
KGDB_HONOUR_BLOCKLIST pretty quickly if they need to!


> It feels like the right solution is that the architecture-specific
> single-step code should simply consider a single-step through a
> blocklisted area to be a step through one giant instruction.

For kgdb this feature is already implemented (next or finish).


> > + if (error != 0) {
> > + error_packet(remcom_out_buffer, error);
> > + break;
> > + }
> > + fallthrough;
> > + case 'c': /* Continue packet */
> > if (kgdb_contthread && kgdb_contthread != current) {
> > /* Can't switch threads in kgdb */
> > error_packet(remcom_out_buffer, -EINVAL);
> > diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
> > index ec4940146612..4853c413f579 100644
> > --- a/kernel/debug/kdb/kdb_bp.c
> > +++ b/kernel/debug/kdb/kdb_bp.c
> > @@ -507,18 +507,14 @@ static int kdb_bc(int argc, const char **argv)
> > * None.
> > * Remarks:
> > *
> > - * Set the arch specific option to trigger a debug trap after the next
> > - * instruction.
> > + * KDB_CMD_SS is a command that our caller acts on to effect the step.
> > */
> >
> > static int kdb_ss(int argc, const char **argv)
> > {
> > if (argc != 0)
> > return KDB_ARGCOUNT;
> > - /*
> > - * Set trace flag and go.
> > - */
> > - KDB_STATE_SET(DOING_SS);
> > +
> > return KDB_CMD_SS;
> > }
> >
> > diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
> > index 5c7949061671..cd40bf780b93 100644
> > --- a/kernel/debug/kdb/kdb_main.c
> > +++ b/kernel/debug/kdb/kdb_main.c
> > @@ -1189,7 +1189,7 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
> > kdb_dbtrap_t db_result)
> > {
> > char *cmdbuf;
> > - int diag;
> > + int diag, res;
> > struct task_struct *kdb_current =
> > kdb_curr_task(raw_smp_processor_id());
> >
> > @@ -1346,10 +1346,16 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
> > }
> > if (diag == KDB_CMD_GO
> > || diag == KDB_CMD_CPU
> > - || diag == KDB_CMD_SS
> > || diag == KDB_CMD_KGDB)
> > break;
> >
> > + if (diag == KDB_CMD_SS) {
> > + res = kgdb_validate_single_step_address(instruction_pointer(regs));
>
> Is it legit to use instruction_pointer() directly? Should you be
> calling kgdb_arch_pc() ...or does that just account for having just
> hit a breakpoint?

I decided between kgdb_arch_pc() and instruction_pointer() based on the
usage of regs in the rest of this file (which is exclusively
instruction_pointer() ). I didn't want the lookup to mismatch what the
user has been told in the console.

On the other hand, I did cross my mind that every PC lookup could be
broken and I made a note for the future...


Daniel.

2020-07-20 08:15:57

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] kgdb: Add NOKPROBE labels on the trap handler functions

On Fri, Jul 17, 2020 at 03:39:58PM -0700, Doug Anderson wrote:
> Hi,
>
> On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> <[email protected]> wrote:
> >
> > Currently kgdb honours the kprobe blocklist but doesn't place its own
> > trap handling code on the list. Add labels to discourage attempting to
> > use kgdb to debug itself.
> >
> > These changes do not make it impossible to provoke recursive trapping
> > since they do not cover all the calls that can be made on kgdb's entry
> > logic. However going much further whilst we are sharing the kprobe
> > blocklist risks reducing the capabilities of kprobe and this would be a
> > bad trade off (especially so given kgdb's users are currently conditioned
> > to avoid recursive traps).
> >
> > Signed-off-by: Daniel Thompson <[email protected]>
> > ---
> > kernel/debug/debug_core.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
>
> I could just be missing something, but...
>
> I understand not adding "NOKPROBE_SYMBOL" to generic kernel functions
> that kgdb happens to call, but I'm not quite sure I understand why all
> of the kdb / kgdb code itself shouldn't be in the blocklist. I
> certainly don't object to the functions you added to the blocklist, I
> guess I'm must trying to understand why it's a bad idea to add more or
> how you came up with the list of functions that you did.

Relatively early in the trap handler execution (just after we bring the
other CPUs to a halt) all breakpoints are replaced with the original
opcodes. Therefore I only marked up functions that run between the trap
firing and the breakpoints being removed (and also between the
breakpoints being reinstated and trap exit).


Daniel.

2020-07-21 21:05:27

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kgdb: Use the kprobe blocklist to limit single stepping

Hi,

On Mon, Jul 20, 2020 at 1:08 AM Daniel Thompson
<[email protected]> wrote:
>
> On Fri, Jul 17, 2020 at 03:39:51PM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> > <[email protected]> wrote:
> > >
> > > If we are running in a part of the kernel that dislikes breakpoint
> > > debugging then it is very unlikely to be safe to single step. Add
> > > some safety rails to prevent stepping through anything on the kprobe
> > > blocklist.
> > >
> > > As part of this kdb_ss() will no longer set the DOING_SS flags when it
> > > requests a step. This is safe because this flag is already redundant,
> > > returning KDB_CMD_SS is all that is needed to request a step (and this
> > > saves us from having to unset the flag if the safety check fails).
> > >
> > > Signed-off-by: Daniel Thompson <[email protected]>
> > > ---
> > > include/linux/kgdb.h | 1 +
> > > kernel/debug/debug_core.c | 13 +++++++++++++
> > > kernel/debug/gdbstub.c | 10 +++++++++-
> > > kernel/debug/kdb/kdb_bp.c | 8 ++------
> > > kernel/debug/kdb/kdb_main.c | 10 ++++++++--
> > > 5 files changed, 33 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 7caba4604edc..aefe823998cb 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -214,6 +214,7 @@ extern void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc);
> > >
> > > /* Optional functions. */
> > > extern int kgdb_validate_break_address(unsigned long addr);
> > > +extern int kgdb_validate_single_step_address(unsigned long addr);
> > > extern int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt);
> > > extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
> > >
> > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > index 133a361578dc..4b59bcc90c5d 100644
> > > --- a/kernel/debug/debug_core.c
> > > +++ b/kernel/debug/debug_core.c
> > > @@ -208,6 +208,19 @@ int __weak kgdb_validate_break_address(unsigned long addr)
> > > return err;
> > > }
> > >
> > > +int __weak kgdb_validate_single_step_address(unsigned long addr)
> > > +{
> > > + /*
> > > + * Disallow stepping when we are executing code that is marked
> > > + * as unsuitable for breakpointing... stepping won't be safe
> > > + * either!
> > > + */
> > > + if (kgdb_within_blocklist(addr))
> > > + return -EINVAL;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs)
> > > {
> > > return instruction_pointer(regs);
> > > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
> > > index 61774aec46b4..f1c88007cc2b 100644
> > > --- a/kernel/debug/gdbstub.c
> > > +++ b/kernel/debug/gdbstub.c
> > > @@ -1041,8 +1041,16 @@ int gdb_serial_stub(struct kgdb_state *ks)
> > > if (tmp == 0)
> > > break;
> > > /* Fall through - on tmp < 0 */
> > > - case 'c': /* Continue packet */
> > > case 's': /* Single step packet */
> > > + error = kgdb_validate_single_step_address(
> > > + kgdb_arch_pc(ks->ex_vector,
> > > + ks->linux_regs));
> >
> > I'm a little confused. Isn't this like saying "if
> > (i_am_standing_in_acid) dont_step_into_acid"?
>
> I describe it more as:
>
> if (we_know_there_is_acid_nearby)
> dont_step_forward
>
> It is possible we are currently stepping in acid but it is also possible
> (and reasonably likely) that we haven't stepped in it yet but will do so
> soon.
>
>
> > Specifically you're checking the _current_ PC to see if it's in the
> > blocklist, right? ...but you've already (effectively) dropped into
> > the debugger at that location, so if it really was a problem wouldn't
> > we already be in trouble?
>
> The basic use case is where someone is stepping and we reach a PC that
> would be blocked for a breakpoint. This will typically be due (although
> I think it does generalize) to a function call and the safety rail will
> be reached after we have jumped to the blocked function but before we
> actually execute any instructions within it.
>
> Or putting it another way, there is no reason to worry if we start
> somewhere "safe" and start stepping towards something on the blocklist.
> We won't melt our shoes!

I guess I still don't totally get it. So let's say we have:

void dont_trace_this(...)
{
thing_not_to_trace_1();
thing_not_to_trace_2();
don_t_trace = this;
}
NOKPROBE_SYMBOL(dont_trace_this);

void trace_me()
{
sing();
dance();
dont_trace_this();
party();
}

So presumably the dont_trace_this() function is marked as
NOKPROBE_SYMBOL because it's called by the kprobe handling code or by
kgdb, right? So if we had a breakpoint there then we'd just have
infinite recursion. Thus we want to prevent putting breakpoints
anywhere in this function. Even though dont_trace_this() is also
called from the trace_me() function it doesn't matter--we still can't
put breakpoints in it because it would cause problems with the
debugger.

Now, I guess the question is: why exactly do we need to prevent single
stepping in dont_trace_this(). In the case above where
dont_trace_this() is called from trace_me() it would actually be OK to
single step it, right? ...unless this is on a CPU that doesn't have a
"single step mode" and has to implement stepping by breakpoints, of
course.

...but maybe I'm confused and there is a reason that we shouldn't
allow single stepping into dont_trace_this() when called from
trace_me(). If that is the case, I'm wondering why it's OK to step
and stop on the first instruction of the function but it's not OK to
step and stop through the other instructions in the function.

-Doug

2020-07-21 21:22:53

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] kgdb: Add NOKPROBE labels on the trap handler functions

Hi,

On Mon, Jul 20, 2020 at 1:13 AM Daniel Thompson
<[email protected]> wrote:
>
> On Fri, Jul 17, 2020 at 03:39:58PM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> > <[email protected]> wrote:
> > >
> > > Currently kgdb honours the kprobe blocklist but doesn't place its own
> > > trap handling code on the list. Add labels to discourage attempting to
> > > use kgdb to debug itself.
> > >
> > > These changes do not make it impossible to provoke recursive trapping
> > > since they do not cover all the calls that can be made on kgdb's entry
> > > logic. However going much further whilst we are sharing the kprobe
> > > blocklist risks reducing the capabilities of kprobe and this would be a
> > > bad trade off (especially so given kgdb's users are currently conditioned
> > > to avoid recursive traps).
> > >
> > > Signed-off-by: Daniel Thompson <[email protected]>
> > > ---
> > > kernel/debug/debug_core.c | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> >
> > I could just be missing something, but...
> >
> > I understand not adding "NOKPROBE_SYMBOL" to generic kernel functions
> > that kgdb happens to call, but I'm not quite sure I understand why all
> > of the kdb / kgdb code itself shouldn't be in the blocklist. I
> > certainly don't object to the functions you added to the blocklist, I
> > guess I'm must trying to understand why it's a bad idea to add more or
> > how you came up with the list of functions that you did.
>
> Relatively early in the trap handler execution (just after we bring the
> other CPUs to a halt) all breakpoints are replaced with the original
> opcodes. Therefore I only marked up functions that run between the trap
> firing and the breakpoints being removed (and also between the
> breakpoints being reinstated and trap exit).

Ah, OK! Could that be added to the commit message?

Also, shouldn't you mark kgdb_arch_set_breakpoint()? What about
dbg_activate_sw_breakpoints()? I haven't gone and extensively
searched, but those two jump out to me as ones that were missed.

I suppose that means that if someone tried to set a breakpoint on a
kgdb function that wasn't one of the ones that you listed then the
system would happily report that the breakpoint has been set (no error
given) but that the breakpoint would just have no effect? It wouldn't
crash (which is good), it just wouldn't detect that the breakpoint was
useless. However, if these were in the NOKPROBE_SYMBOL then you'd get
a nice error message. Is there no way we could use a linker script to
just mark everything using a linker script or somesuch?

-Doug

2020-09-04 15:27:20

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kgdb: Use the kprobe blocklist to limit single stepping

On Tue, Jul 21, 2020 at 02:04:45PM -0700, Doug Anderson wrote:
> On Mon, Jul 20, 2020 at 1:08 AM Daniel Thompson
> <[email protected]> wrote:
> > On Fri, Jul 17, 2020 at 03:39:51PM -0700, Doug Anderson wrote:
> > > On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> > > <[email protected]> wrote:
> > > >
> > > > If we are running in a part of the kernel that dislikes breakpoint
> > > > debugging then it is very unlikely to be safe to single step. Add
> > > > some safety rails to prevent stepping through anything on the kprobe
> > > > blocklist.
> > > >
> > > > As part of this kdb_ss() will no longer set the DOING_SS flags when it
> > > > requests a step. This is safe because this flag is already redundant,
> > > > returning KDB_CMD_SS is all that is needed to request a step (and this
> > > > saves us from having to unset the flag if the safety check fails).
> > > >
> > > > Signed-off-by: Daniel Thompson <[email protected]>
> > > > ---
> > > > include/linux/kgdb.h | 1 +
> > > > kernel/debug/debug_core.c | 13 +++++++++++++
> > > > kernel/debug/gdbstub.c | 10 +++++++++-
> > > > kernel/debug/kdb/kdb_bp.c | 8 ++------
> > > > kernel/debug/kdb/kdb_main.c | 10 ++++++++--
> > > > 5 files changed, 33 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > > index 7caba4604edc..aefe823998cb 100644
> > > > --- a/include/linux/kgdb.h
> > > > +++ b/include/linux/kgdb.h
> > > > @@ -214,6 +214,7 @@ extern void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc);
> > > >
> > > > /* Optional functions. */
> > > > extern int kgdb_validate_break_address(unsigned long addr);
> > > > +extern int kgdb_validate_single_step_address(unsigned long addr);
> > > > extern int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt);
> > > > extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
> > > >
> > > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > > index 133a361578dc..4b59bcc90c5d 100644
> > > > --- a/kernel/debug/debug_core.c
> > > > +++ b/kernel/debug/debug_core.c
> > > > @@ -208,6 +208,19 @@ int __weak kgdb_validate_break_address(unsigned long addr)
> > > > return err;
> > > > }
> > > >
> > > > +int __weak kgdb_validate_single_step_address(unsigned long addr)
> > > > +{
> > > > + /*
> > > > + * Disallow stepping when we are executing code that is marked
> > > > + * as unsuitable for breakpointing... stepping won't be safe
> > > > + * either!
> > > > + */
> > > > + if (kgdb_within_blocklist(addr))
> > > > + return -EINVAL;
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs)
> > > > {
> > > > return instruction_pointer(regs);
> > > > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
> > > > index 61774aec46b4..f1c88007cc2b 100644
> > > > --- a/kernel/debug/gdbstub.c
> > > > +++ b/kernel/debug/gdbstub.c
> > > > @@ -1041,8 +1041,16 @@ int gdb_serial_stub(struct kgdb_state *ks)
> > > > if (tmp == 0)
> > > > break;
> > > > /* Fall through - on tmp < 0 */
> > > > - case 'c': /* Continue packet */
> > > > case 's': /* Single step packet */
> > > > + error = kgdb_validate_single_step_address(
> > > > + kgdb_arch_pc(ks->ex_vector,
> > > > + ks->linux_regs));
> > >
> > > I'm a little confused. Isn't this like saying "if
> > > (i_am_standing_in_acid) dont_step_into_acid"?
> >
> > I describe it more as:
> >
> > if (we_know_there_is_acid_nearby)
> > dont_step_forward
> >
> > It is possible we are currently stepping in acid but it is also possible
> > (and reasonably likely) that we haven't stepped in it yet but will do so
> > soon.
> >
> >
> > > Specifically you're checking the _current_ PC to see if it's in the
> > > blocklist, right? ...but you've already (effectively) dropped into
> > > the debugger at that location, so if it really was a problem wouldn't
> > > we already be in trouble?
> >
> > The basic use case is where someone is stepping and we reach a PC that
> > would be blocked for a breakpoint. This will typically be due (although
> > I think it does generalize) to a function call and the safety rail will
> > be reached after we have jumped to the blocked function but before we
> > actually execute any instructions within it.
> >
> > Or putting it another way, there is no reason to worry if we start
> > somewhere "safe" and start stepping towards something on the blocklist.
> > We won't melt our shoes!
>
> I guess I still don't totally get it. So let's say we have:
>
> void dont_trace_this(...)
> {
> thing_not_to_trace_1();
> thing_not_to_trace_2();
> don_t_trace = this;
> }
> NOKPROBE_SYMBOL(dont_trace_this);
>
> void trace_me()
> {
> sing();
> dance();
> dont_trace_this();
> party();
> }
>
> So presumably the dont_trace_this() function is marked as
> NOKPROBE_SYMBOL because it's called by the kprobe handling code or by
> kgdb, right? So if we had a breakpoint there then we'd just have
> infinite recursion. Thus we want to prevent putting breakpoints
> anywhere in this function. Even though dont_trace_this() is also
> called from the trace_me() function it doesn't matter--we still can't
> put breakpoints in it because it would cause problems with the
> debugger.
>
> Now, I guess the question is: why exactly do we need to prevent single
> stepping in dont_trace_this(). In the case above where
> dont_trace_this() is called from trace_me() it would actually be OK to
> single step it, right? ...unless this is on a CPU that doesn't have a
> "single step mode" and has to implement stepping by breakpoints, of
> course.

I think you are persuading me.

Although I can think of plenty of places where it isn't safe to step I'm
struggling to think of any way for us to end up stopped in the debugger
in those places and certainly now without setting the catastrophic
(which is the only safety rail currently extant).

That means I don't think I can put up a strong enough case that this
patch is better than doing nothing!

I'll drop it for now.

> ...but maybe I'm confused

I think on the whole you've expressed things more lucidly than I have!
Nevertheless...

> and there is a reason that we shouldn't
> allow single stepping into dont_trace_this() when called from
> trace_me(). If that is the case, I'm wondering why it's OK to step
> and stop on the first instruction of the function but it's not OK to
> step and stop through the other instructions in the function.

... when we stop on the first instruction of a function then we have not
actually executed any part of it. In other words we haven't executed
anything on the blocklist.

Of course the whole issue is moot for now.


Daniel.

2020-09-04 15:42:06

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] kgdb: Add NOKPROBE labels on the trap handler functions

On Tue, Jul 21, 2020 at 02:22:08PM -0700, Doug Anderson wrote:
> Hi,
>
> On Mon, Jul 20, 2020 at 1:13 AM Daniel Thompson
> <[email protected]> wrote:
> >
> > On Fri, Jul 17, 2020 at 03:39:58PM -0700, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Thu, Jul 16, 2020 at 8:20 AM Daniel Thompson
> > > <[email protected]> wrote:
> > > >
> > > > Currently kgdb honours the kprobe blocklist but doesn't place its own
> > > > trap handling code on the list. Add labels to discourage attempting to
> > > > use kgdb to debug itself.
> > > >
> > > > These changes do not make it impossible to provoke recursive trapping
> > > > since they do not cover all the calls that can be made on kgdb's entry
> > > > logic. However going much further whilst we are sharing the kprobe
> > > > blocklist risks reducing the capabilities of kprobe and this would be a
> > > > bad trade off (especially so given kgdb's users are currently conditioned
> > > > to avoid recursive traps).
> > > >
> > > > Signed-off-by: Daniel Thompson <[email protected]>
> > > > ---
> > > > kernel/debug/debug_core.c | 8 ++++++++
> > > > 1 file changed, 8 insertions(+)
> > >
> > > I could just be missing something, but...
> > >
> > > I understand not adding "NOKPROBE_SYMBOL" to generic kernel functions
> > > that kgdb happens to call, but I'm not quite sure I understand why all
> > > of the kdb / kgdb code itself shouldn't be in the blocklist. I
> > > certainly don't object to the functions you added to the blocklist, I
> > > guess I'm must trying to understand why it's a bad idea to add more or
> > > how you came up with the list of functions that you did.
> >
> > Relatively early in the trap handler execution (just after we bring the
> > other CPUs to a halt) all breakpoints are replaced with the original
> > opcodes. Therefore I only marked up functions that run between the trap
> > firing and the breakpoints being removed (and also between the
> > breakpoints being reinstated and trap exit).
>
> Ah, OK! Could that be added to the commit message?

Will do.

> Also, shouldn't you mark kgdb_arch_set_breakpoint()? What about
> dbg_activate_sw_breakpoints()? I haven't gone and extensively
> searched, but those two jump out to me as ones that were missed.

Agree. I think I over-focusses on the entry path. I will review the
exit path more closely.

> I suppose that means that if someone tried to set a breakpoint on a
> kgdb function that wasn't one of the ones that you listed then the
> system would happily report that the breakpoint has been set (no error
> given) but that the breakpoint would just have no effect? It wouldn't
> crash (which is good), it just wouldn't detect that the breakpoint was
> useless.

Assuming the kgdb function is used exclusively from the trap handler
then this is correct.

> However, if these were in the NOKPROBE_SYMBOL then you'd get
> a nice error message. Is there no way we could use a linker script to
> just mark everything using a linker script or somesuch?

You'd still get odd effects with library functions that are used inside
and outside the debugger (which can be breakpointed but don't trigger
inside kgdb). Arguably the effect is clearer to users if they can see
kdb/kgdb functions behaving the same way as library functions. It's odd
but it won't promote false expectations from users.


Daniel.