2020-04-21 21:16:48

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles

This whole pile of patches was motivated by me trying to get kgdb to
work properly on a platform where my serial driver ended up being hit
by the -EPROBE_DEFER virus (it wasn't practicing social distancing
from other drivers). Specifically my serial driver's parent device
depended on a resource that wasn't available when its probe was first
called. It returned -EPROBE_DEFER which meant that when "kgdboc"
tried to run its setup the serial driver wasn't there. Unfortunately
"kgdboc" never tried again, so that meant that kgdb was disabled until
I manually enalbed it via sysfs.

While I could try to figure out how to get around the -EPROBE_DEFER
somehow, the above problems could happen to anyone and -EPROBE_DEFER
is generally considered something you just have to live with. In any
case the current "kgdboc" setup is a bit of a race waiting to happen.
I _think_ I saw during early testing that even adding a msleep() in
the typical serial driver's probe() is enough to trigger similar
issues.

I decided that for the above race the best attitude to get kgdb to
register at boot was probably "if you can't beat 'em, join 'em".
Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
driver uses it it's no longer a virus). It does so a little awkwardly
because "kgdboc" hasn't normally had a "struct device" associated with
it, but it's really not _that_ ugly to make a platform device and
seems less ugly than alternatives.

Unfortunately now on my system the debugger is one of the last things
to register at boot. That's OK for debugging problems that show up
significantly after boot, but isn't so hot for all the boot problems
that I end up debugging. This motivated me to try to get something
working a little earlier.

My first attempt was to try to get the existing "ekgdboc" to work
earlier. I tried that for a bit until I realized that it needed to
work at the tty layer and I couldn't find any serial drivers that
managed to register themselves to the tty layer super early at boot.
The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
of a special snowflake. Trying to get my serial driver and all its
dependencies to probe normally and register the tty driver super early
at boot seemed like a bad way to go. In fact, all the complexity
needed to do something like this is why the system already has a
special concept of a "boot console" that lives only long enough to
transition to the normal console.

Leveraging the boot console seemed like a good way to go and that's
what this series does. I found that consoles could have a read()
function, though I couldn't find anyone who implemented it. I
implemented it for two serial drivers for the devices I had easy
access to, making the assumption that for boot consoles that we could
assume read() and write() were polling-compatible (seems sane I
think).

Now anyone who makes a small change to their serial driver can easily
enable early kgdb debugging!

The devices I had for testing were:
- arm32: rk3288-veyron-jerry
- arm64: rk3399-gru-kevin
- arm64: qcom-sc7180-trogdor (not mainline yet)

These are the devices I tested this series on. I tried to test
various combinations of enabling/disabling various options and I
hopefully caught the corner cases, but I'd appreciate any extra
testing people can do. Notably I didn't test on x86, but (I think) I
didn't touch much there so I shouldn't have broken anything.

When testing I found a few problems with actually dropping into the
debugger super early on arm and arm64 devices. Patches in this series
should help with this. For arm I just avoid dropping into the
debugger until a little later and for arm64 I actually enable
debugging super early.

I realize that bits of this series might feel a little hacky, though
I've tried to do things in the cleanest way I could without overly
interferring with the rest of the kernel. If you hate the way I
solved a problem I would love it if you could provide guidance on how
you think I could solve the problem better.

This series (and my comments / documentation / commit messages) are
now long enough that my eyes glaze over when I try to read it all over
to double-check. I've nontheless tried to double-check it, but I'm
pretty sure I did something stupid. Thank you ahead of time for
pointing it out to me so I can fix it in v3. If somehow I managed to
not do anything stupid (really?) then thank you for double-checking me
anyway.

Changes in v2:
- ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
- ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
- Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
- Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp

Douglas Anderson (9):
kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
Revert "kgdboc: disable the console lock when in kgdb"
kgdboc: Use a platform device to handle tty drivers showing up late
kgdb: Delay "kgdbwait" to dbg_late_init() by default
arm64: Add call_break_hook() to early_brk64() for early kgdb
kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
Documentation: kgdboc: Document new earlycon_kgdboc parameter
serial: qcom_geni_serial: Support earlycon_kgdboc
serial: 8250_early: Support earlycon_kgdboc

.../admin-guide/kernel-parameters.txt | 20 ++
Documentation/dev-tools/kgdb.rst | 14 +
arch/arm64/include/asm/debug-monitors.h | 2 +
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/kgdb.c | 5 +
arch/arm64/kernel/traps.c | 3 +
arch/x86/kernel/kgdb.c | 5 +
drivers/misc/kgdbts.c | 2 +-
drivers/tty/mips_ejtag_fdc.c | 2 +-
drivers/tty/serial/8250/8250_early.c | 23 ++
drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
drivers/tty/serial/qcom_geni_serial.c | 32 +++
drivers/usb/early/ehci-dbgp.c | 2 +-
include/linux/kgdb.h | 25 +-
kernel/debug/debug_core.c | 48 +++-
15 files changed, 400 insertions(+), 47 deletions(-)

--
2.26.1.301.g55bc3eb7cb9-goog


2020-04-21 21:16:59

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 4/9] kgdb: Delay "kgdbwait" to dbg_late_init() by default

Using kgdb requires at least some level of architecture-level
initialization. If nothing else, it relies on the architecture to
pass breakpoints / crashes onto kgdb.

On some architectures this all works super early, specifically it
starts working at some point in time before Linux parses
early_params's. On other architectures it doesn't. A survey of a few
platforms:

a) x86: Presumably it all works early since "ekgdboc" is documented to
work here.
b) arm64: Catching crashes works; with a simple patch breakpoints can
also be made to work.
c) arm: Nothing in kgdb works until
paging_init() -> devicemaps_init() -> early_trap_init()

Let's be conservative and, by default, process "kgdbwait" (which tells
the kernel to drop into the debugger ASAP at boot) a bit later at
dbg_late_init() time. If an architecture has tested it and wants to
re-enable super early debugging, they can implement the weak function
kgdb_arch_can_debug_early() to return true. We'll do this for x86 to
start. It should be noted that dbg_late_init() is still called quite
early in the system.

Note that this patch doesn't affect when kgdb runs its init. If kgdb
is set to initialize early it will still initialize when parsing
early_params's. This patch _only_ inhibits the initial breakpoint
from "kgdbwait". This means:

* Without any extra patches arm64 platforms will at least catch
crashes after kgdb inits.
* arm platforms will catch crashes (and could handle a hardcoded
kgdb_breakpoint()) any time after early_trap_init() runs, even
before dbg_late_init().

Signed-off-by: Douglas Anderson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
---

Changes in v2: None

arch/x86/kernel/kgdb.c | 5 +++++
include/linux/kgdb.h | 22 ++++++++++++++++++++++
kernel/debug/debug_core.c | 29 +++++++++++++++++++----------
3 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index c44fe7d8d9a4..60c47787c588 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -673,6 +673,11 @@ void kgdb_arch_late(void)
}
}

+bool kgdb_arch_can_debug_early(void)
+{
+ return true;
+}
+
/**
* kgdb_arch_exit - Perform any architecture specific uninitalization.
*
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index b072aeb1fd78..7371517aeacc 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -226,6 +226,28 @@ extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
*/
extern void kgdb_arch_late(void);

+/**
+ * kgdb_arch_can_debug_early - Check if OK to break before dbg_late_init()
+ *
+ * If an architecture can definitely handle entering the debugger when
+ * early_param's are parsed then it can override this function to return
+ * true. Otherwise if "kgdbwait" is passed on the kernel command line it
+ * won't actually be processed until dbg_late_init() just after the call
+ * to kgdb_arch_late() is made.
+ *
+ * NOTE: Even if this returns false we will still try to register kgdb to
+ * handle breakpoints and crashes when early_params's are parsed, we just
+ * won't act on the "kgdbwait" parameter until dbg_late_init(). If you
+ * get a crash and try to drop into kgdb somewhere between these two
+ * places you might or might not end up being able to use kgdb depending
+ * on exactly how far along the architecture has initted.
+ *
+ * ALSO: dbg_late_init() is actually still fairly early in the system
+ * boot process.
+ *
+ * Return: true if platform can handle kgdb early.
+ */
+extern bool kgdb_arch_can_debug_early(void);

/**
* struct kgdb_arch - Describe architecture specific values.
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 950dc667c823..8f178239856d 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -950,16 +950,32 @@ void kgdb_panic(const char *msg)
kgdb_breakpoint();
}

+static void kgdb_initial_breakpoint(void)
+{
+ kgdb_break_asap = 0;
+
+ pr_crit("Waiting for connection from remote gdb...\n");
+ kgdb_breakpoint();
+}
+
void __weak kgdb_arch_late(void)
{
}

+bool __weak kgdb_arch_can_debug_early(void)
+{
+ return false;
+}
+
void __init dbg_late_init(void)
{
dbg_is_early = false;
if (kgdb_io_module_registered)
kgdb_arch_late();
kdb_init(KDB_INIT_FULL);
+
+ if (kgdb_io_module_registered && kgdb_break_asap)
+ kgdb_initial_breakpoint();
}

static int
@@ -1055,14 +1071,6 @@ void kgdb_schedule_breakpoint(void)
}
EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);

-static void kgdb_initial_breakpoint(void)
-{
- kgdb_break_asap = 0;
-
- pr_crit("Waiting for connection from remote gdb...\n");
- kgdb_breakpoint();
-}
-
/**
* kgdb_register_io_module - register KGDB IO module
* @new_dbg_io_ops: the io ops vector
@@ -1099,7 +1107,8 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
/* Arm KGDB now. */
kgdb_register_callbacks();

- if (kgdb_break_asap)
+ if (kgdb_break_asap &&
+ (!dbg_is_early || kgdb_arch_can_debug_early()))
kgdb_initial_breakpoint();

return 0;
@@ -1169,7 +1178,7 @@ static int __init opt_kgdb_wait(char *str)
kgdb_break_asap = 1;

kdb_init(KDB_INIT_EARLY);
- if (kgdb_io_module_registered)
+ if (kgdb_io_module_registered && kgdb_arch_can_debug_early())
kgdb_initial_breakpoint();

return 0;
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:17:06

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 8/9] serial: qcom_geni_serial: Support earlycon_kgdboc

Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.

We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2: None

drivers/tty/serial/qcom_geni_serial.c | 32 +++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 6119090ce045..4563d152b39e 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1090,6 +1090,36 @@ static void qcom_geni_serial_earlycon_write(struct console *con,
__qcom_geni_serial_console_write(&dev->port, s, n);
}

+#ifdef CONFIG_CONSOLE_POLL
+static int qcom_geni_serial_earlycon_read(struct console *con,
+ char *s, unsigned int n)
+{
+ struct earlycon_device *dev = con->data;
+ struct uart_port *uport = &dev->port;
+ int num_read = 0;
+ int ch;
+
+ while (num_read < n) {
+ ch = qcom_geni_serial_get_char(uport);
+ if (ch == NO_POLL_CHAR)
+ break;
+ s[num_read++] = ch;
+ }
+
+ return num_read;
+}
+
+static void __init qcom_geni_serial_enable_early_read(struct geni_se *se,
+ struct console *con)
+{
+ geni_se_setup_s_cmd(se, UART_START_READ, 0);
+ con->read = qcom_geni_serial_earlycon_read;
+}
+#else
+static inline void qcom_geni_serial_enable_early_read(struct geni_se *se,
+ struct console *con) { ; }
+#endif
+
static int __init qcom_geni_serial_earlycon_setup(struct earlycon_device *dev,
const char *opt)
{
@@ -1136,6 +1166,8 @@ static int __init qcom_geni_serial_earlycon_setup(struct earlycon_device *dev,

dev->con->write = qcom_geni_serial_earlycon_write;
dev->con->setup = NULL;
+ qcom_geni_serial_enable_early_read(&se, dev->con);
+
return 0;
}
OF_EARLYCON_DECLARE(qcom_geni, "qcom,geni-debug-uart",
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:17:16

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 9/9] serial: 8250_early: Support earlycon_kgdboc

Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.

We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2: None

drivers/tty/serial/8250/8250_early.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/drivers/tty/serial/8250/8250_early.c b/drivers/tty/serial/8250/8250_early.c
index 5cd8c36c8fcc..70d7826788f5 100644
--- a/drivers/tty/serial/8250/8250_early.c
+++ b/drivers/tty/serial/8250/8250_early.c
@@ -109,6 +109,28 @@ static void early_serial8250_write(struct console *console,
uart_console_write(port, s, count, serial_putc);
}

+#ifdef CONFIG_CONSOLE_POLL
+static int early_serial8250_read(struct console *console,
+ char *s, unsigned int count)
+{
+ struct earlycon_device *device = console->data;
+ struct uart_port *port = &device->port;
+ unsigned int status;
+ int num_read = 0;
+
+ while (num_read < count) {
+ status = serial8250_early_in(port, UART_LSR);
+ if (!(status & UART_LSR_DR))
+ break;
+ s[num_read++] = serial8250_early_in(port, UART_RX);
+ }
+
+ return num_read;
+}
+#else
+#define early_serial8250_read NULL
+#endif
+
static void __init init_port(struct earlycon_device *device)
{
struct uart_port *port = &device->port;
@@ -149,6 +171,7 @@ int __init early_serial8250_setup(struct earlycon_device *device,
init_port(device);

device->con->write = early_serial8250_write;
+ device->con->read = early_serial8250_read;
return 0;
}
EARLYCON_DECLARE(uart8250, early_serial8250_setup);
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:17:19

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 7/9] Documentation: kgdboc: Document new earlycon_kgdboc parameter

The recent patch ("kgdboc: Add earlycon_kgdboc to support early kgdb
using boot consoles") adds a new kernel command line parameter.
Document it.

Note that the patch adding the feature does some comparing/contrasting
of "earlycon_kgdboc" vs. the existing "ekgdboc". See that patch for
more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
and just makes "kgdboc" do its normal initialization early (only works
if your tty driver is already ready). The new "earlycon_kgdboc" works
in combination with "kgdboc" and is backed by boot consoles.

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2: None

.../admin-guide/kernel-parameters.txt | 20 +++++++++++++++++++
Documentation/dev-tools/kgdb.rst | 14 +++++++++++++
2 files changed, 34 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f2a93c8679e8..588625ec2993 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1132,6 +1132,22 @@
address must be provided, and the serial port must
already be setup and configured.

+ earlycon_kgdboc= [KGDB,HW]
+ If the boot console provides the ability to read
+ characters and can work in polling mode, you can use
+ this parameter to tell kgdb to use it as a backend
+ until the normal console is registered. Intended to
+ be used together with the kgdboc parameter which
+ specifies the normal console to transition to.
+
+ The the name of the early console should be specified
+ as the value of this parameter. Note that the name of
+ the early console might be different than the tty
+ name passed to kgdboc. If only one boot console with
+ a read() function is enabled it's OK to leave the
+ value blank and the first boot console that implements
+ read() will be picked.
+
earlyprintk= [X86,SH,ARM,M68k,S390]
earlyprintk=vga
earlyprintk=sclp
@@ -1190,6 +1206,10 @@
This is designed to be used in conjunction with
the boot argument: earlyprintk=vga

+ This parameter works in place of the kgdboc parameter
+ but can only be used if the backing tty is available
+ very early in the boot process.
+
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}

diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
index d38be58f872a..c0b321403d9a 100644
--- a/Documentation/dev-tools/kgdb.rst
+++ b/Documentation/dev-tools/kgdb.rst
@@ -274,6 +274,20 @@ don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as
on the initial connect, or to use a debugger proxy that allows an
unmodified gdb to do the debugging.

+Kernel parameter: ``earlycon_kgdboc``
+-------------------------------------
+
+If you specify the kernel parameter ``earlycon_kgdboc`` and your serial
+driver registers a boot console that supports polling (doesn't need
+interrupts and implements a nonblocking read() function) kgdb will attempt
+to work using the boot console until it can transition to the regular
+tty driver specified by the ``kgdboc`` parameter.
+
+Normally there is only one boot console (especially that implements the
+read() function) so just adding ``earlycon_kgdboc`` on its own is
+sufficient to make this work. If you have more than one boot console you
+can add the boot console's name to differentiate.
+
Kernel parameter: ``kgdbwait``
------------------------------

--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:17:25

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 5/9] arm64: Add call_break_hook() to early_brk64() for early kgdb

In order to make early kgdb work properly we need early_brk64() to be
able to call into it. This is as easy as adding a call into
call_break_hook() just like we do later in the normal brk_handler().

Once we do this we can let kgdb know that it can break into the
debugger a little earlier (specifically when parsing early_param's).

NOTE: without this patch it turns out that arm64 can't do breakpoints
even at dbg_late_init(), so if we decide something about this patch is
wrong we might need to move dbg_late_init() a little later.

Signed-off-by: Douglas Anderson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---

Changes in v2: None

arch/arm64/include/asm/debug-monitors.h | 2 ++
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/kgdb.c | 5 +++++
arch/arm64/kernel/traps.c | 3 +++
4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 7619f473155f..2d82a0314d29 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -97,6 +97,8 @@ void unregister_user_break_hook(struct break_hook *hook);
void register_kernel_break_hook(struct break_hook *hook);
void unregister_kernel_break_hook(struct break_hook *hook);

+int call_break_hook(struct pt_regs *regs, unsigned int esr);
+
u8 debug_monitors_arch(void);

enum dbg_active_el {
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index 48222a4760c2..59c353dfc8e9 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
unregister_debug_hook(&hook->node);
}

-static int call_break_hook(struct pt_regs *regs, unsigned int esr)
+int call_break_hook(struct pt_regs *regs, unsigned int esr)
{
struct break_hook *hook;
struct list_head *list;
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 43119922341f..96a47af870bc 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -301,6 +301,11 @@ static struct notifier_block kgdb_notifier = {
.priority = -INT_MAX,
};

+extern bool kgdb_arch_can_debug_early(void)
+{
+ return true;
+}
+
/*
* kgdb_arch_init - Perform any architecture specific initialization.
* This function will handle the initialization of any architecture
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index cf402be5c573..a8173f0c1774 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
#endif
+ if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
+ return 0;
+
return bug_handler(regs, esr) != DBG_HOOK_HANDLED;
}

--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:17:40

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 2/9] Revert "kgdboc: disable the console lock when in kgdb"

This reverts commit 81eaadcae81b4c1bf01649a3053d1f54e2d81cf1.

Commit 81eaadcae81b ("kgdboc: disable the console lock when in kgdb")
is no longer needed now that we have the patch ("kgdb: Disable
WARN_CONSOLE_UNLOCKED for all kgdb"). Revert it.

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2:
- ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.

drivers/tty/serial/kgdboc.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index c9f94fa82be4..8a1a4d1b6768 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -275,14 +275,10 @@ static void kgdboc_pre_exp_handler(void)
/* Increment the module count when the debugger is active */
if (!kgdb_connected)
try_module_get(THIS_MODULE);
-
- atomic_inc(&ignore_console_lock_warning);
}

static void kgdboc_post_exp_handler(void)
{
- atomic_dec(&ignore_console_lock_warning);
-
/* decrement the module count when the debugger detaches */
if (!kgdb_connected)
module_put(THIS_MODULE);
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:18:24

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles

We want to enable kgdb to debug the early parts of the kernel.
Unfortunately kgdb normally is a client of the tty API in the kernel
and serial drivers don't register to the tty layer until fairly late
in the boot process.

Serial drivers do, however, commonly register a boot console. Let's
enable the kgdboc driver to work with boot consoles to provide early
debugging.

This change co-opts the existing read() function pointer that's part
of "struct console". It's assumed that if a boot console (with the
flag CON_BOOT) has implemented read() that both the read() and write()
function are polling functions. That means they work without
interrupts and read() will return immediately (with 0 bytes read) if
there's nothing to read. This should be a safe assumption since it
appears that no current boot consoles implement read() right now and
there seems no reason to do so unless they wanted to support
"earlycon_kgdboc".

The console API isn't really intended to have clients work with it
like we're doing. Specifically there doesn't appear to be any way for
clients to be notified about a boot console being unregistered. We'll
work around this by checking that our console is still valid before
using it. We'll also try to transition off of the boot console and
onto the "tty" API as quickly as possible.

The normal/expected way to make all this work is to use
"earlycon_kgdboc" and "kgdboc" together. You should point them both
to the same physical serial connection. At boot time, as the system
transitions from the boot console to the normal console, kgdb will
switch over. If you don't use things in the normal/expected way it's
a bit of a buyer-beware situation. Things thought about:

- If you specify only "earlycon_kgdboc" but not "kgdboc" you still
might end up dropping into kgdb upon a crash/sysrq but you may not
be able to type.
- If you use "keep_bootcon" (which is already a bit of a buyer-beware
option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
trying to use your boot console for kgdb.
- If your "earlycon_kgdboc" and "kgdboc" devices are not the same
device things should work OK, but it'll be your job to switch over
which device you're monitoring (including figuring out how to switch
over gdb in-flight if you're using it).

When trying to enable "earlycon_kgdboc" it should be noted that the
names that are registered through the boot console layer and the tty
layer are not the same for the same port. For example when debugging
on one board I'd need to pass "earlycon_kgdboc=qcom_geni
kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
console name is a pain and there will rarely be more than one boot
console enabled, you can provide the "earlycon_kgdboc" parameter
without specifying the name of the boot console. In this case we'll
just pick the first boot that implements read() that we find.

This new "earlycon_kgdboc" parameter should be contrasted to the
existing "ekgdboc" parameter. While both provide a way to debug very
early, the usage and mechanisms are quite different. Specifically
"earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
there is a transition from one to the other. The "ekgdboc" parameter,
on the other hand, replaces the "kgdboc" parameter. It runs the same
logic as the "kgdboc" parameter but just relies on your TTY driver
being present super early. The only known usage of the old "ekgdboc"
parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
be noted that "kbd" has special treatment allowing it to init early as
a tty device.

Signed-off-by: Douglas Anderson <[email protected]>
---
This patch touches files in several different subsystems, but it
touches a single line and that line is related to kgdb. I'm assuming
this can all go through the kgdb tree, but if needed I can always
introduce a new API call instead of modifying the old one and then
just have the old API call be a thin wrapper on the new one.

Changes in v2:
- Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
- Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp

drivers/misc/kgdbts.c | 2 +-
drivers/tty/mips_ejtag_fdc.c | 2 +-
drivers/tty/serial/kgdboc.c | 132 +++++++++++++++++++++++++++++++++-
drivers/usb/early/ehci-dbgp.c | 2 +-
include/linux/kgdb.h | 3 +-
kernel/debug/debug_core.c | 15 +++-
6 files changed, 149 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/kgdbts.c b/drivers/misc/kgdbts.c
index bccd341e9ae1..5c4e4a8771cf 100644
--- a/drivers/misc/kgdbts.c
+++ b/drivers/misc/kgdbts.c
@@ -1077,7 +1077,7 @@ static int configure_kgdbts(void)
final_ack = 0;
run_plant_and_detach_test(1);

- err = kgdb_register_io_module(&kgdbts_io_ops);
+ err = kgdb_register_io_module(&kgdbts_io_ops, false);
if (err) {
configured = 0;
return err;
diff --git a/drivers/tty/mips_ejtag_fdc.c b/drivers/tty/mips_ejtag_fdc.c
index 21e76a2ec182..68817cca39ce 100644
--- a/drivers/tty/mips_ejtag_fdc.c
+++ b/drivers/tty/mips_ejtag_fdc.c
@@ -1265,7 +1265,7 @@ static struct kgdb_io kgdbfdc_io_ops = {

static int __init kgdbfdc_init(void)
{
- kgdb_register_io_module(&kgdbfdc_io_ops);
+ kgdb_register_io_module(&kgdbfdc_io_ops, false);
return 0;
}
early_initcall(kgdbfdc_init);
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 519d8cfbfbed..2f526f2d2bea 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -21,6 +21,7 @@
#include <linux/input.h>
#include <linux/module.h>
#include <linux/platform_device.h>
+#include <linux/serial_core.h>

#define MAX_CONFIG_LEN 40

@@ -42,6 +43,14 @@ static int kgdb_tty_line;

static struct platform_device *kgdboc_pdev;

+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+static struct kgdb_io earlycon_kgdboc_io_ops;
+struct console *earlycon;
+bool earlycon_neutered;
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+#define earlycon NULL
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -135,8 +144,46 @@ static void kgdboc_unregister_kbd(void)
#define kgdboc_restore_input()
#endif /* ! CONFIG_KDB_KEYBOARD */

+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+
+static void cleanup_earlycon(bool unregister)
+{
+ if (earlycon && unregister)
+ kgdb_unregister_io_module(&earlycon_kgdboc_io_ops);
+ earlycon = NULL;
+}
+
+static bool is_earlycon_still_valid(void)
+{
+ struct console *con;
+
+ for_each_console(con)
+ if (con == earlycon)
+ return true;
+ return false;
+}
+
+static void cleanup_earlycon_if_invalid(void)
+{
+ console_lock();
+ if (earlycon && (earlycon_neutered || !is_earlycon_still_valid())) {
+ pr_warn("earlycon vanished; unregistering\n");
+ cleanup_earlycon(true);
+ }
+ console_unlock();
+}
+
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
+static inline void cleanup_earlycon(bool unregister) { ; }
+static inline void cleanup_earlycon_if_invalid(void) { ; }
+
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
static void cleanup_kgdboc(void)
{
+ cleanup_earlycon(true);
+
if (configured != 1)
return;

@@ -188,9 +235,10 @@ static int configure_kgdboc(void)
kgdb_tty_line = tty_line;

do_register:
- err = kgdb_register_io_module(&kgdboc_io_ops);
+ err = kgdb_register_io_module(&kgdboc_io_ops, earlycon != NULL);
if (err)
goto noconfig;
+ cleanup_earlycon(false);

err = kgdb_register_nmi_console();
if (err)
@@ -206,6 +254,14 @@ static int configure_kgdboc(void)
kgdboc_unregister_kbd();
configured = 0;

+ /*
+ * Each time we run configure_kgdboc() but don't find a console, use
+ * that as a chance to validate that our earlycon didn't vanish on
+ * us. If it vanished we should unregister which will disable kgdb
+ * if we're the last I/O module.
+ */
+ cleanup_earlycon_if_invalid();
+
return err;
}

@@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
}

early_param("ekgdboc", kgdboc_early_init);
+
+static int earlycon_kgdboc_get_char(void)
+{
+ char c;
+
+ if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
+ return NO_POLL_CHAR;
+
+ return c;
+}
+
+static void earlycon_kgdboc_put_char(u8 chr)
+{
+ if (!earlycon_neutered)
+ earlycon->write(earlycon, &chr, 1);
+}
+
+static void earlycon_kgdboc_pre_exp_handler(void)
+{
+ /*
+ * We don't get notified when the boot console is unregistered.
+ * Double-check when we enter the debugger. Unfortunately we
+ * can't really unregister ourselves now, but at least don't crash.
+ */
+ if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
+ pr_warn("Neutering kgdb since boot console vanished\n");
+ earlycon_neutered = true;
+ }
+}
+
+static struct kgdb_io earlycon_kgdboc_io_ops = {
+ .name = "earlycon_kgdboc",
+ .read_char = earlycon_kgdboc_get_char,
+ .write_char = earlycon_kgdboc_put_char,
+ .pre_exception = earlycon_kgdboc_pre_exp_handler,
+ .is_console = true,
+};
+
+static int __init earlycon_kgdboc_init(char *opt)
+{
+ struct console *con;
+
+ kdb_init(KDB_INIT_EARLY);
+
+ /*
+ * Look for a matching console, or if the name was left blank just
+ * pick the first one we find.
+ */
+ console_lock();
+ for_each_console(con) {
+ if (con->write && con->read &&
+ (con->flags & (CON_BOOT | CON_ENABLED)) &&
+ (!opt || !opt[0] || strcmp(con->name, opt) == 0))
+ break;
+ }
+ console_unlock();
+
+ if (!con) {
+ pr_info("Couldn't find kgdb earlycon\n");
+ return 0;
+ }
+
+ earlycon = con;
+ pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
+ if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
+ earlycon = NULL;
+ pr_info("Failed to register kgdb with earlycon\n");
+ return 0;
+ }
+
+ return 0;
+}
+
+early_param("earlycon_kgdboc", earlycon_kgdboc_init);
#endif /* CONFIG_KGDB_SERIAL_CONSOLE */

module_init(init_kgdboc);
diff --git a/drivers/usb/early/ehci-dbgp.c b/drivers/usb/early/ehci-dbgp.c
index ea0d531c63e2..bb04c688e094 100644
--- a/drivers/usb/early/ehci-dbgp.c
+++ b/drivers/usb/early/ehci-dbgp.c
@@ -1057,7 +1057,7 @@ static int __init kgdbdbgp_parse_config(char *str)
ptr++;
kgdbdbgp_wait_time = simple_strtoul(ptr, &ptr, 10);
}
- kgdb_register_io_module(&kgdbdbgp_io_ops);
+ kgdb_register_io_module(&kgdbdbgp_io_ops, false);
kgdbdbgp_io_ops.is_console = early_dbgp_console.index != -1;

return 0;
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 7371517aeacc..2e86307f2683 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -323,7 +323,8 @@ static inline int kgdb_unregister_nmi_console(void) { return 0; }
static inline bool kgdb_nmi_poll_knock(void) { return 1; }
#endif

-extern int kgdb_register_io_module(struct kgdb_io *local_kgdb_io_ops);
+extern int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops,
+ bool replace);
extern void kgdb_unregister_io_module(struct kgdb_io *local_kgdb_io_ops);
extern struct kgdb_io *dbg_io_ops;

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 8f178239856d..1b5435c6d92a 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
/**
* kgdb_register_io_module - register KGDB IO module
* @new_dbg_io_ops: the io ops vector
+ * @replace: If true it's OK if there were old ops. This is used
+ * to transition from early kgdb to normal kgdb. It's
+ * assumed these are the same device so kgdb can continue.
*
* Register it with the KGDB core.
*/
-int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
+int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)
{
+ struct kgdb_io *old_dbg_io_ops;
int err;

spin_lock(&kgdb_registration_lock);

- if (dbg_io_ops) {
+ old_dbg_io_ops = dbg_io_ops;
+ if (dbg_io_ops && !replace) {
spin_unlock(&kgdb_registration_lock);

pr_err("Another I/O driver is already registered with KGDB\n");
@@ -1102,6 +1107,12 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)

spin_unlock(&kgdb_registration_lock);

+ if (replace) {
+ pr_info("Replaced I/O driver %s with %s\n",
+ old_dbg_io_ops->name, new_dbg_io_ops->name);
+ return 0;
+ }
+
pr_info("Registered I/O driver %s\n", new_dbg_io_ops->name);

/* Arm KGDB now. */
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:18:31

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 3/9] kgdboc: Use a platform device to handle tty drivers showing up late

If you build CONFIG_KGDB_SERIAL_CONSOLE into the kernel then you
should be able to have KGDB init itself at bootup by specifying the
"kgdboc=..." kernel command line parameter. This has worked OK for me
for many years, but on a new device I switched to it stopped working.

The problem is that on this new device the serial driver gets its
probe deferred. Now when kgdb initializes it can't find the tty
driver and when it gives up it never tries again.

We could try to find ways to move up the initialization of the serial
driver and such a thing might be worthwhile, but it's nice to be
robust against serial drivers that load late. We could move kgdb to
init itself later but that penalizes our ability to debug early boot
code on systems where the driver inits early. We could roll our own
system of detecting when new tty drivers get loaded and then use that
to figure out when kgdb can init, but that's ugly.

Instead, let's jump on the -EPROBE_DEFER bandwagon. We'll create a
singleton instance of a "kgdboc" platform device. If we can't find
our tty device when the singleton "kgdboc" probes we'll return
-EPROBE_DEFER which means that the system will call us back later to
try again when the tty device might be there.

We won't fully transition all of the kgdboc to a platform device
because early kgdb initialization (via the "ekgdboc" kernel command
line parameter) still runs before the platform device has been
created. The kgdb platform device is merely used as a convenient way
to hook into the system's normal probe deferral mechanisms.

As part of this, we'll ever-so-slightly change how the "kgdboc=..."
kernel command line parameter works. Previously if you booted up and
kgdb couldn't find the tty driver then later reading
'/sys/module/kgdboc/parameters/kgdboc' would return a blank string.
Now kgdb will keep track of the string that came as part of the
command line and give it back to you. It's expected that this should
be an OK change.

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2: None

drivers/tty/serial/kgdboc.c | 126 +++++++++++++++++++++++++++++-------
1 file changed, 101 insertions(+), 25 deletions(-)

diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 8a1a4d1b6768..519d8cfbfbed 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -20,6 +20,7 @@
#include <linux/vt_kern.h>
#include <linux/input.h>
#include <linux/module.h>
+#include <linux/platform_device.h>

#define MAX_CONFIG_LEN 40

@@ -27,6 +28,7 @@ static struct kgdb_io kgdboc_io_ops;

/* -1 = init not run yet, 0 = unconfigured, 1 = configured. */
static int configured = -1;
+DEFINE_MUTEX(config_mutex);

static char config[MAX_CONFIG_LEN];
static struct kparam_string kps = {
@@ -38,6 +40,8 @@ static int kgdboc_use_kms; /* 1 if we use kernel mode switching */
static struct tty_driver *kgdb_tty_driver;
static int kgdb_tty_line;

+static struct platform_device *kgdboc_pdev;
+
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -133,11 +137,13 @@ static void kgdboc_unregister_kbd(void)

static void cleanup_kgdboc(void)
{
+ if (configured != 1)
+ return;
+
if (kgdb_unregister_nmi_console())
return;
kgdboc_unregister_kbd();
- if (configured == 1)
- kgdb_unregister_io_module(&kgdboc_io_ops);
+ kgdb_unregister_io_module(&kgdboc_io_ops);
}

static int configure_kgdboc(void)
@@ -198,20 +204,79 @@ static int configure_kgdboc(void)
kgdb_unregister_io_module(&kgdboc_io_ops);
noconfig:
kgdboc_unregister_kbd();
- config[0] = 0;
configured = 0;
- cleanup_kgdboc();

return err;
}

+static int kgdboc_probe(struct platform_device *pdev)
+{
+ int ret = 0;
+
+ mutex_lock(&config_mutex);
+ if (configured != 1) {
+ ret = configure_kgdboc();
+
+ /* Convert "no device" to "defer" so we'll keep trying */
+ if (ret == -ENODEV)
+ ret = -EPROBE_DEFER;
+ }
+ mutex_unlock(&config_mutex);
+
+ return ret;
+}
+
+static struct platform_driver kgdboc_platform_driver = {
+ .probe = kgdboc_probe,
+ .driver = {
+ .name = "kgdboc",
+ .suppress_bind_attrs = true,
+ },
+};
+
static int __init init_kgdboc(void)
{
- /* Already configured? */
- if (configured == 1)
+ int ret;
+
+ /*
+ * kgdboc is a little bit of an odd "platform_driver". It can be
+ * up and running long before the platform_driver object is
+ * created and thus doesn't actually store anything in it. There's
+ * only one instance of kgdb so anything is stored as global state.
+ * The platform_driver is only created so that we can leverage the
+ * kernel's mechanisms (like -EPROBE_DEFER) to call us when our
+ * underlying tty is ready. Here we init our platform driver and
+ * then create the single kgdboc instance.
+ */
+ ret = platform_driver_register(&kgdboc_platform_driver);
+ if (ret)
+ return ret;
+
+ kgdboc_pdev = platform_device_alloc("kgdboc", PLATFORM_DEVID_NONE);
+ if (!kgdboc_pdev) {
+ ret = -ENOMEM;
+ goto err_did_register;
+ }
+
+ ret = platform_device_add(kgdboc_pdev);
+ if (!ret)
return 0;

- return configure_kgdboc();
+ platform_device_put(kgdboc_pdev);
+
+err_did_register:
+ platform_driver_unregister(&kgdboc_platform_driver);
+ return ret;
+}
+
+static void exit_kgdboc(void)
+{
+ mutex_lock(&config_mutex);
+ cleanup_kgdboc();
+ mutex_unlock(&config_mutex);
+
+ platform_device_unregister(kgdboc_pdev);
+ platform_driver_unregister(&kgdboc_platform_driver);
}

static int kgdboc_get_char(void)
@@ -234,24 +299,20 @@ static int param_set_kgdboc_var(const char *kmessage,
const struct kernel_param *kp)
{
size_t len = strlen(kmessage);
+ int ret = 0;

if (len >= MAX_CONFIG_LEN) {
pr_err("config string too long\n");
return -ENOSPC;
}

- /* Only copy in the string if the init function has not run yet */
- if (configured < 0) {
- strcpy(config, kmessage);
- return 0;
- }
-
if (kgdb_connected) {
pr_err("Cannot reconfigure while KGDB is connected.\n");
-
return -EBUSY;
}

+ mutex_lock(&config_mutex);
+
strcpy(config, kmessage);
/* Chop out \n char as a result of echo */
if (len && config[len - 1] == '\n')
@@ -260,8 +321,30 @@ static int param_set_kgdboc_var(const char *kmessage,
if (configured == 1)
cleanup_kgdboc();

- /* Go and configure with the new params. */
- return configure_kgdboc();
+ /*
+ * Configure with the new params as long as init already ran.
+ * Note that we can get called before init if someone loads us
+ * with "modprobe kgdboc kgdboc=..." or if they happen to use the
+ * the odd syntax of "kgdboc.kgdboc=..." on the kernel command.
+ */
+ if (configured >= 0)
+ ret = configure_kgdboc();
+
+ /*
+ * If we couldn't configure then clear out the config. Note that
+ * specifying an invalid config on the kernel command line vs.
+ * through sysfs have slightly different behaviors. If we fail
+ * to configure what was specified on the kernel command line
+ * we'll leave it in the 'config' and return -EPROBE_DEFER from
+ * our probe. When specified through sysfs userspace is
+ * responsible for loading the tty driver before setting up.
+ */
+ if (ret)
+ config[0] = '\0';
+
+ mutex_unlock(&config_mutex);
+
+ return ret;
}

static int dbg_restore_graphics;
@@ -320,15 +403,8 @@ __setup("kgdboc=", kgdboc_option_setup);
/* This is only available if kgdboc is a built in for early debugging */
static int __init kgdboc_early_init(char *opt)
{
- /* save the first character of the config string because the
- * init routine can destroy it.
- */
- char save_ch;
-
kgdboc_option_setup(opt);
- save_ch = config[0];
- init_kgdboc();
- config[0] = save_ch;
+ configure_kgdboc();
return 0;
}

@@ -336,7 +412,7 @@ early_param("ekgdboc", kgdboc_early_init);
#endif /* CONFIG_KGDB_SERIAL_CONSOLE */

module_init(init_kgdboc);
-module_exit(cleanup_kgdboc);
+module_exit(exit_kgdboc);
module_param_call(kgdboc, param_set_kgdboc_var, param_get_string, &kps, 0644);
MODULE_PARM_DESC(kgdboc, "<serial_device>[,baud]");
MODULE_DESCRIPTION("KGDB Console TTY Driver");
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-21 21:20:55

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2 1/9] kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb

In commit 81eaadcae81b ("kgdboc: disable the console lock when in
kgdb") we avoided the WARN_CONSOLE_UNLOCKED() yell when we were in
kgdboc. That still works fine, but it turns out that we get a similar
yell when using other I/O drivers. One example is the "I/O driver"
for the kgdb test suite (kgdbts). When I enabled that I again got the
same yells.

Even though "kgdbts" doesn't actually interact with the user over the
console, using it still causes kgdb to print to the consoles. That
trips the same warning:
con_is_visible+0x60/0x68
con_scroll+0x110/0x1b8
lf+0x4c/0xc8
vt_console_print+0x1b8/0x348
vkdb_printf+0x320/0x89c
kdb_printf+0x68/0x90
kdb_main_loop+0x190/0x860
kdb_stub+0x2cc/0x3ec
kgdb_cpu_enter+0x268/0x744
kgdb_handle_exception+0x1a4/0x200
kgdb_compiled_brk_fn+0x34/0x44
brk_handler+0x7c/0xb8
do_debug_exception+0x1b4/0x228

Let's increment/decrement the "ignore_console_lock_warning" variable
all the time when we enter the debugger.

This will allow us to later revert commit 81eaadcae81b ("kgdboc:
disable the console lock when in kgdb").

Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v2:
- ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.

kernel/debug/debug_core.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 2b7c9b67931d..950dc667c823 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -668,6 +668,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
if (kgdb_skipexception(ks->ex_vector, ks->linux_regs))
goto kgdb_restore;

+ atomic_inc(&ignore_console_lock_warning);
+
/* Call the I/O driver's pre_exception routine */
if (dbg_io_ops->pre_exception)
dbg_io_ops->pre_exception();
@@ -740,6 +742,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
if (dbg_io_ops->post_exception)
dbg_io_ops->post_exception();

+ atomic_dec(&ignore_console_lock_warning);
+
if (!kgdb_single_step) {
raw_spin_unlock(&dbg_slave_lock);
/* Wait till all the CPUs have quit from the debugger. */
--
2.26.1.301.g55bc3eb7cb9-goog

2020-04-23 14:16:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles

On Tue, Apr 21, 2020 at 02:14:38PM -0700, Douglas Anderson wrote:
> This whole pile of patches was motivated by me trying to get kgdb to
> work properly on a platform where my serial driver ended up being hit
> by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> from other drivers). Specifically my serial driver's parent device
> depended on a resource that wasn't available when its probe was first
> called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> tried to run its setup the serial driver wasn't there. Unfortunately
> "kgdboc" never tried again, so that meant that kgdb was disabled until
> I manually enalbed it via sysfs.
>
> While I could try to figure out how to get around the -EPROBE_DEFER
> somehow, the above problems could happen to anyone and -EPROBE_DEFER
> is generally considered something you just have to live with. In any
> case the current "kgdboc" setup is a bit of a race waiting to happen.
> I _think_ I saw during early testing that even adding a msleep() in
> the typical serial driver's probe() is enough to trigger similar
> issues.
>
> I decided that for the above race the best attitude to get kgdb to
> register at boot was probably "if you can't beat 'em, join 'em".
> Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> driver uses it it's no longer a virus). It does so a little awkwardly
> because "kgdboc" hasn't normally had a "struct device" associated with
> it, but it's really not _that_ ugly to make a platform device and
> seems less ugly than alternatives.
>
> Unfortunately now on my system the debugger is one of the last things
> to register at boot. That's OK for debugging problems that show up
> significantly after boot, but isn't so hot for all the boot problems
> that I end up debugging. This motivated me to try to get something
> working a little earlier.
>
> My first attempt was to try to get the existing "ekgdboc" to work
> earlier. I tried that for a bit until I realized that it needed to
> work at the tty layer and I couldn't find any serial drivers that
> managed to register themselves to the tty layer super early at boot.
> The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> of a special snowflake. Trying to get my serial driver and all its
> dependencies to probe normally and register the tty driver super early
> at boot seemed like a bad way to go. In fact, all the complexity
> needed to do something like this is why the system already has a
> special concept of a "boot console" that lives only long enough to
> transition to the normal console.
>
> Leveraging the boot console seemed like a good way to go and that's
> what this series does. I found that consoles could have a read()
> function, though I couldn't find anyone who implemented it. I
> implemented it for two serial drivers for the devices I had easy
> access to, making the assumption that for boot consoles that we could
> assume read() and write() were polling-compatible (seems sane I
> think).
>
> Now anyone who makes a small change to their serial driver can easily
> enable early kgdb debugging!
>
> The devices I had for testing were:
> - arm32: rk3288-veyron-jerry
> - arm64: rk3399-gru-kevin
> - arm64: qcom-sc7180-trogdor (not mainline yet)
>
> These are the devices I tested this series on. I tried to test
> various combinations of enabling/disabling various options and I
> hopefully caught the corner cases, but I'd appreciate any extra
> testing people can do. Notably I didn't test on x86, but (I think) I
> didn't touch much there so I shouldn't have broken anything.
>
> When testing I found a few problems with actually dropping into the
> debugger super early on arm and arm64 devices. Patches in this series
> should help with this. For arm I just avoid dropping into the
> debugger until a little later and for arm64 I actually enable
> debugging super early.
>
> I realize that bits of this series might feel a little hacky, though
> I've tried to do things in the cleanest way I could without overly
> interferring with the rest of the kernel. If you hate the way I
> solved a problem I would love it if you could provide guidance on how
> you think I could solve the problem better.
>
> This series (and my comments / documentation / commit messages) are
> now long enough that my eyes glaze over when I try to read it all over
> to double-check. I've nontheless tried to double-check it, but I'm
> pretty sure I did something stupid. Thank you ahead of time for
> pointing it out to me so I can fix it in v3. If somehow I managed to
> not do anything stupid (really?) then thank you for double-checking me
> anyway.
>
> Changes in v2:
> - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
>
> Douglas Anderson (9):
> kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> Revert "kgdboc: disable the console lock when in kgdb"
> kgdboc: Use a platform device to handle tty drivers showing up late
> kgdb: Delay "kgdbwait" to dbg_late_init() by default
> arm64: Add call_break_hook() to early_brk64() for early kgdb
> kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> Documentation: kgdboc: Document new earlycon_kgdboc parameter
> serial: qcom_geni_serial: Support earlycon_kgdboc
> serial: 8250_early: Support earlycon_kgdboc
>
> .../admin-guide/kernel-parameters.txt | 20 ++
> Documentation/dev-tools/kgdb.rst | 14 +
> arch/arm64/include/asm/debug-monitors.h | 2 +
> arch/arm64/kernel/debug-monitors.c | 2 +-
> arch/arm64/kernel/kgdb.c | 5 +
> arch/arm64/kernel/traps.c | 3 +
> arch/x86/kernel/kgdb.c | 5 +
> drivers/misc/kgdbts.c | 2 +-
> drivers/tty/mips_ejtag_fdc.c | 2 +-
> drivers/tty/serial/8250/8250_early.c | 23 ++
> drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> drivers/tty/serial/qcom_geni_serial.c | 32 +++
> drivers/usb/early/ehci-dbgp.c | 2 +-
> include/linux/kgdb.h | 25 +-
> kernel/debug/debug_core.c | 48 +++-
> 15 files changed, 400 insertions(+), 47 deletions(-)

Reviewed-by: Greg Kroah-Hartman <[email protected]>

2020-04-24 08:35:49

by Sumit Garg

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles

Hi Doug,

On Wed, 22 Apr 2020 at 02:45, Douglas Anderson <[email protected]> wrote:
>
> This whole pile of patches was motivated by me trying to get kgdb to
> work properly on a platform where my serial driver ended up being hit
> by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> from other drivers). Specifically my serial driver's parent device
> depended on a resource that wasn't available when its probe was first
> called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> tried to run its setup the serial driver wasn't there. Unfortunately
> "kgdboc" never tried again, so that meant that kgdb was disabled until
> I manually enalbed it via sysfs.
>
> While I could try to figure out how to get around the -EPROBE_DEFER
> somehow, the above problems could happen to anyone and -EPROBE_DEFER
> is generally considered something you just have to live with. In any
> case the current "kgdboc" setup is a bit of a race waiting to happen.
> I _think_ I saw during early testing that even adding a msleep() in
> the typical serial driver's probe() is enough to trigger similar
> issues.
>
> I decided that for the above race the best attitude to get kgdb to
> register at boot was probably "if you can't beat 'em, join 'em".
> Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> driver uses it it's no longer a virus). It does so a little awkwardly
> because "kgdboc" hasn't normally had a "struct device" associated with
> it, but it's really not _that_ ugly to make a platform device and
> seems less ugly than alternatives.
>
> Unfortunately now on my system the debugger is one of the last things
> to register at boot. That's OK for debugging problems that show up
> significantly after boot, but isn't so hot for all the boot problems
> that I end up debugging. This motivated me to try to get something
> working a little earlier.
>
> My first attempt was to try to get the existing "ekgdboc" to work
> earlier. I tried that for a bit until I realized that it needed to
> work at the tty layer and I couldn't find any serial drivers that
> managed to register themselves to the tty layer super early at boot.
> The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> of a special snowflake. Trying to get my serial driver and all its
> dependencies to probe normally and register the tty driver super early
> at boot seemed like a bad way to go. In fact, all the complexity
> needed to do something like this is why the system already has a
> special concept of a "boot console" that lives only long enough to
> transition to the normal console.
>
> Leveraging the boot console seemed like a good way to go and that's
> what this series does. I found that consoles could have a read()
> function, though I couldn't find anyone who implemented it. I
> implemented it for two serial drivers for the devices I had easy
> access to, making the assumption that for boot consoles that we could
> assume read() and write() were polling-compatible (seems sane I
> think).
>
> Now anyone who makes a small change to their serial driver can easily
> enable early kgdb debugging!
>
> The devices I had for testing were:
> - arm32: rk3288-veyron-jerry
> - arm64: rk3399-gru-kevin
> - arm64: qcom-sc7180-trogdor (not mainline yet)
>
> These are the devices I tested this series on. I tried to test
> various combinations of enabling/disabling various options and I
> hopefully caught the corner cases, but I'd appreciate any extra
> testing people can do.

earlycon_kgdboc sounds like a really cool feature. So I gave it a try
on my arm64 machine (Developerbox) and it works like a charm. So for
patch 6/9 you can add:

Tested-by: Sumit Garg <[email protected]>

Plus, in order to enable earlycon_kgdboc on Developerbox I had to
implement the read() function in the early console driver for
amba-pl011 (see patch [1]). It would be great if you could pick that
patch [1] too as part of this series.

[1] https://lkml.org/lkml/2020/4/24/173

-Sumit

> Notably I didn't test on x86, but (I think) I
> didn't touch much there so I shouldn't have broken anything.
>
> When testing I found a few problems with actually dropping into the
> debugger super early on arm and arm64 devices. Patches in this series
> should help with this. For arm I just avoid dropping into the
> debugger until a little later and for arm64 I actually enable
> debugging super early.
>
> I realize that bits of this series might feel a little hacky, though
> I've tried to do things in the cleanest way I could without overly
> interferring with the rest of the kernel. If you hate the way I
> solved a problem I would love it if you could provide guidance on how
> you think I could solve the problem better.
>
> This series (and my comments / documentation / commit messages) are
> now long enough that my eyes glaze over when I try to read it all over
> to double-check. I've nontheless tried to double-check it, but I'm
> pretty sure I did something stupid. Thank you ahead of time for
> pointing it out to me so I can fix it in v3. If somehow I managed to
> not do anything stupid (really?) then thank you for double-checking me
> anyway.
>
> Changes in v2:
> - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
>
> Douglas Anderson (9):
> kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> Revert "kgdboc: disable the console lock when in kgdb"
> kgdboc: Use a platform device to handle tty drivers showing up late
> kgdb: Delay "kgdbwait" to dbg_late_init() by default
> arm64: Add call_break_hook() to early_brk64() for early kgdb
> kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> Documentation: kgdboc: Document new earlycon_kgdboc parameter
> serial: qcom_geni_serial: Support earlycon_kgdboc
> serial: 8250_early: Support earlycon_kgdboc
>
> .../admin-guide/kernel-parameters.txt | 20 ++
> Documentation/dev-tools/kgdb.rst | 14 +
> arch/arm64/include/asm/debug-monitors.h | 2 +
> arch/arm64/kernel/debug-monitors.c | 2 +-
> arch/arm64/kernel/kgdb.c | 5 +
> arch/arm64/kernel/traps.c | 3 +
> arch/x86/kernel/kgdb.c | 5 +
> drivers/misc/kgdbts.c | 2 +-
> drivers/tty/mips_ejtag_fdc.c | 2 +-
> drivers/tty/serial/8250/8250_early.c | 23 ++
> drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> drivers/tty/serial/qcom_geni_serial.c | 32 +++
> drivers/usb/early/ehci-dbgp.c | 2 +-
> include/linux/kgdb.h | 25 +-
> kernel/debug/debug_core.c | 48 +++-
> 15 files changed, 400 insertions(+), 47 deletions(-)
>
> --
> 2.26.1.301.g55bc3eb7cb9-goog
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2020-04-24 10:15:46

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles

On Fri, Apr 24, 2020 at 02:02:51PM +0530, Sumit Garg wrote:
> Hi Doug,
>
> On Wed, 22 Apr 2020 at 02:45, Douglas Anderson <[email protected]> wrote:
> >
> > This whole pile of patches was motivated by me trying to get kgdb to
> > work properly on a platform where my serial driver ended up being hit
> > by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> > from other drivers). Specifically my serial driver's parent device
> > depended on a resource that wasn't available when its probe was first
> > called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> > tried to run its setup the serial driver wasn't there. Unfortunately
> > "kgdboc" never tried again, so that meant that kgdb was disabled until
> > I manually enalbed it via sysfs.
> >
> > While I could try to figure out how to get around the -EPROBE_DEFER
> > somehow, the above problems could happen to anyone and -EPROBE_DEFER
> > is generally considered something you just have to live with. In any
> > case the current "kgdboc" setup is a bit of a race waiting to happen.
> > I _think_ I saw during early testing that even adding a msleep() in
> > the typical serial driver's probe() is enough to trigger similar
> > issues.
> >
> > I decided that for the above race the best attitude to get kgdb to
> > register at boot was probably "if you can't beat 'em, join 'em".
> > Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> > driver uses it it's no longer a virus). It does so a little awkwardly
> > because "kgdboc" hasn't normally had a "struct device" associated with
> > it, but it's really not _that_ ugly to make a platform device and
> > seems less ugly than alternatives.
> >
> > Unfortunately now on my system the debugger is one of the last things
> > to register at boot. That's OK for debugging problems that show up
> > significantly after boot, but isn't so hot for all the boot problems
> > that I end up debugging. This motivated me to try to get something
> > working a little earlier.
> >
> > My first attempt was to try to get the existing "ekgdboc" to work
> > earlier. I tried that for a bit until I realized that it needed to
> > work at the tty layer and I couldn't find any serial drivers that
> > managed to register themselves to the tty layer super early at boot.
> > The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> > of a special snowflake. Trying to get my serial driver and all its
> > dependencies to probe normally and register the tty driver super early
> > at boot seemed like a bad way to go. In fact, all the complexity
> > needed to do something like this is why the system already has a
> > special concept of a "boot console" that lives only long enough to
> > transition to the normal console.
> >
> > Leveraging the boot console seemed like a good way to go and that's
> > what this series does. I found that consoles could have a read()
> > function, though I couldn't find anyone who implemented it. I
> > implemented it for two serial drivers for the devices I had easy
> > access to, making the assumption that for boot consoles that we could
> > assume read() and write() were polling-compatible (seems sane I
> > think).
> >
> > Now anyone who makes a small change to their serial driver can easily
> > enable early kgdb debugging!
> >
> > The devices I had for testing were:
> > - arm32: rk3288-veyron-jerry
> > - arm64: rk3399-gru-kevin
> > - arm64: qcom-sc7180-trogdor (not mainline yet)
> >
> > These are the devices I tested this series on. I tried to test
> > various combinations of enabling/disabling various options and I
> > hopefully caught the corner cases, but I'd appreciate any extra
> > testing people can do.
>
> earlycon_kgdboc sounds like a really cool feature. So I gave it a try
> on my arm64 machine (Developerbox) and it works like a charm. So for
> patch 6/9 you can add:
>
> Tested-by: Sumit Garg <[email protected]>
>
> Plus, in order to enable earlycon_kgdboc on Developerbox I had to
> implement the read() function in the early console driver for
> amba-pl011 (see patch [1]). It would be great if you could pick that
> patch [1] too as part of this series.
>
> [1] https://lkml.org/lkml/2020/4/24/173

I think PL011 support is also useful for getting this feature integrated
into the test suite too!


Daniel.


>
> -Sumit
>
> > Notably I didn't test on x86, but (I think) I
> > didn't touch much there so I shouldn't have broken anything.
> >
> > When testing I found a few problems with actually dropping into the
> > debugger super early on arm and arm64 devices. Patches in this series
> > should help with this. For arm I just avoid dropping into the
> > debugger until a little later and for arm64 I actually enable
> > debugging super early.
> >
> > I realize that bits of this series might feel a little hacky, though
> > I've tried to do things in the cleanest way I could without overly
> > interferring with the rest of the kernel. If you hate the way I
> > solved a problem I would love it if you could provide guidance on how
> > you think I could solve the problem better.
> >
> > This series (and my comments / documentation / commit messages) are
> > now long enough that my eyes glaze over when I try to read it all over
> > to double-check. I've nontheless tried to double-check it, but I'm
> > pretty sure I did something stupid. Thank you ahead of time for
> > pointing it out to me so I can fix it in v3. If somehow I managed to
> > not do anything stupid (really?) then thank you for double-checking me
> > anyway.
> >
> > Changes in v2:
> > - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> > - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> > - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> > - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
> >
> > Douglas Anderson (9):
> > kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> > Revert "kgdboc: disable the console lock when in kgdb"
> > kgdboc: Use a platform device to handle tty drivers showing up late
> > kgdb: Delay "kgdbwait" to dbg_late_init() by default
> > arm64: Add call_break_hook() to early_brk64() for early kgdb
> > kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> > Documentation: kgdboc: Document new earlycon_kgdboc parameter
> > serial: qcom_geni_serial: Support earlycon_kgdboc
> > serial: 8250_early: Support earlycon_kgdboc
> >
> > .../admin-guide/kernel-parameters.txt | 20 ++
> > Documentation/dev-tools/kgdb.rst | 14 +
> > arch/arm64/include/asm/debug-monitors.h | 2 +
> > arch/arm64/kernel/debug-monitors.c | 2 +-
> > arch/arm64/kernel/kgdb.c | 5 +
> > arch/arm64/kernel/traps.c | 3 +
> > arch/x86/kernel/kgdb.c | 5 +
> > drivers/misc/kgdbts.c | 2 +-
> > drivers/tty/mips_ejtag_fdc.c | 2 +-
> > drivers/tty/serial/8250/8250_early.c | 23 ++
> > drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> > drivers/tty/serial/qcom_geni_serial.c | 32 +++
> > drivers/usb/early/ehci-dbgp.c | 2 +-
> > include/linux/kgdb.h | 25 +-
> > kernel/debug/debug_core.c | 48 +++-
> > 15 files changed, 400 insertions(+), 47 deletions(-)
> >
> > --
> > 2.26.1.301.g55bc3eb7cb9-goog
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2020-04-27 13:42:30

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb

On Tue, Apr 21, 2020 at 02:14:39PM -0700, Douglas Anderson wrote:
> In commit 81eaadcae81b ("kgdboc: disable the console lock when in
> kgdb") we avoided the WARN_CONSOLE_UNLOCKED() yell when we were in
> kgdboc. That still works fine, but it turns out that we get a similar
> yell when using other I/O drivers. One example is the "I/O driver"
> for the kgdb test suite (kgdbts). When I enabled that I again got the
> same yells.
>
> Even though "kgdbts" doesn't actually interact with the user over the
> console, using it still causes kgdb to print to the consoles. That
> trips the same warning:
> con_is_visible+0x60/0x68
> con_scroll+0x110/0x1b8
> lf+0x4c/0xc8
> vt_console_print+0x1b8/0x348
> vkdb_printf+0x320/0x89c
> kdb_printf+0x68/0x90
> kdb_main_loop+0x190/0x860
> kdb_stub+0x2cc/0x3ec
> kgdb_cpu_enter+0x268/0x744
> kgdb_handle_exception+0x1a4/0x200
> kgdb_compiled_brk_fn+0x34/0x44
> brk_handler+0x7c/0xb8
> do_debug_exception+0x1b4/0x228
>
> Let's increment/decrement the "ignore_console_lock_warning" variable
> all the time when we enter the debugger.
>
> This will allow us to later revert commit 81eaadcae81b ("kgdboc:
> disable the console lock when in kgdb").
>
> Signed-off-by: Douglas Anderson <[email protected]>

Reviewed-by: Daniel Thompson <[email protected]>

2020-04-27 13:46:02

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 2/9] Revert "kgdboc: disable the console lock when in kgdb"

On Tue, Apr 21, 2020 at 02:14:40PM -0700, Douglas Anderson wrote:
> This reverts commit 81eaadcae81b4c1bf01649a3053d1f54e2d81cf1.
>
> Commit 81eaadcae81b ("kgdboc: disable the console lock when in kgdb")
> is no longer needed now that we have the patch ("kgdb: Disable
> WARN_CONSOLE_UNLOCKED for all kgdb"). Revert it.
>
> Signed-off-by: Douglas Anderson <[email protected]>

Reviewed-by: Daniel Thompson <[email protected]>

2020-04-27 14:24:53

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] kgdboc: Use a platform device to handle tty drivers showing up late

On Tue, Apr 21, 2020 at 02:14:41PM -0700, Douglas Anderson wrote:
> If you build CONFIG_KGDB_SERIAL_CONSOLE into the kernel then you
> should be able to have KGDB init itself at bootup by specifying the
> "kgdboc=..." kernel command line parameter. This has worked OK for me
> for many years, but on a new device I switched to it stopped working.
>
> The problem is that on this new device the serial driver gets its
> probe deferred. Now when kgdb initializes it can't find the tty
> driver and when it gives up it never tries again.
>
> We could try to find ways to move up the initialization of the serial
> driver and such a thing might be worthwhile, but it's nice to be
> robust against serial drivers that load late. We could move kgdb to
> init itself later but that penalizes our ability to debug early boot
> code on systems where the driver inits early. We could roll our own
> system of detecting when new tty drivers get loaded and then use that
> to figure out when kgdb can init, but that's ugly.
>
> Instead, let's jump on the -EPROBE_DEFER bandwagon. We'll create a
> singleton instance of a "kgdboc" platform device. If we can't find
> our tty device when the singleton "kgdboc" probes we'll return
> -EPROBE_DEFER which means that the system will call us back later to
> try again when the tty device might be there.
>
> We won't fully transition all of the kgdboc to a platform device
> because early kgdb initialization (via the "ekgdboc" kernel command
> line parameter) still runs before the platform device has been
> created. The kgdb platform device is merely used as a convenient way
> to hook into the system's normal probe deferral mechanisms.
>
> As part of this, we'll ever-so-slightly change how the "kgdboc=..."
> kernel command line parameter works. Previously if you booted up and
> kgdb couldn't find the tty driver then later reading
> '/sys/module/kgdboc/parameters/kgdboc' would return a blank string.
> Now kgdb will keep track of the string that came as part of the
> command line and give it back to you. It's expected that this should
> be an OK change.
>
> Signed-off-by: Douglas Anderson <[email protected]>

Reviewed-by: Daniel Thompson <[email protected]>

2020-04-27 15:45:39

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 4/9] kgdb: Delay "kgdbwait" to dbg_late_init() by default

On Tue, Apr 21, 2020 at 02:14:42PM -0700, Douglas Anderson wrote:
> Using kgdb requires at least some level of architecture-level
> initialization. If nothing else, it relies on the architecture to
> pass breakpoints / crashes onto kgdb.
>
> On some architectures this all works super early, specifically it
> starts working at some point in time before Linux parses
> early_params's. On other architectures it doesn't. A survey of a few
> platforms:
>
> a) x86: Presumably it all works early since "ekgdboc" is documented to
> work here.
> b) arm64: Catching crashes works; with a simple patch breakpoints can
> also be made to work.
> c) arm: Nothing in kgdb works until
> paging_init() -> devicemaps_init() -> early_trap_init()
>
> Let's be conservative and, by default, process "kgdbwait" (which tells
> the kernel to drop into the debugger ASAP at boot) a bit later at
> dbg_late_init() time. If an architecture has tested it and wants to
> re-enable super early debugging, they can implement the weak function
> kgdb_arch_can_debug_early() to return true. We'll do this for x86 to
> start. It should be noted that dbg_late_init() is still called quite
> early in the system.
>
> Note that this patch doesn't affect when kgdb runs its init. If kgdb
> is set to initialize early it will still initialize when parsing
> early_params's. This patch _only_ inhibits the initial breakpoint
> from "kgdbwait". This means:
>
> * Without any extra patches arm64 platforms will at least catch
> crashes after kgdb inits.
> * arm platforms will catch crashes (and could handle a hardcoded
> kgdb_breakpoint()) any time after early_trap_init() runs, even
> before dbg_late_init().
>
> Signed-off-by: Douglas Anderson <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>

Overall this looks good but there is a small quibble below...


> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index b072aeb1fd78..7371517aeacc 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -226,6 +226,28 @@ extern int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt);
> */
> extern void kgdb_arch_late(void);
>
> +/**
> + * kgdb_arch_can_debug_early - Check if OK to break before dbg_late_init()
> + *
> + * If an architecture can definitely handle entering the debugger when
> + * early_param's are parsed then it can override this function to return
> + * true. Otherwise if "kgdbwait" is passed on the kernel command line it
> + * won't actually be processed until dbg_late_init() just after the call
> + * to kgdb_arch_late() is made.
> + *
> + * NOTE: Even if this returns false we will still try to register kgdb to
> + * handle breakpoints and crashes when early_params's are parsed, we just
> + * won't act on the "kgdbwait" parameter until dbg_late_init(). If you
> + * get a crash and try to drop into kgdb somewhere between these two
> + * places you might or might not end up being able to use kgdb depending
> + * on exactly how far along the architecture has initted.
> + *
> + * ALSO: dbg_late_init() is actually still fairly early in the system
> + * boot process.
> + *
> + * Return: true if platform can handle kgdb early.
> + */
> +extern bool kgdb_arch_can_debug_early(void);

Does this need to be a function? It looks like all implementations are
either return true or return false (e.g. CONFIG_ARCH_HAVE_EARLY_DEBUG
would do the same thing).


Daniel.

2020-04-27 16:39:12

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles

On Tue, Apr 21, 2020 at 02:14:44PM -0700, Douglas Anderson wrote:
> We want to enable kgdb to debug the early parts of the kernel.
> Unfortunately kgdb normally is a client of the tty API in the kernel
> and serial drivers don't register to the tty layer until fairly late
> in the boot process.
>
> Serial drivers do, however, commonly register a boot console. Let's
> enable the kgdboc driver to work with boot consoles to provide early
> debugging.
>
> This change co-opts the existing read() function pointer that's part
> of "struct console". It's assumed that if a boot console (with the
> flag CON_BOOT) has implemented read() that both the read() and write()
> function are polling functions. That means they work without
> interrupts and read() will return immediately (with 0 bytes read) if
> there's nothing to read. This should be a safe assumption since it
> appears that no current boot consoles implement read() right now and
> there seems no reason to do so unless they wanted to support
> "earlycon_kgdboc".
>
> The console API isn't really intended to have clients work with it
> like we're doing. Specifically there doesn't appear to be any way for
> clients to be notified about a boot console being unregistered. We'll
> work around this by checking that our console is still valid before
> using it. We'll also try to transition off of the boot console and
> onto the "tty" API as quickly as possible.
>
> The normal/expected way to make all this work is to use
> "earlycon_kgdboc" and "kgdboc" together. You should point them both
> to the same physical serial connection. At boot time, as the system
> transitions from the boot console to the normal console, kgdb will
> switch over. If you don't use things in the normal/expected way it's
> a bit of a buyer-beware situation. Things thought about:
>
> - If you specify only "earlycon_kgdboc" but not "kgdboc" you still
> might end up dropping into kgdb upon a crash/sysrq but you may not
> be able to type.
> - If you use "keep_bootcon" (which is already a bit of a buyer-beware
> option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
> trying to use your boot console for kgdb.
> - If your "earlycon_kgdboc" and "kgdboc" devices are not the same
> device things should work OK, but it'll be your job to switch over
> which device you're monitoring (including figuring out how to switch
> over gdb in-flight if you're using it).
>
> When trying to enable "earlycon_kgdboc" it should be noted that the
> names that are registered through the boot console layer and the tty
> layer are not the same for the same port. For example when debugging
> on one board I'd need to pass "earlycon_kgdboc=qcom_geni
> kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
> console name is a pain and there will rarely be more than one boot
> console enabled, you can provide the "earlycon_kgdboc" parameter
> without specifying the name of the boot console. In this case we'll
> just pick the first boot that implements read() that we find.
>
> This new "earlycon_kgdboc" parameter should be contrasted to the
> existing "ekgdboc" parameter. While both provide a way to debug very
> early, the usage and mechanisms are quite different. Specifically
> "earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
> there is a transition from one to the other. The "ekgdboc" parameter,
> on the other hand, replaces the "kgdboc" parameter. It runs the same
> logic as the "kgdboc" parameter but just relies on your TTY driver
> being present super early. The only known usage of the old "ekgdboc"
> parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
> be noted that "kbd" has special treatment allowing it to init early as
> a tty device.
>
> Signed-off-by: Douglas Anderson <[email protected]>

Again, very happy with the overall approach, just a few quibbles.


> ---
> This patch touches files in several different subsystems, but it
> touches a single line and that line is related to kgdb. I'm assuming
> this can all go through the kgdb tree, but if needed I can always
> introduce a new API call instead of modifying the old one and then
> just have the old API call be a thin wrapper on the new one.

Funny you should say that!

I don't really like that extra argument although it is nothing to do
with simplifying merges...


> diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> index 519d8cfbfbed..2f526f2d2bea 100644
> --- a/drivers/tty/serial/kgdboc.c
> +++ b/drivers/tty/serial/kgdboc.c
> @@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
> }
>
> early_param("ekgdboc", kgdboc_early_init);
> +
> +static int earlycon_kgdboc_get_char(void)
> +{
> + char c;
> +
> + if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
> + return NO_POLL_CHAR;
> +
> + return c;
> +}
> +
> +static void earlycon_kgdboc_put_char(u8 chr)
> +{
> + if (!earlycon_neutered)
> + earlycon->write(earlycon, &chr, 1);
> +}
> +
> +static void earlycon_kgdboc_pre_exp_handler(void)
> +{
> + /*
> + * We don't get notified when the boot console is unregistered.
> + * Double-check when we enter the debugger. Unfortunately we
> + * can't really unregister ourselves now, but at least don't crash.
> + */
> + if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
> + pr_warn("Neutering kgdb since boot console vanished\n");
> + earlycon_neutered = true;

This is, IMHO, too subtle.

I don't think this is merely a warning with a gentle message about
neutering. IIUC the system is (or will shortly be) dead in the water.
After diligently stopping all the CPUs the debug-core will then start
waiting for a character that cannot possibly come!

I think this might be one of those vanishingly rare places where
panicing might actually the right thing to do... although only after
neutering" the kgdb panic handler first ;-).


> + }
> +}
> +
> +static struct kgdb_io earlycon_kgdboc_io_ops = {
> + .name = "earlycon_kgdboc",
> + .read_char = earlycon_kgdboc_get_char,
> + .write_char = earlycon_kgdboc_put_char,
> + .pre_exception = earlycon_kgdboc_pre_exp_handler,
> + .is_console = true,
> +};
> +
> +static int __init earlycon_kgdboc_init(char *opt)
> +{
> + struct console *con;
> +
> + kdb_init(KDB_INIT_EARLY);

This is normally taken care of by debug-core.c . Could this be
integrated into kgdb_register_io_module() ?


> +
> + /*
> + * Look for a matching console, or if the name was left blank just
> + * pick the first one we find.
> + */
> + console_lock();
> + for_each_console(con) {
> + if (con->write && con->read &&
> + (con->flags & (CON_BOOT | CON_ENABLED)) &&
> + (!opt || !opt[0] || strcmp(con->name, opt) == 0))
> + break;
> + }
> + console_unlock();
> +
> + if (!con) {
> + pr_info("Couldn't find kgdb earlycon\n");
> + return 0;
> + }
> +
> + earlycon = con;
> + pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
> + if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
> + earlycon = NULL;
> + pr_info("Failed to register kgdb with earlycon\n");
> + return 0;
> + }
> +
> + return 0;
> +}
> +
> +early_param("earlycon_kgdboc", earlycon_kgdboc_init);
> #endif /* CONFIG_KGDB_SERIAL_CONSOLE */
>
> module_init(init_kgdboc);
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 8f178239856d..1b5435c6d92a 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
> /**
> * kgdb_register_io_module - register KGDB IO module
> * @new_dbg_io_ops: the io ops vector
> + * @replace: If true it's OK if there were old ops. This is used
> + * to transition from early kgdb to normal kgdb. It's
> + * assumed these are the same device so kgdb can continue.
> *
> * Register it with the KGDB core.
> */
> -int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> +int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)

As I said I'm not a big fan of the extra argument. It makes the call
sites harder to read.

Could earlycon_kgdboc be registered with a boolean flag set so that
a subsequent register will automatically replace the old one
(maybe "is_replaceable" or "is_temporary")?

For bonus marks the core could also enforce that a replaceable io ops
table must have init set to null (because there is no deinit).


Daniel.

2020-04-27 16:48:48

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 7/9] Documentation: kgdboc: Document new earlycon_kgdboc parameter

On Tue, Apr 21, 2020 at 02:14:45PM -0700, Douglas Anderson wrote:
> The recent patch ("kgdboc: Add earlycon_kgdboc to support early kgdb
> using boot consoles") adds a new kernel command line parameter.
> Document it.
>
> Note that the patch adding the feature does some comparing/contrasting
> of "earlycon_kgdboc" vs. the existing "ekgdboc". See that patch for
> more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
> and just makes "kgdboc" do its normal initialization early (only works
> if your tty driver is already ready). The new "earlycon_kgdboc" works
> in combination with "kgdboc" and is backed by boot consoles.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v2: None
>
> .../admin-guide/kernel-parameters.txt | 20 +++++++++++++++++++
> Documentation/dev-tools/kgdb.rst | 14 +++++++++++++
> 2 files changed, 34 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..588625ec2993 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1132,6 +1132,22 @@
> address must be provided, and the serial port must
> already be setup and configured.
>
> + earlycon_kgdboc= [KGDB,HW]
> + If the boot console provides the ability to read
> + characters and can work in polling mode, you can use
> + this parameter to tell kgdb to use it as a backend
> + until the normal console is registered. Intended to
> + be used together with the kgdboc parameter which
> + specifies the normal console to transition to.
> +
> + The the name of the early console should be specified
> + as the value of this parameter. Note that the name of
> + the early console might be different than the tty
> + name passed to kgdboc. If only one boot console with
> + a read() function is enabled it's OK to leave the
> + value blank and the first boot console that implements
> + read() will be picked.

There's no need for the "If only one boot console with a read()
funcuiton is enabled" here,

Seeing this in alphabetic order in this patch it also crosses my mind
that kgdboc_earlycon might be a better name so that is sorts closer
to the other kgdb options. This is a kgdboc feature that uses earlycon
not an earlycon feature that uses kgdboc.


> +
> earlyprintk= [X86,SH,ARM,M68k,S390]
> earlyprintk=vga
> earlyprintk=sclp
> @@ -1190,6 +1206,10 @@
> This is designed to be used in conjunction with
> the boot argument: earlyprintk=vga
>
> + This parameter works in place of the kgdboc parameter
> + but can only be used if the backing tty is available
> + very early in the boot process.
> +

I wonder if pragmatic advice is more useful:

For early debugging via a serial port see earlycon_kgdboc instead.

> edd= [EDD]
> Format: {"off" | "on" | "skip[mbr]"}
>
> diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
> index d38be58f872a..c0b321403d9a 100644
> --- a/Documentation/dev-tools/kgdb.rst
> +++ b/Documentation/dev-tools/kgdb.rst
> @@ -274,6 +274,20 @@ don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as
> on the initial connect, or to use a debugger proxy that allows an
> unmodified gdb to do the debugging.
>
> +Kernel parameter: ``earlycon_kgdboc``
> +-------------------------------------
> +
> +If you specify the kernel parameter ``earlycon_kgdboc`` and your serial
> +driver registers a boot console that supports polling (doesn't need
> +interrupts and implements a nonblocking read() function) kgdb will attempt
> +to work using the boot console until it can transition to the regular
> +tty driver specified by the ``kgdboc`` parameter.
> +
> +Normally there is only one boot console (especially that implements the
> +read() function) so just adding ``earlycon_kgdboc`` on its own is
> +sufficient to make this work. If you have more than one boot console you
> +can add the boot console's name to differentiate.
> +

I think we need an example here. The example in the patch header for
the previous patch was useful (at least for me).


Daniel.

2020-04-27 16:52:36

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] serial: qcom_geni_serial: Support earlycon_kgdboc

On Tue, Apr 21, 2020 at 02:14:46PM -0700, Douglas Anderson wrote:
> Implement the read() function in the early console driver. With
> recent kgdb patches this allows you to use kgdb to debug fairly early
> into the system boot.
>
> We only bother implementing this if polling is enabled since kgdb
> can't be enabled without that.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v2: None
>
> drivers/tty/serial/qcom_geni_serial.c | 32 +++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> index 6119090ce045..4563d152b39e 100644
> --- a/drivers/tty/serial/qcom_geni_serial.c
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -1090,6 +1090,36 @@ static void qcom_geni_serial_earlycon_write(struct console *con,
> __qcom_geni_serial_console_write(&dev->port, s, n);
> }
>
> +#ifdef CONFIG_CONSOLE_POLL
> +static int qcom_geni_serial_earlycon_read(struct console *con,
> + char *s, unsigned int n)
> +{
> + struct earlycon_device *dev = con->data;
> + struct uart_port *uport = &dev->port;
> + int num_read = 0;
> + int ch;
> +
> + while (num_read < n) {
> + ch = qcom_geni_serial_get_char(uport);
> + if (ch == NO_POLL_CHAR)
> + break;
> + s[num_read++] = ch;
> + }
> +
> + return num_read;
> +}
> +
> +static void __init qcom_geni_serial_enable_early_read(struct geni_se *se,
> + struct console *con)
> +{
> + geni_se_setup_s_cmd(se, UART_START_READ, 0);
> + con->read = qcom_geni_serial_earlycon_read;
> +}
> +#else
> +static inline void qcom_geni_serial_enable_early_read(struct geni_se *se,
> + struct console *con) { ; }

This is pure nitpicking but since I was passing... why the ; ?


Daniel.

2020-04-27 16:52:50

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] serial: 8250_early: Support earlycon_kgdboc

On Tue, Apr 21, 2020 at 02:14:47PM -0700, Douglas Anderson wrote:
> Implement the read() function in the early console driver. With
> recent kgdb patches this allows you to use kgdb to debug fairly early
> into the system boot.
>
> We only bother implementing this if polling is enabled since kgdb
> can't be enabled without that.
>
> Signed-off-by: Douglas Anderson <[email protected]>

Reviewed-by: Daniel Thompson <[email protected]>

2020-04-27 16:55:18

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] arm64: Add call_break_hook() to early_brk64() for early kgdb

On Tue, Apr 21, 2020 at 02:14:43PM -0700, Douglas Anderson wrote:
> In order to make early kgdb work properly we need early_brk64() to be
> able to call into it. This is as easy as adding a call into
> call_break_hook() just like we do later in the normal brk_handler().
>
> Once we do this we can let kgdb know that it can break into the
> debugger a little earlier (specifically when parsing early_param's).
>
> NOTE: without this patch it turns out that arm64 can't do breakpoints
> even at dbg_late_init(), so if we decide something about this patch is
> wrong we might need to move dbg_late_init() a little later.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>

I haven't done any testing at this point (I'd hope to enable tests
for this in the test suite), however FWIW and just so you know I didn't
forget about this patch:

Reviewed-by: Daniel Thompson <[email protected]>


Daniel.

2020-04-28 21:35:49

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles

Hi,

On Mon, Apr 27, 2020 at 9:36 AM Daniel Thompson
<[email protected]> wrote:
>
> On Tue, Apr 21, 2020 at 02:14:44PM -0700, Douglas Anderson wrote:
> > We want to enable kgdb to debug the early parts of the kernel.
> > Unfortunately kgdb normally is a client of the tty API in the kernel
> > and serial drivers don't register to the tty layer until fairly late
> > in the boot process.
> >
> > Serial drivers do, however, commonly register a boot console. Let's
> > enable the kgdboc driver to work with boot consoles to provide early
> > debugging.
> >
> > This change co-opts the existing read() function pointer that's part
> > of "struct console". It's assumed that if a boot console (with the
> > flag CON_BOOT) has implemented read() that both the read() and write()
> > function are polling functions. That means they work without
> > interrupts and read() will return immediately (with 0 bytes read) if
> > there's nothing to read. This should be a safe assumption since it
> > appears that no current boot consoles implement read() right now and
> > there seems no reason to do so unless they wanted to support
> > "earlycon_kgdboc".
> >
> > The console API isn't really intended to have clients work with it
> > like we're doing. Specifically there doesn't appear to be any way for
> > clients to be notified about a boot console being unregistered. We'll
> > work around this by checking that our console is still valid before
> > using it. We'll also try to transition off of the boot console and
> > onto the "tty" API as quickly as possible.
> >
> > The normal/expected way to make all this work is to use
> > "earlycon_kgdboc" and "kgdboc" together. You should point them both
> > to the same physical serial connection. At boot time, as the system
> > transitions from the boot console to the normal console, kgdb will
> > switch over. If you don't use things in the normal/expected way it's
> > a bit of a buyer-beware situation. Things thought about:
> >
> > - If you specify only "earlycon_kgdboc" but not "kgdboc" you still
> > might end up dropping into kgdb upon a crash/sysrq but you may not
> > be able to type.
> > - If you use "keep_bootcon" (which is already a bit of a buyer-beware
> > option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
> > trying to use your boot console for kgdb.
> > - If your "earlycon_kgdboc" and "kgdboc" devices are not the same
> > device things should work OK, but it'll be your job to switch over
> > which device you're monitoring (including figuring out how to switch
> > over gdb in-flight if you're using it).
> >
> > When trying to enable "earlycon_kgdboc" it should be noted that the
> > names that are registered through the boot console layer and the tty
> > layer are not the same for the same port. For example when debugging
> > on one board I'd need to pass "earlycon_kgdboc=qcom_geni
> > kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
> > console name is a pain and there will rarely be more than one boot
> > console enabled, you can provide the "earlycon_kgdboc" parameter
> > without specifying the name of the boot console. In this case we'll
> > just pick the first boot that implements read() that we find.
> >
> > This new "earlycon_kgdboc" parameter should be contrasted to the
> > existing "ekgdboc" parameter. While both provide a way to debug very
> > early, the usage and mechanisms are quite different. Specifically
> > "earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
> > there is a transition from one to the other. The "ekgdboc" parameter,
> > on the other hand, replaces the "kgdboc" parameter. It runs the same
> > logic as the "kgdboc" parameter but just relies on your TTY driver
> > being present super early. The only known usage of the old "ekgdboc"
> > parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
> > be noted that "kbd" has special treatment allowing it to init early as
> > a tty device.
> >
> > Signed-off-by: Douglas Anderson <[email protected]>
>
> Again, very happy with the overall approach, just a few quibbles.
>
>
> > ---
> > This patch touches files in several different subsystems, but it
> > touches a single line and that line is related to kgdb. I'm assuming
> > this can all go through the kgdb tree, but if needed I can always
> > introduce a new API call instead of modifying the old one and then
> > just have the old API call be a thin wrapper on the new one.
>
> Funny you should say that!
>
> I don't really like that extra argument although it is nothing to do
> with simplifying merges...
>
>
> > diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> > index 519d8cfbfbed..2f526f2d2bea 100644
> > --- a/drivers/tty/serial/kgdboc.c
> > +++ b/drivers/tty/serial/kgdboc.c
> > @@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
> > }
> >
> > early_param("ekgdboc", kgdboc_early_init);
> > +
> > +static int earlycon_kgdboc_get_char(void)
> > +{
> > + char c;
> > +
> > + if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
> > + return NO_POLL_CHAR;
> > +
> > + return c;
> > +}
> > +
> > +static void earlycon_kgdboc_put_char(u8 chr)
> > +{
> > + if (!earlycon_neutered)
> > + earlycon->write(earlycon, &chr, 1);
> > +}
> > +
> > +static void earlycon_kgdboc_pre_exp_handler(void)
> > +{
> > + /*
> > + * We don't get notified when the boot console is unregistered.
> > + * Double-check when we enter the debugger. Unfortunately we
> > + * can't really unregister ourselves now, but at least don't crash.
> > + */
> > + if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
> > + pr_warn("Neutering kgdb since boot console vanished\n");
> > + earlycon_neutered = true;
>
> This is, IMHO, too subtle.
>
> I don't think this is merely a warning with a gentle message about
> neutering. IIUC the system is (or will shortly be) dead in the water.
> After diligently stopping all the CPUs the debug-core will then start
> waiting for a character that cannot possibly come!
>
> I think this might be one of those vanishingly rare places where
> panicing might actually the right thing to do... although only after
> neutering" the kgdb panic handler first ;-).

OK. I ended up adding a patch that makes our general re-entry
handling better and then relying on that since there's no other great
way to neuter the kgdb panic handler. Then I just called panic().

NOTE: it's actually quite hard to reproduce this. If you specify
"earlycon_kgdboc" but not "kgdboc" it'll notice at configure_kgdboc()
that the boot console vanished. I could reproduce this by hacking
configure_kgdboc() not to do this, but otherwise it was hard.

...oh, but I did realize that there's a window where the boot console
has vanished and our init function hasn't yet been called. That's a
pretty small window on the systems I tested, probably owing to the
fact that kgdboc itself is listed in the serial drivers and is listed
last, so it'll typically probe right after serial drivers do. ...and
if I hit deferred probing again I should run after the deferred probe
of the serial driver I needed. It's slightly fragile but maybe it'll
do for now. I guess if people start hitting this panic we'll have to
figure out what to do. If we don't want to add hooks in for the
kernel to tell us about this event we could always do something hacky
like poll every millisecond and it's probably work. For now I'll just
document that people should use "keep_bootcon" if they end up in this
situation.


> > + }
> > +}
> > +
> > +static struct kgdb_io earlycon_kgdboc_io_ops = {
> > + .name = "earlycon_kgdboc",
> > + .read_char = earlycon_kgdboc_get_char,
> > + .write_char = earlycon_kgdboc_put_char,
> > + .pre_exception = earlycon_kgdboc_pre_exp_handler,
> > + .is_console = true,
> > +};
> > +
> > +static int __init earlycon_kgdboc_init(char *opt)
> > +{
> > + struct console *con;
> > +
> > + kdb_init(KDB_INIT_EARLY);
>
> This is normally taken care of by debug-core.c . Could this be
> integrated into kgdb_register_io_module() ?

Unfortunately it's not totally trivial. At least one problem that
feels difficult to solve is that kdb_init() (and all its
sub-functions) are marked "__init" but kgdb_register_io_module() isn't
(and can't be).

One possible solution: I could totally remove this call and things
will work fine, but only if you do "kgdbwait" or if you make sure your
code doesn't crash or hit any hardcoded kgdb_breakpoint() until
dbg_late_init() is called. That's not totally ideal. I'm going to
assume it's OK for me to leave the kdb_init() here.

NOTE: I believe that the existing "ekgdboc" has the same issues but
I'm not setup to use "ekgdboc" and so I haven't tested. If you can
reproduce the "ekgdboc" issue that is there (in theory) I can also
post up a patch that'll fix that the same way...


> > +
> > + /*
> > + * Look for a matching console, or if the name was left blank just
> > + * pick the first one we find.
> > + */
> > + console_lock();
> > + for_each_console(con) {
> > + if (con->write && con->read &&
> > + (con->flags & (CON_BOOT | CON_ENABLED)) &&
> > + (!opt || !opt[0] || strcmp(con->name, opt) == 0))
> > + break;
> > + }
> > + console_unlock();
> > +
> > + if (!con) {
> > + pr_info("Couldn't find kgdb earlycon\n");
> > + return 0;
> > + }
> > +
> > + earlycon = con;
> > + pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
> > + if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
> > + earlycon = NULL;
> > + pr_info("Failed to register kgdb with earlycon\n");
> > + return 0;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +early_param("earlycon_kgdboc", earlycon_kgdboc_init);
> > #endif /* CONFIG_KGDB_SERIAL_CONSOLE */
> >
> > module_init(init_kgdboc);
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index 8f178239856d..1b5435c6d92a 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
> > /**
> > * kgdb_register_io_module - register KGDB IO module
> > * @new_dbg_io_ops: the io ops vector
> > + * @replace: If true it's OK if there were old ops. This is used
> > + * to transition from early kgdb to normal kgdb. It's
> > + * assumed these are the same device so kgdb can continue.
> > *
> > * Register it with the KGDB core.
> > */
> > -int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> > +int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)
>
> As I said I'm not a big fan of the extra argument. It makes the call
> sites harder to read.
>
> Could earlycon_kgdboc be registered with a boolean flag set so that
> a subsequent register will automatically replace the old one
> (maybe "is_replaceable" or "is_temporary")?
>
> For bonus marks the core could also enforce that a replaceable io ops
> table must have init set to null (because there is no deinit).

OK. I ended up adding a "deinit" function call and using that as an
indication that the ops are replaceable. This cleaned up some of the
earlycon_kgdb code and seemed sane.


-Doug

2020-04-28 21:42:26

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v2 7/9] Documentation: kgdboc: Document new earlycon_kgdboc parameter

Hi,

On Mon, Apr 27, 2020 at 9:46 AM Daniel Thompson
<[email protected]> wrote:
>
> On Tue, Apr 21, 2020 at 02:14:45PM -0700, Douglas Anderson wrote:
> > The recent patch ("kgdboc: Add earlycon_kgdboc to support early kgdb
> > using boot consoles") adds a new kernel command line parameter.
> > Document it.
> >
> > Note that the patch adding the feature does some comparing/contrasting
> > of "earlycon_kgdboc" vs. the existing "ekgdboc". See that patch for
> > more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
> > and just makes "kgdboc" do its normal initialization early (only works
> > if your tty driver is already ready). The new "earlycon_kgdboc" works
> > in combination with "kgdboc" and is backed by boot consoles.
> >
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > Changes in v2: None
> >
> > .../admin-guide/kernel-parameters.txt | 20 +++++++++++++++++++
> > Documentation/dev-tools/kgdb.rst | 14 +++++++++++++
> > 2 files changed, 34 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..588625ec2993 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -1132,6 +1132,22 @@
> > address must be provided, and the serial port must
> > already be setup and configured.
> >
> > + earlycon_kgdboc= [KGDB,HW]
> > + If the boot console provides the ability to read
> > + characters and can work in polling mode, you can use
> > + this parameter to tell kgdb to use it as a backend
> > + until the normal console is registered. Intended to
> > + be used together with the kgdboc parameter which
> > + specifies the normal console to transition to.
> > +
> > + The the name of the early console should be specified
> > + as the value of this parameter. Note that the name of
> > + the early console might be different than the tty
> > + name passed to kgdboc. If only one boot console with
> > + a read() function is enabled it's OK to leave the
> > + value blank and the first boot console that implements
> > + read() will be picked.
>
> There's no need for the "If only one boot console with a read()
> funcuiton is enabled" here,
>
> Seeing this in alphabetic order in this patch it also crosses my mind
> that kgdboc_earlycon might be a better name so that is sorts closer
> to the other kgdb options. This is a kgdboc feature that uses earlycon
> not an earlycon feature that uses kgdboc.

OK. 'git format-patch', sed, and 'git am' for the win.


> > +
> > earlyprintk= [X86,SH,ARM,M68k,S390]
> > earlyprintk=vga
> > earlyprintk=sclp
> > @@ -1190,6 +1206,10 @@
> > This is designed to be used in conjunction with
> > the boot argument: earlyprintk=vga
> >
> > + This parameter works in place of the kgdboc parameter
> > + but can only be used if the backing tty is available
> > + very early in the boot process.
> > +
>
> I wonder if pragmatic advice is more useful:
>
> For early debugging via a serial port see earlycon_kgdboc instead.

Done.


> > edd= [EDD]
> > Format: {"off" | "on" | "skip[mbr]"}
> >
> > diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
> > index d38be58f872a..c0b321403d9a 100644
> > --- a/Documentation/dev-tools/kgdb.rst
> > +++ b/Documentation/dev-tools/kgdb.rst
> > @@ -274,6 +274,20 @@ don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as
> > on the initial connect, or to use a debugger proxy that allows an
> > unmodified gdb to do the debugging.
> >
> > +Kernel parameter: ``earlycon_kgdboc``
> > +-------------------------------------
> > +
> > +If you specify the kernel parameter ``earlycon_kgdboc`` and your serial
> > +driver registers a boot console that supports polling (doesn't need
> > +interrupts and implements a nonblocking read() function) kgdb will attempt
> > +to work using the boot console until it can transition to the regular
> > +tty driver specified by the ``kgdboc`` parameter.
> > +
> > +Normally there is only one boot console (especially that implements the
> > +read() function) so just adding ``earlycon_kgdboc`` on its own is
> > +sufficient to make this work. If you have more than one boot console you
> > +can add the boot console's name to differentiate.
> > +
>
> I think we need an example here. The example in the patch header for
> the previous patch was useful (at least for me).

Done.