This whole pile of patches was motivated by me trying to get kgdb to
work properly on a platform where my serial driver ended up being hit
by the -EPROBE_DEFER virus (it wasn't practicing social distancing
from other drivers). Specifically my serial driver's parent device
depended on a resource that wasn't available when its probe was first
called. It returned -EPROBE_DEFER which meant that when "kgdboc"
tried to run its setup the serial driver wasn't there. Unfortunately
"kgdboc" never tried again, so that meant that kgdb was disabled until
I manually enalbed it via sysfs.
While I could try to figure out how to get around the -EPROBE_DEFER
somehow, the above problems could happen to anyone and -EPROBE_DEFER
is generally considered something you just have to live with. In any
case the current "kgdboc" setup is a bit of a race waiting to happen.
I _think_ I saw during early testing that even adding a msleep() in
the typical serial driver's probe() is enough to trigger similar
issues.
I decided that for the above race the best attitude to get kgdb to
register at boot was probably "if you can't beat 'em, join 'em".
Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
driver uses it it's no longer a virus). It does so a little awkwardly
because "kgdboc" hasn't normally had a "struct device" associated with
it, but it's really not _that_ ugly to make a platform device and
seems less ugly than alternatives.
Unfortunately now on my system the debugger is one of the last things
to register at boot. That's OK for debugging problems that show up
significantly after boot, but isn't so hot for all the boot problems
that I end up debugging. This motivated me to try to get something
working a little earlier.
My first attempt was to try to get the existing "ekgdboc" to work
earlier. I tried that for a bit until I realized that it needed to
work at the tty layer and I couldn't find any serial drivers that
managed to register themselves to the tty layer super early at boot.
The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
of a special snowflake. Trying to get my serial driver and all its
dependencies to probe normally and register the tty driver super early
at boot seemed like a bad way to go. In fact, all the complexity
needed to do something like this is why the system already has a
special concept of a "boot console" that lives only long enough to
transition to the normal console.
Leveraging the boot console seemed like a good way to go and that's
what this series does. I found that consoles could have a read()
function, though I couldn't find anyone who implemented it. I
implemented it for two serial drivers for the devices I had easy
access to, making the assumption that for boot consoles that we could
assume read() and write() were polling-compatible (seems sane I
think).
Now anyone who makes a small change to their serial driver can easily
enable early kgdb debugging!
The devices I had for testing were:
- arm32: rk3288-veyron-jerry
- arm64: rk3399-gru-kevin
- arm64: qcom-sc7180-trogdor (not mainline yet)
These are the devices I tested this series on. I tried to test
various combinations of enabling/disabling various options and I
hopefully caught the corner cases, but I'd appreciate any extra
testing people can do. Notably I didn't test on x86, but (I think) I
didn't touch much there so I shouldn't have broken anything.
When testing I found a few problems with actually dropping into the
debugger super early on arm and arm64 devices. Patches in this series
should help with this. For arm I just avoid dropping into the
debugger until a little later and for arm64 I actually enable
debugging super early.
I realize that bits of this series might feel a little hacky, though
I've tried to do things in the cleanest way I could without overly
interferring with the rest of the kernel. If you hate the way I
solved a problem I would love it if you could provide guidance on how
you think I could solve the problem better.
This series (and my comments / documentation / commit messages) are
now long enough that my eyes glaze over when I try to read it all over
to double-check. I've nontheless tried to double-check it, but I'm
pretty sure I did something stupid. Thank you ahead of time for
pointing it out to me so I can fix it in v4. If somehow I managed to
not do anything stupid (really?) then thank you for double-checking me
anyway.
NOTE: v3 of the patch series tacks on support for an extra serial
driver from Sumit Garg. I've piled it onto the end of my series at
his request just to keep everything together.
Changes in v3:
- ("kgdb: Prevent infinite recursive entries to the debugger") new for v3.
- ("serial: amba-pl011: Support kgdboc_earlycon") pulled into my v3.
- Add deinit() to I/O ops to know a driver can be replaced.
- Added example in kgdb.rst
- Change boolean weak function to KConfig.
- Don't just neuter input, panic if earlycon vanishes.
- No extra param to kgdb_register_io_module().
- Removed unneeded sentence in kerenel-parameters doc.
- Renamed earlycon_kgdboc to kgdboc_earlycon.
- Simplify earlycon_kgdb deinit by using the deinit() function.
- Suggest people use kgdboc_earlycon instead of ekgdboc.
- { ; } ==> { }
Changes in v2:
- ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
- ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
- Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
- Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
Douglas Anderson (10):
kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
Revert "kgdboc: disable the console lock when in kgdb"
kgdboc: Use a platform device to handle tty drivers showing up late
kgdb: Delay "kgdbwait" to dbg_late_init() by default
arm64: Add call_break_hook() to early_brk64() for early kgdb
kgdb: Prevent infinite recursive entries to the debugger
kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles
Documentation: kgdboc: Document new kgdboc_earlycon parameter
serial: qcom_geni_serial: Support kgdboc_earlycon
serial: 8250_early: Support kgdboc_earlycon
Sumit Garg (1):
serial: amba-pl011: Support kgdboc_earlycon
.../admin-guide/kernel-parameters.txt | 20 ++
Documentation/dev-tools/kgdb.rst | 24 ++
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/debug-monitors.h | 2 +
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/traps.c | 3 +
arch/x86/Kconfig | 1 +
drivers/tty/serial/8250/8250_early.c | 23 ++
drivers/tty/serial/amba-pl011.c | 32 +++
drivers/tty/serial/kgdboc.c | 266 ++++++++++++++++--
drivers/tty/serial/qcom_geni_serial.c | 32 +++
include/linux/kgdb.h | 4 +
kernel/debug/debug_core.c | 53 +++-
lib/Kconfig.kgdb | 18 ++
14 files changed, 437 insertions(+), 44 deletions(-)
--
2.26.2.303.gf8c07b1a785-goog
In order to make early kgdb work properly we need early_brk64() to be
able to call into it. This is as easy as adding a call into
call_break_hook() just like we do later in the normal brk_handler().
Once we do this we can let kgdb know that it can break into the
debugger a little earlier (specifically when parsing early_param's).
NOTE: without this patch it turns out that arm64 can't do breakpoints
even at dbg_late_init(), so if we decide something about this patch is
wrong we might need to move dbg_late_init() a little later.
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
Changes in v3:
- Change boolean weak function to KConfig.
Changes in v2: None
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/debug-monitors.h | 2 ++
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/traps.c | 3 +++
4 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 40fb05d96c60..08a736175d2d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -13,6 +13,7 @@ config ARM64
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
+ select ARCH_HAS_EARLY_DEBUG
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 7619f473155f..2d82a0314d29 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -97,6 +97,8 @@ void unregister_user_break_hook(struct break_hook *hook);
void register_kernel_break_hook(struct break_hook *hook);
void unregister_kernel_break_hook(struct break_hook *hook);
+int call_break_hook(struct pt_regs *regs, unsigned int esr);
+
u8 debug_monitors_arch(void);
enum dbg_active_el {
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index 48222a4760c2..59c353dfc8e9 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
unregister_debug_hook(&hook->node);
}
-static int call_break_hook(struct pt_regs *regs, unsigned int esr)
+int call_break_hook(struct pt_regs *regs, unsigned int esr)
{
struct break_hook *hook;
struct list_head *list;
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index cf402be5c573..a8173f0c1774 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
#endif
+ if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
+ return 0;
+
return bug_handler(regs, esr) != DBG_HOOK_HANDLED;
}
--
2.26.2.303.gf8c07b1a785-goog
From: Sumit Garg <[email protected]>
Implement the read() function in the early console driver. With
recently added kgdboc_earlycon feature, this allows you to use kgdb
to debug fairly early into the system boot.
We only bother implementing this if polling is enabled since kgdb can't
be enabled without that.
Signed-off-by: Sumit Garg <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v3:
- ("serial: amba-pl011: Support kgdboc_earlycon") pulled into my v3.
- Renamed earlycon_kgdboc to kgdboc_earlycon.
Changes in v2: None
drivers/tty/serial/amba-pl011.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 2296bb0f9578..c010f639298d 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -2435,6 +2435,37 @@ static void pl011_early_write(struct console *con, const char *s, unsigned n)
uart_console_write(&dev->port, s, n, pl011_putc);
}
+#ifdef CONFIG_CONSOLE_POLL
+static int pl011_getc(struct uart_port *port)
+{
+ if (readl(port->membase + UART01x_FR) & UART01x_FR_RXFE)
+ return NO_POLL_CHAR;
+
+ if (port->iotype == UPIO_MEM32)
+ return readl(port->membase + UART01x_DR);
+ else
+ return readb(port->membase + UART01x_DR);
+}
+
+static int pl011_early_read(struct console *con, char *s, unsigned int n)
+{
+ struct earlycon_device *dev = con->data;
+ int ch, num_read = 0;
+
+ while (num_read < n) {
+ ch = pl011_getc(&dev->port);
+ if (ch == NO_POLL_CHAR)
+ break;
+
+ s[num_read++] = ch;
+ }
+
+ return num_read;
+}
+#else
+#define pl011_early_read NULL
+#endif
+
/*
* On non-ACPI systems, earlycon is enabled by specifying
* "earlycon=pl011,<address>" on the kernel command line.
@@ -2454,6 +2485,7 @@ static int __init pl011_early_console_setup(struct earlycon_device *device,
return -ENODEV;
device->con->write = pl011_early_write;
+ device->con->read = pl011_early_read;
return 0;
}
--
2.26.2.303.gf8c07b1a785-goog
The recent patch ("kgdboc: Add kgdboc_earlycon to support early kgdb
using boot consoles") adds a new kernel command line parameter.
Document it.
Note that the patch adding the feature does some comparing/contrasting
of "kgdboc_earlycon" vs. the existing "ekgdboc". See that patch for
more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
and just makes "kgdboc" do its normal initialization early (only works
if your tty driver is already ready). The new "kgdboc_earlycon" works
in combination with "kgdboc" and is backed by boot consoles.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
---
Changes in v3:
- Added example in kgdb.rst
- Removed unneeded sentence in kerenel-parameters doc.
- Renamed earlycon_kgdboc to kgdboc_earlycon.
- Suggest people use kgdboc_earlycon instead of ekgdboc.
Changes in v2: None
.../admin-guide/kernel-parameters.txt | 20 ++++++++++++++++
Documentation/dev-tools/kgdb.rst | 24 +++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7bc83f3d9bdf..3b5ae06a98aa 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1190,6 +1190,11 @@
This is designed to be used in conjunction with
the boot argument: earlyprintk=vga
+ This parameter works in place of the kgdboc parameter
+ but can only be used if the backing tty is available
+ very early in the boot process. For early debugging
+ via a serial port see kgdboc_earlycon instead.
+
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
@@ -2105,6 +2110,21 @@
kms, kbd format: kms,kbd
kms, kbd and serial format: kms,kbd,<ser_dev>[,baud]
+ kgdboc_earlycon= [KGDB,HW]
+ If the boot console provides the ability to read
+ characters and can work in polling mode, you can use
+ this parameter to tell kgdb to use it as a backend
+ until the normal console is registered. Intended to
+ be used together with the kgdboc parameter which
+ specifies the normal console to transition to.
+
+ The the name of the early console should be specified
+ as the value of this parameter. Note that the name of
+ the early console might be different than the tty
+ name passed to kgdboc. It's OK to leave the value
+ blank and the first boot console that implements
+ read() will be picked.
+
kgdbwait [KGDB] Stop kernel execution and enter the
kernel debugger at the earliest opportunity.
diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
index d38be58f872a..61293f40bc6e 100644
--- a/Documentation/dev-tools/kgdb.rst
+++ b/Documentation/dev-tools/kgdb.rst
@@ -274,6 +274,30 @@ don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as
on the initial connect, or to use a debugger proxy that allows an
unmodified gdb to do the debugging.
+Kernel parameter: ``kgdboc_earlycon``
+-------------------------------------
+
+If you specify the kernel parameter ``kgdboc_earlycon`` and your serial
+driver registers a boot console that supports polling (doesn't need
+interrupts and implements a nonblocking read() function) kgdb will attempt
+to work using the boot console until it can transition to the regular
+tty driver specified by the ``kgdboc`` parameter.
+
+Normally there is only one boot console (especially that implements the
+read() function) so just adding ``kgdboc_earlycon`` on its own is
+sufficient to make this work. If you have more than one boot console you
+can add the boot console's name to differentiate. Note that names that
+are registered through the boot console layer and the tty layer are not
+the same for the same port.
+
+For instance, on one board to be explicit you might do::
+
+ kgdboc_earlycon=qcom_geni kgdboc=ttyMSM0
+
+If the only boot console on the device was "qcom_geni", you could simplify::
+
+ kgdboc_earlycon kgdboc=ttyMSM0
+
Kernel parameter: ``kgdbwait``
------------------------------
--
2.26.2.303.gf8c07b1a785-goog
Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.
We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
Changes in v3:
- Renamed earlycon_kgdboc to kgdboc_earlycon.
Changes in v2: None
drivers/tty/serial/8250/8250_early.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/drivers/tty/serial/8250/8250_early.c b/drivers/tty/serial/8250/8250_early.c
index 5cd8c36c8fcc..70d7826788f5 100644
--- a/drivers/tty/serial/8250/8250_early.c
+++ b/drivers/tty/serial/8250/8250_early.c
@@ -109,6 +109,28 @@ static void early_serial8250_write(struct console *console,
uart_console_write(port, s, count, serial_putc);
}
+#ifdef CONFIG_CONSOLE_POLL
+static int early_serial8250_read(struct console *console,
+ char *s, unsigned int count)
+{
+ struct earlycon_device *device = console->data;
+ struct uart_port *port = &device->port;
+ unsigned int status;
+ int num_read = 0;
+
+ while (num_read < count) {
+ status = serial8250_early_in(port, UART_LSR);
+ if (!(status & UART_LSR_DR))
+ break;
+ s[num_read++] = serial8250_early_in(port, UART_RX);
+ }
+
+ return num_read;
+}
+#else
+#define early_serial8250_read NULL
+#endif
+
static void __init init_port(struct earlycon_device *device)
{
struct uart_port *port = &device->port;
@@ -149,6 +171,7 @@ int __init early_serial8250_setup(struct earlycon_device *device,
init_port(device);
device->con->write = early_serial8250_write;
+ device->con->read = early_serial8250_read;
return 0;
}
EARLYCON_DECLARE(uart8250, early_serial8250_setup);
--
2.26.2.303.gf8c07b1a785-goog
We want to enable kgdb to debug the early parts of the kernel.
Unfortunately kgdb normally is a client of the tty API in the kernel
and serial drivers don't register to the tty layer until fairly late
in the boot process.
Serial drivers do, however, commonly register a boot console. Let's
enable the kgdboc driver to work with boot consoles to provide early
debugging.
This change co-opts the existing read() function pointer that's part
of "struct console". It's assumed that if a boot console (with the
flag CON_BOOT) has implemented read() that both the read() and write()
function are polling functions. That means they work without
interrupts and read() will return immediately (with 0 bytes read) if
there's nothing to read. This should be a safe assumption since it
appears that no current boot consoles implement read() right now and
there seems no reason to do so unless they wanted to support
"kgdboc_earlycon".
The console API isn't really intended to have clients work with it
like we're doing. Specifically there doesn't appear to be any way for
clients to be notified about a boot console being unregistered. We'll
work around this by checking that our console is still valid before
using it. We'll also try to transition off of the boot console and
onto the "tty" API as quickly as possible.
The normal/expected way to make all this work is to use
"kgdboc_earlycon" and "kgdboc" together. You should point them both
to the same physical serial connection. At boot time, as the system
transitions from the boot console to the normal console, kgdb will
switch over. If you don't use things in the normal/expected way it's
a bit of a buyer-beware situation. Things thought about:
- If you specify only "kgdboc_earlycon" but not "kgdboc" and the boot
console vanishes at a weird time we'll panic if someone tries to
drop into kgdb.
- If you use "keep_bootcon" (which is already a bit of a buyer-beware
option) and specify "kgdboc_earlycon" but not "kgdboc" we'll keep
trying to use your boot console for kgdb.
- If your "kgdboc_earlycon" and "kgdboc" devices are not the same
device things should work OK, but it'll be your job to switch over
which device you're monitoring (including figuring out how to switch
over gdb in-flight if you're using it).
When trying to enable "kgdboc_earlycon" it should be noted that the
names that are registered through the boot console layer and the tty
layer are not the same for the same port. For example when debugging
on one board I'd need to pass "kgdboc_earlycon=qcom_geni
kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
console name is a pain and there will rarely be more than one boot
console enabled, you can provide the "kgdboc_earlycon" parameter
without specifying the name of the boot console. In this case we'll
just pick the first boot that implements read() that we find.
This new "kgdboc_earlycon" parameter should be contrasted to the
existing "ekgdboc" parameter. While both provide a way to debug very
early, the usage and mechanisms are quite different. Specifically
"kgdboc_earlycon" is meant to be used in tandem with "kgdboc" and
there is a transition from one to the other. The "ekgdboc" parameter,
on the other hand, replaces the "kgdboc" parameter. It runs the same
logic as the "kgdboc" parameter but just relies on your TTY driver
being present super early. The only known usage of the old "ekgdboc"
parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
be noted that "kbd" has special treatment allowing it to init early as
a tty device.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Tested-by: Sumit Garg <[email protected]>
---
I have kept Greg's Reviewed-by and Sumit's Tested-by tags on this
commit despite changes that aren't totally trivial. Please yell if
you disagree with this. Reasons:
- Greg's Reviewed-by seemed more an overall acknowledgment that the
series wasn't totally insane rather than a detailed review. I don't
think the changes from v2 to v3 change that.
- Sumit's Tested-by seemed useful as confirmation that someone else
made this work on a machine that wasn't mine. I don't believe that
the changes from v2 to v3 should affect anything here.
Changes in v3:
- Add deinit() to I/O ops to know a driver can be replaced.
- Don't just neuter input, panic if earlycon vanishes.
- No extra param to kgdb_register_io_module().
- Renamed earlycon_kgdboc to kgdboc_earlycon.
- Simplify earlycon_kgdb deinit by using the deinit() function.
Changes in v2:
- Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
- Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
drivers/tty/serial/kgdboc.c | 136 ++++++++++++++++++++++++++++++++++++
include/linux/kgdb.h | 4 ++
kernel/debug/debug_core.c | 23 ++++--
3 files changed, 159 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 519d8cfbfbed..7aca0a67fc0b 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -21,6 +21,7 @@
#include <linux/input.h>
#include <linux/module.h>
#include <linux/platform_device.h>
+#include <linux/serial_core.h>
#define MAX_CONFIG_LEN 40
@@ -42,6 +43,13 @@ static int kgdb_tty_line;
static struct platform_device *kgdboc_pdev;
+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+static struct kgdb_io kgdboc_earlycon_io_ops;
+struct console *earlycon;
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+#define earlycon NULL
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -135,8 +143,45 @@ static void kgdboc_unregister_kbd(void)
#define kgdboc_restore_input()
#endif /* ! CONFIG_KDB_KEYBOARD */
+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+
+static void cleanup_earlycon(void)
+{
+ if (earlycon)
+ kgdb_unregister_io_module(&kgdboc_earlycon_io_ops);
+}
+
+static bool is_earlycon_still_valid(void)
+{
+ struct console *con;
+
+ for_each_console(con)
+ if (con == earlycon)
+ return true;
+ return false;
+}
+
+static void cleanup_earlycon_if_invalid(void)
+{
+ console_lock();
+ if (earlycon && !is_earlycon_still_valid()) {
+ pr_warn("earlycon vanished; unregistering\n");
+ cleanup_earlycon();
+ }
+ console_unlock();
+}
+
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
+static inline void cleanup_earlycon(void) { ; }
+static inline void cleanup_earlycon_if_invalid(void) { ; }
+
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
static void cleanup_kgdboc(void)
{
+ cleanup_earlycon();
+
if (configured != 1)
return;
@@ -206,6 +251,14 @@ static int configure_kgdboc(void)
kgdboc_unregister_kbd();
configured = 0;
+ /*
+ * Each time we run configure_kgdboc() but don't find a console, use
+ * that as a chance to validate that our earlycon didn't vanish on
+ * us. If it vanished we should unregister which will disable kgdb
+ * if we're the last I/O module.
+ */
+ cleanup_earlycon_if_invalid();
+
return err;
}
@@ -409,6 +462,89 @@ static int __init kgdboc_early_init(char *opt)
}
early_param("ekgdboc", kgdboc_early_init);
+
+static int kgdboc_earlycon_get_char(void)
+{
+ char c;
+
+ if (!earlycon->read(earlycon, &c, 1))
+ return NO_POLL_CHAR;
+
+ return c;
+}
+
+static void kgdboc_earlycon_put_char(u8 chr)
+{
+ earlycon->write(earlycon, &chr, 1);
+}
+
+static void kgdboc_earlycon_pre_exp_handler(void)
+{
+ /*
+ * We don't get notified when the boot console is unregistered.
+ * Double-check when we enter the debugger. Unfortunately we
+ * can't really unregister ourselves now, so we panic. We rely
+ * on kgdb's ability to detect re-entrancy to make the panic
+ * take effect.
+ *
+ * NOTE: if you're here in the lull when the real console has
+ * replaced the boot console but our init hasn't run yet it's
+ * possible that the "keep_bootcon" argument may help.
+ */
+ if (earlycon && !is_earlycon_still_valid())
+ panic("KGDB earlycon vanished and nothing replaced it\n");
+}
+
+static void kgdboc_earlycon_deinit(void)
+{
+ earlycon = NULL;
+}
+
+static struct kgdb_io kgdboc_earlycon_io_ops = {
+ .name = "kgdboc_earlycon",
+ .read_char = kgdboc_earlycon_get_char,
+ .write_char = kgdboc_earlycon_put_char,
+ .pre_exception = kgdboc_earlycon_pre_exp_handler,
+ .deinit = kgdboc_earlycon_deinit,
+ .is_console = true,
+};
+
+static int __init kgdboc_earlycon_init(char *opt)
+{
+ struct console *con;
+
+ kdb_init(KDB_INIT_EARLY);
+
+ /*
+ * Look for a matching console, or if the name was left blank just
+ * pick the first one we find.
+ */
+ console_lock();
+ for_each_console(con) {
+ if (con->write && con->read &&
+ (con->flags & (CON_BOOT | CON_ENABLED)) &&
+ (!opt || !opt[0] || strcmp(con->name, opt) == 0))
+ break;
+ }
+ console_unlock();
+
+ if (!con) {
+ pr_info("Couldn't find kgdb earlycon\n");
+ return 0;
+ }
+
+ earlycon = con;
+ pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
+ if (kgdb_register_io_module(&kgdboc_earlycon_io_ops) != 0) {
+ earlycon = NULL;
+ pr_info("Failed to register kgdb with earlycon\n");
+ return 0;
+ }
+
+ return 0;
+}
+
+early_param("kgdboc_earlycon", kgdboc_earlycon_init);
#endif /* CONFIG_KGDB_SERIAL_CONSOLE */
module_init(init_kgdboc);
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index b072aeb1fd78..77a3c519478a 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -269,6 +269,9 @@ struct kgdb_arch {
* @write_char: Pointer to a function that will write one char.
* @flush: Pointer to a function that will flush any pending writes.
* @init: Pointer to a function that will initialize the device.
+ * @deinit: Pointer to a function that will deinit the device. Implies that
+ * this I/O driver is temporary and expects to be replaced. Called when
+ * an I/O driver is replaced or explicitly unregistered.
* @pre_exception: Pointer to a function that will do any prep work for
* the I/O driver.
* @post_exception: Pointer to a function that will do any cleanup work
@@ -282,6 +285,7 @@ struct kgdb_io {
void (*write_char) (u8);
void (*flush) (void);
int (*init) (void);
+ void (*deinit) (void);
void (*pre_exception) (void);
void (*post_exception) (void);
int is_console;
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index faf5bd4c34ee..2d74dcbca477 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -1075,15 +1075,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
*/
int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
{
+ struct kgdb_io *old_dbg_io_ops;
int err;
spin_lock(&kgdb_registration_lock);
- if (dbg_io_ops) {
- spin_unlock(&kgdb_registration_lock);
+ old_dbg_io_ops = dbg_io_ops;
+ if (old_dbg_io_ops) {
+ if (!old_dbg_io_ops->deinit) {
+ spin_unlock(&kgdb_registration_lock);
- pr_err("Another I/O driver is already registered with KGDB\n");
- return -EBUSY;
+ pr_err("KGDB I/O driver %s can't replace %s.\n",
+ new_dbg_io_ops->name, old_dbg_io_ops->name);
+ return -EBUSY;
+ }
+ old_dbg_io_ops->deinit();
}
if (new_dbg_io_ops->init) {
@@ -1098,6 +1104,12 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
spin_unlock(&kgdb_registration_lock);
+ if (old_dbg_io_ops) {
+ pr_info("Replaced I/O driver %s with %s\n",
+ old_dbg_io_ops->name, new_dbg_io_ops->name);
+ return 0;
+ }
+
pr_info("Registered I/O driver %s\n", new_dbg_io_ops->name);
/* Arm KGDB now. */
@@ -1134,6 +1146,9 @@ void kgdb_unregister_io_module(struct kgdb_io *old_dbg_io_ops)
spin_unlock(&kgdb_registration_lock);
+ if (old_dbg_io_ops->deinit)
+ old_dbg_io_ops->deinit();
+
pr_info("Unregistered I/O driver %s, debugger disabled\n",
old_dbg_io_ops->name);
}
--
2.26.2.303.gf8c07b1a785-goog
This reverts commit 81eaadcae81b4c1bf01649a3053d1f54e2d81cf1.
Commit 81eaadcae81b ("kgdboc: disable the console lock when in kgdb")
is no longer needed now that we have the patch ("kgdb: Disable
WARN_CONSOLE_UNLOCKED for all kgdb"). Revert it.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
Changes in v3: None
Changes in v2:
- ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
drivers/tty/serial/kgdboc.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index c9f94fa82be4..8a1a4d1b6768 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -275,14 +275,10 @@ static void kgdboc_pre_exp_handler(void)
/* Increment the module count when the debugger is active */
if (!kgdb_connected)
try_module_get(THIS_MODULE);
-
- atomic_inc(&ignore_console_lock_warning);
}
static void kgdboc_post_exp_handler(void)
{
- atomic_dec(&ignore_console_lock_warning);
-
/* decrement the module count when the debugger detaches */
if (!kgdb_connected)
module_put(THIS_MODULE);
--
2.26.2.303.gf8c07b1a785-goog
If we detect that we recursively entered the debugger we should hack
our I/O ops to NULL so that the panic() in the next line won't
actually cause another recursion into the debugger. The first line of
kgdb_panic() will check this and return.
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v3:
- ("kgdb: Prevent infinite recursive entries to the debugger") new for v3.
Changes in v2: None
kernel/debug/debug_core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 503c1630ca76..faf5bd4c34ee 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -532,6 +532,7 @@ static int kgdb_reenter_check(struct kgdb_state *ks)
if (exception_level > 1) {
dump_stack();
+ kgdb_io_module_registered = false;
panic("Recursive entry to debugger");
}
--
2.26.2.303.gf8c07b1a785-goog
Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.
We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
---
Changes in v3:
- Renamed earlycon_kgdboc to kgdboc_earlycon.
- { ; } ==> { }
Changes in v2: None
drivers/tty/serial/qcom_geni_serial.c | 32 +++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 6119090ce045..6bace1c6bb09 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1090,6 +1090,36 @@ static void qcom_geni_serial_earlycon_write(struct console *con,
__qcom_geni_serial_console_write(&dev->port, s, n);
}
+#ifdef CONFIG_CONSOLE_POLL
+static int qcom_geni_serial_earlycon_read(struct console *con,
+ char *s, unsigned int n)
+{
+ struct earlycon_device *dev = con->data;
+ struct uart_port *uport = &dev->port;
+ int num_read = 0;
+ int ch;
+
+ while (num_read < n) {
+ ch = qcom_geni_serial_get_char(uport);
+ if (ch == NO_POLL_CHAR)
+ break;
+ s[num_read++] = ch;
+ }
+
+ return num_read;
+}
+
+static void __init qcom_geni_serial_enable_early_read(struct geni_se *se,
+ struct console *con)
+{
+ geni_se_setup_s_cmd(se, UART_START_READ, 0);
+ con->read = qcom_geni_serial_earlycon_read;
+}
+#else
+static inline void qcom_geni_serial_enable_early_read(struct geni_se *se,
+ struct console *con) { }
+#endif
+
static int __init qcom_geni_serial_earlycon_setup(struct earlycon_device *dev,
const char *opt)
{
@@ -1136,6 +1166,8 @@ static int __init qcom_geni_serial_earlycon_setup(struct earlycon_device *dev,
dev->con->write = qcom_geni_serial_earlycon_write;
dev->con->setup = NULL;
+ qcom_geni_serial_enable_early_read(&se, dev->con);
+
return 0;
}
OF_EARLYCON_DECLARE(qcom_geni, "qcom,geni-debug-uart",
--
2.26.2.303.gf8c07b1a785-goog
In commit 81eaadcae81b ("kgdboc: disable the console lock when in
kgdb") we avoided the WARN_CONSOLE_UNLOCKED() yell when we were in
kgdboc. That still works fine, but it turns out that we get a similar
yell when using other I/O drivers. One example is the "I/O driver"
for the kgdb test suite (kgdbts). When I enabled that I again got the
same yells.
Even though "kgdbts" doesn't actually interact with the user over the
console, using it still causes kgdb to print to the consoles. That
trips the same warning:
con_is_visible+0x60/0x68
con_scroll+0x110/0x1b8
lf+0x4c/0xc8
vt_console_print+0x1b8/0x348
vkdb_printf+0x320/0x89c
kdb_printf+0x68/0x90
kdb_main_loop+0x190/0x860
kdb_stub+0x2cc/0x3ec
kgdb_cpu_enter+0x268/0x744
kgdb_handle_exception+0x1a4/0x200
kgdb_compiled_brk_fn+0x34/0x44
brk_handler+0x7c/0xb8
do_debug_exception+0x1b4/0x228
Let's increment/decrement the "ignore_console_lock_warning" variable
all the time when we enter the debugger.
This will allow us to later revert commit 81eaadcae81b ("kgdboc:
disable the console lock when in kgdb").
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
Changes in v3: None
Changes in v2:
- ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
kernel/debug/debug_core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 2b7c9b67931d..950dc667c823 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -668,6 +668,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
if (kgdb_skipexception(ks->ex_vector, ks->linux_regs))
goto kgdb_restore;
+ atomic_inc(&ignore_console_lock_warning);
+
/* Call the I/O driver's pre_exception routine */
if (dbg_io_ops->pre_exception)
dbg_io_ops->pre_exception();
@@ -740,6 +742,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
if (dbg_io_ops->post_exception)
dbg_io_ops->post_exception();
+ atomic_dec(&ignore_console_lock_warning);
+
if (!kgdb_single_step) {
raw_spin_unlock(&dbg_slave_lock);
/* Wait till all the CPUs have quit from the debugger. */
--
2.26.2.303.gf8c07b1a785-goog
Using kgdb requires at least some level of architecture-level
initialization. If nothing else, it relies on the architecture to
pass breakpoints / crashes onto kgdb.
On some architectures this all works super early, specifically it
starts working at some point in time before Linux parses
early_params's. On other architectures it doesn't. A survey of a few
platforms:
a) x86: Presumably it all works early since "ekgdboc" is documented to
work here.
b) arm64: Catching crashes works; with a simple patch breakpoints can
also be made to work.
c) arm: Nothing in kgdb works until
paging_init() -> devicemaps_init() -> early_trap_init()
Let's be conservative and, by default, process "kgdbwait" (which tells
the kernel to drop into the debugger ASAP at boot) a bit later at
dbg_late_init() time. If an architecture has tested it and wants to
re-enable super early debugging, they can select the
ARCH_HAS_EARLY_DEBUG KConfig option. We'll do this for x86 to start.
It should be noted that dbg_late_init() is still called quite early in
the system.
Note that this patch doesn't affect when kgdb runs its init. If kgdb
is set to initialize early it will still initialize when parsing
early_param's. This patch _only_ inhibits the initial breakpoint from
"kgdbwait". This means:
* Without any extra patches arm64 platforms will at least catch
crashes after kgdb inits.
* arm platforms will catch crashes (and could handle a hardcoded
kgdb_breakpoint()) any time after early_trap_init() runs, even
before dbg_late_init().
Signed-off-by: Douglas Anderson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
---
Changes in v3:
- Change boolean weak function to KConfig.
Changes in v2: None
arch/x86/Kconfig | 1 +
kernel/debug/debug_core.c | 25 +++++++++++++++----------
lib/Kconfig.kgdb | 18 ++++++++++++++++++
3 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1197b5596d5a..4b8c4c15a59d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -60,6 +60,7 @@ config X86
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
+ select ARCH_HAS_EARLY_DEBUG
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FILTER_PGPROT
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 950dc667c823..503c1630ca76 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -950,6 +950,14 @@ void kgdb_panic(const char *msg)
kgdb_breakpoint();
}
+static void kgdb_initial_breakpoint(void)
+{
+ kgdb_break_asap = 0;
+
+ pr_crit("Waiting for connection from remote gdb...\n");
+ kgdb_breakpoint();
+}
+
void __weak kgdb_arch_late(void)
{
}
@@ -960,6 +968,9 @@ void __init dbg_late_init(void)
if (kgdb_io_module_registered)
kgdb_arch_late();
kdb_init(KDB_INIT_FULL);
+
+ if (kgdb_io_module_registered && kgdb_break_asap)
+ kgdb_initial_breakpoint();
}
static int
@@ -1055,14 +1066,6 @@ void kgdb_schedule_breakpoint(void)
}
EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
-static void kgdb_initial_breakpoint(void)
-{
- kgdb_break_asap = 0;
-
- pr_crit("Waiting for connection from remote gdb...\n");
- kgdb_breakpoint();
-}
-
/**
* kgdb_register_io_module - register KGDB IO module
* @new_dbg_io_ops: the io ops vector
@@ -1099,7 +1102,8 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
/* Arm KGDB now. */
kgdb_register_callbacks();
- if (kgdb_break_asap)
+ if (kgdb_break_asap &&
+ (!dbg_is_early || IS_ENABLED(CONFIG_ARCH_HAS_EARLY_DEBUG)))
kgdb_initial_breakpoint();
return 0;
@@ -1169,7 +1173,8 @@ static int __init opt_kgdb_wait(char *str)
kgdb_break_asap = 1;
kdb_init(KDB_INIT_EARLY);
- if (kgdb_io_module_registered)
+ if (kgdb_io_module_registered &&
+ IS_ENABLED(CONFIG_ARCH_HAS_EARLY_DEBUG))
kgdb_initial_breakpoint();
return 0;
diff --git a/lib/Kconfig.kgdb b/lib/Kconfig.kgdb
index 933680b59e2d..ffa7a76de086 100644
--- a/lib/Kconfig.kgdb
+++ b/lib/Kconfig.kgdb
@@ -124,4 +124,22 @@ config KDB_CONTINUE_CATASTROPHIC
CONFIG_KDB_CONTINUE_CATASTROPHIC == 2. KDB forces a reboot.
If you are not sure, say 0.
+config ARCH_HAS_EARLY_DEBUG
+ bool
+ default n
+ help
+ If an architecture can definitely handle entering the debugger
+ when early_param's are parsed then it select this config.
+ Otherwise, if "kgdbwait" is passed on the kernel command line it
+ won't actually be processed until dbg_late_init() just after the
+ call to kgdb_arch_late() is made.
+
+ NOTE: Even if this isn't selected by an architecture we will
+ still try to register kgdb to handle breakpoints and crashes
+ when early_param's are parsed, we just won't act on the
+ "kgdbwait" parameter until dbg_late_init(). If you get a
+ crash and try to drop into kgdb somewhere between these two
+ places you might or might not end up being able to use kgdb
+ depending on exactly how far along the architecture has initted.
+
endif # KGDB
--
2.26.2.303.gf8c07b1a785-goog
If you build CONFIG_KGDB_SERIAL_CONSOLE into the kernel then you
should be able to have KGDB init itself at bootup by specifying the
"kgdboc=..." kernel command line parameter. This has worked OK for me
for many years, but on a new device I switched to it stopped working.
The problem is that on this new device the serial driver gets its
probe deferred. Now when kgdb initializes it can't find the tty
driver and when it gives up it never tries again.
We could try to find ways to move up the initialization of the serial
driver and such a thing might be worthwhile, but it's nice to be
robust against serial drivers that load late. We could move kgdb to
init itself later but that penalizes our ability to debug early boot
code on systems where the driver inits early. We could roll our own
system of detecting when new tty drivers get loaded and then use that
to figure out when kgdb can init, but that's ugly.
Instead, let's jump on the -EPROBE_DEFER bandwagon. We'll create a
singleton instance of a "kgdboc" platform device. If we can't find
our tty device when the singleton "kgdboc" probes we'll return
-EPROBE_DEFER which means that the system will call us back later to
try again when the tty device might be there.
We won't fully transition all of the kgdboc to a platform device
because early kgdb initialization (via the "ekgdboc" kernel command
line parameter) still runs before the platform device has been
created. The kgdb platform device is merely used as a convenient way
to hook into the system's normal probe deferral mechanisms.
As part of this, we'll ever-so-slightly change how the "kgdboc=..."
kernel command line parameter works. Previously if you booted up and
kgdb couldn't find the tty driver then later reading
'/sys/module/kgdboc/parameters/kgdboc' would return a blank string.
Now kgdb will keep track of the string that came as part of the
command line and give it back to you. It's expected that this should
be an OK change.
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
Changes in v3: None
Changes in v2: None
drivers/tty/serial/kgdboc.c | 126 +++++++++++++++++++++++++++++-------
1 file changed, 101 insertions(+), 25 deletions(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 8a1a4d1b6768..519d8cfbfbed 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -20,6 +20,7 @@
#include <linux/vt_kern.h>
#include <linux/input.h>
#include <linux/module.h>
+#include <linux/platform_device.h>
#define MAX_CONFIG_LEN 40
@@ -27,6 +28,7 @@ static struct kgdb_io kgdboc_io_ops;
/* -1 = init not run yet, 0 = unconfigured, 1 = configured. */
static int configured = -1;
+DEFINE_MUTEX(config_mutex);
static char config[MAX_CONFIG_LEN];
static struct kparam_string kps = {
@@ -38,6 +40,8 @@ static int kgdboc_use_kms; /* 1 if we use kernel mode switching */
static struct tty_driver *kgdb_tty_driver;
static int kgdb_tty_line;
+static struct platform_device *kgdboc_pdev;
+
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -133,11 +137,13 @@ static void kgdboc_unregister_kbd(void)
static void cleanup_kgdboc(void)
{
+ if (configured != 1)
+ return;
+
if (kgdb_unregister_nmi_console())
return;
kgdboc_unregister_kbd();
- if (configured == 1)
- kgdb_unregister_io_module(&kgdboc_io_ops);
+ kgdb_unregister_io_module(&kgdboc_io_ops);
}
static int configure_kgdboc(void)
@@ -198,20 +204,79 @@ static int configure_kgdboc(void)
kgdb_unregister_io_module(&kgdboc_io_ops);
noconfig:
kgdboc_unregister_kbd();
- config[0] = 0;
configured = 0;
- cleanup_kgdboc();
return err;
}
+static int kgdboc_probe(struct platform_device *pdev)
+{
+ int ret = 0;
+
+ mutex_lock(&config_mutex);
+ if (configured != 1) {
+ ret = configure_kgdboc();
+
+ /* Convert "no device" to "defer" so we'll keep trying */
+ if (ret == -ENODEV)
+ ret = -EPROBE_DEFER;
+ }
+ mutex_unlock(&config_mutex);
+
+ return ret;
+}
+
+static struct platform_driver kgdboc_platform_driver = {
+ .probe = kgdboc_probe,
+ .driver = {
+ .name = "kgdboc",
+ .suppress_bind_attrs = true,
+ },
+};
+
static int __init init_kgdboc(void)
{
- /* Already configured? */
- if (configured == 1)
+ int ret;
+
+ /*
+ * kgdboc is a little bit of an odd "platform_driver". It can be
+ * up and running long before the platform_driver object is
+ * created and thus doesn't actually store anything in it. There's
+ * only one instance of kgdb so anything is stored as global state.
+ * The platform_driver is only created so that we can leverage the
+ * kernel's mechanisms (like -EPROBE_DEFER) to call us when our
+ * underlying tty is ready. Here we init our platform driver and
+ * then create the single kgdboc instance.
+ */
+ ret = platform_driver_register(&kgdboc_platform_driver);
+ if (ret)
+ return ret;
+
+ kgdboc_pdev = platform_device_alloc("kgdboc", PLATFORM_DEVID_NONE);
+ if (!kgdboc_pdev) {
+ ret = -ENOMEM;
+ goto err_did_register;
+ }
+
+ ret = platform_device_add(kgdboc_pdev);
+ if (!ret)
return 0;
- return configure_kgdboc();
+ platform_device_put(kgdboc_pdev);
+
+err_did_register:
+ platform_driver_unregister(&kgdboc_platform_driver);
+ return ret;
+}
+
+static void exit_kgdboc(void)
+{
+ mutex_lock(&config_mutex);
+ cleanup_kgdboc();
+ mutex_unlock(&config_mutex);
+
+ platform_device_unregister(kgdboc_pdev);
+ platform_driver_unregister(&kgdboc_platform_driver);
}
static int kgdboc_get_char(void)
@@ -234,24 +299,20 @@ static int param_set_kgdboc_var(const char *kmessage,
const struct kernel_param *kp)
{
size_t len = strlen(kmessage);
+ int ret = 0;
if (len >= MAX_CONFIG_LEN) {
pr_err("config string too long\n");
return -ENOSPC;
}
- /* Only copy in the string if the init function has not run yet */
- if (configured < 0) {
- strcpy(config, kmessage);
- return 0;
- }
-
if (kgdb_connected) {
pr_err("Cannot reconfigure while KGDB is connected.\n");
-
return -EBUSY;
}
+ mutex_lock(&config_mutex);
+
strcpy(config, kmessage);
/* Chop out \n char as a result of echo */
if (len && config[len - 1] == '\n')
@@ -260,8 +321,30 @@ static int param_set_kgdboc_var(const char *kmessage,
if (configured == 1)
cleanup_kgdboc();
- /* Go and configure with the new params. */
- return configure_kgdboc();
+ /*
+ * Configure with the new params as long as init already ran.
+ * Note that we can get called before init if someone loads us
+ * with "modprobe kgdboc kgdboc=..." or if they happen to use the
+ * the odd syntax of "kgdboc.kgdboc=..." on the kernel command.
+ */
+ if (configured >= 0)
+ ret = configure_kgdboc();
+
+ /*
+ * If we couldn't configure then clear out the config. Note that
+ * specifying an invalid config on the kernel command line vs.
+ * through sysfs have slightly different behaviors. If we fail
+ * to configure what was specified on the kernel command line
+ * we'll leave it in the 'config' and return -EPROBE_DEFER from
+ * our probe. When specified through sysfs userspace is
+ * responsible for loading the tty driver before setting up.
+ */
+ if (ret)
+ config[0] = '\0';
+
+ mutex_unlock(&config_mutex);
+
+ return ret;
}
static int dbg_restore_graphics;
@@ -320,15 +403,8 @@ __setup("kgdboc=", kgdboc_option_setup);
/* This is only available if kgdboc is a built in for early debugging */
static int __init kgdboc_early_init(char *opt)
{
- /* save the first character of the config string because the
- * init routine can destroy it.
- */
- char save_ch;
-
kgdboc_option_setup(opt);
- save_ch = config[0];
- init_kgdboc();
- config[0] = save_ch;
+ configure_kgdboc();
return 0;
}
@@ -336,7 +412,7 @@ early_param("ekgdboc", kgdboc_early_init);
#endif /* CONFIG_KGDB_SERIAL_CONSOLE */
module_init(init_kgdboc);
-module_exit(cleanup_kgdboc);
+module_exit(exit_kgdboc);
module_param_call(kgdboc, param_set_kgdboc_var, param_get_string, &kps, 0644);
MODULE_PARM_DESC(kgdboc, "<serial_device>[,baud]");
MODULE_DESCRIPTION("KGDB Console TTY Driver");
--
2.26.2.303.gf8c07b1a785-goog
On Tue, Apr 28, 2020 at 02:13:44PM -0700, Douglas Anderson wrote:
> Using kgdb requires at least some level of architecture-level
> initialization. If nothing else, it relies on the architecture to
> pass breakpoints / crashes onto kgdb.
>
> On some architectures this all works super early, specifically it
> starts working at some point in time before Linux parses
> early_params's. On other architectures it doesn't. A survey of a few
> platforms:
>
> a) x86: Presumably it all works early since "ekgdboc" is documented to
> work here.
> b) arm64: Catching crashes works; with a simple patch breakpoints can
> also be made to work.
> c) arm: Nothing in kgdb works until
> paging_init() -> devicemaps_init() -> early_trap_init()
>
> Let's be conservative and, by default, process "kgdbwait" (which tells
> the kernel to drop into the debugger ASAP at boot) a bit later at
> dbg_late_init() time. If an architecture has tested it and wants to
> re-enable super early debugging, they can select the
> ARCH_HAS_EARLY_DEBUG KConfig option. We'll do this for x86 to start.
> It should be noted that dbg_late_init() is still called quite early in
> the system.
>
> Note that this patch doesn't affect when kgdb runs its init. If kgdb
> is set to initialize early it will still initialize when parsing
> early_param's. This patch _only_ inhibits the initial breakpoint from
> "kgdbwait". This means:
>
> * Without any extra patches arm64 platforms will at least catch
> crashes after kgdb inits.
> * arm platforms will catch crashes (and could handle a hardcoded
> kgdb_breakpoint()) any time after early_trap_init() runs, even
> before dbg_late_init().
>
> Signed-off-by: Douglas Anderson <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Reviewed-by: Greg Kroah-Hartman <[email protected]>
It looks like this patch is triggering some warnings from the existing
defconfigs (both x86 and arm64). It looks like this:
---
wychelm$ make defconfig
GEN Makefile
*** Default configuration is based on 'x86_64_defconfig'
WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
Depends on [n]: KGDB [=n]
Selected by [y]:
- X86 [=y]
WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
Depends on [n]: KGDB [=n]
Selected by [y]:
- X86 [=y]
#
# No change to .config
#
---
Daniel.
Hi,
On Thu, Apr 30, 2020 at 8:49 AM Daniel Thompson
<[email protected]> wrote:
>
> On Tue, Apr 28, 2020 at 02:13:44PM -0700, Douglas Anderson wrote:
> > Using kgdb requires at least some level of architecture-level
> > initialization. If nothing else, it relies on the architecture to
> > pass breakpoints / crashes onto kgdb.
> >
> > On some architectures this all works super early, specifically it
> > starts working at some point in time before Linux parses
> > early_params's. On other architectures it doesn't. A survey of a few
> > platforms:
> >
> > a) x86: Presumably it all works early since "ekgdboc" is documented to
> > work here.
> > b) arm64: Catching crashes works; with a simple patch breakpoints can
> > also be made to work.
> > c) arm: Nothing in kgdb works until
> > paging_init() -> devicemaps_init() -> early_trap_init()
> >
> > Let's be conservative and, by default, process "kgdbwait" (which tells
> > the kernel to drop into the debugger ASAP at boot) a bit later at
> > dbg_late_init() time. If an architecture has tested it and wants to
> > re-enable super early debugging, they can select the
> > ARCH_HAS_EARLY_DEBUG KConfig option. We'll do this for x86 to start.
> > It should be noted that dbg_late_init() is still called quite early in
> > the system.
> >
> > Note that this patch doesn't affect when kgdb runs its init. If kgdb
> > is set to initialize early it will still initialize when parsing
> > early_param's. This patch _only_ inhibits the initial breakpoint from
> > "kgdbwait". This means:
> >
> > * Without any extra patches arm64 platforms will at least catch
> > crashes after kgdb inits.
> > * arm platforms will catch crashes (and could handle a hardcoded
> > kgdb_breakpoint()) any time after early_trap_init() runs, even
> > before dbg_late_init().
> >
> > Signed-off-by: Douglas Anderson <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Reviewed-by: Greg Kroah-Hartman <[email protected]>
>
> It looks like this patch is triggering some warnings from the existing
> defconfigs (both x86 and arm64). It looks like this:
>
> ---
> wychelm$ make defconfig
> GEN Makefile
> *** Default configuration is based on 'x86_64_defconfig'
>
> WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
> Depends on [n]: KGDB [=n]
> Selected by [y]:
> - X86 [=y]
>
> WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
> Depends on [n]: KGDB [=n]
> Selected by [y]:
> - X86 [=y]
Ah, thanks! I hadn't noticed those. I think it'd be easy to just
change the relevant patches to just "select ARCH_HAS_EARLY_DEBUG if
KGDB". If you agree that's a good fix and are willing, I'd be happy
if you just added it to the relevant patches when applying. If not, I
can post a v4.
-Doug
On Tue, Apr 28, 2020 at 02:13:46PM -0700, Douglas Anderson wrote:
> If we detect that we recursively entered the debugger we should hack
> our I/O ops to NULL so that the panic() in the next line won't
> actually cause another recursion into the debugger. The first line of
> kgdb_panic() will check this and return.
>
> Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
On Tue, Apr 28, 2020 at 02:13:47PM -0700, Douglas Anderson wrote:
> We want to enable kgdb to debug the early parts of the kernel.
> Unfortunately kgdb normally is a client of the tty API in the kernel
> and serial drivers don't register to the tty layer until fairly late
> in the boot process.
>
> Serial drivers do, however, commonly register a boot console. Let's
> enable the kgdboc driver to work with boot consoles to provide early
> debugging.
>
> This change co-opts the existing read() function pointer that's part
> of "struct console". It's assumed that if a boot console (with the
> flag CON_BOOT) has implemented read() that both the read() and write()
> function are polling functions. That means they work without
> interrupts and read() will return immediately (with 0 bytes read) if
> there's nothing to read. This should be a safe assumption since it
> appears that no current boot consoles implement read() right now and
> there seems no reason to do so unless they wanted to support
> "kgdboc_earlycon".
>
> The console API isn't really intended to have clients work with it
> like we're doing. Specifically there doesn't appear to be any way for
> clients to be notified about a boot console being unregistered. We'll
> work around this by checking that our console is still valid before
> using it. We'll also try to transition off of the boot console and
> onto the "tty" API as quickly as possible.
>
> The normal/expected way to make all this work is to use
> "kgdboc_earlycon" and "kgdboc" together. You should point them both
> to the same physical serial connection. At boot time, as the system
> transitions from the boot console to the normal console, kgdb will
> switch over. If you don't use things in the normal/expected way it's
> a bit of a buyer-beware situation. Things thought about:
>
> - If you specify only "kgdboc_earlycon" but not "kgdboc" and the boot
> console vanishes at a weird time we'll panic if someone tries to
> drop into kgdb.
> - If you use "keep_bootcon" (which is already a bit of a buyer-beware
> option) and specify "kgdboc_earlycon" but not "kgdboc" we'll keep
> trying to use your boot console for kgdb.
> - If your "kgdboc_earlycon" and "kgdboc" devices are not the same
> device things should work OK, but it'll be your job to switch over
> which device you're monitoring (including figuring out how to switch
> over gdb in-flight if you're using it).
As mentioned in other threads. If we are changing the way we manage the
lifetime of the consoles I think it would be good to squash that change
down and simplify some of these cases.
> When trying to enable "kgdboc_earlycon" it should be noted that the
> names that are registered through the boot console layer and the tty
> layer are not the same for the same port. For example when debugging
> on one board I'd need to pass "kgdboc_earlycon=qcom_geni
> kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
> console name is a pain and there will rarely be more than one boot
> console enabled, you can provide the "kgdboc_earlycon" parameter
> without specifying the name of the boot console. In this case we'll
> just pick the first boot that implements read() that we find.
>
> This new "kgdboc_earlycon" parameter should be contrasted to the
> existing "ekgdboc" parameter. While both provide a way to debug very
> early, the usage and mechanisms are quite different. Specifically
> "kgdboc_earlycon" is meant to be used in tandem with "kgdboc" and
> there is a transition from one to the other. The "ekgdboc" parameter,
> on the other hand, replaces the "kgdboc" parameter. It runs the same
> logic as the "kgdboc" parameter but just relies on your TTY driver
> being present super early. The only known usage of the old "ekgdboc"
> parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
> be noted that "kbd" has special treatment allowing it to init early as
> a tty device.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> Reviewed-by: Greg Kroah-Hartman <[email protected]>
> Tested-by: Sumit Garg <[email protected]>
> ---
> I have kept Greg's Reviewed-by and Sumit's Tested-by tags on this
> commit despite changes that aren't totally trivial. Please yell if
> you disagree with this. Reasons:
> - Greg's Reviewed-by seemed more an overall acknowledgment that the
> series wasn't totally insane rather than a detailed review. I don't
> think the changes from v2 to v3 change that.
> - Sumit's Tested-by seemed useful as confirmation that someone else
> made this work on a machine that wasn't mine. I don't believe that
> the changes from v2 to v3 should affect anything here.
>
> Changes in v3:
> - Add deinit() to I/O ops to know a driver can be replaced.
> - Don't just neuter input, panic if earlycon vanishes.
> - No extra param to kgdb_register_io_module().
> - Renamed earlycon_kgdboc to kgdboc_earlycon.
> - Simplify earlycon_kgdb deinit by using the deinit() function.
>
> Changes in v2:
> - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
>
> drivers/tty/serial/kgdboc.c | 136 ++++++++++++++++++++++++++++++++++++
> include/linux/kgdb.h | 4 ++
> kernel/debug/debug_core.c | 23 ++++--
> 3 files changed, 159 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> index 519d8cfbfbed..7aca0a67fc0b 100644
> --- a/drivers/tty/serial/kgdboc.c
> +++ b/drivers/tty/serial/kgdboc.c
> @@ -21,6 +21,7 @@
> #include <linux/input.h>
> #include <linux/module.h>
> #include <linux/platform_device.h>
> +#include <linux/serial_core.h>
>
> #define MAX_CONFIG_LEN 40
>
> @@ -42,6 +43,13 @@ static int kgdb_tty_line;
>
> static struct platform_device *kgdboc_pdev;
>
> +#ifdef CONFIG_KGDB_SERIAL_CONSOLE
Isn't this always set for this file (see Makefile)?
I think all the instances of this check (and the diligent
#else clauses are redundant).
> +static struct kgdb_io kgdboc_earlycon_io_ops;
> +struct console *earlycon;
static?
> <snip>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index b072aeb1fd78..77a3c519478a 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -1075,15 +1075,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
> */
> int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> {
> + struct kgdb_io *old_dbg_io_ops;
> int err;
>
> spin_lock(&kgdb_registration_lock);
>
> - if (dbg_io_ops) {
> - spin_unlock(&kgdb_registration_lock);
> + old_dbg_io_ops = dbg_io_ops;
> + if (old_dbg_io_ops) {
> + if (!old_dbg_io_ops->deinit) {
> + spin_unlock(&kgdb_registration_lock);
>
> - pr_err("Another I/O driver is already registered with KGDB\n");
> - return -EBUSY;
> + pr_err("KGDB I/O driver %s can't replace %s.\n",
> + new_dbg_io_ops->name, old_dbg_io_ops->name);
> + return -EBUSY;
> + }
> + old_dbg_io_ops->deinit();
> }
>
> if (new_dbg_io_ops->init) {
> @@ -1098,6 +1104,12 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
>
> spin_unlock(&kgdb_registration_lock);
>
> + if (old_dbg_io_ops) {
> + pr_info("Replaced I/O driver %s with %s\n",
> + old_dbg_io_ops->name, new_dbg_io_ops->name);
I know that causes no trouble for the current deinit() method does but
I'd be more comfortable if the core printed this before calling deinit()?
Daniel.
On Tue, Apr 28, 2020 at 02:13:48PM -0700, Douglas Anderson wrote:
> The recent patch ("kgdboc: Add kgdboc_earlycon to support early kgdb
> using boot consoles") adds a new kernel command line parameter.
> Document it.
>
> Note that the patch adding the feature does some comparing/contrasting
> of "kgdboc_earlycon" vs. the existing "ekgdboc". See that patch for
> more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
> and just makes "kgdboc" do its normal initialization early (only works
> if your tty driver is already ready). The new "kgdboc_earlycon" works
> in combination with "kgdboc" and is backed by boot consoles.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> Reviewed-by: Greg Kroah-Hartman <[email protected]>
> ---
>
> Changes in v3:
> - Added example in kgdb.rst
> - Removed unneeded sentence in kerenel-parameters doc.
> - Renamed earlycon_kgdboc to kgdboc_earlycon.
> - Suggest people use kgdboc_earlycon instead of ekgdboc.
>
> Changes in v2: None
>
> .../admin-guide/kernel-parameters.txt | 20 ++++++++++++++++
> Documentation/dev-tools/kgdb.rst | 24 +++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 7bc83f3d9bdf..3b5ae06a98aa 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1190,6 +1190,11 @@
> This is designed to be used in conjunction with
> the boot argument: earlyprintk=vga
>
> + This parameter works in place of the kgdboc parameter
> + but can only be used if the backing tty is available
> + very early in the boot process. For early debugging
> + via a serial port see kgdboc_earlycon instead.
> +
> edd= [EDD]
> Format: {"off" | "on" | "skip[mbr]"}
>
> @@ -2105,6 +2110,21 @@
> kms, kbd format: kms,kbd
> kms, kbd and serial format: kms,kbd,<ser_dev>[,baud]
>
> + kgdboc_earlycon= [KGDB,HW]
> + If the boot console provides the ability to read
> + characters and can work in polling mode, you can use
> + this parameter to tell kgdb to use it as a backend
> + until the normal console is registered. Intended to
> + be used together with the kgdboc parameter which
> + specifies the normal console to transition to.
> +
> + The the name of the early console should be specified
s/The the/The/
Other than that:
Reviewed-by: Daniel Thompson <[email protected]>
On Thu, Apr 30, 2020 at 09:35:30AM -0700, Doug Anderson wrote:
> Hi,
>
> On Thu, Apr 30, 2020 at 8:49 AM Daniel Thompson
> <[email protected]> wrote:
> >
> > On Tue, Apr 28, 2020 at 02:13:44PM -0700, Douglas Anderson wrote:
> > > Using kgdb requires at least some level of architecture-level
> > > initialization. If nothing else, it relies on the architecture to
> > > pass breakpoints / crashes onto kgdb.
> > >
> > > On some architectures this all works super early, specifically it
> > > starts working at some point in time before Linux parses
> > > early_params's. On other architectures it doesn't. A survey of a few
> > > platforms:
> > >
> > > a) x86: Presumably it all works early since "ekgdboc" is documented to
> > > work here.
> > > b) arm64: Catching crashes works; with a simple patch breakpoints can
> > > also be made to work.
> > > c) arm: Nothing in kgdb works until
> > > paging_init() -> devicemaps_init() -> early_trap_init()
> > >
> > > Let's be conservative and, by default, process "kgdbwait" (which tells
> > > the kernel to drop into the debugger ASAP at boot) a bit later at
> > > dbg_late_init() time. If an architecture has tested it and wants to
> > > re-enable super early debugging, they can select the
> > > ARCH_HAS_EARLY_DEBUG KConfig option. We'll do this for x86 to start.
> > > It should be noted that dbg_late_init() is still called quite early in
> > > the system.
> > >
> > > Note that this patch doesn't affect when kgdb runs its init. If kgdb
> > > is set to initialize early it will still initialize when parsing
> > > early_param's. This patch _only_ inhibits the initial breakpoint from
> > > "kgdbwait". This means:
> > >
> > > * Without any extra patches arm64 platforms will at least catch
> > > crashes after kgdb inits.
> > > * arm platforms will catch crashes (and could handle a hardcoded
> > > kgdb_breakpoint()) any time after early_trap_init() runs, even
> > > before dbg_late_init().
> > >
> > > Signed-off-by: Douglas Anderson <[email protected]>
> > > Cc: Thomas Gleixner <[email protected]>
> > > Cc: Ingo Molnar <[email protected]>
> > > Cc: Borislav Petkov <[email protected]>
> > > Reviewed-by: Greg Kroah-Hartman <[email protected]>
> >
> > It looks like this patch is triggering some warnings from the existing
> > defconfigs (both x86 and arm64). It looks like this:
> >
> > ---
> > wychelm$ make defconfig
> > GEN Makefile
> > *** Default configuration is based on 'x86_64_defconfig'
> >
> > WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
> > Depends on [n]: KGDB [=n]
> > Selected by [y]:
> > - X86 [=y]
> >
> > WARNING: unmet direct dependencies detected for ARCH_HAS_EARLY_DEBUG
> > Depends on [n]: KGDB [=n]
> > Selected by [y]:
> > - X86 [=y]
>
> Ah, thanks! I hadn't noticed those. I think it'd be easy to just
> change the relevant patches to just "select ARCH_HAS_EARLY_DEBUG if
> KGDB". If you agree that's a good fix and are willing, I'd be happy
> if you just added it to the relevant patches when applying. If not, I
> can post a v4.
Happy with the approach to fix this.
Given the follow on discussion from the end of last week I suspect there
probably needs to be a v4 anyway so perhaps the last question is
applying a fix up is moot at this point?
Daniel.
Hi Doug,
On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> index 48222a4760c2..59c353dfc8e9 100644
> --- a/arch/arm64/kernel/debug-monitors.c
> +++ b/arch/arm64/kernel/debug-monitors.c
> @@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
> unregister_debug_hook(&hook->node);
> }
>
> -static int call_break_hook(struct pt_regs *regs, unsigned int esr)
> +int call_break_hook(struct pt_regs *regs, unsigned int esr)
> {
> struct break_hook *hook;
> struct list_head *list;
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index cf402be5c573..a8173f0c1774 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> #endif
> + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> + return 0;
I think this just means we're not running debug_traps_init() early enough,
and actually the KASAN early handler is unnecessary too.
If we call debug_traps_init() directly from setup_arch() and drop the
arch_initcall(), can we then drop early_brk64 entirely?
Will
Hi,
On Mon, May 11, 2020 at 7:59 AM Will Deacon <[email protected]> wrote:
>
> Hi Doug,
>
> On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> > diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> > index 48222a4760c2..59c353dfc8e9 100644
> > --- a/arch/arm64/kernel/debug-monitors.c
> > +++ b/arch/arm64/kernel/debug-monitors.c
> > @@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
> > unregister_debug_hook(&hook->node);
> > }
> >
> > -static int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > +int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > {
> > struct break_hook *hook;
> > struct list_head *list;
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index cf402be5c573..a8173f0c1774 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> > if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> > return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> > #endif
> > + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> > + return 0;
>
> I think this just means we're not running debug_traps_init() early enough,
> and actually the KASAN early handler is unnecessary too.
>
> If we call debug_traps_init() directly from setup_arch() and drop the
> arch_initcall(), can we then drop early_brk64 entirely?
It seems to work in my testing. ...but the worry I have is the
comment right before trap_init(). It says:
/* This registration must happen early, before debug_traps_init(). */
By moving debug_traps_init() early we're violating that comment. Do I
just remove that comment, or was there a good reason for it? ...or am
I reading it wrong and I should have read it as if it said:
/* NOTE: this registration happens early, before debug_traps_init(). */
...then removing it is fine. Maybe that's right?
I coded this up and put it on the Chrome OS gerrit at
<https://crrev.com/c/2195061>. I'm happy to post this on the list as
a loner patch to replace this one or spin the whole series depending
on what people want.
-Doug
On Mon, May 11, 2020 at 03:45:02PM -0700, Doug Anderson wrote:
> On Mon, May 11, 2020 at 7:59 AM Will Deacon <[email protected]> wrote:
> > On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> > > diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> > > index 48222a4760c2..59c353dfc8e9 100644
> > > --- a/arch/arm64/kernel/debug-monitors.c
> > > +++ b/arch/arm64/kernel/debug-monitors.c
> > > @@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
> > > unregister_debug_hook(&hook->node);
> > > }
> > >
> > > -static int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > > +int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > > {
> > > struct break_hook *hook;
> > > struct list_head *list;
> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index cf402be5c573..a8173f0c1774 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> > > if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> > > return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> > > #endif
> > > + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> > > + return 0;
> >
> > I think this just means we're not running debug_traps_init() early enough,
> > and actually the KASAN early handler is unnecessary too.
> >
> > If we call debug_traps_init() directly from setup_arch() and drop the
> > arch_initcall(), can we then drop early_brk64 entirely?
>
> It seems to work in my testing. ...but the worry I have is the
> comment right before trap_init(). It says:
>
> /* This registration must happen early, before debug_traps_init(). */
I /think/ the reason for this is because debug_traps_init() replaces the
BRK vector, so if that runs before the break hooks have been registered
for e.g. BUG() then BUG() won't work during that window. Hmm, so dropping
early_brk64 is problematic after all. Damn.
Is trap_init() early enough for you? If so, we could call debug_traps_init()
from traps_init() after registering the break hooks.
Will
Hi,
On Tue, May 12, 2020 at 12:36 AM Will Deacon <[email protected]> wrote:
>
> On Mon, May 11, 2020 at 03:45:02PM -0700, Doug Anderson wrote:
> > On Mon, May 11, 2020 at 7:59 AM Will Deacon <[email protected]> wrote:
> > > On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> > > > diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> > > > index 48222a4760c2..59c353dfc8e9 100644
> > > > --- a/arch/arm64/kernel/debug-monitors.c
> > > > +++ b/arch/arm64/kernel/debug-monitors.c
> > > > @@ -297,7 +297,7 @@ void unregister_kernel_break_hook(struct break_hook *hook)
> > > > unregister_debug_hook(&hook->node);
> > > > }
> > > >
> > > > -static int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > > > +int call_break_hook(struct pt_regs *regs, unsigned int esr)
> > > > {
> > > > struct break_hook *hook;
> > > > struct list_head *list;
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index cf402be5c573..a8173f0c1774 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> > > > if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> > > > return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> > > > #endif
> > > > + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> > > > + return 0;
> > >
> > > I think this just means we're not running debug_traps_init() early enough,
> > > and actually the KASAN early handler is unnecessary too.
> > >
> > > If we call debug_traps_init() directly from setup_arch() and drop the
> > > arch_initcall(), can we then drop early_brk64 entirely?
> >
> > It seems to work in my testing. ...but the worry I have is the
> > comment right before trap_init(). It says:
> >
> > /* This registration must happen early, before debug_traps_init(). */
>
> I /think/ the reason for this is because debug_traps_init() replaces the
> BRK vector, so if that runs before the break hooks have been registered
> for e.g. BUG() then BUG() won't work during that window. Hmm, so dropping
> early_brk64 is problematic after all. Damn.
>
> Is trap_init() early enough for you? If so, we could call debug_traps_init()
> from traps_init() after registering the break hooks.
"Early enough" is a subjective term, of course. The earlier we can
init, the earlier we can drop into the debugger. ...but, of course,
everyone thinks their feature is the most important and should be
first, so let's see...
Certainly if we waited until trap_init() it wouldn't be early enough
to set "ARCH_HAS_EARLY_DEBUG". Setting that means that debugging is
ready when early params are parsed and those happen at the start of
setup_arch(). The call to trap_init() happens a bit later.
If we decide that we just don't care about getting
"ARCH_HAS_EARLY_DEBUG" to work then the earliest we'll be able to
break into the debugger (via kgdbwait) is dbg_late_init(). That
_does_ happen after trap_init() so your solution would work.
As a person who spends most of his time in driver land, it wouldn't be
the end of the world to wait for dbg_late_init(). That's still much
earlier than most code I'd ever debug. ...and, bonus points is that
if we hit a crash any time after earlyparams we _will_ still drop into
the debugger. It's only breakpoints that won't be available until
dbg_late_init().
tl;dr:
* If we care about "kgdbwait" and breakpoints working as early as
possible then we need my patch.
* If we are OK w/ a slightly later "kgdbwait" then I think we can move
debug_traps_init() to trap_init() and get rid of the early version.
Please let me know which way you'd like to proceed.
-Doug
Hey Doug,
On Tue, May 12, 2020 at 08:27:50AM -0700, Doug Anderson wrote:
> On Tue, May 12, 2020 at 12:36 AM Will Deacon <[email protected]> wrote:
> > On Mon, May 11, 2020 at 03:45:02PM -0700, Doug Anderson wrote:
> > > On Mon, May 11, 2020 at 7:59 AM Will Deacon <[email protected]> wrote:
> > > > On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > > index cf402be5c573..a8173f0c1774 100644
> > > > > --- a/arch/arm64/kernel/traps.c
> > > > > +++ b/arch/arm64/kernel/traps.c
> > > > > @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> > > > > if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> > > > > return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> > > > > #endif
> > > > > + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> > > > > + return 0;
> > > >
> > > > I think this just means we're not running debug_traps_init() early enough,
> > > > and actually the KASAN early handler is unnecessary too.
> > > >
> > > > If we call debug_traps_init() directly from setup_arch() and drop the
> > > > arch_initcall(), can we then drop early_brk64 entirely?
> > >
> > > It seems to work in my testing. ...but the worry I have is the
> > > comment right before trap_init(). It says:
> > >
> > > /* This registration must happen early, before debug_traps_init(). */
> >
> > I /think/ the reason for this is because debug_traps_init() replaces the
> > BRK vector, so if that runs before the break hooks have been registered
> > for e.g. BUG() then BUG() won't work during that window. Hmm, so dropping
> > early_brk64 is problematic after all. Damn.
> >
> > Is trap_init() early enough for you? If so, we could call debug_traps_init()
> > from traps_init() after registering the break hooks.
>
> "Early enough" is a subjective term, of course. The earlier we can
> init, the earlier we can drop into the debugger. ...but, of course,
> everyone thinks their feature is the most important and should be
> first, so let's see...
>
> Certainly if we waited until trap_init() it wouldn't be early enough
> to set "ARCH_HAS_EARLY_DEBUG". Setting that means that debugging is
> ready when early params are parsed and those happen at the start of
> setup_arch(). The call to trap_init() happens a bit later.
>
> If we decide that we just don't care about getting
> "ARCH_HAS_EARLY_DEBUG" to work then the earliest we'll be able to
> break into the debugger (via kgdbwait) is dbg_late_init(). That
> _does_ happen after trap_init() so your solution would work.
>
> As a person who spends most of his time in driver land, it wouldn't be
> the end of the world to wait for dbg_late_init(). That's still much
> earlier than most code I'd ever debug. ...and, bonus points is that
> if we hit a crash any time after earlyparams we _will_ still drop into
> the debugger. It's only breakpoints that won't be available until
> dbg_late_init().
>
>
> tl;dr:
>
> * If we care about "kgdbwait" and breakpoints working as early as
> possible then we need my patch.
>
> * If we are OK w/ a slightly later "kgdbwait" then I think we can move
> debug_traps_init() to trap_init() and get rid of the early version.
>
>
> Please let me know which way you'd like to proceed.
Let's go with the trap_init() approach for now, and we can revisit it later
if somebody has a compelling reason to initialise things earlier. However,
I don't think you can remove early_brk64(), as it's needed for BUG() to
work correctly.
Will
Hi,
On Tue, May 12, 2020 at 11:17 PM Will Deacon <[email protected]> wrote:
>
> Hey Doug,
>
> On Tue, May 12, 2020 at 08:27:50AM -0700, Doug Anderson wrote:
> > On Tue, May 12, 2020 at 12:36 AM Will Deacon <[email protected]> wrote:
> > > On Mon, May 11, 2020 at 03:45:02PM -0700, Doug Anderson wrote:
> > > > On Mon, May 11, 2020 at 7:59 AM Will Deacon <[email protected]> wrote:
> > > > > On Tue, Apr 28, 2020 at 02:13:45PM -0700, Douglas Anderson wrote:
> > > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > > > index cf402be5c573..a8173f0c1774 100644
> > > > > > --- a/arch/arm64/kernel/traps.c
> > > > > > +++ b/arch/arm64/kernel/traps.c
> > > > > > @@ -1044,6 +1044,9 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
> > > > > > if ((comment & ~KASAN_BRK_MASK) == KASAN_BRK_IMM)
> > > > > > return kasan_handler(regs, esr) != DBG_HOOK_HANDLED;
> > > > > > #endif
> > > > > > + if (call_break_hook(regs, esr) == DBG_HOOK_HANDLED)
> > > > > > + return 0;
> > > > >
> > > > > I think this just means we're not running debug_traps_init() early enough,
> > > > > and actually the KASAN early handler is unnecessary too.
> > > > >
> > > > > If we call debug_traps_init() directly from setup_arch() and drop the
> > > > > arch_initcall(), can we then drop early_brk64 entirely?
> > > >
> > > > It seems to work in my testing. ...but the worry I have is the
> > > > comment right before trap_init(). It says:
> > > >
> > > > /* This registration must happen early, before debug_traps_init(). */
> > >
> > > I /think/ the reason for this is because debug_traps_init() replaces the
> > > BRK vector, so if that runs before the break hooks have been registered
> > > for e.g. BUG() then BUG() won't work during that window. Hmm, so dropping
> > > early_brk64 is problematic after all. Damn.
> > >
> > > Is trap_init() early enough for you? If so, we could call debug_traps_init()
> > > from traps_init() after registering the break hooks.
> >
> > "Early enough" is a subjective term, of course. The earlier we can
> > init, the earlier we can drop into the debugger. ...but, of course,
> > everyone thinks their feature is the most important and should be
> > first, so let's see...
> >
> > Certainly if we waited until trap_init() it wouldn't be early enough
> > to set "ARCH_HAS_EARLY_DEBUG". Setting that means that debugging is
> > ready when early params are parsed and those happen at the start of
> > setup_arch(). The call to trap_init() happens a bit later.
> >
> > If we decide that we just don't care about getting
> > "ARCH_HAS_EARLY_DEBUG" to work then the earliest we'll be able to
> > break into the debugger (via kgdbwait) is dbg_late_init(). That
> > _does_ happen after trap_init() so your solution would work.
> >
> > As a person who spends most of his time in driver land, it wouldn't be
> > the end of the world to wait for dbg_late_init(). That's still much
> > earlier than most code I'd ever debug. ...and, bonus points is that
> > if we hit a crash any time after earlyparams we _will_ still drop into
> > the debugger. It's only breakpoints that won't be available until
> > dbg_late_init().
> >
> >
> > tl;dr:
> >
> > * If we care about "kgdbwait" and breakpoints working as early as
> > possible then we need my patch.
> >
> > * If we are OK w/ a slightly later "kgdbwait" then I think we can move
> > debug_traps_init() to trap_init() and get rid of the early version.
> >
> >
> > Please let me know which way you'd like to proceed.
>
> Let's go with the trap_init() approach for now, and we can revisit it later
> if somebody has a compelling reason to initialise things earlier. However,
> I don't think you can remove early_brk64(), as it's needed for BUG() to
> work correctly.
Posted at:
https://lore.kernel.org/r/20200513160501.1.I0b5edf030cc6ebef6ab4829f8867cdaea42485d8@changeid
I'll also reply to the v4 version of this patch to point at it.
-Doug