This patch-set is aimed to improve console handling especially when kdb
operates in NMI context.
Brief description of enhancements:
- Add status check for console prior to invoking corresponding handler.
- Fixup to avoid possible deadlock in NMI context due to usage of locks
in the console handlers.
- Prefer usage of polling I/O driver mode (lockless APIs) over invocation
of console handlers.
Changes in v4:
- Use dbg_io_ops->write_char() directly instead of passing it as a
function pointer.
- Use "struct console" rather than "struct tty_driver" for comparison.
- Make commit descriptions and comments more informative.
- Add review tag for patch #2.
Changes in v3:
- Split patch to have separate patch for console status check.
- New patch to re-factor kdb message emit code.
- New patch to prefer polling I/O over console mode.
- Add code comments to describe usage of oops_in_progress.
Changes in v2:
- Use oops_in_progress directly instead of bust_spinlocks().
Sumit Garg (4):
kdb: Re-factor kdb_printf() message write code
kdb: Check status of console prior to invoking handlers
kdb: Make kdb_printf() console handling more robust
kdb: Switch to use safer dbg_io_ops over console APIs
drivers/tty/serial/kgdb_nmi.c | 2 +-
drivers/tty/serial/kgdboc.c | 4 +--
drivers/usb/early/ehci-dbgp.c | 3 +-
include/linux/kgdb.h | 5 ++-
kernel/debug/kdb/kdb_io.c | 76 ++++++++++++++++++++++++++-----------------
5 files changed, 54 insertions(+), 36 deletions(-)
--
2.7.4
Re-factor kdb_printf() message write code in order to avoid duplication
of code and thereby increase readability.
Signed-off-by: Sumit Garg <[email protected]>
---
kernel/debug/kdb/kdb_io.c | 61 +++++++++++++++++++++++++----------------------
1 file changed, 32 insertions(+), 29 deletions(-)
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 924bc92..e46f33e 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -542,6 +542,33 @@ static int kdb_search_string(char *searched, char *searchfor)
return 0;
}
+static void kdb_io_write(char *cp, int len)
+{
+ if (len == 0)
+ return;
+
+ while (len--) {
+ dbg_io_ops->write_char(*cp);
+ cp++;
+ }
+}
+
+static void kdb_msg_write(char *msg, int msg_len)
+{
+ struct console *c;
+
+ if (msg_len == 0)
+ return;
+
+ if (dbg_io_ops && !dbg_io_ops->is_console)
+ kdb_io_write(msg, msg_len);
+
+ for_each_console(c) {
+ c->write(c, msg, msg_len);
+ touch_nmi_watchdog();
+ }
+}
+
int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
{
int diag;
@@ -553,7 +580,6 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
int this_cpu, old_cpu;
char *cp, *cp2, *cphold = NULL, replaced_byte = ' ';
char *moreprompt = "more> ";
- struct console *c;
unsigned long uninitialized_var(flags);
/* Serialize kdb_printf if multiple cpus try to write at once.
@@ -687,22 +713,11 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
*/
retlen = strlen(kdb_buffer);
cp = (char *) printk_skip_headers(kdb_buffer);
- if (!dbg_kdb_mode && kgdb_connected) {
+ if (!dbg_kdb_mode && kgdb_connected)
gdbstub_msg_write(cp, retlen - (cp - kdb_buffer));
- } else {
- if (dbg_io_ops && !dbg_io_ops->is_console) {
- len = retlen - (cp - kdb_buffer);
- cp2 = cp;
- while (len--) {
- dbg_io_ops->write_char(*cp2);
- cp2++;
- }
- }
- for_each_console(c) {
- c->write(c, cp, retlen - (cp - kdb_buffer));
- touch_nmi_watchdog();
- }
- }
+ else
+ kdb_msg_write(cp, retlen - (cp - kdb_buffer));
+
if (logging) {
saved_loglevel = console_loglevel;
console_loglevel = CONSOLE_LOGLEVEL_SILENT;
@@ -751,19 +766,7 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
moreprompt = "more> ";
kdb_input_flush();
-
- if (dbg_io_ops && !dbg_io_ops->is_console) {
- len = strlen(moreprompt);
- cp = moreprompt;
- while (len--) {
- dbg_io_ops->write_char(*cp);
- cp++;
- }
- }
- for_each_console(c) {
- c->write(c, moreprompt, strlen(moreprompt));
- touch_nmi_watchdog();
- }
+ kdb_msg_write(moreprompt, strlen(moreprompt));
if (logging)
printk("%s", moreprompt);
--
2.7.4
Check if a console is enabled prior to invoking corresponding write
handler.
Suggested-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Sumit Garg <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
---
kernel/debug/kdb/kdb_io.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index e46f33e..fad38eb 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -564,6 +564,8 @@ static void kdb_msg_write(char *msg, int msg_len)
kdb_io_write(msg, msg_len);
for_each_console(c) {
+ if (!(c->flags & CON_ENABLED))
+ continue;
c->write(c, msg, msg_len);
touch_nmi_watchdog();
}
--
2.7.4
In kgdb context, calling console handlers aren't safe due to locks used
in those handlers which could in turn lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).
Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.
In order to achieve this we need to reverse the order of preference to
use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
store "struct console" that represents debugger I/O in dbg_io_ops and
while emitting kdb messages, skip console that matches dbg_io_ops
console in order to avoid duplicate messages. After this change,
"is_console" param becomes redundant and hence removed.
Suggested-by: Daniel Thompson <[email protected]>
Signed-off-by: Sumit Garg <[email protected]>
---
drivers/tty/serial/kgdb_nmi.c | 2 +-
drivers/tty/serial/kgdboc.c | 4 ++--
drivers/usb/early/ehci-dbgp.c | 3 ++-
include/linux/kgdb.h | 5 ++---
kernel/debug/kdb/kdb_io.c | 4 +++-
5 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/tty/serial/kgdb_nmi.c b/drivers/tty/serial/kgdb_nmi.c
index 5022447..6004c0c 100644
--- a/drivers/tty/serial/kgdb_nmi.c
+++ b/drivers/tty/serial/kgdb_nmi.c
@@ -50,7 +50,7 @@ static int kgdb_nmi_console_setup(struct console *co, char *options)
* I/O utilities that messages sent to the console will automatically
* be displayed on the dbg_io.
*/
- dbg_io_ops->is_console = true;
+ dbg_io_ops->cons = co;
return 0;
}
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index c9f94fa..c7ffa87 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -153,7 +153,7 @@ static int configure_kgdboc(void)
goto noconfig;
}
- kgdboc_io_ops.is_console = 0;
+ kgdboc_io_ops.cons = NULL;
kgdb_tty_driver = NULL;
kgdboc_use_kms = 0;
@@ -173,7 +173,7 @@ static int configure_kgdboc(void)
int idx;
if (cons->device && cons->device(cons, &idx) == p &&
idx == tty_line) {
- kgdboc_io_ops.is_console = 1;
+ kgdboc_io_ops.cons = cons;
break;
}
}
diff --git a/drivers/usb/early/ehci-dbgp.c b/drivers/usb/early/ehci-dbgp.c
index ea0d531..8c32d1a 100644
--- a/drivers/usb/early/ehci-dbgp.c
+++ b/drivers/usb/early/ehci-dbgp.c
@@ -1058,7 +1058,8 @@ static int __init kgdbdbgp_parse_config(char *str)
kgdbdbgp_wait_time = simple_strtoul(ptr, &ptr, 10);
}
kgdb_register_io_module(&kgdbdbgp_io_ops);
- kgdbdbgp_io_ops.is_console = early_dbgp_console.index != -1;
+ if (early_dbgp_console.index != -1)
+ kgdbdbgp_io_ops.cons = early_dbgp_console;
return 0;
}
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index b072aeb..bc0face3 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -273,8 +273,7 @@ struct kgdb_arch {
* the I/O driver.
* @post_exception: Pointer to a function that will do any cleanup work
* for the I/O driver.
- * @is_console: 1 if the end device is a console 0 if the I/O device is
- * not a console
+ * @cons: valid if the I/O device is a console.
*/
struct kgdb_io {
const char *name;
@@ -284,7 +283,7 @@ struct kgdb_io {
int (*init) (void);
void (*pre_exception) (void);
void (*post_exception) (void);
- int is_console;
+ struct console *cons;
};
extern const struct kgdb_arch arch_kgdb_ops;
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 9e5a40d..5e00bc8 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len)
if (msg_len == 0)
return;
- if (dbg_io_ops && !dbg_io_ops->is_console)
+ if (dbg_io_ops)
kdb_io_write(msg, msg_len);
for_each_console(c) {
if (!(c->flags & CON_ENABLED))
continue;
+ if (c == dbg_io_ops->cons)
+ continue;
/*
* Set oops_in_progress to encourage the console drivers to
* disregard their internal spin locks: in the current calling
--
2.7.4
While rounding up CPUs via NMIs, its possible that a rounded up CPU
maybe holding a console port lock leading to kgdb master CPU stuck in
a deadlock during invocation of console write operations. A similar
deadlock could also be possible while using synchronous breakpoints.
So in order to avoid such a deadlock, set oops_in_progress to encourage
the console drivers to disregard their internal spin locks: in the
current calling context the risk of deadlock is a bigger problem than
risks due to re-entering the console driver. We operate directly on
oops_in_progress rather than using bust_spinlocks() because the calls
bust_spinlocks() makes on exit are not appropriate for this calling
context.
Suggested-by: Petr Mladek <[email protected]>
Signed-off-by: Sumit Garg <[email protected]>
---
kernel/debug/kdb/kdb_io.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index fad38eb..9e5a40d 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -566,7 +566,18 @@ static void kdb_msg_write(char *msg, int msg_len)
for_each_console(c) {
if (!(c->flags & CON_ENABLED))
continue;
+ /*
+ * Set oops_in_progress to encourage the console drivers to
+ * disregard their internal spin locks: in the current calling
+ * context the risk of deadlock is a bigger problem than risks
+ * due to re-entering the console driver. We operate directly on
+ * oops_in_progress rather than using bust_spinlocks() because
+ * the calls bust_spinlocks() makes on exit are not appropriate
+ * for this calling context.
+ */
+ ++oops_in_progress;
c->write(c, msg, msg_len);
+ --oops_in_progress;
touch_nmi_watchdog();
}
}
--
2.7.4
Hi Sumit,
I love your patch! Yet something to improve:
[auto build test ERROR on tty/tty-testing]
[also build test ERROR on usb/usb-testing v5.7-rc7 next-20200529]
[cannot apply to kgdb/kgdb-next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Sumit-Garg/kdb-Improve-console-handling/20200531-075431
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 2388a096e7865c043e83ece4e26654bd3d1a20d5)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> drivers/usb/early/ehci-dbgp.c:1062:24: error: assigning to 'struct console *' from incompatible type 'struct console'; take the address with &
kgdbdbgp_io_ops.cons = early_dbgp_console;
^ ~~~~~~~~~~~~~~~~~~
&
1 error generated.
vim +1062 drivers/usb/early/ehci-dbgp.c
1046
1047 static int __init kgdbdbgp_parse_config(char *str)
1048 {
1049 char *ptr;
1050
1051 if (!ehci_debug) {
1052 if (early_dbgp_init(str))
1053 return -1;
1054 }
1055 ptr = strchr(str, ',');
1056 if (ptr) {
1057 ptr++;
1058 kgdbdbgp_wait_time = simple_strtoul(ptr, &ptr, 10);
1059 }
1060 kgdb_register_io_module(&kgdbdbgp_io_ops);
1061 if (early_dbgp_console.index != -1)
> 1062 kgdbdbgp_io_ops.cons = early_dbgp_console;
1063
1064 return 0;
1065 }
1066 early_param("kgdbdbgp", kgdbdbgp_parse_config);
1067
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
On Sun, 31 May 2020 at 10:58, kbuild test robot <[email protected]> wrote:
>
> Hi Sumit,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on tty/tty-testing]
> [also build test ERROR on usb/usb-testing v5.7-rc7 next-20200529]
> [cannot apply to kgdb/kgdb-next]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
>
> url: https://github.com/0day-ci/linux/commits/Sumit-Garg/kdb-Improve-console-handling/20200531-075431
> base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing
> config: x86_64-allyesconfig (attached as .config)
> compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 2388a096e7865c043e83ece4e26654bd3d1a20d5)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install x86_64 cross compiling tool for clang build
> # apt-get install binutils-x86-64-linux-gnu
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kbuild test robot <[email protected]>
>
> All errors (new ones prefixed by >>, old ones prefixed by <<):
>
> >> drivers/usb/early/ehci-dbgp.c:1062:24: error: assigning to 'struct console *' from incompatible type 'struct console'; take the address with &
> kgdbdbgp_io_ops.cons = early_dbgp_console;
> ^ ~~~~~~~~~~~~~~~~~~
> &
> 1 error generated.
>
Ah, my bad. Will fix it up in the next version.
-Sumit
> vim +1062 drivers/usb/early/ehci-dbgp.c
>
> 1046
> 1047 static int __init kgdbdbgp_parse_config(char *str)
> 1048 {
> 1049 char *ptr;
> 1050
> 1051 if (!ehci_debug) {
> 1052 if (early_dbgp_init(str))
> 1053 return -1;
> 1054 }
> 1055 ptr = strchr(str, ',');
> 1056 if (ptr) {
> 1057 ptr++;
> 1058 kgdbdbgp_wait_time = simple_strtoul(ptr, &ptr, 10);
> 1059 }
> 1060 kgdb_register_io_module(&kgdbdbgp_io_ops);
> 1061 if (early_dbgp_console.index != -1)
> > 1062 kgdbdbgp_io_ops.cons = early_dbgp_console;
> 1063
> 1064 return 0;
> 1065 }
> 1066 early_param("kgdbdbgp", kgdbdbgp_parse_config);
> 1067
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]
On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote:
> In kgdb context, calling console handlers aren't safe due to locks used
> in those handlers which could in turn lead to a deadlock. Although, using
> oops_in_progress increases the chance to bypass locks in most console
> handlers but it might not be sufficient enough in case a console uses
> more locks (VT/TTY is good example).
>
> Currently when a driver provides both polling I/O and a console then kdb
> will output using the console. We can increase robustness by using the
> currently active polling I/O driver (which should be lockless) instead
> of the corresponding console. For several common cases (e.g. an
> embedded system with a single serial port that is used both for console
> output and debugger I/O) this will result in no console handler being
> used.
>
> In order to achieve this we need to reverse the order of preference to
> use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> store "struct console" that represents debugger I/O in dbg_io_ops and
> while emitting kdb messages, skip console that matches dbg_io_ops
> console in order to avoid duplicate messages. After this change,
> "is_console" param becomes redundant and hence removed.
>
> Suggested-by: Daniel Thompson <[email protected]>
> Signed-off-by: Sumit Garg <[email protected]>
Looking good, only one minor comment left on my side (including the
three patches prior).
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 9e5a40d..5e00bc8 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len)
> if (msg_len == 0)
> return;
>
> - if (dbg_io_ops && !dbg_io_ops->is_console)
> + if (dbg_io_ops)
> kdb_io_write(msg, msg_len);
Since this now slots on so cleanly and there are not multiple calls
to kdb_io_write() then I think perhaps factoring this out into its
own function (in patch 1) is no long necessary. The character write
loop can go directly into this function.
Daniel.
On Tue, 2 Jun 2020 at 19:16, Daniel Thompson <[email protected]> wrote:
>
> On Fri, May 29, 2020 at 04:56:47PM +0530, Sumit Garg wrote:
> > In kgdb context, calling console handlers aren't safe due to locks used
> > in those handlers which could in turn lead to a deadlock. Although, using
> > oops_in_progress increases the chance to bypass locks in most console
> > handlers but it might not be sufficient enough in case a console uses
> > more locks (VT/TTY is good example).
> >
> > Currently when a driver provides both polling I/O and a console then kdb
> > will output using the console. We can increase robustness by using the
> > currently active polling I/O driver (which should be lockless) instead
> > of the corresponding console. For several common cases (e.g. an
> > embedded system with a single serial port that is used both for console
> > output and debugger I/O) this will result in no console handler being
> > used.
> >
> > In order to achieve this we need to reverse the order of preference to
> > use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> > store "struct console" that represents debugger I/O in dbg_io_ops and
> > while emitting kdb messages, skip console that matches dbg_io_ops
> > console in order to avoid duplicate messages. After this change,
> > "is_console" param becomes redundant and hence removed.
> >
> > Suggested-by: Daniel Thompson <[email protected]>
> > Signed-off-by: Sumit Garg <[email protected]>
>
> Looking good, only one minor comment left on my side (including the
> three patches prior).
>
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 9e5a40d..5e00bc8 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -560,12 +560,14 @@ static void kdb_msg_write(char *msg, int msg_len)
> > if (msg_len == 0)
> > return;
> >
> > - if (dbg_io_ops && !dbg_io_ops->is_console)
> > + if (dbg_io_ops)
> > kdb_io_write(msg, msg_len);
>
> Since this now slots on so cleanly and there are not multiple calls
> to kdb_io_write() then I think perhaps factoring this out into its
> own function (in patch 1) is no long necessary. The character write
> loop can go directly into this function.
>
Okay, will update it in the next version.
-Sumit
>
> Daniel.
Hi,
On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
>
> Re-factor kdb_printf() message write code in order to avoid duplication
> of code and thereby increase readability.
>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> kernel/debug/kdb/kdb_io.c | 61 +++++++++++++++++++++++++----------------------
> 1 file changed, 32 insertions(+), 29 deletions(-)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 924bc92..e46f33e 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -542,6 +542,33 @@ static int kdb_search_string(char *searched, char *searchfor)
> return 0;
> }
>
> +static void kdb_io_write(char *cp, int len)
nit: "const char *" just to make it obvious that we don't modify the string?
> +{
> + if (len == 0)
> + return;
Remove the above check. It's double-overkill. Not only did you just
check in kdb_msg_write() but also the while loop below will do a
"no-op" just fine even without your check.
> +
> + while (len--) {
> + dbg_io_ops->write_char(*cp);
> + cp++;
> + }
> +}
> +
> +static void kdb_msg_write(char *msg, int msg_len)
nit: "const char *" just to make it obvious that we don't modify the string?
Other than those small things, this looks nice to me. Feel free to
add my Reviewed-by tag once small things are fixed.
-Doug
Hi,
On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
>
> Check if a console is enabled prior to invoking corresponding write
> handler.
>
> Suggested-by: Sergey Senozhatsky <[email protected]>
> Signed-off-by: Sumit Garg <[email protected]>
> Reviewed-by: Daniel Thompson <[email protected]>
> ---
> kernel/debug/kdb/kdb_io.c | 2 ++
> 1 file changed, 2 insertions(+)
Reviewed-by: Douglas Anderson <[email protected]>
Hi,
On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
>
> While rounding up CPUs via NMIs, its possible that a rounded up CPU
> maybe holding a console port lock leading to kgdb master CPU stuck in
> a deadlock during invocation of console write operations. A similar
> deadlock could also be possible while using synchronous breakpoints.
>
> So in order to avoid such a deadlock, set oops_in_progress to encourage
> the console drivers to disregard their internal spin locks: in the
> current calling context the risk of deadlock is a bigger problem than
> risks due to re-entering the console driver. We operate directly on
> oops_in_progress rather than using bust_spinlocks() because the calls
> bust_spinlocks() makes on exit are not appropriate for this calling
> context.
>
> Suggested-by: Petr Mladek <[email protected]>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> kernel/debug/kdb/kdb_io.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index fad38eb..9e5a40d 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -566,7 +566,18 @@ static void kdb_msg_write(char *msg, int msg_len)
> for_each_console(c) {
> if (!(c->flags & CON_ENABLED))
> continue;
> + /*
> + * Set oops_in_progress to encourage the console drivers to
> + * disregard their internal spin locks: in the current calling
> + * context the risk of deadlock is a bigger problem than risks
> + * due to re-entering the console driver. We operate directly on
> + * oops_in_progress rather than using bust_spinlocks() because
> + * the calls bust_spinlocks() makes on exit are not appropriate
> + * for this calling context.
> + */
> + ++oops_in_progress;
> c->write(c, msg, msg_len);
> + --oops_in_progress;
This seems sane to me, especially when combined with your next patch
that tries really hard not to run this flow. ;-)
Reviewed-by: Douglas Anderson <[email protected]>
Hi,
On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
>
> In kgdb context, calling console handlers aren't safe due to locks used
> in those handlers which could in turn lead to a deadlock. Although, using
> oops_in_progress increases the chance to bypass locks in most console
> handlers but it might not be sufficient enough in case a console uses
> more locks (VT/TTY is good example).
>
> Currently when a driver provides both polling I/O and a console then kdb
> will output using the console. We can increase robustness by using the
> currently active polling I/O driver (which should be lockless) instead
> of the corresponding console. For several common cases (e.g. an
> embedded system with a single serial port that is used both for console
> output and debugger I/O) this will result in no console handler being
> used.
>
> In order to achieve this we need to reverse the order of preference to
> use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> store "struct console" that represents debugger I/O in dbg_io_ops and
> while emitting kdb messages, skip console that matches dbg_io_ops
> console in order to avoid duplicate messages. After this change,
> "is_console" param becomes redundant and hence removed.
>
> Suggested-by: Daniel Thompson <[email protected]>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> drivers/tty/serial/kgdb_nmi.c | 2 +-
> drivers/tty/serial/kgdboc.c | 4 ++--
I don't think this will compile against the "kgdboc_earlycon" patches
that landed, will it? Specifically when I apply your patch I still
see "is_console" in:
static struct kgdb_io kgdboc_earlycon_io_ops = {
.name = "kgdboc_earlycon",
.read_char = kgdboc_earlycon_get_char,
.write_char = kgdboc_earlycon_put_char,
.pre_exception = kgdboc_earlycon_pre_exp_handler,
.deinit = kgdboc_earlycon_deinit,
.is_console = true,
};
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index b072aeb..bc0face3 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -273,8 +273,7 @@ struct kgdb_arch {
> * the I/O driver.
> * @post_exception: Pointer to a function that will do any cleanup work
> * for the I/O driver.
> - * @is_console: 1 if the end device is a console 0 if the I/O device is
> - * not a console
> + * @cons: valid if the I/O device is a console.
optional nit: add "; else NULL"
Other than that this looks great. Feel free to add my Reviewed-by:
tag once you've fixed the error that the bot found and resolved with
kgdb_earlycon.
-Doug
On Wed, 3 Jun 2020 at 03:02, Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
> >
> > Re-factor kdb_printf() message write code in order to avoid duplication
> > of code and thereby increase readability.
> >
> > Signed-off-by: Sumit Garg <[email protected]>
> > ---
> > kernel/debug/kdb/kdb_io.c | 61 +++++++++++++++++++++++++----------------------
> > 1 file changed, 32 insertions(+), 29 deletions(-)
> >
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 924bc92..e46f33e 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -542,6 +542,33 @@ static int kdb_search_string(char *searched, char *searchfor)
> > return 0;
> > }
> >
> > +static void kdb_io_write(char *cp, int len)
>
> nit: "const char *" just to make it obvious that we don't modify the string?
>
>
> > +{
> > + if (len == 0)
> > + return;
>
> Remove the above check. It's double-overkill. Not only did you just
> check in kdb_msg_write() but also the while loop below will do a
> "no-op" just fine even without your check.
>
I will get rid of kdb_io_write() as per Daniel's comment on patch #4.
>
> > +
> > + while (len--) {
> > + dbg_io_ops->write_char(*cp);
> > + cp++;
> > + }
> > +}
> > +
> > +static void kdb_msg_write(char *msg, int msg_len)
>
> nit: "const char *" just to make it obvious that we don't modify the string?
>
Okay.
>
> Other than those small things, this looks nice to me. Feel free to
> add my Reviewed-by tag once small things are fixed.
>
Thanks.
-Sumit
>
> -Doug
On Wed, 3 Jun 2020 at 03:02, Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 29, 2020 at 4:27 AM Sumit Garg <[email protected]> wrote:
> >
> > In kgdb context, calling console handlers aren't safe due to locks used
> > in those handlers which could in turn lead to a deadlock. Although, using
> > oops_in_progress increases the chance to bypass locks in most console
> > handlers but it might not be sufficient enough in case a console uses
> > more locks (VT/TTY is good example).
> >
> > Currently when a driver provides both polling I/O and a console then kdb
> > will output using the console. We can increase robustness by using the
> > currently active polling I/O driver (which should be lockless) instead
> > of the corresponding console. For several common cases (e.g. an
> > embedded system with a single serial port that is used both for console
> > output and debugger I/O) this will result in no console handler being
> > used.
> >
> > In order to achieve this we need to reverse the order of preference to
> > use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
> > store "struct console" that represents debugger I/O in dbg_io_ops and
> > while emitting kdb messages, skip console that matches dbg_io_ops
> > console in order to avoid duplicate messages. After this change,
> > "is_console" param becomes redundant and hence removed.
> >
> > Suggested-by: Daniel Thompson <[email protected]>
> > Signed-off-by: Sumit Garg <[email protected]>
> > ---
> > drivers/tty/serial/kgdb_nmi.c | 2 +-
> > drivers/tty/serial/kgdboc.c | 4 ++--
>
> I don't think this will compile against the "kgdboc_earlycon" patches
> that landed, will it? Specifically when I apply your patch I still
> see "is_console" in:
Agree will fix this and rebase this patch-set onto Daniel's tree.
>
> static struct kgdb_io kgdboc_earlycon_io_ops = {
> .name = "kgdboc_earlycon",
> .read_char = kgdboc_earlycon_get_char,
> .write_char = kgdboc_earlycon_put_char,
> .pre_exception = kgdboc_earlycon_pre_exp_handler,
> .deinit = kgdboc_earlycon_deinit,
> .is_console = true,
> };
>
>
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index b072aeb..bc0face3 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -273,8 +273,7 @@ struct kgdb_arch {
> > * the I/O driver.
> > * @post_exception: Pointer to a function that will do any cleanup work
> > * for the I/O driver.
> > - * @is_console: 1 if the end device is a console 0 if the I/O device is
> > - * not a console
> > + * @cons: valid if the I/O device is a console.
>
> optional nit: add "; else NULL"
>
Okay.
>
> Other than that this looks great. Feel free to add my Reviewed-by:
> tag once you've fixed the error that the bot found and resolved with
> kgdb_earlycon.
>
Thanks.
-Sumit
>
> -Doug