In kgdb NMI context, polling driver APIs are more safer to use instead
of console APIs since the polling drivers know they will execute from
all sorts of crazy places. And for the most common use cases this would
also result in no console handler ever being called. So switch to use
polling driver APIs in case a particular console supports polling mode.
Suggested-by: Daniel Thompson <[email protected]>
Signed-off-by: Sumit Garg <[email protected]>
---
kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 3a5a068..8e0d581 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -24,6 +24,7 @@
#include <linux/kgdb.h>
#include <linux/kdb.h>
#include <linux/kallsyms.h>
+#include <linux/tty_driver.h>
#include "kdb_private.h"
#define CMD_BUFLEN 256
@@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
}
}
for_each_console(c) {
+ int line;
+ struct tty_driver *p;
+
if (!(c->flags & CON_ENABLED))
continue;
- ++oops_in_progress;
- c->write(c, cp, retlen - (cp - kdb_buffer));
- --oops_in_progress;
+ p = c->device ? c->device(c, &line) : NULL;
+ if (p && p->ops && p->ops->poll_put_char) {
+ len = retlen - (cp - kdb_buffer);
+ cp2 = cp;
+ while (len--) {
+ p->ops->poll_put_char(p, line, *cp2);
+ cp2++;
+ }
+ } else {
+ ++oops_in_progress;
+ c->write(c, cp, retlen - (cp - kdb_buffer));
+ --oops_in_progress;
+ }
touch_nmi_watchdog();
}
}
@@ -765,11 +779,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
}
}
for_each_console(c) {
+ int line;
+ struct tty_driver *p;
+
if (!(c->flags & CON_ENABLED))
continue;
- ++oops_in_progress;
- c->write(c, moreprompt, strlen(moreprompt));
- --oops_in_progress;
+ p = c->device ? c->device(c, &line) : NULL;
+ if (p && p->ops && p->ops->poll_put_char) {
+ len = strlen(moreprompt);
+ cp = moreprompt;
+ while (len--) {
+ p->ops->poll_put_char(p, line, *cp);
+ cp++;
+ }
+ } else {
+ ++oops_in_progress;
+ c->write(c, moreprompt, strlen(moreprompt));
+ --oops_in_progress;
+ }
touch_nmi_watchdog();
}
--
2.7.4
On Fri, May 22, 2020 at 08:04:31PM +0530, Sumit Garg wrote:
> In kgdb NMI context, polling driver APIs are more safer to use instead
> of console APIs since the polling drivers know they will execute from
> all sorts of crazy places. And for the most common use cases this would
> also result in no console handler ever being called. So switch to use
> polling driver APIs in case a particular console supports polling mode.
This comment seems rather half hearted, not least because it doesn't
explain what the current problem is nor why using the polling API is
safer.
Compare the above against the advice in
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
and I think it comes up short. Perhaps also consider Ingo Molnar's much
more concise suggestion on describing changes:
: Please use the customary changelog style we use in the kernel:
: " Current code does (A), this has a problem when (B).
: We can improve this doing (C), because (D)."
-- http://lkml.iu.edu/hypermail//linux/kernel/1311.1/01157.html
> Suggested-by: Daniel Thompson <[email protected]>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
> 1 file changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 3a5a068..8e0d581 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -24,6 +24,7 @@
> #include <linux/kgdb.h>
> #include <linux/kdb.h>
> #include <linux/kallsyms.h>
> +#include <linux/tty_driver.h>
> #include "kdb_private.h"
>
> #define CMD_BUFLEN 256
> @@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> }
> }
> for_each_console(c) {
> + int line;
> + struct tty_driver *p;
> +
> if (!(c->flags & CON_ENABLED))
> continue;
> - ++oops_in_progress;
> - c->write(c, cp, retlen - (cp - kdb_buffer));
> - --oops_in_progress;
> + p = c->device ? c->device(c, &line) : NULL;
> + if (p && p->ops && p->ops->poll_put_char) {
What prevents this logic from matching an active console that hasn't
been selected as the polling driver?
> + len = retlen - (cp - kdb_buffer);
> + cp2 = cp;
> + while (len--) {
> + p->ops->poll_put_char(p, line, *cp2);
> + cp2++;
> + }
Assuming it is possible to identify the console that matches the
currently selected polling driver can't we just drop the
is_console test and get rid of this branch entirely.
The only reason for the is_console test is to avoid issuing messages
twice so if we are able to suppress the c->write() for the same UART
then is_console check becomes pointless and can go.
> + } else {
> + ++oops_in_progress;
> + c->write(c, cp, retlen - (cp - kdb_buffer));
> + --oops_in_progress;
> + }
> touch_nmi_watchdog();
> }
> }
> @@ -765,11 +779,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> }
> }
> for_each_console(c) {
> + int line;
> + struct tty_driver *p;
> +
> if (!(c->flags & CON_ENABLED))
> continue;
> - ++oops_in_progress;
> - c->write(c, moreprompt, strlen(moreprompt));
> - --oops_in_progress;
> + p = c->device ? c->device(c, &line) : NULL;
> + if (p && p->ops && p->ops->poll_put_char) {
> + len = strlen(moreprompt);
> + cp = moreprompt;
> + while (len--) {
> + p->ops->poll_put_char(p, line, *cp);
> + cp++;
> + }
> + } else {
> + ++oops_in_progress;
> + c->write(c, moreprompt, strlen(moreprompt));
> + --oops_in_progress;
> + }
Maybe also consider pulling the string emit to a separate function.
Daniel.
On Tue, May 26, 2020 at 01:16:17PM +0530, Sumit Garg wrote:
> On Fri, 22 May 2020 at 21:33, Daniel Thompson
> <[email protected]> wrote:
> >
> > On Fri, May 22, 2020 at 08:04:31PM +0530, Sumit Garg wrote:
> > > In kgdb NMI context, polling driver APIs are more safer to use instead
> > > of console APIs since the polling drivers know they will execute from
> > > all sorts of crazy places. And for the most common use cases this would
> > > also result in no console handler ever being called. So switch to use
> > > polling driver APIs in case a particular console supports polling mode.
> >
> > This comment seems rather half hearted, not least because it doesn't
> > explain what the current problem is nor why using the polling API is
> > safer.
> >
>
> TBH, some sentences in the above comment were borrowed from your
> suggestion here [1]. But I agree that it doesn't portray the complete
> picture. So how about:
>
> ====
> In kgdb NMI context, calling console handlers isn't safe due to locks
> used in those handlers which could lead to a deadlock. Although, using
> oops_in_progress increases the chance to bypass locks in most console
> handlers but it might not be sufficient enough in case a console uses
> more locks (VT/TTY is good example).
>
> So instead switch to use lockless polling driver APIs in case a
> particular console supports polling mode which is common for most kdb
> use-cases and would result in no console handler ever being called.
> ====
Better, although the later paragraph still seems rather vague to me.
Compare to:
Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.
> [1] https://lkml.org/lkml/2020/5/20/356
>
> > Compare the above against the advice in
> > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> > and I think it comes up short. Perhaps also consider Ingo Molnar's much
> > more concise suggestion on describing changes:
> >
> > : Please use the customary changelog style we use in the kernel:
> > : " Current code does (A), this has a problem when (B).
> > : We can improve this doing (C), because (D)."
> > -- http://lkml.iu.edu/hypermail//linux/kernel/1311.1/01157.html
>
> Thanks for the pointers.
>
> >
> >
> > > Suggested-by: Daniel Thompson <[email protected]>
> > > Signed-off-by: Sumit Garg <[email protected]>
> > > ---
> > > kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
> > > 1 file changed, 33 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > > index 3a5a068..8e0d581 100644
> > > --- a/kernel/debug/kdb/kdb_io.c
> > > +++ b/kernel/debug/kdb/kdb_io.c
> > > @@ -24,6 +24,7 @@
> > > #include <linux/kgdb.h>
> > > #include <linux/kdb.h>
> > > #include <linux/kallsyms.h>
> > > +#include <linux/tty_driver.h>
> > > #include "kdb_private.h"
> > >
> > > #define CMD_BUFLEN 256
> > > @@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> > > }
> > > }
> > > for_each_console(c) {
> > > + int line;
> > > + struct tty_driver *p;
> > > +
> > > if (!(c->flags & CON_ENABLED))
> > > continue;
> > > - ++oops_in_progress;
> > > - c->write(c, cp, retlen - (cp - kdb_buffer));
> > > - --oops_in_progress;
> > > + p = c->device ? c->device(c, &line) : NULL;
> > > + if (p && p->ops && p->ops->poll_put_char) {
> >
> > What prevents this logic from matching an active console that hasn't
> > been selected as the polling driver?
>
> Yes you are correct and it could lead to invoking poll_put_char()
> without poll_init(). And we couldn't invoke poll_init() here as that
> still comes with locks and could sleep. So one way to overcome this
> would be to pass selected polling driver via dbg_io_ops and use
> polling APIs only if the underlying console driver matches that
> polling driver.
Agree.
Note that this is all I ever expected to look at when I commented about
before.
> > > + len = retlen - (cp - kdb_buffer);
> > > + cp2 = cp;
> > > + while (len--) {
> > > + p->ops->poll_put_char(p, line, *cp2);
> > > + cp2++;
> > > + }
> >
> > Assuming it is possible to identify the console that matches the
> > currently selected polling driver can't we just drop the
> > is_console test and get rid of this branch entirely.
>
> Have a look at my suggested approach above.
>
> >
> > The only reason for the is_console test is to avoid issuing messages
> > twice so if we are able to suppress the c->write() for the same UART
> > then is_console check becomes pointless and can go.
>
> I did consider removing is_console check but it looks like it's not
> only limited to polling drivers but also used at other places (see [1]
> [2]) as well.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/early/ehci-dbgp.c#n1061
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/kgdb_nmi.c#n48
IIUC you mean that the logic to match devices only works for tty drivers
and there examples are not tty drivers.
This could probably be solved but no need to get too tied in knots. It's
fine to keep the is_console check for now.
However rather than replicate the polled I/O write code a third and
fourth time lets get the I/O logic pulled out into proper functions.
Daniel.
On Fri, 22 May 2020 at 21:33, Daniel Thompson
<[email protected]> wrote:
>
> On Fri, May 22, 2020 at 08:04:31PM +0530, Sumit Garg wrote:
> > In kgdb NMI context, polling driver APIs are more safer to use instead
> > of console APIs since the polling drivers know they will execute from
> > all sorts of crazy places. And for the most common use cases this would
> > also result in no console handler ever being called. So switch to use
> > polling driver APIs in case a particular console supports polling mode.
>
> This comment seems rather half hearted, not least because it doesn't
> explain what the current problem is nor why using the polling API is
> safer.
>
TBH, some sentences in the above comment were borrowed from your
suggestion here [1]. But I agree that it doesn't portray the complete
picture. So how about:
====
In kgdb NMI context, calling console handlers isn't safe due to locks
used in those handlers which could lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).
So instead switch to use lockless polling driver APIs in case a
particular console supports polling mode which is common for most kdb
use-cases and would result in no console handler ever being called.
====
[1] https://lkml.org/lkml/2020/5/20/356
> Compare the above against the advice in
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> and I think it comes up short. Perhaps also consider Ingo Molnar's much
> more concise suggestion on describing changes:
>
> : Please use the customary changelog style we use in the kernel:
> : " Current code does (A), this has a problem when (B).
> : We can improve this doing (C), because (D)."
> -- http://lkml.iu.edu/hypermail//linux/kernel/1311.1/01157.html
Thanks for the pointers.
>
>
> > Suggested-by: Daniel Thompson <[email protected]>
> > Signed-off-by: Sumit Garg <[email protected]>
> > ---
> > kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
> > 1 file changed, 33 insertions(+), 6 deletions(-)
> >
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 3a5a068..8e0d581 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -24,6 +24,7 @@
> > #include <linux/kgdb.h>
> > #include <linux/kdb.h>
> > #include <linux/kallsyms.h>
> > +#include <linux/tty_driver.h>
> > #include "kdb_private.h"
> >
> > #define CMD_BUFLEN 256
> > @@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> > }
> > }
> > for_each_console(c) {
> > + int line;
> > + struct tty_driver *p;
> > +
> > if (!(c->flags & CON_ENABLED))
> > continue;
> > - ++oops_in_progress;
> > - c->write(c, cp, retlen - (cp - kdb_buffer));
> > - --oops_in_progress;
> > + p = c->device ? c->device(c, &line) : NULL;
> > + if (p && p->ops && p->ops->poll_put_char) {
>
> What prevents this logic from matching an active console that hasn't
> been selected as the polling driver?
Yes you are correct and it could lead to invoking poll_put_char()
without poll_init(). And we couldn't invoke poll_init() here as that
still comes with locks and could sleep. So one way to overcome this
would be to pass selected polling driver via dbg_io_ops and use
polling APIs only if the underlying console driver matches that
polling driver.
>
>
> > + len = retlen - (cp - kdb_buffer);
> > + cp2 = cp;
> > + while (len--) {
> > + p->ops->poll_put_char(p, line, *cp2);
> > + cp2++;
> > + }
>
> Assuming it is possible to identify the console that matches the
> currently selected polling driver can't we just drop the
> is_console test and get rid of this branch entirely.
Have a look at my suggested approach above.
>
> The only reason for the is_console test is to avoid issuing messages
> twice so if we are able to suppress the c->write() for the same UART
> then is_console check becomes pointless and can go.
I did consider removing is_console check but it looks like it's not
only limited to polling drivers but also used at other places (see [1]
[2]) as well.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/early/ehci-dbgp.c#n1061
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/kgdb_nmi.c#n48
>
>
> > + } else {
> > + ++oops_in_progress;
> > + c->write(c, cp, retlen - (cp - kdb_buffer));
> > + --oops_in_progress;
> > + }
> > touch_nmi_watchdog();
> > }
> > }
> > @@ -765,11 +779,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> > }
> > }
> > for_each_console(c) {
> > + int line;
> > + struct tty_driver *p;
> > +
> > if (!(c->flags & CON_ENABLED))
> > continue;
> > - ++oops_in_progress;
> > - c->write(c, moreprompt, strlen(moreprompt));
> > - --oops_in_progress;
> > + p = c->device ? c->device(c, &line) : NULL;
> > + if (p && p->ops && p->ops->poll_put_char) {
> > + len = strlen(moreprompt);
> > + cp = moreprompt;
> > + while (len--) {
> > + p->ops->poll_put_char(p, line, *cp);
> > + cp++;
> > + }
> > + } else {
> > + ++oops_in_progress;
> > + c->write(c, moreprompt, strlen(moreprompt));
> > + --oops_in_progress;
> > + }
>
> Maybe also consider pulling the string emit to a separate function.
>
Okay.
-Sumit
>
> Daniel.
On Tue, 26 May 2020 at 16:40, Daniel Thompson
<[email protected]> wrote:
>
> On Tue, May 26, 2020 at 01:16:17PM +0530, Sumit Garg wrote:
> > On Fri, 22 May 2020 at 21:33, Daniel Thompson
> > <[email protected]> wrote:
> > >
> > > On Fri, May 22, 2020 at 08:04:31PM +0530, Sumit Garg wrote:
> > > > In kgdb NMI context, polling driver APIs are more safer to use instead
> > > > of console APIs since the polling drivers know they will execute from
> > > > all sorts of crazy places. And for the most common use cases this would
> > > > also result in no console handler ever being called. So switch to use
> > > > polling driver APIs in case a particular console supports polling mode.
> > >
> > > This comment seems rather half hearted, not least because it doesn't
> > > explain what the current problem is nor why using the polling API is
> > > safer.
> > >
> >
> > TBH, some sentences in the above comment were borrowed from your
> > suggestion here [1]. But I agree that it doesn't portray the complete
> > picture. So how about:
> >
> > ====
> > In kgdb NMI context, calling console handlers isn't safe due to locks
> > used in those handlers which could lead to a deadlock. Although, using
> > oops_in_progress increases the chance to bypass locks in most console
> > handlers but it might not be sufficient enough in case a console uses
> > more locks (VT/TTY is good example).
> >
> > So instead switch to use lockless polling driver APIs in case a
> > particular console supports polling mode which is common for most kdb
> > use-cases and would result in no console handler ever being called.
> > ====
>
> Better, although the later paragraph still seems rather vague to me.
> Compare to:
>
> Currently when a driver provides both polling I/O and a console then kdb
> will output using the console. We can increase robustness by using the
> currently active polling I/O driver (which should be lockless) instead
> of the corresponding console. For several common cases (e.g. an
> embedded system with a single serial port that is used both for console
> output and debugger I/O) this will result in no console handler being
> used.
>
Looks good, will use it instead.
>
> > [1] https://lkml.org/lkml/2020/5/20/356
> >
> > > Compare the above against the advice in
> > > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> > > and I think it comes up short. Perhaps also consider Ingo Molnar's much
> > > more concise suggestion on describing changes:
> > >
> > > : Please use the customary changelog style we use in the kernel:
> > > : " Current code does (A), this has a problem when (B).
> > > : We can improve this doing (C), because (D)."
> > > -- http://lkml.iu.edu/hypermail//linux/kernel/1311.1/01157.html
> >
> > Thanks for the pointers.
> >
> > >
> > >
> > > > Suggested-by: Daniel Thompson <[email protected]>
> > > > Signed-off-by: Sumit Garg <[email protected]>
> > > > ---
> > > > kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
> > > > 1 file changed, 33 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > > > index 3a5a068..8e0d581 100644
> > > > --- a/kernel/debug/kdb/kdb_io.c
> > > > +++ b/kernel/debug/kdb/kdb_io.c
> > > > @@ -24,6 +24,7 @@
> > > > #include <linux/kgdb.h>
> > > > #include <linux/kdb.h>
> > > > #include <linux/kallsyms.h>
> > > > +#include <linux/tty_driver.h>
> > > > #include "kdb_private.h"
> > > >
> > > > #define CMD_BUFLEN 256
> > > > @@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> > > > }
> > > > }
> > > > for_each_console(c) {
> > > > + int line;
> > > > + struct tty_driver *p;
> > > > +
> > > > if (!(c->flags & CON_ENABLED))
> > > > continue;
> > > > - ++oops_in_progress;
> > > > - c->write(c, cp, retlen - (cp - kdb_buffer));
> > > > - --oops_in_progress;
> > > > + p = c->device ? c->device(c, &line) : NULL;
> > > > + if (p && p->ops && p->ops->poll_put_char) {
> > >
> > > What prevents this logic from matching an active console that hasn't
> > > been selected as the polling driver?
> >
> > Yes you are correct and it could lead to invoking poll_put_char()
> > without poll_init(). And we couldn't invoke poll_init() here as that
> > still comes with locks and could sleep. So one way to overcome this
> > would be to pass selected polling driver via dbg_io_ops and use
> > polling APIs only if the underlying console driver matches that
> > polling driver.
>
> Agree.
>
> Note that this is all I ever expected to look at when I commented about
> before.
Okay.
>
>
> > > > + len = retlen - (cp - kdb_buffer);
> > > > + cp2 = cp;
> > > > + while (len--) {
> > > > + p->ops->poll_put_char(p, line, *cp2);
> > > > + cp2++;
> > > > + }
> > >
> > > Assuming it is possible to identify the console that matches the
> > > currently selected polling driver can't we just drop the
> > > is_console test and get rid of this branch entirely.
> >
> > Have a look at my suggested approach above.
> >
> > >
> > > The only reason for the is_console test is to avoid issuing messages
> > > twice so if we are able to suppress the c->write() for the same UART
> > > then is_console check becomes pointless and can go.
> >
> > I did consider removing is_console check but it looks like it's not
> > only limited to polling drivers but also used at other places (see [1]
> > [2]) as well.
> >
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/early/ehci-dbgp.c#n1061
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/kgdb_nmi.c#n48
>
> IIUC you mean that the logic to match devices only works for tty drivers
> and there examples are not tty drivers.
>
> This could probably be solved but no need to get too tied in knots. It's
> fine to keep the is_console check for now.
>
Okay.
> However rather than replicate the polled I/O write code a third and
> fourth time lets get the I/O logic pulled out into proper functions.
>
Sure, will do the refactoring.
-Sumit
>
>
> Daniel.