2019-05-31 07:43:36

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 0/3] livepatch: Clean up of reliable stacktrace warnings

I believe that all the changes make sense. I would like to get them
merged if nobody see a fundamental problem.

Changes against v2:

+ Put back the patch removing WARN_ONCE in the weak
save_stack_trace_tsk_reliable(). It is related.
+ Simplified patch removing the duplicate warning from klp_check_stack()
+ Update commit message for 3rd patch [Josh]

Petr Mladek (3):
stacktrace: Remove superfluous WARN_ONCE() from
save_stack_trace_tsk_reliable()
livepatch: Remove duplicate warning about missing reliable stacktrace
support
livepatch: Use static buffer for debugging messages under rq lock

kernel/livepatch/transition.c | 4 +---
kernel/stacktrace.c | 1 -
2 files changed, 1 insertion(+), 4 deletions(-)

--
2.16.4


2019-05-31 07:43:41

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 2/3] livepatch: Remove duplicate warning about missing reliable stacktrace support

WARN_ON_ONCE() could not be called safely under rq lock because
of console deadlock issues.

It can be simply removed. A better descriptive message is written
in klp_enable_patch() when klp_have_reliable_stack() fails.
The remaining debug message is good enough.

Signed-off-by: Petr Mladek <[email protected]>
---
kernel/livepatch/transition.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index abb2a4a2cbb2..1bf362df76e1 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -247,7 +247,6 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
int ret, nr_entries;

ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
- WARN_ON_ONCE(ret == -ENOSYS);
if (ret < 0) {
snprintf(err_buf, STACK_ERR_BUF_SIZE,
"%s: %s:%d has an unreliable stack\n",
--
2.16.4

2019-05-31 07:44:43

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 1/3] stacktrace: Remove superfluous WARN_ONCE() from save_stack_trace_tsk_reliable()

WARN_ONCE() in the generic save_stack_trace_tsk_reliable() is superfluous.

The information is passed also via the return value. The only current
user klp_check_stack() writes its own warning when the reliable stack
traces are not supported. Other eventual users might want its own error
handling as well.

Signed-off-by: Petr Mladek <[email protected]>
Acked-by: Miroslav Benes <[email protected]>
Reviewed-by: Kamalesh Babulal <[email protected]>
---
kernel/stacktrace.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index 5667f1da3ede..8d088408928d 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -259,7 +259,6 @@ __weak int
save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
{
- WARN_ONCE(1, KERN_INFO "save_stack_tsk_reliable() not implemented yet.\n");
return -ENOSYS;
}

--
2.16.4

2019-05-31 07:44:46

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 3/3] livepatch: Use static buffer for debugging messages under rq lock

The err_buf array uses 128 bytes of stack space. Move it off the stack
by making it static. It's safe to use a shared buffer because
klp_try_switch_task() is called under klp_mutex.

Signed-off-by: Petr Mladek <[email protected]>
Acked-by: Miroslav Benes <[email protected]>
Reviewed-by: Kamalesh Babulal <[email protected]>
---
kernel/livepatch/transition.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 1bf362df76e1..5c4f0c1f826e 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -280,11 +280,11 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
*/
static bool klp_try_switch_task(struct task_struct *task)
{
+ static char err_buf[STACK_ERR_BUF_SIZE];
struct rq *rq;
struct rq_flags flags;
int ret;
bool success = false;
- char err_buf[STACK_ERR_BUF_SIZE];

err_buf[0] = '\0';

@@ -327,7 +327,6 @@ static bool klp_try_switch_task(struct task_struct *task)
pr_debug("%s", err_buf);

return success;
-
}

/*
--
2.16.4

2019-05-31 12:27:55

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 1/3] stacktrace: Remove superfluous WARN_ONCE() from save_stack_trace_tsk_reliable()

On Fri, 31 May 2019, Petr Mladek wrote:

> WARN_ONCE() in the generic save_stack_trace_tsk_reliable() is superfluous.
>
> The information is passed also via the return value. The only current
> user klp_check_stack() writes its own warning when the reliable stack
> traces are not supported. Other eventual users might want its own error
> handling as well.
>
> Signed-off-by: Petr Mladek <[email protected]>
> Acked-by: Miroslav Benes <[email protected]>
> Reviewed-by: Kamalesh Babulal <[email protected]>
> ---
> kernel/stacktrace.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
> index 5667f1da3ede..8d088408928d 100644
> --- a/kernel/stacktrace.c
> +++ b/kernel/stacktrace.c
> @@ -259,7 +259,6 @@ __weak int
> save_stack_trace_tsk_reliable(struct task_struct *tsk,
> struct stack_trace *trace)
> {
> - WARN_ONCE(1, KERN_INFO "save_stack_tsk_reliable() not implemented yet.\n");
> return -ENOSYS;
> }

Do we even need the weak function now after Thomas' changes to
kernel/stacktrace.c?

- livepatch is the only user and it calls stack_trace_save_tsk_reliable()
- x86 defines CONFIG_ARCH_STACKWALK and CONFIG_HAVE_RELIABLE_STACKTRACE,
so it has stack_trace_save_tsk_reliable() implemented and it calls
arch_stack_walk_reliable()
- powerpc defines CONFIG_HAVE_RELIABLE_STACKTRACE and does not have
CONFIG_ARCH_STACKWALK. It also has stack_trace_save_tsk_reliable()
implemented and it calls save_stack_trace_tsk_reliable(), which is
implemented in arch/powerpc/
- all other archs do not have CONFIG_HAVE_RELIABLE_STACKTRACE and there is
stack_trace_save_tsk_reliable() returning ENOSYS for these cases in
include/linux/stacktrace.c

Miroslav

2019-05-31 12:34:11

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 2/3] livepatch: Remove duplicate warning about missing reliable stacktrace support

On Fri, 31 May 2019, Petr Mladek wrote:

> WARN_ON_ONCE() could not be called safely under rq lock because
> of console deadlock issues.
>
> It can be simply removed. A better descriptive message is written
> in klp_enable_patch() when klp_have_reliable_stack() fails.
> The remaining debug message is good enough.
>
> Signed-off-by: Petr Mladek <[email protected]>
> ---
> kernel/livepatch/transition.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index abb2a4a2cbb2..1bf362df76e1 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -247,7 +247,6 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> int ret, nr_entries;
>
> ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
> - WARN_ON_ONCE(ret == -ENOSYS);
> if (ret < 0) {
> snprintf(err_buf, STACK_ERR_BUF_SIZE,
> "%s: %s:%d has an unreliable stack\n",

The current situation is not the best, but I think the patch improves it
only slightly. I see two possible solutions.

1. we either revert commit 1d98a69e5cef ("livepatch: Remove reliable
stacktrace check in klp_try_switch_task()"), so that klp_check_stack()
returns right away.

2. or we test ret from stack_trace_save_tsk_reliable() for ENOSYS and
return.

In my opinion either of them is better than what we have now (and what we
would have with the patch), because klp_check_stack() returns, but it
prints out that a task has an unreliable stack. Yes, it is pr_debug() only
in the end, but still.

I don't have a preference and my understanding is that Petr does not want
to do v4. I can prepare a patch, but it would be nice to choose now. Josh?
Anyone else?

Miroslav

2019-05-31 12:39:20

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 3/3] livepatch: Use static buffer for debugging messages under rq lock

On Fri, 31 May 2019, Petr Mladek wrote:

> The err_buf array uses 128 bytes of stack space. Move it off the stack
> by making it static. It's safe to use a shared buffer because
> klp_try_switch_task() is called under klp_mutex.
>
> Signed-off-by: Petr Mladek <[email protected]>
> Acked-by: Miroslav Benes <[email protected]>
> Reviewed-by: Kamalesh Babulal <[email protected]>
> ---
> kernel/livepatch/transition.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index 1bf362df76e1..5c4f0c1f826e 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -280,11 +280,11 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> */
> static bool klp_try_switch_task(struct task_struct *task)
> {
> + static char err_buf[STACK_ERR_BUF_SIZE];
> struct rq *rq;
> struct rq_flags flags;
> int ret;
> bool success = false;
> - char err_buf[STACK_ERR_BUF_SIZE];
>
> err_buf[0] = '\0';
>
> @@ -327,7 +327,6 @@ static bool klp_try_switch_task(struct task_struct *task)
> pr_debug("%s", err_buf);
>
> return success;
> -
> }

This could go in separately as it is not connected to the rest of the
series.

Miroslav

2019-05-31 13:20:38

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 2/3] livepatch: Remove duplicate warning about missing reliable stacktrace support

On Fri 2019-05-31 14:32:34, Miroslav Benes wrote:
> On Fri, 31 May 2019, Petr Mladek wrote:
>
> > WARN_ON_ONCE() could not be called safely under rq lock because
> > of console deadlock issues.
> >
> > It can be simply removed. A better descriptive message is written
> > in klp_enable_patch() when klp_have_reliable_stack() fails.
> > The remaining debug message is good enough.
> >
> > Signed-off-by: Petr Mladek <[email protected]>
> > ---
> > kernel/livepatch/transition.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index abb2a4a2cbb2..1bf362df76e1 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -247,7 +247,6 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > int ret, nr_entries;
> >
> > ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
> > - WARN_ON_ONCE(ret == -ENOSYS);
> > if (ret < 0) {
> > snprintf(err_buf, STACK_ERR_BUF_SIZE,
> > "%s: %s:%d has an unreliable stack\n",
>
> The current situation is not the best, but I think the patch improves it
> only slightly. I see two possible solutions.
>
> 1. we either revert commit 1d98a69e5cef ("livepatch: Remove reliable
> stacktrace check in klp_try_switch_task()"), so that klp_check_stack()
> returns right away.
>
> 2. or we test ret from stack_trace_save_tsk_reliable() for ENOSYS and
> return.
>
> In my opinion either of them is better than what we have now (and what we
> would have with the patch), because klp_check_stack() returns, but it
> prints out that a task has an unreliable stack. Yes, it is pr_debug() only
> in the end, but still.

IMHO, any extra check will not improve the situation much. Quiet
return is as useless as the misleading pr_debug() that will
not normally get printed anyway.

Adding a warning is complicated because we would need to pass it via
the temporary buffer. It is not worth the complexity.

IMHO, it does not make sense to complicate the code because of
a corner case situation that would hit only people adding
livepatch support for new architecture.


> I don't have a preference and my understanding is that Petr does not want
> to do v4. I can prepare a patch, but it would be nice to choose now. Josh?
> Anyone else?

Exactly. Fell free to take over the patches.

I do not want to spend more time on endless discussions about this
corner case.

I made the 1st and 2nd patch only because of ideas mentioned when switching
klp_have_reliable_stack() from error to warning, namely:

https://lkml.kernel.org/r/20190418135858.n3lzq2oxkj52m6bi@treble
https://lkml.kernel.org/r/[email protected]

I have never expected that it would need several more respins.

Note that there is no good solution. We would need to choose between
a complicated code or an imperfect error handling. Anything in between
is just a compromise. Choosing between all the variants looks like
a bikeshedding to me.

I sent what I preferred and what looked acceptable. I am not going
to block other approaches unless they make the code too complicated.

Best Regards,
Petr

2019-05-31 13:32:26

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 2/3] livepatch: Remove duplicate warning about missing reliable stacktrace support

On Fri, 31 May 2019, Petr Mladek wrote:

> On Fri 2019-05-31 14:32:34, Miroslav Benes wrote:
> > On Fri, 31 May 2019, Petr Mladek wrote:
> >
> > > WARN_ON_ONCE() could not be called safely under rq lock because
> > > of console deadlock issues.
> > >
> > > It can be simply removed. A better descriptive message is written
> > > in klp_enable_patch() when klp_have_reliable_stack() fails.
> > > The remaining debug message is good enough.
> > >
> > > Signed-off-by: Petr Mladek <[email protected]>
> > > ---
> > > kernel/livepatch/transition.c | 1 -
> > > 1 file changed, 1 deletion(-)
> > >
> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > index abb2a4a2cbb2..1bf362df76e1 100644
> > > --- a/kernel/livepatch/transition.c
> > > +++ b/kernel/livepatch/transition.c
> > > @@ -247,7 +247,6 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > > int ret, nr_entries;
> > >
> > > ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
> > > - WARN_ON_ONCE(ret == -ENOSYS);
> > > if (ret < 0) {
> > > snprintf(err_buf, STACK_ERR_BUF_SIZE,
> > > "%s: %s:%d has an unreliable stack\n",
> >
> > The current situation is not the best, but I think the patch improves it
> > only slightly. I see two possible solutions.
> >
> > 1. we either revert commit 1d98a69e5cef ("livepatch: Remove reliable
> > stacktrace check in klp_try_switch_task()"), so that klp_check_stack()
> > returns right away.
> >
> > 2. or we test ret from stack_trace_save_tsk_reliable() for ENOSYS and
> > return.
> >
> > In my opinion either of them is better than what we have now (and what we
> > would have with the patch), because klp_check_stack() returns, but it
> > prints out that a task has an unreliable stack. Yes, it is pr_debug() only
> > in the end, but still.
>
> IMHO, any extra check will not improve the situation much. Quiet
> return is as useless as the misleading pr_debug() that will
> not normally get printed anyway.

I disagree here. I think the silent return would be perfectly fine. The
user was warned in klp_enable_patch() already.

Miroslav

2019-05-31 19:30:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 1/3] stacktrace: Remove superfluous WARN_ONCE() from save_stack_trace_tsk_reliable()

On Fri, May 31, 2019 at 02:25:15PM +0200, Miroslav Benes wrote:
> On Fri, 31 May 2019, Petr Mladek wrote:
>
> > WARN_ONCE() in the generic save_stack_trace_tsk_reliable() is superfluous.
> >
> > The information is passed also via the return value. The only current
> > user klp_check_stack() writes its own warning when the reliable stack
> > traces are not supported. Other eventual users might want its own error
> > handling as well.
> >
> > Signed-off-by: Petr Mladek <[email protected]>
> > Acked-by: Miroslav Benes <[email protected]>
> > Reviewed-by: Kamalesh Babulal <[email protected]>
> > ---
> > kernel/stacktrace.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
> > index 5667f1da3ede..8d088408928d 100644
> > --- a/kernel/stacktrace.c
> > +++ b/kernel/stacktrace.c
> > @@ -259,7 +259,6 @@ __weak int
> > save_stack_trace_tsk_reliable(struct task_struct *tsk,
> > struct stack_trace *trace)
> > {
> > - WARN_ONCE(1, KERN_INFO "save_stack_tsk_reliable() not implemented yet.\n");
> > return -ENOSYS;
> > }
>
> Do we even need the weak function now after Thomas' changes to
> kernel/stacktrace.c?
>
> - livepatch is the only user and it calls stack_trace_save_tsk_reliable()
> - x86 defines CONFIG_ARCH_STACKWALK and CONFIG_HAVE_RELIABLE_STACKTRACE,
> so it has stack_trace_save_tsk_reliable() implemented and it calls
> arch_stack_walk_reliable()
> - powerpc defines CONFIG_HAVE_RELIABLE_STACKTRACE and does not have
> CONFIG_ARCH_STACKWALK. It also has stack_trace_save_tsk_reliable()
> implemented and it calls save_stack_trace_tsk_reliable(), which is
> implemented in arch/powerpc/
> - all other archs do not have CONFIG_HAVE_RELIABLE_STACKTRACE and there is
> stack_trace_save_tsk_reliable() returning ENOSYS for these cases in
> include/linux/stacktrace.c

I think you're right. stack_trace_save_tsk_reliable() in stacktrace.h
returning -ENOSYS serves the same purpose as the old weak version of
save_stack_trace_tsk_reliable() which is no longer called directly.

--
Josh

2019-05-31 19:39:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 2/3] livepatch: Remove duplicate warning about missing reliable stacktrace support

On Fri, May 31, 2019 at 02:32:34PM +0200, Miroslav Benes wrote:
> On Fri, 31 May 2019, Petr Mladek wrote:
>
> > WARN_ON_ONCE() could not be called safely under rq lock because
> > of console deadlock issues.
> >
> > It can be simply removed. A better descriptive message is written
> > in klp_enable_patch() when klp_have_reliable_stack() fails.
> > The remaining debug message is good enough.
> >
> > Signed-off-by: Petr Mladek <[email protected]>
> > ---
> > kernel/livepatch/transition.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index abb2a4a2cbb2..1bf362df76e1 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -247,7 +247,6 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > int ret, nr_entries;
> >
> > ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
> > - WARN_ON_ONCE(ret == -ENOSYS);
> > if (ret < 0) {
> > snprintf(err_buf, STACK_ERR_BUF_SIZE,
> > "%s: %s:%d has an unreliable stack\n",
>
> The current situation is not the best, but I think the patch improves it
> only slightly. I see two possible solutions.
>
> 1. we either revert commit 1d98a69e5cef ("livepatch: Remove reliable
> stacktrace check in klp_try_switch_task()"), so that klp_check_stack()
> returns right away.
>
> 2. or we test ret from stack_trace_save_tsk_reliable() for ENOSYS and
> return.
>
> In my opinion either of them is better than what we have now (and what we
> would have with the patch), because klp_check_stack() returns, but it
> prints out that a task has an unreliable stack. Yes, it is pr_debug() only
> in the end, but still.
>
> I don't have a preference and my understanding is that Petr does not want
> to do v4. I can prepare a patch, but it would be nice to choose now. Josh?
> Anyone else?

My vote would be #1 -- just revert 1d98a69e5cef.

--
Josh

2019-05-31 19:40:43

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 3/3] livepatch: Use static buffer for debugging messages under rq lock

On Fri, May 31, 2019 at 09:41:47AM +0200, Petr Mladek wrote:
> The err_buf array uses 128 bytes of stack space. Move it off the stack
> by making it static. It's safe to use a shared buffer because
> klp_try_switch_task() is called under klp_mutex.
>
> Signed-off-by: Petr Mladek <[email protected]>
> Acked-by: Miroslav Benes <[email protected]>
> Reviewed-by: Kamalesh Babulal <[email protected]>

Acked-by: Josh Poimboeuf <[email protected]>

--
Josh

2019-06-03 10:05:33

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 3/3] livepatch: Use static buffer for debugging messages under rq lock

On Fri 2019-05-31 14:37:52, Miroslav Benes wrote:
> On Fri, 31 May 2019, Petr Mladek wrote:
>
> > The err_buf array uses 128 bytes of stack space. Move it off the stack
> > by making it static. It's safe to use a shared buffer because
> > klp_try_switch_task() is called under klp_mutex.
> >
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index 1bf362df76e1..5c4f0c1f826e 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -327,7 +327,6 @@ static bool klp_try_switch_task(struct task_struct *task)
> > pr_debug("%s", err_buf);
> >
> > return success;
> > -
> > }
>
> This could go in separately as it is not connected to the rest of the
> series.

I have never seen a standalone commit just removing an empty line.
It is usually done when one touches the code around.

If you resist, we could omit this hunk from the patch and leave
the code as is.

Best Regards,
Petr

2019-06-06 13:30:40

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 3/3] livepatch: Use static buffer for debugging messages under rq lock

On Fri 2019-05-31 14:39:08, Josh Poimboeuf wrote:
> On Fri, May 31, 2019 at 09:41:47AM +0200, Petr Mladek wrote:
> > The err_buf array uses 128 bytes of stack space. Move it off the stack
> > by making it static. It's safe to use a shared buffer because
> > klp_try_switch_task() is called under klp_mutex.
> >
> > Signed-off-by: Petr Mladek <[email protected]>
> > Acked-by: Miroslav Benes <[email protected]>
> > Reviewed-by: Kamalesh Babulal <[email protected]>
>
> Acked-by: Josh Poimboeuf <[email protected]>

The patch is committed into for-5.3/core branch.

Note that the branch is based on the last merge from livepatch.git.
As a result, the sefttest fails because of the regression in
the reliable stacktrace code.

You might want to base your development on the for-next branch.
Or you chould cherry pick the commit 7eaf51a2e094229b75cc0c31
("[PATCH] stacktrace: Unbreak stack_trace_save_tsk_reliable()").

Best Regards,
Petr

PS: I am leaving the fate of the other two patches into Miroslav's
hands ;-)