LinuxLists.cc - [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

2019-07-29 01:10:29

Subject: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

If context tracking is enabled, causing page fault in preemptirq
irq_enable or irq_disable events triggers the following RCU EQS warning.

Reproducer:

// CONFIG_PREEMPTIRQ_EVENTS=y
// CONFIG_CONTEXT_TRACKING=y
// CONFIG_RCU_EQS_DEBUG=y
# echo 1 > events/preemptirq/irq_disable/enable
# echo 1 > options/userstacktrace

WARNING: CPU: 0 PID: 2574 at kernel/rcu/tree.c:262 rcu_dynticks_eqs_exit+0x48/0x50
CPU: 0 PID: 2574 Comm: sh Not tainted 5.3.0-rc1+ #105
RIP: 0010:rcu_dynticks_eqs_exit+0x48/0x50
Call Trace:
rcu_eqs_exit+0x4e/0xd0
rcu_user_exit+0x13/0x20
__context_tracking_exit.part.0+0x74/0x120
context_tracking_exit.part.0+0x28/0x50
context_tracking_exit+0x1d/0x20
do_page_fault+0xab/0x1b0
do_async_page_fault+0x35/0xb0
async_page_fault+0x3e/0x50
RIP: 0010:arch_stack_walk_user+0x8e/0x100
stack_trace_save_user+0x7d/0xa9
trace_buffer_unlock_commit_regs+0x178/0x220
trace_event_buffer_commit+0x6b/0x200
trace_event_raw_event_preemptirq_template+0x7b/0xc0
trace_hardirqs_off_caller+0xb3/0xf0
trace_hardirqs_off_thunk+0x1a/0x20
entry_SYSCALL_64_after_hwframe+0x3e/0xbe

Details of call trace and RCU EQS/Context:

entry_SYSCALL_64_after_hwframe() EQS: IN, CTX: USER
trace_irq_disable_rcuidle()
rcu_irq_enter_irqson()
rcu_dynticks_eqs_exit() EQS: IN => OUT
stack_trace_save_user() EQS: OUT, CTX: USER
page_fault()
do_page_fault()
exception_enter() EQS: OUT, CTX: USER
context_tracking_exit()
rcu_eqs_exit()
rcu_dynticks_eqs_exit() EQS: OUT => OUT? (warning)

trace_irq_disable/enable_rcuidle() are called from user mode in entry
code, and calls rcu_irq_enter_irqson() in __DO_TRACE(). This can cause
the state "RCU ESQ: OUT but CTX: USER", then stack_trace_save_user() can
cause page fault which calls rcu_dynticks_eqs_exit() again leads to hit
the EQS validation warning if CONFIG_RCU_EQS_DEBUG is enabled.

Fix it by calling exception_enter/exit() around
trace_irq_disable/enable_rcuidle() to enter CONTEXT_KERNEL before
tracing code causes page fault. Also makes the timing of state change to
CONTEXT_KERNL earlier to prevent tracing codes from calling
context_tracking_exit() recursively.

Ideally, the problem can be fixed by calling enter_from_user_mode() before
TRACE_IRQS_OFF in entry codes (then we need to tell lockdep that IRQs are
off eariler) and calling prepare_exit_to_usermode() after TRACE_IRQS_ON.
But this patch will be much simpler and limit the most change to tracing
codes.

Fixes: 865e63b04e9b ("tracing: Add back in rcu_irq_enter/exit_irqson() for rcuidle tracepoints")
Signed-off-by: Eiichi Tsukata <[email protected]>
---
kernel/context_tracking.c | 6 +++++-
kernel/trace/trace_preemptirq.c | 15 +++++++++++++--
2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index be01a4d627c9..860eaf9780e5 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -148,6 +148,11 @@ void __context_tracking_exit(enum ctx_state state)
return;

if (__this_cpu_read(context_tracking.state) == state) {
+ /*
+ * Change state before executing codes which can trigger
+ * page fault leading unnecessary re-entrance.
+ */
+ __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
if (__this_cpu_read(context_tracking.active)) {
/*
* We are going to run code that may use RCU. Inform
@@ -159,7 +164,6 @@ void __context_tracking_exit(enum ctx_state state)
trace_user_exit(0);
}
}
- __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
}
context_tracking_recursion_exit();
}
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 4d8e99fdbbbe..031b51cb94d0 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -10,6 +10,7 @@
#include <linux/module.h>
#include <linux/ftrace.h>
#include <linux/kprobes.h>
+#include <linux/context_tracking.h>
#include "trace.h"

#define CREATE_TRACE_POINTS
@@ -49,9 +50,14 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);

__visible void trace_hardirqs_on_caller(unsigned long caller_addr)
{
+ enum ctx_state prev_state;
+
if (this_cpu_read(tracing_irq_cpu)) {
- if (!in_nmi())
+ if (!in_nmi()) {
+ prev_state = exception_enter();
trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
+ exception_exit(prev_state);
+ }
tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
this_cpu_write(tracing_irq_cpu, 0);
}
@@ -63,11 +69,16 @@ NOKPROBE_SYMBOL(trace_hardirqs_on_caller);

__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
+ enum ctx_state prev_state;
+
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
- if (!in_nmi())
+ if (!in_nmi()) {
+ prev_state = exception_enter();
trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
+ exception_exit(prev_state);
+ }
}

lockdep_hardirqs_off(CALLER_ADDR0);
--
2.21.0

2019-07-29 04:44:09

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Sun, Jul 28, 2019 at 6:08 PM Eiichi Tsukata <[email protected]> wrote:
>
> If context tracking is enabled, causing page fault in preemptirq
> irq_enable or irq_disable events triggers the following RCU EQS warning.
>

Yuck.

> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index be01a4d627c9..860eaf9780e5 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -148,6 +148,11 @@ void __context_tracking_exit(enum ctx_state state)
> return;
>
> if (__this_cpu_read(context_tracking.state) == state) {
> + /*
> + * Change state before executing codes which can trigger
> + * page fault leading unnecessary re-entrance.
> + */
> + __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);

Seems reasonable.

> if (__this_cpu_read(context_tracking.active)) {
> /*
> * We are going to run code that may use RCU. Inform
> @@ -159,7 +164,6 @@ void __context_tracking_exit(enum ctx_state state)
> trace_user_exit(0);
> }
> }
> - __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
> }
> context_tracking_recursion_exit();
> }
> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> index 4d8e99fdbbbe..031b51cb94d0 100644
> --- a/kernel/trace/trace_preemptirq.c
> +++ b/kernel/trace/trace_preemptirq.c
> @@ -10,6 +10,7 @@
> #include <linux/module.h>
> #include <linux/ftrace.h>
> #include <linux/kprobes.h>
> +#include <linux/context_tracking.h>
> #include "trace.h"
>
> #define CREATE_TRACE_POINTS
> @@ -49,9 +50,14 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
>
> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> {
> + enum ctx_state prev_state;
> +
> if (this_cpu_read(tracing_irq_cpu)) {
> - if (!in_nmi())
> + if (!in_nmi()) {
> + prev_state = exception_enter();
> trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
> + exception_exit(prev_state);
> + }
> tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
> this_cpu_write(tracing_irq_cpu, 0);
> }

This seems a bit distressing. Now we're going to do a whole bunch of
context tracking transitions for each kernel entry. Would a better
fix me to change trace_hardirqs_on_caller to skip the trace event if
the previous state was already IRQs on and, more importantly, to skip
tracing IRQs off if IRQs were already off? The x86 code is very
careful to avoid ever having IRQs on and CONTEXT_USER at the same
time.

--Andy

2019-07-29 10:31:48

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Sun, Jul 28, 2019 at 09:25:58PM -0700, Andy Lutomirski wrote:
> On Sun, Jul 28, 2019 at 6:08 PM Eiichi Tsukata <[email protected]> wrote:

> > diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> > index 4d8e99fdbbbe..031b51cb94d0 100644
> > --- a/kernel/trace/trace_preemptirq.c
> > +++ b/kernel/trace/trace_preemptirq.c
> > @@ -10,6 +10,7 @@
> > #include <linux/module.h>
> > #include <linux/ftrace.h>
> > #include <linux/kprobes.h>
> > +#include <linux/context_tracking.h>
> > #include "trace.h"
> >
> > #define CREATE_TRACE_POINTS
> > @@ -49,9 +50,14 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
> >
> > __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> > {
> > + enum ctx_state prev_state;
> > +
> > if (this_cpu_read(tracing_irq_cpu)) {
> > - if (!in_nmi())
> > + if (!in_nmi()) {
> > + prev_state = exception_enter();
> > trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
> > + exception_exit(prev_state);
> > + }
> > tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
> > this_cpu_write(tracing_irq_cpu, 0);
> > }
>
> This seems a bit distressing. Now we're going to do a whole bunch of
> context tracking transitions for each kernel entry. Would a better
> fix me to change trace_hardirqs_on_caller to skip the trace event if
> the previous state was already IRQs on and, more importantly, to skip
> tracing IRQs off if IRQs were already off? The x86 code is very
> careful to avoid ever having IRQs on and CONTEXT_USER at the same
> time.

I think they already (try to) do that; see 'tracing_irq_cpu'.

2019-07-29 15:22:24

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Mon, 29 Jul 2019 10:07:34 +0900
Eiichi Tsukata <[email protected]> wrote:

> If context tracking is enabled, causing page fault in preemptirq
> irq_enable or irq_disable events triggers the following RCU EQS warning.
>
> Reproducer:
>
> // CONFIG_PREEMPTIRQ_EVENTS=y
> // CONFIG_CONTEXT_TRACKING=y
> // CONFIG_RCU_EQS_DEBUG=y
> # echo 1 > events/preemptirq/irq_disable/enable
> # echo 1 > options/userstacktrace

So the problem is only with userstacktrace enabled?

> }
> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> index 4d8e99fdbbbe..031b51cb94d0 100644
> --- a/kernel/trace/trace_preemptirq.c
> +++ b/kernel/trace/trace_preemptirq.c
> @@ -10,6 +10,7 @@
> #include <linux/module.h>
> #include <linux/ftrace.h>
> #include <linux/kprobes.h>
> +#include <linux/context_tracking.h>
> #include "trace.h"
>
> #define CREATE_TRACE_POINTS
> @@ -49,9 +50,14 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
>
> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> {
> + enum ctx_state prev_state;
> +
> if (this_cpu_read(tracing_irq_cpu)) {
> - if (!in_nmi())
> + if (!in_nmi()) {

This is a very high fast path (for tracing irqs off and such). Instead
of adding a check here for a case that is seldom used (userstacktrace
and tracing irqs on/off). Move this to surround the userstack trace
code.

-- Steve

> + prev_state = exception_enter();
> trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
> + exception_exit(prev_state);
> + }
> tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
> this_cpu_write(tracing_irq_cpu, 0);
> }
> @@ -63,11 +69,16 @@ NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
>
> __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
> {
> + enum ctx_state prev_state;
> +
> if (!this_cpu_read(tracing_irq_cpu)) {
> this_cpu_write(tracing_irq_cpu, 1);
> tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
> - if (!in_nmi())
> + if (!in_nmi()) {
> + prev_state = exception_enter();
> trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
> + exception_exit(prev_state);
> + }
> }
>
> lockdep_hardirqs_off(CALLER_ADDR0);

2019-07-30 04:30:37

by Eiichi Tsukata

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

Thanks for comments.

On 2019/07/29 19:29, Peter Zijlstra wrote:
> On Sun, Jul 28, 2019 at 09:25:58PM -0700, Andy Lutomirski wrote:
>> On Sun, Jul 28, 2019 at 6:08 PM Eiichi Tsukata <[email protected]> wrote:
>
>>> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
>>> index 4d8e99fdbbbe..031b51cb94d0 100644
>>> --- a/kernel/trace/trace_preemptirq.c
>>> +++ b/kernel/trace/trace_preemptirq.c
>>> @@ -10,6 +10,7 @@
>>> #include <linux/module.h>
>>> #include <linux/ftrace.h>
>>> #include <linux/kprobes.h>
>>> +#include <linux/context_tracking.h>
>>> #include "trace.h"
>>>
>>> #define CREATE_TRACE_POINTS
>>> @@ -49,9 +50,14 @@ NOKPROBE_SYMBOL(trace_hardirqs_off);
>>>
>>> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
>>> {
>>> + enum ctx_state prev_state;
>>> +
>>> if (this_cpu_read(tracing_irq_cpu)) {
>>> - if (!in_nmi())
>>> + if (!in_nmi()) {
>>> + prev_state = exception_enter();
>>> trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
>>> + exception_exit(prev_state);
>>> + }
>>> tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
>>> this_cpu_write(tracing_irq_cpu, 0);
>>> }
>>
>> This seems a bit distressing. Now we're going to do a whole bunch of
>> context tracking transitions for each kernel entry. Would a better>> fix me to change trace_hardirqs_on_caller to skip the trace event if
>> the previous state was already IRQs on and, more importantly, to skip
>> tracing IRQs off if IRQs were already off? The x86 code is very
>> careful to avoid ever having IRQs on and CONTEXT_USER at the same
>> time.
>
> I think they already (try to) do that; see 'tracing_irq_cpu'.
>

Or you mean something like this?
As for trace_hardirqs_off_caller:

diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 4d8e99fdbbbe..d39478bcf0f2 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -66,7 +66,7 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
- if (!in_nmi())
+ if (!in_nmi() && !irqs_disabled())
trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
}

Or

diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 4d8e99fdbbbe..e08c5c6ff2b3 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -66,8 +66,6 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
- if (!in_nmi())
- trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
}

As for trace_hardirqs_on_caller, it is called when IRQs off and CONTEXT_USER.
So even though we skipped the trace event if the previous state was already IRQs on,
we will fall into the same situation.

2019-07-30 04:39:06

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Tue, 30 Jul 2019 10:50:36 +0900
Eiichi Tsukata <[email protected]> wrote:

> >
> > I think they already (try to) do that; see 'tracing_irq_cpu'.
> >
>
> Or you mean something like this?
> As for trace_hardirqs_off_caller:

You missed what Peter said.

>
> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> index 4d8e99fdbbbe..d39478bcf0f2 100644
> --- a/kernel/trace/trace_preemptirq.c
> +++ b/kernel/trace/trace_preemptirq.c
> @@ -66,7 +66,7 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
> if (!this_cpu_read(tracing_irq_cpu)) {
^^^^^^^^^^^^^^^
The above makes this called only the first time we disable interrupts.

> this_cpu_write(tracing_irq_cpu, 1);
> tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
> - if (!in_nmi())
> + if (!in_nmi() && !irqs_disabled())

This would always be false. This function is always called with irqs_disabled()!

So no, this is not what is meant.

> trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
> }
>
> Or
>
> diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
> index 4d8e99fdbbbe..e08c5c6ff2b3 100644
> --- a/kernel/trace/trace_preemptirq.c
> +++ b/kernel/trace/trace_preemptirq.c
> @@ -66,8 +66,6 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
> if (!this_cpu_read(tracing_irq_cpu)) {
> this_cpu_write(tracing_irq_cpu, 1);
> tracer_hardirqs_off(CALLER_ADDR0, caller_addr);
> - if (!in_nmi())
> - trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);

And this just removes the tracepoint completely?

-- Steve

> }
>
>
> As for trace_hardirqs_on_caller, it is called when IRQs off and CONTEXT_USER.
> So even though we skipped the trace event if the previous state was already IRQs on,
> we will fall into the same situation.

2019-07-30 04:43:32

by Eiichi Tsukata

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

Thanks for comments.

On 2019/07/30 0:21, Steven Rostedt wrote:
> On Mon, 29 Jul 2019 10:07:34 +0900
> Eiichi Tsukata <[email protected]> wrote:
>
>> If context tracking is enabled, causing page fault in preemptirq
>> irq_enable or irq_disable events triggers the following RCU EQS warning.
>>
>> Reproducer:
>>
>> // CONFIG_PREEMPTIRQ_EVENTS=y
>> // CONFIG_CONTEXT_TRACKING=y
>> // CONFIG_RCU_EQS_DEBUG=y
>> # echo 1 > events/preemptirq/irq_disable/enable
>> # echo 1 > options/userstacktrace
>
> So the problem is only with userstacktrace enabled?

It can happen when tracing code causes page fault in preemptirq events.
For example, the following perf command also hit the warning:

# perf record -e 'preemptirq:irq_enable' -g ls

>>
>> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
>> {
>> + enum ctx_state prev_state;
>> +
>> if (this_cpu_read(tracing_irq_cpu)) {
>> - if (!in_nmi())
>> + if (!in_nmi()) {
>
> This is a very high fast path (for tracing irqs off and such). Instead
> of adding a check here for a case that is seldom used (userstacktrace
> and tracing irqs on/off). Move this to surround the userstack trace
> code.
>
> -- Steve

If the problem was only with userstacktrace, it will be reasonable to
surround only the userstack unwinder. But the situation is similar to
the previous "tracing vs CR2" case. As Peter taught me in
https://lore.kernel.org/lkml/[email protected]/
there are some other codes likely to to user access.
So I surround preemptirq events earlier.

2019-07-30 04:45:05

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Tue, 30 Jul 2019 11:00:42 +0900
Eiichi Tsukata <[email protected]> wrote:

> Thanks for comments.
>
> On 2019/07/30 0:21, Steven Rostedt wrote:
> > On Mon, 29 Jul 2019 10:07:34 +0900
> > Eiichi Tsukata <[email protected]> wrote:
> >
> >> If context tracking is enabled, causing page fault in preemptirq
> >> irq_enable or irq_disable events triggers the following RCU EQS warning.
> >>
> >> Reproducer:
> >>
> >> // CONFIG_PREEMPTIRQ_EVENTS=y
> >> // CONFIG_CONTEXT_TRACKING=y
> >> // CONFIG_RCU_EQS_DEBUG=y
> >> # echo 1 > events/preemptirq/irq_disable/enable
> >> # echo 1 > options/userstacktrace
> >
> > So the problem is only with userstacktrace enabled?
>
> It can happen when tracing code causes page fault in preemptirq events.
> For example, the following perf command also hit the warning:
>
> # perf record -e 'preemptirq:irq_enable' -g ls

Again,

That's not a irq trace event issue, that's a stack trace issue.

>
>
> >>
> >> __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> >> {
> >> + enum ctx_state prev_state;
> >> +
> >> if (this_cpu_read(tracing_irq_cpu)) {
> >> - if (!in_nmi())
> >> + if (!in_nmi()) {
> >
> > This is a very high fast path (for tracing irqs off and such). Instead
> > of adding a check here for a case that is seldom used (userstacktrace
> > and tracing irqs on/off). Move this to surround the userstack trace
> > code.
> >
> > -- Steve
>
> If the problem was only with userstacktrace, it will be reasonable to
> surround only the userstack unwinder. But the situation is similar to
> the previous "tracing vs CR2" case. As Peter taught me in
> https://lore.kernel.org/lkml/[email protected]/
> there are some other codes likely to to user access.
> So I surround preemptirq events earlier.

I disagree. The issue is with the attached callbacks that call
something (a stack unwinder) that can fault.

This is called whenever irqs are disabled. I say we surround the
culprit (the stack unwinder or stack trace) and not punish the irq
enable/disable events.

So NAK on this patch.

-- Steve

2019-07-30 10:05:50

by Joel Fernandes

[permalink] [raw]

Subject: Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

On Mon, Jul 29, 2019 at 10:15 PM Steven Rostedt <[email protected]> wrote:
[snip]
> > If the problem was only with userstacktrace, it will be reasonable to
> > surround only the userstack unwinder. But the situation is similar to
> > the previous "tracing vs CR2" case. As Peter taught me in
> > https://lore.kernel.org/lkml/[email protected]/
> > there are some other codes likely to to user access.
> > So I surround preemptirq events earlier.
>
> I disagree. The issue is with the attached callbacks that call
> something (a stack unwinder) that can fault.
>
> This is called whenever irqs are disabled. I say we surround the
> culprit (the stack unwinder or stack trace) and not punish the irq
> enable/disable events.

I agree with everything Steve said.

thanks,

- Joel