2015-05-13 16:08:33

by Shreyas B. Prabhu

[permalink] [raw]
Subject: [PATCH] tracing: Add comments explaining cpu online filter for trace events

trace_mm_page_pcpu_drain, trace_kmem_cache_free, trace_mm_page_free
and trace_tlb_flush can be potentially called from an offlined cpu.
Since trace points use RCU and RCU should not be used from offlined
cpus, we have checks to filter out such calls. Add comments to explain
this.

Signed-off-by: Shreyas B. Prabhu <[email protected]>
---
This applies on top of patches posted here:
https://lkml.org/lkml/2015/5/8/527

include/trace/events/kmem.h | 15 +++++++++++++++
include/trace/events/tlb.h | 5 +++++
2 files changed, 20 insertions(+)

diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index 6cd975f..9883f2f 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -146,6 +146,11 @@ DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,

TP_ARGS(call_site, ptr),

+ /*
+ * This trace can be potentially called from an offlined cpu.
+ * Since trace points use RCU and RCU should not be used from
+ * offline cpus, filter such calls out.
+ */
TP_CONDITION(cpu_online(smp_processor_id()))
);

@@ -155,6 +160,11 @@ TRACE_EVENT_CONDITION(mm_page_free,

TP_ARGS(page, order),

+ /*
+ * This trace can be potentially called from an offlined cpu.
+ * Since trace points use RCU and RCU should not be used from
+ * offline cpus, filter such calls out.
+ */
TP_CONDITION(cpu_online(smp_processor_id())),

TP_STRUCT__entry(
@@ -263,6 +273,11 @@ TRACE_EVENT_CONDITION(mm_page_pcpu_drain,

TP_ARGS(page, order, migratetype),

+ /*
+ * This trace can be potentially called from an offlined cpu.
+ * Since trace points use RCU and RCU should not be used from
+ * offline cpus, filter such calls out.
+ */
TP_CONDITION(cpu_online(smp_processor_id())),

TP_STRUCT__entry(
diff --git a/include/trace/events/tlb.h b/include/trace/events/tlb.h
index 4250f36..2afc3ed 100644
--- a/include/trace/events/tlb.h
+++ b/include/trace/events/tlb.h
@@ -38,6 +38,11 @@ TRACE_EVENT_CONDITION(tlb_flush,
TP_PROTO(int reason, unsigned long pages),
TP_ARGS(reason, pages),

+ /*
+ * This trace can be potentially called from an offlined cpu.
+ * Since trace points use RCU and RCU should not be used from
+ * offline cpus, filter such calls out.
+ */
TP_CONDITION(cpu_online(smp_processor_id())),

TP_STRUCT__entry(
--
1.9.3


2015-05-13 16:21:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing: Add comments explaining cpu online filter for trace events

On Wed, 13 May 2015 21:37:43 +0530
"Shreyas B. Prabhu" <[email protected]> wrote:

> trace_mm_page_pcpu_drain, trace_kmem_cache_free, trace_mm_page_free
> and trace_tlb_flush can be potentially called from an offlined cpu.
> Since trace points use RCU and RCU should not be used from offlined
> cpus, we have checks to filter out such calls. Add comments to explain
> this.
>
> Signed-off-by: Shreyas B. Prabhu <[email protected]>
> ---
> This applies on top of patches posted here:
> https://lkml.org/lkml/2015/5/8/527
>
> include/trace/events/kmem.h | 15 +++++++++++++++
> include/trace/events/tlb.h | 5 +++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index 6cd975f..9883f2f 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -146,6 +146,11 @@ DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,
>
> TP_ARGS(call_site, ptr),
>
> + /*
> + * This trace can be potentially called from an offlined cpu.
> + * Since trace points use RCU and RCU should not be used from
> + * offline cpus, filter such calls out.
> + */
> TP_CONDITION(cpu_online(smp_processor_id()))
> );
>

Thanks for the comments, but can't these still be called with
preemption enabled. What happens when CONFIG_DEBUG_PREEMPT is set and
you enable these tracepoints. Wont it trigger a warning about
smp_processor_id() being used in preemptible code?

-- Steve

2015-05-14 14:17:12

by Shreyas B. Prabhu

[permalink] [raw]
Subject: Re: [PATCH] tracing: Add comments explaining cpu online filter for trace events



On Wednesday 13 May 2015 09:51 PM, Steven Rostedt wrote:
> On Wed, 13 May 2015 21:37:43 +0530
> "Shreyas B. Prabhu" <[email protected]> wrote:
>
>> trace_mm_page_pcpu_drain, trace_kmem_cache_free, trace_mm_page_free
>> and trace_tlb_flush can be potentially called from an offlined cpu.
>> Since trace points use RCU and RCU should not be used from offlined
>> cpus, we have checks to filter out such calls. Add comments to explain
>> this.
>>
>> Signed-off-by: Shreyas B. Prabhu <[email protected]>
>> ---
>> This applies on top of patches posted here:
>> https://lkml.org/lkml/2015/5/8/527
>>
>> include/trace/events/kmem.h | 15 +++++++++++++++
>> include/trace/events/tlb.h | 5 +++++
>> 2 files changed, 20 insertions(+)
>>
>> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
>> index 6cd975f..9883f2f 100644
>> --- a/include/trace/events/kmem.h
>> +++ b/include/trace/events/kmem.h
>> @@ -146,6 +146,11 @@ DEFINE_EVENT_CONDITION(kmem_free, kmem_cache_free,
>>
>> TP_ARGS(call_site, ptr),
>>
>> + /*
>> + * This trace can be potentially called from an offlined cpu.
>> + * Since trace points use RCU and RCU should not be used from
>> + * offline cpus, filter such calls out.
>> + */
>> TP_CONDITION(cpu_online(smp_processor_id()))
>> );
>>
>
> Thanks for the comments, but can't these still be called with
> preemption enabled. What happens when CONFIG_DEBUG_PREEMPT is set and
> you enable these tracepoints. Wont it trigger a warning about
> smp_processor_id() being used in preemptible code?
>
Yes. It does trigger "using smp_processor_id() in preemptible code"
warnings. But as you mentioned in the previous comments, we should be
safe even if the trace call happens from a preemptible section. Let me
play out the scenarios here again-

The task gets migrated after the smp_processor_id()
1. From an online cpu to another online cpu - No impact
2. From an online cpu to an offline cpu - Should never happen
3. From an offline cpu to an online cpu - IIUC, once a cpu has been
offlined it returns to cpu_idle_loop, discovers its offline and calls
arch_cpu_idle_dead. All this happens with preemption disabled. So this
scenario too should never happen.

So I don't see any downside to changing smp_processor_id() to
raw_smp_processor_id() which will suppress the warnings. If you agree
I'll send a patch doing this.

Another alternative which is perhaps worth considering is to change
__DO_TRACE itself to check for offline cpu, without a trace event
specifying the check. This will prevent any currently uncaught and any
future tracepoints from using RCU on offline cpus. But I guess it's
little extreme considering only a low fraction of tracepoints have
potential of being called from offline cpus.

Thanks,
Shreyas

2015-05-14 14:56:45

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing: Add comments explaining cpu online filter for trace events

On Thu, 14 May 2015 19:46:11 +0530
Shreyas B Prabhu <[email protected]> wrote:

> > Thanks for the comments, but can't these still be called with
> > preemption enabled. What happens when CONFIG_DEBUG_PREEMPT is set and
> > you enable these tracepoints. Wont it trigger a warning about
> > smp_processor_id() being used in preemptible code?
> >
> Yes. It does trigger "using smp_processor_id() in preemptible code"
> warnings. But as you mentioned in the previous comments, we should be
> safe even if the trace call happens from a preemptible section. Let me
> play out the scenarios here again-
>
> The task gets migrated after the smp_processor_id()
> 1. From an online cpu to another online cpu - No impact
> 2. From an online cpu to an offline cpu - Should never happen
> 3. From an offline cpu to an online cpu - IIUC, once a cpu has been
> offlined it returns to cpu_idle_loop, discovers its offline and calls
> arch_cpu_idle_dead. All this happens with preemption disabled. So this
> scenario too should never happen.
>
> So I don't see any downside to changing smp_processor_id() to
> raw_smp_processor_id() which will suppress the warnings. If you agree
> I'll send a patch doing this.

Yes, please use the raw_smp_processor_id(), and you can add the above
description about why it is safe to do so (in the comments).

>
> Another alternative which is perhaps worth considering is to change
> __DO_TRACE itself to check for offline cpu, without a trace event
> specifying the check. This will prevent any currently uncaught and any
> future tracepoints from using RCU on offline cpus. But I guess it's
> little extreme considering only a low fraction of tracepoints have
> potential of being called from offline cpus.

I think that's a bit extreme, as it would cause an impact to the speed
of tracepoints in the hot path.

-- Steve