When the bts tracer is removed while the traced task is running,
the write to clear the bts tracer pointer races with context switch code.
Read the tracer once during a context switch.
When a new tracer is installed, the bts tracer is set in the ds context
before the tracer is initialized in order to claim the context for that
tracer.
This may result in write accesses using an uninitialized trace configuration
when scheduling timestamps have been requested.
Store active tracing flags separately and only set active flags after
the tracing configuration has been initialized.
Signed-off-by: Markus Metzger <[email protected]>
---
Index: git-tip/arch/x86/kernel/ds.c
===================================================================
--- git-tip.orig/arch/x86/kernel/ds.c 2009-03-30 17:47:21.000000000 +0200
+++ git-tip/arch/x86/kernel/ds.c 2009-03-30 17:48:31.000000000 +0200
@@ -89,6 +89,9 @@ struct bts_tracer {
/* Buffer overflow notification function: */
bts_ovfl_callback_t ovfl;
+
+ /* Active flags affecting trace collection. */
+ unsigned int flags;
};
struct pebs_tracer {
@@ -799,6 +802,8 @@ void ds_suspend_bts(struct bts_tracer *t
if (!tracer)
return;
+ tracer->flags = 0;
+
task = tracer->ds.context->task;
if (!task || (task == current))
@@ -820,6 +825,8 @@ void ds_resume_bts(struct bts_tracer *tr
if (!tracer)
return;
+ tracer->flags = tracer->trace.ds.flags;
+
task = tracer->ds.context->task;
control = ds_cfg.ctl[dsf_bts];
@@ -1044,36 +1051,39 @@ void ds_switch_to(struct task_struct *pr
{
struct ds_context *prev_ctx = prev->thread.ds_ctx;
struct ds_context *next_ctx = next->thread.ds_ctx;
+ unsigned long debugctlmsr = next->thread.debugctlmsr;
if (prev_ctx) {
+ struct bts_tracer *tracer = prev_ctx->bts_master;
+
update_debugctlmsr(0);
- if (prev_ctx->bts_master &&
- (prev_ctx->bts_master->trace.ds.flags & BTS_TIMESTAMPS)) {
+ if (tracer && (tracer->flags & BTS_TIMESTAMPS)) {
struct bts_struct ts = {
.qualifier = bts_task_departs,
.variant.timestamp.jiffies = jiffies_64,
.variant.timestamp.pid = prev->pid
};
- bts_write(prev_ctx->bts_master, &ts);
+ bts_write(tracer, &ts);
}
}
if (next_ctx) {
- if (next_ctx->bts_master &&
- (next_ctx->bts_master->trace.ds.flags & BTS_TIMESTAMPS)) {
+ struct bts_tracer *tracer = next_ctx->bts_master;
+
+ if (tracer && (tracer->flags & BTS_TIMESTAMPS)) {
struct bts_struct ts = {
.qualifier = bts_task_arrives,
.variant.timestamp.jiffies = jiffies_64,
.variant.timestamp.pid = next->pid
};
- bts_write(next_ctx->bts_master, &ts);
+ bts_write(tracer, &ts);
}
wrmsrl(MSR_IA32_DS_AREA, (unsigned long)next_ctx->ds);
}
- update_debugctlmsr(next->thread.debugctlmsr);
+ update_debugctlmsr(debugctlmsr);
}
void ds_copy_thread(struct task_struct *tsk, struct task_struct *father)
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
On 03/31, Markus Metzger wrote:
>
> Read the tracer once during a context switch.
> ...
> @@ -1044,36 +1051,39 @@ void ds_switch_to(struct task_struct *pr
> {
> struct ds_context *prev_ctx = prev->thread.ds_ctx;
> struct ds_context *next_ctx = next->thread.ds_ctx;
> + unsigned long debugctlmsr = next->thread.debugctlmsr;
>
> if (prev_ctx) {
> + struct bts_tracer *tracer = prev_ctx->bts_master;
> +
> update_debugctlmsr(0);
>
> - if (prev_ctx->bts_master &&
> - (prev_ctx->bts_master->trace.ds.flags & BTS_TIMESTAMPS)) {
> + if (tracer && (tracer->flags & BTS_TIMESTAMPS)) {
In theory, we need barrier() after reading ->bts_master.
(actually, I did see the bug reports when the compiler read the pointer
twice with the code like above).
Off-topic, but afaics modulo bts_task_departs/bts_task_arrives we
have the identical code for prev_ctx/next_ctx, perhaps it makes
sense to make a helper which calls bts_write().
To clarify, even _if_ I am right and _if_ you agree, we can do this
later, I am not suggesting to change this patch right now.
Oleg.
>-----Original Message-----
>From: Oleg Nesterov [mailto:[email protected]]
>Sent: Wednesday, April 01, 2009 1:48 AM
>To: Metzger, Markus T
>Cc: [email protected]; [email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected]; Villacis, Juan;
>[email protected]
>Subject: Re: [patch 1/21] x86, bts: fix race when bts tracer is removed
>
>On 03/31, Markus Metzger wrote:
>>
>> Read the tracer once during a context switch.
>> ...
>> @@ -1044,36 +1051,39 @@ void ds_switch_to(struct task_struct *pr
>> {
>> struct ds_context *prev_ctx = prev->thread.ds_ctx;
>> struct ds_context *next_ctx = next->thread.ds_ctx;
>> + unsigned long debugctlmsr = next->thread.debugctlmsr;
>>
>> if (prev_ctx) {
>> + struct bts_tracer *tracer = prev_ctx->bts_master;
>> +
>> update_debugctlmsr(0);
>>
>> - if (prev_ctx->bts_master &&
>> - (prev_ctx->bts_master->trace.ds.flags & BTS_TIMESTAMPS)) {
>> + if (tracer && (tracer->flags & BTS_TIMESTAMPS)) {
>
>In theory, we need barrier() after reading ->bts_master.
>
>(actually, I did see the bug reports when the compiler read the pointer
> twice with the code like above).
I guess the same is true for prev_ctx, next_ctx, and debugctlmsr, then.
Ingo,
would it be OK to resend this one patch with the barrier()s added?
>Off-topic, but afaics modulo bts_task_departs/bts_task_arrives we
>have the identical code for prev_ctx/next_ctx, perhaps it makes
>sense to make a helper which calls bts_write().
Agreed.
>To clarify, even _if_ I am right and _if_ you agree, we can do this
>later, I am not suggesting to change this patch right now.
Agreed, as well.
thanks and regards,
markus.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
* Metzger, Markus T <[email protected]> wrote:
> >-----Original Message-----
> >From: Oleg Nesterov [mailto:[email protected]]
> >Sent: Wednesday, April 01, 2009 1:48 AM
> >To: Metzger, Markus T
> >Cc: [email protected]; [email protected]; [email protected]; [email protected];
> >[email protected]; [email protected]; [email protected]; Villacis, Juan;
> >[email protected]
> >Subject: Re: [patch 1/21] x86, bts: fix race when bts tracer is removed
> >
> >On 03/31, Markus Metzger wrote:
> >>
> >> Read the tracer once during a context switch.
> >> ...
> >> @@ -1044,36 +1051,39 @@ void ds_switch_to(struct task_struct *pr
> >> {
> >> struct ds_context *prev_ctx = prev->thread.ds_ctx;
> >> struct ds_context *next_ctx = next->thread.ds_ctx;
> >> + unsigned long debugctlmsr = next->thread.debugctlmsr;
> >>
> >> if (prev_ctx) {
> >> + struct bts_tracer *tracer = prev_ctx->bts_master;
> >> +
> >> update_debugctlmsr(0);
> >>
> >> - if (prev_ctx->bts_master &&
> >> - (prev_ctx->bts_master->trace.ds.flags & BTS_TIMESTAMPS)) {
> >> + if (tracer && (tracer->flags & BTS_TIMESTAMPS)) {
> >
> >In theory, we need barrier() after reading ->bts_master.
> >
> >(actually, I did see the bug reports when the compiler read the pointer
> > twice with the code like above).
>
> I guess the same is true for prev_ctx, next_ctx, and debugctlmsr, then.
>
> Ingo,
> would it be OK to resend this one patch with the barrier()s added?
Sure - but note that i have put the series on hold until you get
broad Ack's from Oleg for the ptrace bits. Please fix the review
feedback from Oleg and propagate his acks into the commit logs as
well. Oleg is finding bugs we missed in the past so his review work
is very valuable.
Also - minor patch submission technicality observation: currently
each of your mails goes into a separate discussion thread, making it
hard to review them as a group.
The preferred way to send such series is to use "git format-patch" +
"git send-email". (That will give a nice 0/21 mail and a properly
threaded discussion with proper References header lines.)
Thanks,
Ingo