2007-05-25 15:00:36

by Chuck Ebbert

[permalink] [raw]
Subject: Multiple free during oprofile unload

After applying:

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6c977aad03a18019015035958c65b6729cd0574c

i386: Fix K8/core2 oprofile on multiple CPUs

We are seeing multiple free of the same object during oprofile shutdown.

May 25 07:02:56 vt kernel: oprofile: using NMI interrupt.
May 25 07:06:41 vt kernel: slab error in verify_redzone_free(): cache `size-32':
double free detected
May 25 07:06:41 vt kernel:
May 25 07:06:41 vt kernel: Call Trace:
May 25 07:06:41 vt kernel: [<ffffffff802d93e9>] __slab_error+0x24/0x26
May 25 07:06:41 vt kernel: [<ffffffff80231fcc>] cache_free_debugcheck+0x110/0x222
May 25 07:06:41 vt kernel: [<ffffffff8845dbd5>] :oprofile:free_msrs+0x28/0x66
May 25 07:06:41 vt kernel: [<ffffffff8020b28d>] kfree+0xd7/0x27e
May 25 07:06:41 vt kernel: [<ffffffff8845dbd5>] :oprofile:free_msrs+0x28/0x66
May 25 07:06:41 vt kernel: [<ffffffff8845dc47>] :oprofile:nmi_shutdown+0x34/0x36
May 25 07:06:41 vt kernel: [<ffffffff8845c070>]
:oprofile:oprofile_shutdown+0x23/0x46
May 25 07:06:41 vt kernel: [<ffffffff8845cdf6>]
:oprofile:event_buffer_release+0x16/0x46
May 25 07:06:41 vt kernel: [<ffffffff8021237d>] __fput+0xca/0x190
May 25 07:06:41 vt kernel: [<ffffffff8022d9a7>] fput+0x14/0x16
May 25 07:06:41 vt kernel: [<ffffffff80224656>] filp_close+0x66/0x71
May 25 07:06:41 vt kernel: [<ffffffff80239632>] put_files_struct+0x6d/0xc0
May 25 07:06:41 vt kernel: [<ffffffff80215126>] do_exit+0x296/0x80d
May 25 07:06:41 vt kernel: [<ffffffff80248600>] debug_mutex_init+0x0/0x45
May 25 07:06:41 vt kernel: [<ffffffff8024c115>] sys_exit_group+0x12/0x14
May 25 07:06:41 vt kernel: [<ffffffff8025c11e>] system_call+0x7e/0x83

for_each_possible_cpu(i) {
kfree(cpu_msrs[i].counters);
cpu_msrs[i].counters = NULL;
kfree(cpu_msrs[i].controls);
cpu_msrs[i].controls = NULL;
}

Seems the patch makes all the cpu_msrs[] point to the same area?


2007-05-25 16:03:16

by Andi Kleen

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload


> for_each_possible_cpu(i) {
> kfree(cpu_msrs[i].counters);
> cpu_msrs[i].counters = NULL;
> kfree(cpu_msrs[i].controls);
> cpu_msrs[i].controls = NULL;
> }
>
> Seems the patch makes all the cpu_msrs[] point to the same area?

Yes I forgot to fix the free path. Will submit a patch.

BTW that is why I'm not so happy with you being so trigger happy with stable
submissions.

-Andi



>


2007-05-25 16:19:00

by Dave Jones

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload

On Fri, May 25, 2007 at 05:42:35PM +0200, Andi Kleen wrote:
>
> > for_each_possible_cpu(i) {
> > kfree(cpu_msrs[i].counters);
> > cpu_msrs[i].counters = NULL;
> > kfree(cpu_msrs[i].controls);
> > cpu_msrs[i].controls = NULL;
> > }
> >
> > Seems the patch makes all the cpu_msrs[] point to the same area?
>
> Yes I forgot to fix the free path. Will submit a patch.
>
> BTW that is why I'm not so happy with you being so trigger happy with stable
> submissions.

So what, the alternative is leave oprofile busted in 2.6.21.x even longer?

Dave

--
http://www.codemonkey.org.uk

2007-05-25 17:14:09

by Andi Kleen

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload


>
> So what, the alternative is leave oprofile busted in 2.6.21.x even longer?

stable is about backporting _well tested_ patches from mainline.

If the stable request comes in from someone 5 minutes after the mainline submission
I don't see how that testing should happen. A grace period of a few days
at least is usually appropiate for normal patches (unless it's a security
bug)

-Andi

2007-05-25 17:38:30

by Dave Jones

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload

On Fri, May 25, 2007 at 07:13:52PM +0200, Andi Kleen wrote:
>
> >
> > So what, the alternative is leave oprofile busted in 2.6.21.x even longer?
>
> stable is about backporting _well tested_ patches from mainline.
>
> If the stable request comes in from someone 5 minutes after the mainline submission
> I don't see how that testing should happen.

And how many people have picked up on the fact that the fix in 2.6.22rc is busted
in the last week ?
Ie, we'd be waiting probably until after .22 goes out before we found out it's
still horked, and leaving .21.x busted for another three weeks.

If the functionality is completely busted in -stable, adding a patch that's
isolated to that component can't really make things much worse.
All we've done here is move the breakage.

Dave

--
http://www.codemonkey.org.uk

2007-05-25 18:03:16

by Andi Kleen

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload


> If the functionality is completely busted in -stable,

It's not; e.g. it works great on my P4 testbox with 2.6.21

> adding a patch that's
> isolated to that component can't really make things much worse.
> All we've done here is move the breakage.

I don't think stable is the right place for development.

-Andi

2007-05-25 18:33:18

by Alan

[permalink] [raw]
Subject: Re: Multiple free during oprofile unload

On Fri, 25 May 2007 20:02:58 +0200
Andi Kleen <[email protected]> wrote:

>
> > If the functionality is completely busted in -stable,
>
> It's not; e.g. it works great on my P4 testbox with 2.6.21
>
> > adding a patch that's
> > isolated to that component can't really make things much worse.
> > All we've done here is move the breakage.
>
> I don't think stable is the right place for development.

I'd agree entirely with Dave - if you are applying a fix to something
that is currently totally broken which may make it work and which doesn't
affect any other bit of code then it goes into the stable tree.

Absolutes like "no xyz in the stable tree" don't work in the real world
with real users. Intelligence does.

2007-05-26 02:28:18

by Chris Wright

[permalink] [raw]
Subject: [PATCH] x86: fix oprofile double free (was Re: Multiple free during oprofile unload)

* Alan Cox ([email protected]) wrote:
> I'd agree entirely with Dave - if you are applying a fix to something
> that is currently totally broken which may make it work and which doesn't
> affect any other bit of code then it goes into the stable tree.

And, in this case we're in luck. It's not released in any -stable tree
yet (it's queued for the next release). So there's plenty of time to
fix it up before next -stable release.

Something like below should fix it.

thanks,
-chris
--

Subject: [PATCH] x86: fix oprofile double free

From: Chris Wright <[email protected]>

Chuck reports that the recent fix from Andi to oprofile
6c977aad03a18019015035958c65b6729cd0574c introduces a double
free. Each cpu's cpu_msrs is setup to point to cpu 0's, which
causes free_msrs to free cpu 0's pointers for_each_possible_cpu.
Rather than copy the pointers, do a deep copy instead.

Signed-off-by: Chris Wright <[email protected]>
---

arch/i386/oprofile/nmi_int.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/i386/oprofile/nmi_int.c b/arch/i386/oprofile/nmi_int.c
index a7c0783..0c39443 100644
--- a/arch/i386/oprofile/nmi_int.c
+++ b/arch/i386/oprofile/nmi_int.c
@@ -211,8 +211,14 @@ static int nmi_setup(void)
/* Assume saved/restored counters are the same on all CPUs */
model->fill_in_addresses(&cpu_msrs[0]);
for_each_possible_cpu (cpu) {
- if (cpu != 0)
- cpu_msrs[cpu] = cpu_msrs[0];
+ if (cpu != 0) {
+ memcpy(cpu_msrs[cpu].counters, cpu_msrs[0].counters,
+ sizeof(struct op_msr) * model->num_counters);
+
+ memcpy(cpu_msrs[cpu].controls, cpu_msrs[0].controls,
+ sizeof(struct op_msr) * model->num_controls);
+ }
+
}
on_each_cpu(nmi_save_registers, NULL, 0, 1);
on_each_cpu(nmi_cpu_setup, NULL, 0, 1);

2007-05-26 07:43:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86: fix oprofile double free (was Re: Multiple free during oprofile unload)


> And, in this case we're in luck. It's not released in any -stable tree
> yet (it's queued for the next release). So there's plenty of time to
> fix it up before next -stable release.
>
> Something like below should fix it.

I already got a similar patch, but not tested yet.

-Andi

2007-05-26 14:05:29

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH] x86: fix oprofile double free (was Re: Multiple free during oprofile unload)

* Andi Kleen ([email protected]) wrote:
>
> > And, in this case we're in luck. It's not released in any -stable tree
> > yet (it's queued for the next release). So there's plenty of time to
> > fix it up before next -stable release.
> >
> > Something like below should fix it.
>
> I already got a similar patch, but not tested yet.

OK. I tested the one I sent, on Opteron.