2008-08-19 19:51:55

by Vegard Nossum

[permalink] [raw]
Subject: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Hi,

With latest -git (1fca25427482387689fa27594c992a961d98768f), I got
this on reading from /dev/cpu/*/* while hot-unplugging cpu1.

------------[ cut here ]------------
WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/arch/x86/kernel/ipi.c:123
send_IPI_mask_bitmask+0xc3/0xe0()
Pid: 3881, comm: cat Not tainted 2.6.27-rc3-00464-g1fca254 #12
[<c013591f>] warn_on_slowpath+0x4f/0x80
[<c010a300>] ? native_sched_clock+0x80/0x110
[<c010a335>] ? native_sched_clock+0xb5/0x110
[<c015ae5a>] ? __lock_acquire+0x27a/0xa00
[<c015635b>] ? trace_hardirqs_off+0xb/0x10
[<c010a335>] ? native_sched_clock+0xb5/0x110
[<c01563bd>] ? put_lock_stats+0xd/0x30
[<c0118a43>] send_IPI_mask_bitmask+0xc3/0xe0
[<c01017c8>] send_IPI_mask+0x8/0x10
[<c0118307>] native_send_call_func_single_ipi+0x27/0x30
[<c0160a2b>] generic_exec_single+0x7b/0x80
[<c0160adf>] smp_call_function_single+0x5f/0x110
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a597>] _rdmsr_on_cpu+0x27/0x60
[<c037a5ea>] rdmsr_safe_on_cpu+0x1a/0x20
[<c011733e>] msr_read+0x6e/0xa0
[<c01a87b4>] vfs_read+0x94/0x130
[<c01172d0>] ? msr_read+0x0/0xa0
[<c01a8b5d>] sys_read+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================
---[ end trace fe4338948cb73be2 ]---
BUG: soft lockup - CPU#0 stuck for 61s! [cat:3881]
irq event stamp: 14632440
hardirqs last enabled at (14632439): [<c015968b>] trace_hardirqs_on+0xb/0x10
hardirqs last disabled at (14632440): [<c015635b>] trace_hardirqs_off+0xb/0x10
softirqs last enabled at (14632434): [<c013a4d1>] __do_softirq+0xe1/0x100
softirqs last disabled at (14632427): [<c013a595>] do_softirq+0xa5/0xb0
Pid: 3881, comm: cat Tainted: G W (2.6.27-rc3-00464-g1fca254 #12)
EIP: 0060:[<c0160952>] EFLAGS: 00200202 CPU: 0
EIP is at csd_flag_wait+0x12/0x20
EAX: f5f31ef0 EBX: c215dc60 ECX: ffffb300 EDX: 000008fa
ESI: 00200292 EDI: c215dc68 EBP: f5f31ec0 ESP: f5f31ec0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: 087d0a5c CR3: 33c36000 CR4: 000006d0
DR0: c0ebd43c DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c0160a15>] generic_exec_single+0x65/0x80
[<c0160adf>] smp_call_function_single+0x5f/0x110
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a597>] _rdmsr_on_cpu+0x27/0x60
[<c037a5ea>] rdmsr_safe_on_cpu+0x1a/0x20
[<c011733e>] msr_read+0x6e/0xa0
[<c01a87b4>] vfs_read+0x94/0x130
[<c01172d0>] ? msr_read+0x0/0xa0
[<c01a8b5d>] sys_read+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================

At least SSH is not usable after this, but I guess SysRq and such
would work (the "CPU stuck" message still showed up after the apparent
freeze).


Vegard

PS: This is probably not a regression.

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036


2008-08-20 01:37:49

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Tue, Aug 19, 2008 at 09:51:44PM +0200, Vegard Nossum wrote:
> Hi,
>
> With latest -git (1fca25427482387689fa27594c992a961d98768f), I got
> this on reading from /dev/cpu/*/* while hot-unplugging cpu1.

It's generally known the oprofile doesn't support CPU hotplug well.
Someone needs to make a project out of fixing it properly. Right now
it's just a "don't do that when it hurts"

-Andi

2008-08-20 06:26:29

by Vegard Nossum

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Wed, Aug 20, 2008 at 3:39 AM, Andi Kleen <[email protected]> wrote:
> On Tue, Aug 19, 2008 at 09:51:44PM +0200, Vegard Nossum wrote:
>> Hi,
>>
>> With latest -git (1fca25427482387689fa27594c992a961d98768f), I got
>> this on reading from /dev/cpu/*/* while hot-unplugging cpu1.
>
> It's generally known the oprofile doesn't support CPU hotplug well.
> Someone needs to make a project out of fixing it properly. Right now
> it's just a "don't do that when it hurts"

Hm. What you say is true, but this one in particular has nothing to do
with oprofile! It has something to do with reading /dev/cpu/*/msr
while hot-unplugging cpu1:

[<c011733e>] msr_read+0x6e/0xa0
[<c01a87b4>] vfs_read+0x94/0x130

I wasn't using oprofile when this happened. So I think it should also
be considered a separate issue. Though yes -- CPU hotplug in general
tends to break a lot of things.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-22 00:40:09

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Wed, Aug 20, 2008 at 08:26:19AM +0200, Vegard Nossum wrote:
> On Wed, Aug 20, 2008 at 3:39 AM, Andi Kleen <[email protected]> wrote:
> > On Tue, Aug 19, 2008 at 09:51:44PM +0200, Vegard Nossum wrote:
> >> Hi,
> >>
> >> With latest -git (1fca25427482387689fa27594c992a961d98768f), I got
> >> this on reading from /dev/cpu/*/* while hot-unplugging cpu1.
> >
> > It's generally known the oprofile doesn't support CPU hotplug well.
> > Someone needs to make a project out of fixing it properly. Right now
> > it's just a "don't do that when it hurts"
>
> Hm. What you say is true, but this one in particular has nothing to do
> with oprofile! It has something to do with reading /dev/cpu/*/msr
> while hot-unplugging cpu1:
>
> [<c011733e>] msr_read+0x6e/0xa0
> [<c01a87b4>] vfs_read+0x94/0x130
>
> I wasn't using oprofile when this happened. So I think it should also
> be considered a separate issue. Though yes -- CPU hotplug in general
> tends to break a lot of things.

>From my reading of the msr code, we check that the cpu is online in ->open,
but we never check it again, and also, we make no guarantees that it
won't go away before we ->read or even ->close it.

Would adding a get_cpu/put_cpu across the open/close solve this?
Peter?

Dave

--
http://www.codemonkey.org.uk

2008-08-22 02:14:16

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Dave Jones wrote:
> >
> > Hm. What you say is true, but this one in particular has nothing to do
> > with oprofile! It has something to do with reading /dev/cpu/*/msr
> > while hot-unplugging cpu1:
> >
> > [<c011733e>] msr_read+0x6e/0xa0
> > [<c01a87b4>] vfs_read+0x94/0x130
> >
> > I wasn't using oprofile when this happened. So I think it should also
> > be considered a separate issue. Though yes -- CPU hotplug in general
> > tends to break a lot of things.
>
> From my reading of the msr code, we check that the cpu is online in ->open,
> but we never check it again, and also, we make no guarantees that it
> won't go away before we ->read or even ->close it.
>
> Would adding a get_cpu/put_cpu across the open/close solve this?
> Peter?
>

A get_cpu/put_cpu across the whole open..close sequence would seem to
be, ahem, rude, since userspace could hold it for an arbitrary amount of
time (plus, there is no guarantee that they are invoked on the same CPU.)

The cpuid driver has the same problem, obviously.

get_online_cpus() and put_online_cpus() around the call to
{rd,wr}msr_safe_on_cpu() should work; and the CPU hotplug documentation
seems to claim that we can just disable preemption around those calls,
which is exactly what get_cpu()..put_cpu() does, so I guess
get_cpu()..put_cpu() here is fine. Now, the big question is: should
this really be done in the MSR/CPUID drivers, or should it be done in
smp_call_function_single(), which is the generic code invoked by this?

It seems to be that doing it in smp_call_function_single() would be more
correct as it's already protected by get_cpu()..put_cpu() and a
cpu_online() test in there should not be expensive in comparison to the
whole rest of the code.

You may want to see if this patch fixes the problem; it does *NOT* have
the correct error behaviour (some of the intervening layers don't
propagate errors), but it should make the fault go away.

-hpa


Attachments:
smp-unplug-fault.patch (1.13 kB)

2008-08-22 02:27:00

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

> You may want to see if this patch fixes the problem; it does *NOT* have
> the correct error behaviour (some of the intervening layers don't
> propagate errors), but it should make the fault go away.

The alternative would be to just take out those msr_on_cpu()
interfaces again. Right now they are useless in the kernel,
but still cause problems.

They were only added for OpenVZ's vCPUs which they back then
promised me would hit mainline soon. But that was some time
ago and there wasn't much progress on this.

-Andi

2008-08-22 06:25:42

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Andi Kleen wrote:
>> You may want to see if this patch fixes the problem; it does *NOT* have
>> the correct error behaviour (some of the intervening layers don't
>> propagate errors), but it should make the fault go away.
>
> The alternative would be to just take out those msr_on_cpu()
> interfaces again. Right now they are useless in the kernel,
> but still cause problems.
>
> They were only added for OpenVZ's vCPUs which they back then
> promised me would hit mainline soon. But that was some time
> ago and there wasn't much progress on this.
>
> -Andi

We still need the equivalent functionality, though. The midlayer
(msr_on_cpu) may be pointless, but that doesn't change the fact that
putting this functionality in the lower layer (smp_call_function_single)
makes more sense.

-hpa

2008-08-22 09:33:30

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

> We still need the equivalent functionality, though. The midlayer
> (msr_on_cpu) may be pointless, but that doesn't change the fact that
> putting this functionality in the lower layer (smp_call_function_single)
> makes more sense.

Assuming you can actually have interrupts enabled at these point
and be otherwise ready to do call_function_simple (e.g. cpu hotplug
locking etc.)

For a lot of MSR accesses in more complicated subsystems like cpufreq
that requires complications. I would think for many circumstances it's
better to simply set affinity of the thread before at a higher level.

In hindsight I think it was my mistake to ever merge that.
I admit I never liked it, but just merged it because I wasn't able
to come up with a strong enough counter argument back then.

-Andi

2008-08-22 11:11:49

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Fri, Aug 22, 2008 at 04:28:41AM +0200, Andi Kleen wrote:
> > You may want to see if this patch fixes the problem; it does *NOT* have
> > the correct error behaviour (some of the intervening layers don't
> > propagate errors), but it should make the fault go away.
>
> The alternative would be to just take out those msr_on_cpu()
> interfaces again. Right now they are useless in the kernel,
> but still cause problems.
>
> They were only added for OpenVZ's vCPUs which they back then
> promised me would hit mainline soon.

There were no such promises made. Reread thread.

> But that was some time ago and there wasn't much progress on this.

2008-08-22 16:42:31

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Andi Kleen wrote:
>> We still need the equivalent functionality, though. The midlayer
>> (msr_on_cpu) may be pointless, but that doesn't change the fact that
>> putting this functionality in the lower layer (smp_call_function_single)
>> makes more sense.
>
> Assuming you can actually have interrupts enabled at these point
> and be otherwise ready to do call_function_simple (e.g. cpu hotplug
> locking etc.)
>
> For a lot of MSR accesses in more complicated subsystems like cpufreq
> that requires complications. I would think for many circumstances it's
> better to simply set affinity of the thread before at a higher level.
>
> In hindsight I think it was my mistake to ever merge that.
> I admit I never liked it, but just merged it because I wasn't able
> to come up with a strong enough counter argument back then.

Well, smp_call_function_single already does all necessary locking; it
makes more sense for it to check that what it's about to call still
exists while inside the lock, instead of requiring the higher layers to
guarantee that cannot happen on it. This is simply a matter of the cost
of checking at this point being quite low.

-hpa

2008-08-23 06:42:33

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

H. Peter Anvin wrote:
> Well, smp_call_function_single already does all necessary locking; it
> makes more sense for it to check that what it's about to call still
> exists while inside the lock, instead of requiring the higher layers
> to guarantee that cannot happen on it. This is simply a matter of the
> cost of checking at this point being quite low.

It does, already doesn't it? Hm, smp_call_function_mask() ands the
provided mask with the online mask, but it doesn't look like
smp_call_function_single() does the equivalent.

J

2008-08-23 06:47:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Jeremy Fitzhardinge wrote:
> H. Peter Anvin wrote:
>> Well, smp_call_function_single already does all necessary locking; it
>> makes more sense for it to check that what it's about to call still
>> exists while inside the lock, instead of requiring the higher layers
>> to guarantee that cannot happen on it. This is simply a matter of the
>> cost of checking at this point being quite low.
>
> It does, already doesn't it? Hm, smp_call_function_mask() ands the
> provided mask with the online mask, but it doesn't look like
> smp_call_function_single() does the equivalent.

It doesn't, and that's how this bug was introduced. It's a trivial add
(see test patch already posted) and should hardly matter in terms of
execution time.

I'll write up a clean patch with all the error propagation tomorrow or
Sunday.

-hpa

2008-08-24 17:59:45

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Vegard Nossum wrote:
>
> Okay, now I used cpufreq-selector to change to "ondemand" governor,
> and MHz goes back to 3000. Weird. Why would "performance" governor put
> my machine to a constant 375?
>

That would be a problem... I presume this problem is independent of the
patch, though?

-hpa

2008-08-24 18:16:41

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Sun, Aug 24, 2008 at 07:45:48PM +0200, Vegard Nossum wrote:
> Removing acpi=off helps with the CPU detection problem. The kernel is
> still really slow, though. From /proc/cpuinfo:
>
> processor : 1
> vendor_id : GenuineIntel
> cpu family : 15
> model : 6
> model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
> stepping : 5
> cpu MHz : 375.000
> cache size : 2048 KB
>
> Why is MHz on 375!? I tried cpufreq-selector, but nothing changed. Maybe
>
> calling acpi_cpufreq_init+0x0/0x90
> initcall acpi_cpufreq_init+0x0/0x90 returned -19 after 0 msecs

-ENODEV. Because you don't have frequency scaling capable CPU.

> Okay, now I used cpufreq-selector to change to "ondemand" governor,
> and MHz goes back to 3000. Weird. Why would "performance" governor put
> my machine to a constant 375?

Probably because you're using p4-clockmod, and it's crap.

Dave

--
http://www.codemonkey.org.uk

2008-08-24 17:17:44

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Vegard Nossum wrote:
>
> Hm.
>
> Kernel fails to detect cpu1 at all.
>
> I am currently unsure of whether it's your patch or not. But it's the
> same config that I've been booting for ages (and I copy it over for
> each new kernel version I check out).
>
> Processor #0 (Bootup-CPU)
> I/O APIC #2 Version 32 at 0xFEC00000.
> Enabling APIC mode: Flat. Using 1 I/O APICs
> Processors: 1
> SMP: Allowing 1 CPUs, 0 hotplug CPUs
> mapped APIC to ffffb000 (fee00000)
> mapped IOAPIC to ffffa000 (fec00000)
> Allocating PCI resources starting at 50000000 (gap: 40000000:bee00000)
> PERCPU: Allocating 1221764 bytes of per cpu data
> NR_CPUS: 7, nr_cpu_ids: 1, nr_node_ids 1
>
> I really don't get it. Is this something that can be caused by your
> patch _at all_ ?
>

Could you try this patch? It should (hopefully) tell us if there is any
such invocations and what the call trace looks like.

-hpa


Attachments:
smp-unplug-fault2.patch (1.14 kB)

2008-08-24 17:22:45

by Vegard Nossum

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Sun, Aug 24, 2008 at 7:17 PM, H. Peter Anvin <[email protected]> wrote:
> Vegard Nossum wrote:
>>
>> Hm.
>>
>> Kernel fails to detect cpu1 at all.
>>
>> I am currently unsure of whether it's your patch or not. But it's the
>> same config that I've been booting for ages (and I copy it over for
>> each new kernel version I check out).
>>
>> Processor #0 (Bootup-CPU)
>> I/O APIC #2 Version 32 at 0xFEC00000.
>> Enabling APIC mode: Flat. Using 1 I/O APICs
>> Processors: 1
>> SMP: Allowing 1 CPUs, 0 hotplug CPUs
>> mapped APIC to ffffb000 (fee00000)
>> mapped IOAPIC to ffffa000 (fec00000)
>> Allocating PCI resources starting at 50000000 (gap: 40000000:bee00000)
>> PERCPU: Allocating 1221764 bytes of per cpu data
>> NR_CPUS: 7, nr_cpu_ids: 1, nr_node_ids 1
>>
>> I really don't get it. Is this something that can be caused by your
>> patch _at all_ ?
>>
>
> Could you try this patch? It should (hopefully) tell us if there is any
> such invocations and what the call trace looks like.

I'm sorry, I _just_ reverted your patch and tested the bare kernel...
but it still only detects cpu0 :-(

Apart from that, it's also incredibly slow and I get some
"end_request: I/O error, dev fd0, sector 0" messages. Start-up (init 3
on a F7) takes closer to 10 minutes. Will now take a closer look at my
config.

Oh. I _just_ noticed a completely different change -- I added acpi=off
to my boot line *blush*

Will now remove it and retry your original patch.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-24 16:43:46

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Vegard Nossum wrote:
>
> Hm.
>
> Kernel fails to detect cpu1 at all.
>
> I am currently unsure of whether it's your patch or not. But it's the
> same config that I've been booting for ages (and I copy it over for
> each new kernel version I check out).
>
> Processor #0 (Bootup-CPU)
> I/O APIC #2 Version 32 at 0xFEC00000.
> Enabling APIC mode: Flat. Using 1 I/O APICs
> Processors: 1
> SMP: Allowing 1 CPUs, 0 hotplug CPUs
> mapped APIC to ffffb000 (fee00000)
> mapped IOAPIC to ffffa000 (fec00000)
> Allocating PCI resources starting at 50000000 (gap: 40000000:bee00000)
> PERCPU: Allocating 1221764 bytes of per cpu data
> NR_CPUS: 7, nr_cpu_ids: 1, nr_node_ids 1
>
> I really don't get it. Is this something that can be caused by your
> patch _at all_ ?
>

Well, if smp_call_function_single() is called during the CPU up
sequence, without the CPU having been added to the online mask, then
yes, it could. The most likely place would be from a notifier.

That makes it ugly. Need to track down the reason.

-hpa

2008-08-24 09:21:06

by Vegard Nossum

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Fri, Aug 22, 2008 at 4:13 AM, H. Peter Anvin <[email protected]> wrote:
> It seems to be that doing it in smp_call_function_single() would be more
> correct as it's already protected by get_cpu()..put_cpu() and a cpu_online()
> test in there should not be expensive in comparison to the whole rest of the
> code.
>
> You may want to see if this patch fixes the problem; it does *NOT* have the
> correct error behaviour (some of the intervening layers don't propagate
> errors), but it should make the fault go away.

Hm.

Kernel fails to detect cpu1 at all.

I am currently unsure of whether it's your patch or not. But it's the
same config that I've been booting for ages (and I copy it over for
each new kernel version I check out).

Processor #0 (Bootup-CPU)
I/O APIC #2 Version 32 at 0xFEC00000.
Enabling APIC mode: Flat. Using 1 I/O APICs
Processors: 1
SMP: Allowing 1 CPUs, 0 hotplug CPUs
mapped APIC to ffffb000 (fee00000)
mapped IOAPIC to ffffa000 (fec00000)
Allocating PCI resources starting at 50000000 (gap: 40000000:bee00000)
PERCPU: Allocating 1221764 bytes of per cpu data
NR_CPUS: 7, nr_cpu_ids: 1, nr_node_ids 1

I really don't get it. Is this something that can be caused by your
patch _at all_ ?


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-24 17:45:58

by Vegard Nossum

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Sun, Aug 24, 2008 at 7:22 PM, Vegard Nossum <[email protected]> wrote:
>>> Kernel fails to detect cpu1 at all.
> I'm sorry, I _just_ reverted your patch and tested the bare kernel...
> but it still only detects cpu0 :-(
>
> Apart from that, it's also incredibly slow and I get some
> "end_request: I/O error, dev fd0, sector 0" messages. Start-up (init 3
> on a F7) takes closer to 10 minutes. Will now take a closer look at my
> config.
>
> Oh. I _just_ noticed a completely different change -- I added acpi=off
> to my boot line *blush*

Removing acpi=off helps with the CPU detection problem. The kernel is
still really slow, though. From /proc/cpuinfo:

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 6
model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping : 5
cpu MHz : 375.000
cache size : 2048 KB

Why is MHz on 375!? I tried cpufreq-selector, but nothing changed. Maybe

calling acpi_cpufreq_init+0x0/0x90
initcall acpi_cpufreq_init+0x0/0x90 returned -19 after 0 msecs

There's also this:

SMP: Allowing 2 CPUs, 0 hotplug CPUs

(but CPU hotplug still work, is the line above about something
different, like physical hotplug?)

Apart from that, with your patch applied, hotplug seems to work OK (no
warnings).

Okay, now I used cpufreq-selector to change to "ondemand" governor,
and MHz goes back to 3000. Weird. Why would "performance" governor put
my machine to a constant 375?

Thanks,


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-25 18:31:19

by Vegard Nossum

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Sun, Aug 24, 2008 at 8:13 PM, Dave Jones <[email protected]> wrote:
> On Sun, Aug 24, 2008 at 07:45:48PM +0200, Vegard Nossum wrote:
> > Why is MHz on 375!? I tried cpufreq-selector, but nothing changed. Maybe
> >
> > calling acpi_cpufreq_init+0x0/0x90
> > initcall acpi_cpufreq_init+0x0/0x90 returned -19 after 0 msecs
>
> -ENODEV. Because you don't have frequency scaling capable CPU.
>
> > Okay, now I used cpufreq-selector to change to "ondemand" governor,
> > and MHz goes back to 3000. Weird. Why would "performance" governor put
> > my machine to a constant 375?
>
> Probably because you're using p4-clockmod, and it's crap.

On Sun, Aug 24, 2008 at 7:59 PM, H. Peter Anvin <[email protected]> wrote:
> That would be a problem... I presume this problem is independent of the
> patch, though?

I sorted it -- thanks! It turned out to be pretty obscure; my tty
setting for the receiving end of the serial console was set to echo.
So when the machine booted, it was echoing lots of characters into the
Fedora 7 init, which would prompt for the starting of cpuspeed
initscript. Turning off echo for the tty was what triggered the
slowness; removing cpuspeed from the runlevel entirely solved the
problem.

Don't know why cpuspeed would select a governor which runs the CPU at
a constant 300 MHz, though.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-25 18:34:01

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

> Probably because you're using p4-clockmod, and it's crap.

Really should really bite the bullet and just remove it. People
run in this all the time and I bet you can count the people who
actually use it consciously and usefully with one hand.

Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
option explicitely.

-Andi

2008-08-25 18:41:28

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 08:31:04PM +0200, Vegard Nossum wrote:

> Fedora 7 init, which would prompt for the starting of cpuspeed
> initscript. Turning off echo for the tty was what triggered the
> slowness; removing cpuspeed from the runlevel entirely solved the
> problem.
>
> Don't know why cpuspeed would select a governor which runs the CPU at
> a constant 300 MHz, though.

p4-clockmod is the only cpufreq driver that can run on your hardware.
There's nothing better. A while back, Fedora stopped loading
(and even building) p4-clockmod, because it sucks so bad.
I can't remember when we made that change, but it sounds like it must
have been a post F7 thing.

Dave

--
http://www.codemonkey.org.uk

2008-08-25 18:58:25

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 08:36:11PM +0200, Andi Kleen wrote:
> > Probably because you're using p4-clockmod, and it's crap.
>
> Really should really bite the bullet and just remove it. People
> run in this all the time and I bet you can count the people who
> actually use it consciously and usefully with one hand.
>
> Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
> option explicitely.

We can't really remove it until ACPI processor driver has a better
response than 'thermal event, argh!, shut down'.

When that happens, I'll be glad to see it go.

Dave

--
http://www.codemonkey.org.uk

2008-08-25 19:09:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

Andi Kleen wrote:
>> Probably because you're using p4-clockmod, and it's crap.
>
> Really should really bite the bullet and just remove it. People
> run in this all the time and I bet you can count the people who
> actually use it consciously and usefully with one hand.
>
> Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
> option explicitely.
>

CONFIG_BROKEN?

-hpa

2008-08-25 19:17:07

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 12:08:23PM -0700, H. Peter Anvin wrote:
> Andi Kleen wrote:
> >> Probably because you're using p4-clockmod, and it's crap.
> >
> > Really should really bite the bullet and just remove it. People
> > run in this all the time and I bet you can count the people who
> > actually use it consciously and usefully with one hand.
> >
> > Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
> > option explicitely.
>
> CONFIG_BROKEN?

It's not really broken (at least in the CONFIG_BROKEN sense), it just sucks
when used in the wrong situations. (Which is 99% of the use-cases people
try to use it).

Dave

--
http://www.codemonkey.org.uk

2008-08-25 19:37:20

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 02:54:51PM -0400, Dave Jones wrote:
> On Mon, Aug 25, 2008 at 08:36:11PM +0200, Andi Kleen wrote:
> > > Probably because you're using p4-clockmod, and it's crap.
> >
> > Really should really bite the bullet and just remove it. People
> > run in this all the time and I bet you can count the people who
> > actually use it consciously and usefully with one hand.
> >
> > Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
> > option explicitely.
>
> We can't really remove it until ACPI processor driver has a better
> response than 'thermal event, argh!, shut down'.

It only does that when the critical trip point is reached (which
basically means that the BIOS tells it -- "I'm on fire"). What else should
it do in your opinion when this happens?

-Andi

2008-08-25 19:53:53

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 09:39:26PM +0200, Andi Kleen wrote:
> On Mon, Aug 25, 2008 at 02:54:51PM -0400, Dave Jones wrote:
> > On Mon, Aug 25, 2008 at 08:36:11PM +0200, Andi Kleen wrote:
> > > > Probably because you're using p4-clockmod, and it's crap.
> > >
> > > Really should really bite the bullet and just remove it. People
> > > run in this all the time and I bet you can count the people who
> > > actually use it consciously and usefully with one hand.
> > >
> > > Or at least only make it run when the user set a "I_REALLY_KNOW_WHAT_I_AM_DOING"
> > > option explicitely.
> >
> > We can't really remove it until ACPI processor driver has a better
> > response than 'thermal event, argh!, shut down'.
>
> It only does that when the critical trip point is reached (which
> basically means that the BIOS tells it -- "I'm on fire"). What else should
> it do in your opinion when this happens?

On some systems (for which there aren't BIOS updates) the trip points are
set too low. If we get a thermal event that was caused by temporary
increased workload, temperature will drop off again when that workload
is complete.

For sustained workloads we'd get additional thermal events, at which
time we make a decision "ok, we've throttled as far as we can, and
things are still going badly, power off".

In the event of a failed fan or similar, shutting down is obviously
the right thing to do, and we'd get further thermal events after
throttling which would allow us to do so.

Dave

--
http://www.codemonkey.org.uk

2008-08-25 20:34:36

by Andi Kleen

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

> On some systems (for which there aren't BIOS updates) the trip points are
> set too low.

There were patches floating to make this configurable. I was always
a little sceptical of them, but they exist.

> If we get a thermal event that was caused by temporary
> increased workload, temperature will drop off again when that workload
> is complete.

But none of the cpufreq governours do this. They only care about
load, not about temperature.

> For sustained workloads we'd get additional thermal events, at which
> time we make a decision "ok, we've throttled as far as we can, and
> things are still going badly, power off".

That is what the ACPI driver does when the trip point is reached.

> In the event of a failed fan or similar, shutting down is obviously
> the right thing to do, and we'd get further thermal events after
> throttling which would allow us to do so.

So you're saying processor_thermal should let the system cook
for some time first before really taking action?

-Andi

2008-08-25 20:50:30

by Dave Jones

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, Aug 25, 2008 at 10:36:49PM +0200, Andi Kleen wrote:

> > If we get a thermal event that was caused by temporary
> > increased workload, temperature will drop off again when that workload
> > is complete.
>
> But none of the cpufreq governours do this. They only care about
> load, not about temperature.

Which is good enough to stop p4 laptops from shutting down as
soon as they've finished booting up.

> > For sustained workloads we'd get additional thermal events, at which
> > time we make a decision "ok, we've throttled as far as we can, and
> > things are still going badly, power off".
>
> That is what the ACPI driver does when the trip point is reached.

yes, except for that "we've throttled" part.

Dave

--
http://www.codemonkey.org.uk

2008-08-25 21:24:37

by Arjan van de Ven

[permalink] [raw]
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

On Mon, 25 Aug 2008 16:47:02 -0400
Dave Jones <[email protected]> wrote:

> On Mon, Aug 25, 2008 at 10:36:49PM +0200, Andi Kleen wrote:
>
> > > If we get a thermal event that was caused by temporary
> > > increased workload, temperature will drop off again when that
> > > workload is complete.
> >
> > But none of the cpufreq governours do this. They only care about
> > load, not about temperature.
>
> Which is good enough to stop p4 laptops from shutting down as
> soon as they've finished booting up.\

that's such an enormous gamble it's not funny.


really; if your bios has broken trippoints we should use the kernel
commandline to disable them (and a dmi blacklist if the amount of
bioses that have it wrong is low.. maybe combined with a date based
threshold).

Just praying that p4clockmod keeps it kinda low enough is not the
answer.



--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org