2007-06-28 03:20:28

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: 2.6.21.5-rt17 on lenovo t61, some BUG's (lukewarm IQ?)

Hi Ingo, this is happening in a brand new laptop, a Lenovo t61 with a
7700 processor and the Santa Rosa chipset.

Lukewarm IQ detected in hotplug locking
BUG: at kernel/cpu.c:44 lock_cpu_hotplug()
[<c0405f88>] dump_trace+0x64/0x105
[<c0406041>] show_trace_log_lvl+0x18/0x2c
[<c040664e>] show_trace+0xf/0x11
[<c04066cf>] dump_stack+0x12/0x14
[<c0447519>] lock_cpu_hotplug+0x51/0x74
[<c0426251>] sched_setaffinity+0x13/0xe5
[<c045bee7>] __synchronize_sched+0x35/0x5a
[<c04244f1>] arch_reinit_sched_domains+0x13/0x29
[<c0424543>] sched_power_savings_store+0x3c/0x49
[<c0554680>] sysdev_class_store+0x1e/0x22
[<c04b32f7>] sysfs_write_file+0xa3/0xc6
[<c047beb3>] vfs_write+0xa8/0x154
[<c047c4ce>] sys_write+0x41/0x67
[<c0404f7c>] syscall_call+0x7/0xb
[<b7fb3410>] 0xb7fb3410
=======================
Lukewarm IQ detected in hotplug locking
BUG: at kernel/cpu.c:44 lock_cpu_hotplug()
[<c0405f88>] dump_trace+0x64/0x105
[<c0406041>] show_trace_log_lvl+0x18/0x2c
[<c040664e>] show_trace+0xf/0x11
[<c04066cf>] dump_stack+0x12/0x14
[<c0447519>] lock_cpu_hotplug+0x51/0x74
[<c0426251>] sched_setaffinity+0x13/0xe5
[<c045bf09>] __synchronize_sched+0x57/0x5a
[<c04244f1>] arch_reinit_sched_domains+0x13/0x29
[<c0424543>] sched_power_savings_store+0x3c/0x49
[<c0554680>] sysdev_class_store+0x1e/0x22
[<c04b32f7>] sysfs_write_file+0xa3/0xc6
[<c047beb3>] vfs_write+0xa8/0x154
[<c047c4ce>] sys_write+0x41/0x67
[<c0404f7c>] syscall_call+0x7/0xb
[<b7fb3410>] 0xb7fb3410
=======================

I'm attaching the full output of dmesg FYI, let me know if you need
something else to make sense of this.

[BTW, I tried to unsuccessfully boot rt18 today in one of my CCRMA
machines but the boot hung when trying to start the acpi daemon - this
was on FC6, I'll try to find out more tomorrow. We are seeing some hungs
with rt17 that I have not tried to diagnose yet]

-- Fernando


Attachments:
dmesg.1 (44.44 kB)

2007-06-30 19:24:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.21.5-rt17 on lenovo t61, some BUG's (lukewarm IQ?)


* Fernando Lopez-Lezcano <[email protected]> wrote:

> Hi Ingo, this is happening in a brand new laptop, a Lenovo t61 with a
> 7700 processor and the Santa Rosa chipset.
>
> Lukewarm IQ detected in hotplug locking
> BUG: at kernel/cpu.c:44 lock_cpu_hotplug()

hm, that's an upstream kernel message. Cpu-hotplug locking is ... a bit
messy still. Does it otherwise work? (it should only affect sw-suspend
on SMP)

> [BTW, I tried to unsuccessfully boot rt18 today in one of my CCRMA
> machines but the boot hung when trying to start the acpi daemon - this
> was on FC6, I'll try to find out more tomorrow. We are seeing some
> hungs with rt17 that I have not tried to diagnose yet]

do you still see this?

Ingo

2007-06-30 19:42:21

by Fernando Lopez-Lezcano

[permalink] [raw]
Subject: Re: 2.6.21.5-rt17 on lenovo t61, some BUG's (lukewarm IQ?)

On Sat, 2007-06-30 at 21:24 +0200, Ingo Molnar wrote:
> * Fernando Lopez-Lezcano <[email protected]> wrote:
>
> > Hi Ingo, this is happening in a brand new laptop, a Lenovo t61 with a
> > 7700 processor and the Santa Rosa chipset.
> >
> > Lukewarm IQ detected in hotplug locking
> > BUG: at kernel/cpu.c:44 lock_cpu_hotplug()
>
> hm, that's an upstream kernel message. Cpu-hotplug locking is ... a bit
> messy still. Does it otherwise work? (it should only affect sw-suspend
> on SMP)

It seems to work (I'm typing this on that machine running rt17). But I'm
still testing things and I have certainly not gotten to the point of
really trying suspend. It is a brand new laptop with new'ish chipset and
stuff so lotsa small details to fix :-)

> > [BTW, I tried to unsuccessfully boot rt18 today in one of my CCRMA
> > machines but the boot hung when trying to start the acpi daemon - this
> > was on FC6, I'll try to find out more tomorrow. We are seeing some
> > hungs with rt17 that I have not tried to diagnose yet]
>
> do you still see this?

Yes I do. My previous report was not very precise.

Trying to boot rt18 - this is on fc6 - seems more slugish than rt17
(very subjective), eventually it gets to "starting udev" and it hangs
there. I presume it is trying to start a device driver unsuccessfully
but I don't get any more information and can't get to single user to
find out more.

Yesterday I noticed that _sometimes_ "starting udev" in rt17 takes some
time. So this may be something that got worse in rt18 as opposed to
something completely new in rt18.

Sorry for the vageness and let me know if there's something I can do to
try to find out what's going on (looking at rc.sysinit I found there's
some kernel boot options that allow for more debugging but I did not
have time to try that).

As for the hangs in rt17 I have not been able to find anything out. They
are very sporadic (surprise!) and so far I have found no clues or
commonality in what was happening when the hang occurs.

-- Fernando