2007-11-09 08:57:32

by SANGOI DINO LEONARDO

[permalink] [raw]
Subject: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop

Hi,

My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and
2.6.24.rc2.
It works fine with 2.6.23 and older.

I seen this bug first while running fedora rawhide, so you can find hardware

info and boot logs at https://bugzilla.redhat.com/show_bug.cgi?id=312201.

I did a git bisect, and got this:

$ git bisect bad
4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit
commit 4f86d3a8e297205780cca027e974fd5f81064780
Author: Len Brown <[email protected]>
Date: Wed Oct 3 18:58:00 2007 -0400

cpuidle: consolidate 2.6.22 cpuidle branch into one patch
[SNIP full commit log]

:040000 040000 fadedf003c64838a73d172d6b7c0046d88dedd5e
ebb8a32b3bc49d731c13f2812148ae553bc1a533 M arch
:040000 040000 039a15fe07324bb0481eb1006571f6523c56c254
e3251f5abcc19417472488f523da968e37698ddd M drivers
:040000 040000 89a350e5adc6dfd82adbb9c2f327557cd7a95334
14c738510d6c772e9a8db4bc494ce8fb3434a5fb M include
:040000 040000 e1d33c4a2558da0fb68f7e98145abf69e885ba94
7987a9110d0749aa7442a3f4c8c6c7d7a3df9426 M kernel

$ git bisect log
git-bisect start
# bad: [b4f555081fdd27d13e6ff39d455d5aefae9d2c0c] Merge branch 'for-linus'
of git://git.kernel.dk/linux-2.6-block
git-bisect bad b4f555081fdd27d13e6ff39d455d5aefae9d2c0c
# good: [bbf25010f1a6b761914430f5fca081ec8c7accd1] Linux 2.6.23
git-bisect good bbf25010f1a6b761914430f5fca081ec8c7accd1
# good: [1f06862e11f23ebc99438c592be9c92560d78548] rt2x00: Add new rt73usb
USB ID
git-bisect good 1f06862e11f23ebc99438c592be9c92560d78548
# good: [2c6221483169ddd4c04797cd7296ed4fe52fcdd7] Fix discrepancy between
VDSO based gettimeofday() and sys_gettimeofday().
git-bisect good 2c6221483169ddd4c04797cd7296ed4fe52fcdd7
# bad: [56d61a0e26c5a61c66d1ac259a59960295939da9] Merge branch 'for-linus'
of git://git390.osdl.marist.edu/pub/scm/linux-2.6
git-bisect bad 56d61a0e26c5a61c66d1ac259a59960295939da9
# good: [ec2626815bf9a9922e49820b03e670e833f3ca3c] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
git-bisect good ec2626815bf9a9922e49820b03e670e833f3ca3c
# bad: [c00046c279a2521075250fad682ca0acc10d4fd7] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
git-bisect bad c00046c279a2521075250fad682ca0acc10d4fd7
# bad: [e9a404580ccaeb31dd2a976f9929c4f9eb6f3540] nfs: Fix build break with
CONFIG_NFS_V4=n
git-bisect bad e9a404580ccaeb31dd2a976f9929c4f9eb6f3540
# bad: [4800be295c34268fd3211d49828bfaa6bf62867f] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
git-bisect bad 4800be295c34268fd3211d49828bfaa6bf62867f
# good: [a2883dfa2e4a94b24109b2bfe735561e50cc44b4] Pull thermal into release
branch
git-bisect good a2883dfa2e4a94b24109b2bfe735561e50cc44b4
# good: [731aa5fd9971a5163845fbe55de63d686a11da0a] Pull bugzilla-8709 into
release branch
git-bisect good 731aa5fd9971a5163845fbe55de63d686a11da0a
# good: [910b40468a9ce3f2f5d48c5d260329c27d45adb5] kbuild: introduce
cc-cross-prefix
git-bisect good 910b40468a9ce3f2f5d48c5d260329c27d45adb5
# bad: [00a2b433557f10736e8a02de619b3e9052556c12] Pull acpica into test
branch
git-bisect bad 00a2b433557f10736e8a02de619b3e9052556c12
# bad: [de85871a9a53c00cae4c3a70849b5eaad0eb38b2] Pull cpuidle into test
branch
git-bisect bad de85871a9a53c00cae4c3a70849b5eaad0eb38b2
# bad: [e196441bdf2dbf0526b28a6829c39557c236d611] ACPI: cpuidle: port idle
timer suspend/resume workaround to cpuidle
git-bisect bad e196441bdf2dbf0526b28a6829c39557c236d611
# bad: [4f86d3a8e297205780cca027e974fd5f81064780] cpuidle: consolidate
2.6.22 cpuidle branch into one patch
git-bisect bad 4f86d3a8e297205780cca027e974fd5f81064780
--------

Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set to y (tell me if
full config is needed).

If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine.
Setting CONFIG_CPU_IDLE=n also gives me a working kernel.

Ask me if more info is needed (please CC me).

Thanks,

Dino


2007-11-09 10:03:44

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop


(cc's added)

On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO <[email protected]> wrote:

> Hi,
>
> My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and
> 2.6.24.rc2.
> It works fine with 2.6.23 and older.
>
> I seen this bug first while running fedora rawhide, so you can find hardware
>
> info and boot logs at https://bugzilla.redhat.com/show_bug.cgi?id=312201.
>
> I did a git bisect, and got this:
>
> $ git bisect bad
> 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit
> commit 4f86d3a8e297205780cca027e974fd5f81064780
> Author: Len Brown <[email protected]>
> Date: Wed Oct 3 18:58:00 2007 -0400
>
> cpuidle: consolidate 2.6.22 cpuidle branch into one patch
> [SNIP full commit log]
>
> :040000 040000 fadedf003c64838a73d172d6b7c0046d88dedd5e
> ebb8a32b3bc49d731c13f2812148ae553bc1a533 M arch
> :040000 040000 039a15fe07324bb0481eb1006571f6523c56c254
> e3251f5abcc19417472488f523da968e37698ddd M drivers
> :040000 040000 89a350e5adc6dfd82adbb9c2f327557cd7a95334
> 14c738510d6c772e9a8db4bc494ce8fb3434a5fb M include
> :040000 040000 e1d33c4a2558da0fb68f7e98145abf69e885ba94
> 7987a9110d0749aa7442a3f4c8c6c7d7a3df9426 M kernel
>
> $ git bisect log
> git-bisect start
> # bad: [b4f555081fdd27d13e6ff39d455d5aefae9d2c0c] Merge branch 'for-linus'
> of git://git.kernel.dk/linux-2.6-block
> git-bisect bad b4f555081fdd27d13e6ff39d455d5aefae9d2c0c
> # good: [bbf25010f1a6b761914430f5fca081ec8c7accd1] Linux 2.6.23
> git-bisect good bbf25010f1a6b761914430f5fca081ec8c7accd1
> # good: [1f06862e11f23ebc99438c592be9c92560d78548] rt2x00: Add new rt73usb
> USB ID
> git-bisect good 1f06862e11f23ebc99438c592be9c92560d78548
> # good: [2c6221483169ddd4c04797cd7296ed4fe52fcdd7] Fix discrepancy between
> VDSO based gettimeofday() and sys_gettimeofday().
> git-bisect good 2c6221483169ddd4c04797cd7296ed4fe52fcdd7
> # bad: [56d61a0e26c5a61c66d1ac259a59960295939da9] Merge branch 'for-linus'
> of git://git390.osdl.marist.edu/pub/scm/linux-2.6
> git-bisect bad 56d61a0e26c5a61c66d1ac259a59960295939da9
> # good: [ec2626815bf9a9922e49820b03e670e833f3ca3c] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
> git-bisect good ec2626815bf9a9922e49820b03e670e833f3ca3c
> # bad: [c00046c279a2521075250fad682ca0acc10d4fd7] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
> git-bisect bad c00046c279a2521075250fad682ca0acc10d4fd7
> # bad: [e9a404580ccaeb31dd2a976f9929c4f9eb6f3540] nfs: Fix build break with
> CONFIG_NFS_V4=n
> git-bisect bad e9a404580ccaeb31dd2a976f9929c4f9eb6f3540
> # bad: [4800be295c34268fd3211d49828bfaa6bf62867f] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
> git-bisect bad 4800be295c34268fd3211d49828bfaa6bf62867f
> # good: [a2883dfa2e4a94b24109b2bfe735561e50cc44b4] Pull thermal into release
> branch
> git-bisect good a2883dfa2e4a94b24109b2bfe735561e50cc44b4
> # good: [731aa5fd9971a5163845fbe55de63d686a11da0a] Pull bugzilla-8709 into
> release branch
> git-bisect good 731aa5fd9971a5163845fbe55de63d686a11da0a
> # good: [910b40468a9ce3f2f5d48c5d260329c27d45adb5] kbuild: introduce
> cc-cross-prefix
> git-bisect good 910b40468a9ce3f2f5d48c5d260329c27d45adb5
> # bad: [00a2b433557f10736e8a02de619b3e9052556c12] Pull acpica into test
> branch
> git-bisect bad 00a2b433557f10736e8a02de619b3e9052556c12
> # bad: [de85871a9a53c00cae4c3a70849b5eaad0eb38b2] Pull cpuidle into test
> branch
> git-bisect bad de85871a9a53c00cae4c3a70849b5eaad0eb38b2
> # bad: [e196441bdf2dbf0526b28a6829c39557c236d611] ACPI: cpuidle: port idle
> timer suspend/resume workaround to cpuidle
> git-bisect bad e196441bdf2dbf0526b28a6829c39557c236d611
> # bad: [4f86d3a8e297205780cca027e974fd5f81064780] cpuidle: consolidate
> 2.6.22 cpuidle branch into one patch
> git-bisect bad 4f86d3a8e297205780cca027e974fd5f81064780
> --------
>
> Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set to y (tell me if
> full config is needed).
>
> If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine.
> Setting CONFIG_CPU_IDLE=n also gives me a working kernel.
>
> Ask me if more info is needed (please CC me).
>
> Thanks,
>
> Dino

Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop

> On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO <[email protected]> wrote:
> > Hi,
> >
> > My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and
> > 2.6.24.rc2.
> > It works fine with 2.6.23 and older.
> >
> > I seen this bug first while running fedora rawhide, so you can find hardware
> >
> > info and boot logs at https://bugzilla.redhat.com/show_bug.cgi?id=312201.
> >
> > I did a git bisect, and got this:
> >
> > $ git bisect bad
> > 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit
> > commit 4f86d3a8e297205780cca027e974fd5f81064780
> > Author: Len Brown <[email protected]>
> > Date: Wed Oct 3 18:58:00 2007 -0400
> >
> > cpuidle: consolidate 2.6.22 cpuidle branch into one patch

... snip ...

> > Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set to y (tell me if
> > full config is needed).
> >
> > If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine.
> > Setting CONFIG_CPU_IDLE=n also gives me a working kernel.

I am just thinking aloud ...

Boot log in the bugzilla shows:

CPU0: AMD Turion(tm) 64 Mobile ML-34 stepping 02

So it seems that the hardware just dislikes CONFIG_CPU_IDLE.

I haven't dealt with that cpuidle stuff in the past.
Now I am wondering with which platforms that code was verified.
And yes, I know the code was in -mm for a while.
But maybe the test coverage on AMD platforms was not that high?

What about making CONFIG_CPU_IDLE dependent on EXPERIMENTAL for
the time being and remove EXPERIMENTAL when some more testing has
been done?


Regards,

Andreas

--
Operating | AMD Saxony Limited Liability Company & Co. KG,
System | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
Research | Register Court Dresden: HRA 4896, General Partner authorized
Center | to represent: AMD Saxony LLC (Wilmington, Delaware, US)
(OSRC) | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy



2007-11-09 18:10:55

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop



>-----Original Message-----
>From: Andrew Morton [mailto:[email protected]]
>Sent: Friday, November 09, 2007 2:03 AM
>To: SANGOI DINO LEONARDO
>Cc: [email protected]; Rafael J. Wysocki; Brown,
>Len; Pallipadi, Venkatesh; [email protected]
>Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running
>udev on my laptop
>
>
>(cc's added)
>
>On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO
><[email protected]> wrote:
>
>> Hi,
>>
>> My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and
>> 2.6.24.rc2.
>> It works fine with 2.6.23 and older.
>>
>> I seen this bug first while running fedora rawhide, so you
>can find hardware
>>
>> info and boot logs at
>https://bugzilla.redhat.com/show_bug.cgi?id=312201.
>>
>> I did a git bisect, and got this:
>>
>> $ git bisect bad
>> 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit
>> commit 4f86d3a8e297205780cca027e974fd5f81064780
>> Author: Len Brown <[email protected]>
>> Date: Wed Oct 3 18:58:00 2007 -0400
>>
>> cpuidle: consolidate 2.6.22 cpuidle branch into one patch
>> [SNIP full commit log]
>>
<snip>
> --------
>>
>> Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set
>to y (tell me if
>> full config is needed).
>>
>> If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine.
>> Setting CONFIG_CPU_IDLE=n also gives me a working kernel.
>>
>> Ask me if more info is needed (please CC me).
>>
>> Thanks,
>>
>> Dino


Dino,

Thanks for all the dumps and information in bugzilla. I am looking at it
to root cause the failure and I should have more updates later today.
Can you also post the full config you are using with rc1 rc2.

Thanks,
Venki

2007-11-10 00:56:43

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop

On Fri, Nov 09, 2007 at 10:10:43AM -0800, Pallipadi, Venkatesh wrote:
>
>
> >-----Original Message-----
> >From: Andrew Morton [mailto:[email protected]]
> >Sent: Friday, November 09, 2007 2:03 AM
> >To: SANGOI DINO LEONARDO
> >Cc: [email protected]; Rafael J. Wysocki; Brown,
> >Len; Pallipadi, Venkatesh; [email protected]
> >Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running
> >udev on my laptop
> >
> >
> >(cc's added)
> >
> >On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO
> ><[email protected]> wrote:
> >
> >> Hi,
> >>
> >> My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and
> >> 2.6.24.rc2.
> >> It works fine with 2.6.23 and older.
> >>
> >> I seen this bug first while running fedora rawhide, so you
> >can find hardware
> >>
> >> info and boot logs at
> >https://bugzilla.redhat.com/show_bug.cgi?id=312201.
> >>
> >> I did a git bisect, and got this:
> >>
> >> $ git bisect bad
> >> 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit
> >> commit 4f86d3a8e297205780cca027e974fd5f81064780
> >> Author: Len Brown <[email protected]>
> >> Date: Wed Oct 3 18:58:00 2007 -0400
> >>
> >> cpuidle: consolidate 2.6.22 cpuidle branch into one patch
> >> [SNIP full commit log]
> >>
> <snip>
> > --------
> >>
> >> Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set
> >to y (tell me if
> >> full config is needed).
> >>
> >> If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine.
> >> Setting CONFIG_CPU_IDLE=n also gives me a working kernel.
> >>
> >> Ask me if more info is needed (please CC me).
> >>
> >> Thanks,
> >>
> >> Dino
>
>

Dino,

Can you try the patch below over rc2 and see whether it fixes the problem.
Looking at the code, it should fix the problem. If it does not, can you send
me the output of acpidump from your system. That will help to look further
into this. You can get acpidump from latest pmtools package here.
http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/

Thanks,
Venki


Test patch for the bug report at
https://bugzilla.redhat.com/show_bug.cgi?id=312201

Signed-off-by: Venki Pallipadi <[email protected]>

Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c
===================================================================
--- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c
+++ linux-2.6.24-rc/drivers/acpi/processor_idle.c
@@ -1502,23 +1502,28 @@ static int acpi_idle_enter_bm(struct cpu
} else {
acpi_idle_update_bm_rld(pr, cx);

- spin_lock(&c3_lock);
- c3_cpu_count++;
- /* Disable bus master arbitration when all CPUs are in C3 */
- if (c3_cpu_count == num_online_cpus())
- acpi_set_register(ACPI_BITREG_ARB_DISABLE, 1);
- spin_unlock(&c3_lock);
+ if (pr->flags.bm_check && pr->flags.bm_control) {
+ spin_lock(&c3_lock);
+ c3_cpu_count++;
+ /* Disable bus master arbitration when all CPUs are in C3 */
+ if (c3_cpu_count == num_online_cpus())
+ acpi_set_register(ACPI_BITREG_ARB_DISABLE, 1);
+ spin_unlock(&c3_lock);
+ } else if (!pr->flags.bm_check) {
+ ACPI_FLUSH_CPU_CACHE();
+ }

t1 = inl(acpi_gbl_FADT.xpm_timer_block.address);
acpi_idle_do_entry(cx);
t2 = inl(acpi_gbl_FADT.xpm_timer_block.address);

- spin_lock(&c3_lock);
/* Re-enable bus master arbitration */
- if (c3_cpu_count == num_online_cpus())
+ if (pr->flags.bm_check && pr->flags.bm_control) {
+ spin_lock(&c3_lock);
acpi_set_register(ACPI_BITREG_ARB_DISABLE, 0);
- c3_cpu_count--;
- spin_unlock(&c3_lock);
+ c3_cpu_count--;
+ spin_unlock(&c3_lock);
+ }
}

#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC)