2008-03-03 02:17:57

by Rafael J. Wysocki

[permalink] [raw]
Subject: 2.6.25-rc3-git3: Reported regressions from 2.6.24

This message contains a list of some regressions from 2.6.24 reported since
2.6.25-rc1 was released, for which there are no fixes in the mainline I know
of. ?If any of them have been fixed already, please let me know.

If you know of any other unresolved regressions from 2.6.24, please let me know
either and I'll add them to the list. ?Also, please let me know if any of the
entries below are invalid.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2008-03-03 115 65 49
2008-02-25 ? ? ? 90 ? ? ? 51 ? ? ? ? ?39
? 2008-02-17 ? ? ? 61 ? ? ? 45 ? ? ? ? ?37


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9938
Subject : building fails on lguest
Submitter : Cedric OLLIVIER <[email protected]>
Date : 2008-02-11 15:34


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9954
Subject : iwl3945: not only it periodically dies, it also BUG()s
Submitter : Pavel Machek <[email protected]>
Date : 2008-02-05 22:44
References : http://lkml.org/lkml/2008/2/5/453
Handled-By : Chatre, Reinette <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9958
Subject : parisc compile error
Submitter : Adrian Bunk <[email protected]>
Date : 2008-02-08 01:12
References : http://lkml.org/lkml/2008/2/7/572
Handled-By : Kyle McMartin <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9962
Subject : mount: could not find filesystem
Submitter : Kamalesh Babulal <[email protected]>
Date : 2008-02-12 14:34
References : http://lkml.org/lkml/2008/2/12/91
Handled-By : Bartlomiej Zolnierkiewicz <[email protected]>
Yinghai Lu <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9966
Subject : kernel BUG at kernel/power/snapshot.c:464!
Submitter : Jeff Mahoney <[email protected]>
Date : 2008-02-08 20:03
References : http://lkml.org/lkml/2008/2/8/331
Handled-By : Rafael J. Wysocki <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9976
Subject : BUG: 2.6.25-rc1: iptables postrouting setup causes oops
Submitter : Ben Nizette <[email protected]>
Date : 2008-02-12 12:46
References : http://lkml.org/lkml/2008/2/12/148
Handled-By : Haavard Skinnemoen <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
Subject : 2.6.25-rc1: volanoMark 45% regression
Submitter : Zhang, Yanmin <[email protected]>
Date : 2008-02-13 10:30
References : http://lkml.org/lkml/2008/2/13/128
Handled-By : Srivatsa Vaddagiri <[email protected]>
Balbir Singh <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9980
Subject : 2.6.25-rc1 on Sun Ultra 40
Submitter : Jasper Bryant-Greene <[email protected]>
Date : 2008-02-13 12:25
References : http://lkml.org/lkml/2008/2/13/181
Handled-By : Yinghai Lu <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9983
Subject : PROBLEM: 2.6.25-rc1-git2 freezes when accessing external USB hard disk (ehci-hcd)
Submitter : Linas ?virblis <[email protected]>
Date : 2008-02-13 22:38
References : http://lkml.org/lkml/2008/2/13/566


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9984
Subject : problem with starting 2.6.25-rc1 and latest git
Submitter : Mariusz Kozlowski <[email protected]>
Date : 2008-02-13 23:16
References : http://lkml.org/lkml/2008/2/13/587


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9992
Subject : 2.6.24-git: kmap_atomic() WARN_ON()
Submitter : Thomas Gleixner <[email protected]>
Date : 2008-02-07 00:58
References : http://lkml.org/lkml/2008/2/6/451
http://lkml.org/lkml/2007/1/14/38


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9995
Subject : 2.6.25-rc1 regression - backlight controlls do not work - ThinkPad T61
Submitter : Lukas Hejtmanek <[email protected]>
Date : 2008-02-15 04:51
Handled-By : Zhang Rui <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10003
Subject : hda_intel: balance control does not work correctly
Submitter : Frans Pop <[email protected]>
Date : 2008-02-16 09:58


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10011
Subject : The computer is blocked when X is started
Submitter : Fran?ois Valenduc <[email protected]>
Date : 2008-02-17 06:28
Handled-By : Thomas Gleixner <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10012
Subject : 2.6.24-git4+ regression
Submitter : Lukas Hejtmanek <[email protected]>
Date : 2008-01-30 14:56
References : http://lkml.org/lkml/2008/1/30/254
Handled-By : Ingo Molnar <[email protected]>
Srivatsa Vaddagiri <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10015
Subject : [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
Submitter : Kamalesh Babulal <[email protected]>
Date : 2008-02-16 06:44
References : http://lkml.org/lkml/2008/2/16/4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10016
Subject : cobalt_btns.c &lt;-&gt; struct platform_device compile error
Submitter : Adrian Bunk <[email protected]>
Date : 2008-02-17 12:12
References : http://lkml.org/lkml/2008/2/17/293


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10017
Subject : cdev removal broke cobalt_btns.c compilation
Submitter : Adrian Bunk <[email protected]>
Date : 2008-02-17 12:14
References : http://lkml.org/lkml/2008/2/17/295


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10021
Subject : Linux 2.6.25-rc2 regression: LVM cannot find volume group
Submitter : Tilman Schmidt <[email protected]>
Date : 2008-02-16 20:14
References : http://lkml.org/lkml/2008/2/16/208
Handled-By : Alan Cox <[email protected]>
Jiri Slaby <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10022
Subject : [2.6.25-rc1] jerky mouse cursor and randoooom key repeats
Submitter : Frans Pop <[email protected]>
Date : 2008-02-13 09:41
References : http://lkml.org/lkml/2008/2/13/82
http://lkml.org/lkml/2008/2/16/17
Handled-By : Jiri Kosina <[email protected]>
Mike Galbraith <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10025
Subject : Current git very broken on the Dreamcast
Submitter : Adrian McMenamin <[email protected]>
Date : 2008-02-16 19:38
References : http://lkml.org/lkml/2008/2/16/196
Handled-By : Kristoffer Ericson <[email protected]>
Magnus Damm <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10027
Subject : 2.6.25-rc[12] Video4Linux Bttv Regression
Submitter : Bongani Hlope <[email protected]>
Date : 2008-02-17 09:36
References : http://lkml.org/lkml/2008/2/17/55
Handled-By : Mauro Carvalho Chehab <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10033
Subject : mips yosemite_defconfig compile error
Submitter : Adrian Bunk <[email protected]>
Date : 2008-02-17 16:45
References : http://lkml.org/lkml/2008/2/17/383


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10041
Subject : 2.6.25-rc1/2 regression: first-time login into gnome fails
Submitter : Romano Giannetti <[email protected]>
Date : 2008-02-18 11:56
References : http://lkml.org/lkml/2008/2/18/145
Handled-By : Ray Lee <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10051
Subject : Spurious messages at boot, eventually hangs the usb subsustem
Submitter : Jean-Luc Coulon <[email protected]>
Date : 2008-02-20 09:10


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10061
Subject : Hang in md5_resync
Submitter : Steinar H. Gunderson <[email protected]>
Date : 2008-02-21 13:13


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10065
Subject : 2.6.25-rc2 regression - hang on suspend
Submitter : Soeren Sonnenburg <[email protected]>
Date : 2008-02-19 12:59
References : http://lkml.org/lkml/2008/2/19/165
http://lkml.org/lkml/2008/2/17/381
Handled-By : Rafael J. Wysocki <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10067
Subject : TUNER_TDA8290=y, VIDEO_DEV=n build error
Submitter : Toralf F?rster <[email protected]>
Date : 2008-02-22 10:36
References : http://lkml.org/lkml/2008/2/19/262


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10078
Subject : USB OOPS 2.6.25-rc2-git1
Submitter : Andre Tomt <[email protected]>
Date : 2008-02-19 16:19
References : http://lkml.org/lkml/2008/2/19/253
Handled-By : David Brownell <[email protected]>
Alan Stern <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10080
Subject : 2.6.25-rc2: ohci1394 problem
Submitter : Thomas Meyer <[email protected]>
Date : 2008-02-20 08:47
References : http://lkml.org/lkml/2008/2/20/58
Handled-By : Stefan Richter <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10082
Subject : [BUG] 2.6.25-rc2-git4 - Regression Kernel oops while running kernbench and tbench on powerpc
Submitter : Kamalesh Babulal <[email protected]>
Date : 2008-02-20 16:01
References : http://lkml.org/lkml/2008/2/20/218
http://lkml.org/lkml/2008/1/18/71


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10083
Subject : BUG?: &quot;Cannot map mmconfig aperture&quot;
Submitter : Diego Calleja <[email protected]>
Date : 2008-02-20 22:31
References : http://lkml.org/lkml/2008/2/20/551
Handled-By : Thomas Gleixner <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10084
Subject : 2.6.25-rc2-git4 BUG: sysfs_readdir
Submitter : Randy Dunlap <[email protected]>
Date : 2008-02-21 17:25
References : http://lkml.org/lkml/2008/2/21/212
Handled-By : Greg KH <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10086
Subject : 2.6.25-rc2 + smartd = hang
Submitter : Anders Eriksson <[email protected]>
Date : 2008-02-22 17:51
References : http://lkml.org/lkml/2008/2/22/239
Handled-By : Bartlomiej Zolnierkiewicz <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10093
Subject : 2.6.25-current-git hangs on boot
Submitter : Soeren Sonnenburg <[email protected]>
Date : 2008-02-23 18:55
References : http://lkml.org/lkml/2008/2/23/263
http://marc.info/?l=linux-acpi&amp;m=120387537018467&amp;w=4
Handled-By : Pallipadi, Venkatesh <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10097
Subject : SMP BUG in __nf_conntrack_find
Submitter : Christian Casteyde <[email protected]>
Date : 2008-02-25 10:44


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10100
Subject : 208c70a45624400fafd7511b96bc426bf01f8f5e breaks EC init
Submitter : Michael S. Tsirkin <[email protected]>
Date : 2008-02-25 20:19
References : http://lkml.org/lkml/2008/2/25/282
Handled-By : Alexey Starikovskiy <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10102
Subject : 2.6.25-rc2 Regression Thinkpad acpi
Submitter : Lukas Hejtmanek <[email protected]>
Date : 2008-02-25 12:47
References : http://lkml.org/lkml/2008/2/25/73
Handled-By : Henrique de Moraes Holschuh <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10104
Subject : 2.6.25-rc3: WARNING: at arch/x86/mm/ioremap.c:137
Submitter : Phil Oester <[email protected]>
Date : 2008-02-25 03:09
References : http://lkml.org/lkml/2008/2/24/265


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10117
Subject : 2.6.25-current-git hangs on boot (pci=nommconf helps)
Submitter : Soeren Sonnenburg <[email protected]>
Date : 2008-02-23 18:55
References : http://lkml.org/lkml/2008/2/23/263


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10123
Subject : No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
Submitter : Guennadi Liakhovetski <[email protected]>
Date : 2008-02-27 08:15


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10133
Subject : INFO: possible circular locking in the resume
Submitter : Zdenek Kabelac <[email protected]>
Date : 2008-02-27
References : http://lkml.org/lkml/2008/2/26/479
Handled-By : Gautham R Shenoy <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10140
Subject : Kernel blocked after &quot;ACPI: bus type pci registered&quot;
Submitter : Fran?ois Valenduc <[email protected]>
Date : 2008-03-01 08:13
References : http://lkml.org/lkml/2008/2/28/153
Handled-By : Thomas Gleixner <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10146
Subject : 2.6.25-rc: complete lockup on boot/start of X (bisected)
Submitter : Marcin Slusarz <[email protected]>
Date : 2008-03-02 20:00
References : http://lkml.org/lkml/2008/3/2/91
Handled-By : Peter Zijlstra <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10147
Subject : sh r7780mp_defconfig compile error
Submitter : Adrian Bunk <[email protected]>
Date : 2008-03-02 13:42
References : http://lkml.org/lkml/2008/3/2/141


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10148
Subject : [BUG kernel 2.6.25-rc3 IPV6] ping6 -I eth0 ff02::1 causes system hang.
Submitter : Komuro <[email protected]>
Date : 2008-03-02 13:07
References : http://lkml.org/lkml/2008/3/2/32


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10149
Subject : bisected boot regression post 2.6.25-rc3.. please revert
Submitter : Arjan van de Ven <[email protected]>
Date : 2008-03-01 10:56
References : http://lkml.org/lkml/2008/3/1/155


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10152
Subject : Clocksource tsc is always unstable with 2.6.25-* kernels and CONFIG_NO_HZ=y on my box
Submitter : Gabriel C <[email protected]>
Date : 2008-02-24 01:31
References : http://lkml.org/lkml/2008/2/23/380
http://lkml.org/lkml/2008/2/24/281
Handled-By : Thomas Gleixner <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10155
Subject : Regression in 2.6.25-rc3: s2ram segfaults before suspending
Submitter : Klaus S. Madsen <[email protected]>
Date : 2008-02-27 23:10
References : http://lkml.org/lkml/2008/2/27/364
Handled-By : H. Peter Anvin <[email protected]>
Ingo Molnar <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10156
Subject : KVM &amp; Qemu crashed with infinite recursive kernel loop in the guest
Submitter : Zdenek Kabelac <[email protected]>
Date : 2008-02-28 11:25
References : http://lkml.org/lkml/2008/2/28/106


Regressionn with patches
------------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9874
Subject : Undocking Lenovo ThinkPad T61 causes oops
Submitter : Lukas Hejtmanek <[email protected]>
Date : 2008-02-02 02:07
Handled-By : Len Brown <[email protected]>
Patch : http://marc.info/?l=linux-acpi&amp;m=120389632114090&amp;w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9969
Subject : 2.6.24-git15 Keyboard Issue?
Submitter : Chris Holvenstot <[email protected]>
Date : 2008-02-06 14:02
References : http://lkml.org/lkml/2008/2/6/100
http://lkml.org/lkml/2008/2/13/82
Handled-By : Thomas Gleixner <[email protected]>
Patch : http://lkml.org/lkml/2008/2/15/343


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10013
Subject : tbench regression in 2.6.25-rc1
Submitter : Zhang, Yanmin <[email protected]>
Date : 2008-02-15 02:52
References : http://lkml.org/lkml/2008/2/14/546
Handled-By : Eric Dumazet <[email protected]>
David Miller <[email protected]>
Patch : http://lkml.org/lkml/2008/2/18/66
http://lkml.org/lkml/2008/2/18/117


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10020
Subject : mips SMBUS_PSC_BASE compile errors
Submitter : Adrian Bunk <[email protected]>
Date : 2008-02-17 12:21
References : http://lkml.org/lkml/2008/2/17/299
Handled-By : Manuel Lauss <[email protected]>
Patch : http://lkml.org/lkml/2008/2/18/122


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10024
Subject : 2.6.25-rc2 regression in rt61pci wireless driver
Submitter : Chris Clayton <[email protected]>
Date : 2008-02-16 13:06
References : http://lkml.org/lkml/2008/2/16/82
Handled-By : Ivo van Doorn <[email protected]>
Stefano Brivio <[email protected]>
Patch : http://lkml.org/lkml/2008/3/2/26


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10030
Subject : Suspend doesn't work when SD card is inserted
Submitter : Zdenek Kabelac <[email protected]>
Date : 2008-02-17 12:00
References : http://lkml.org/lkml/2008/2/17/81
Handled-By : Rafael J. Wysocki <[email protected]>
Patch : http://marc.info/?l=linux-acpi&amp;m=120389632114090&amp;w=2


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10031
Subject : [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Submitter : Andrey Borzenkov <[email protected]>
Date : 2008-02-16 13:36
References : http://lkml.org/lkml/2008/2/17/125
Handled-By : Kok, Auke <[email protected]>
Patch : http://lkml.org/lkml/2008/2/21/259


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10071
Subject : kernel hang in inet_init
Submitter : Yannick Dirou <[email protected]>
Date : 2008-02-23 00:34
Handled-By : Paul E. McKenney <[email protected]>
Patch : http://lkml.org/lkml/2008/2/2/11


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10075
Subject : regression: CD burning (k3b) went broke
Submitter : Mike Galbraith <[email protected]>
Date : 2008-02-21 09:42
References : http://lkml.org/lkml/2008/2/21/41
http://lkml.org/lkml/2008/3/2/131
Handled-By : Jens Axboe <[email protected]>
Tejun Heo <[email protected]>
Patch : http://lkml.org/lkml/2008/2/24/278
http://article.gmane.org/gmane.linux.scsi/39425


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10079
Subject : 2.6.25-rc[1,2]: failed to setup dm-crypt key mapping
Submitter : Michael S. Tsirkin <[email protected]>
Date : 2008-02-20 08:43
References : http://lkml.org/lkml/2008/2/20/131
Handled-By : Herbert Xu <[email protected]>
Patch : http://lkml.org/lkml/2008/2/22/127


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10081
Subject : build #365 issue for v2.6.25-rc2-342-g5d9c4a7 in ./arch/x86/kvm/kvm.ko
Submitter : Toralf F?rster <[email protected]>
Date : 2008-02-20 14:11
References : http://lkml.org/lkml/2008/2/20/167
Handled-By : Avi Kivity <[email protected]>
Randy Dunlap <[email protected]>
Patch : http://lkml.org/lkml/2008/2/20/195
http://lkml.org/lkml/2008/2/20/379


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10122
Subject : FIXED_PHY must depend on PHYLIB=y
Submitter : Olaf Hering <[email protected]>
Date : 2008-02-27 07:14
References : http://lkml.org/lkml/2008/2/27/90
Handled-By : Adrian Bunk <[email protected]>
Patch : http://lkml.org/lkml/2008/2/27/157


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10132
Subject : 2.6.25 git regression, oops on boot
Submitter : Jonathan McDowell <[email protected]>
Date : 2008-02-29 11:09
References : http://marc.info/?l=linux-kernel&amp;m=120423268404812&amp;w=2
http://lkml.org/lkml/2008/2/28/369
Handled-By : Zhang Rui <[email protected]>
Lin Ming <[email protected]>
Patch : http://lkml.org/lkml/2008/2/29/49


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10145
Subject : apanel: fix kconfig dependencies
Submitter : Randy Dunlap <[email protected]>
Date : 2008-02-06 16:27
References : http://lkml.org/lkml/2008/2/6/492
Handled-By : Randy Dunlap <[email protected]>
Patch : http://lkml.org/lkml/2008/2/7/333


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10153
Subject : (regression) kernel/timeconst.h bugs with HZ=128
Submitter : David Brownell <[email protected]>
Date : 2008-02-26 19:32
References : http://lkml.org/lkml/2008/2/26/294
Handled-By : H. Peter Anvin <[email protected]>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=15114&amp;action=view
http://bugzilla.kernel.org/attachment.cgi?id=15115&amp;action=view


For details, please follow the links given in references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.24,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=9832

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


2008-03-03 07:54:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24


* Rafael J. Wysocki <[email protected]> wrote:

> Unresolved regressions
> ----------------------
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9938
> Subject : building fails on lguest
> Submitter : Cedric OLLIVIER <[email protected]>
> Date : 2008-02-11 15:34

this should be resolved by:

commit db342d216ba9e060d8c5501eefc1d0a789c9e711
Author: Tony Breeds <[email protected]>
Date: Tue Feb 19 08:16:03 2008 +0100

lguest: fix build breakage

Ingo

2008-03-03 11:43:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Monday, 3 of March 2008, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <[email protected]> wrote:
>
> > Unresolved regressions
> > ----------------------
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9938
> > Subject : building fails on lguest
> > Submitter : Cedric OLLIVIER <[email protected]>
> > Date : 2008-02-11 15:34
>
> this should be resolved by:
>
> commit db342d216ba9e060d8c5501eefc1d0a789c9e711
> Author: Tony Breeds <[email protected]>
> Date: Tue Feb 19 08:16:03 2008 +0100
>
> lguest: fix build breakage

Thanks, closed.

Rafael

2008-03-04 07:36:41

by Pekka Enberg

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

Hi,

2008/3/3 Rafael J. Wysocki <[email protected]>:
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10015
> Subject : [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench
> Submitter : Kamalesh Babulal <[email protected]>
> Date : 2008-02-16 06:44
> References : http://lkml.org/lkml/2008/2/16/4

This is confirmed to be fixed (see comments in the bugzilla issue).

2008-03-04 11:16:59

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Mon 2008-03-03 03:16:06, Rafael J. Wysocki wrote:
> This message contains a list of some regressions from 2.6.24 reported since
> 2.6.25-rc1 was released, for which there are no fixes in the mainline I know
> of. ?If any of them have been fixed already, please let me know.
>
> If you know of any other unresolved regressions from 2.6.24, please let me know
> either and I'll add them to the list. ?Also, please let me know if any of the
> entries below are invalid.
>
>
> Listed regressions statistics:
>
> Date Total Pending Unresolved
> ----------------------------------------
> 2008-03-03 115 65 49
> 2008-02-25 ? ? ? 90 ? ? ? 51 ? ? ? ? ?39
> ? 2008-02-17 ? ? ? 61 ? ? ? 45 ? ? ? ? ?37


> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9995
> Subject : 2.6.25-rc1 regression - backlight controlls do not work - ThinkPad T61
> Submitter : Lukas Hejtmanek <[email protected]>
> Date : 2008-02-15 04:51
> Handled-By : Zhang Rui <[email protected]>

Actually, I see this one too. Witch 2.6.25-rc3, brightness keys just
do not work. I can still control brightness if I load acpi video
module and do

root@amd:/sys/class/backlight/acpi_video0# echo 1 > brightness

.


> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10012
> Subject : 2.6.24-git4+ regression

Better description would be "interactivity problems in -git4+".

> Submitter : Lukas Hejtmanek <[email protected]>
> Date : 2008-01-30 14:56
> References : http://lkml.org/lkml/2008/1/30/254
> Handled-By : Ingo Molnar <[email protected]>
> Srivatsa Vaddagiri <[email protected]>

And he has GROUP_SCHED enabled, that is known broken.

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10022
> Subject : [2.6.25-rc1] jerky mouse cursor and randoooom key repeats
> Submitter : Frans Pop <[email protected]>
> Date : 2008-02-13 09:41
> References : http://lkml.org/lkml/2008/2/13/82
> http://lkml.org/lkml/2008/2/16/17
> Handled-By : Jiri Kosina <[email protected]>
> Mike Galbraith <[email protected]>

This is duplicate of 10012.

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-04 11:29:20

by Dhaval Giani

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
> Subject : 2.6.25-rc1: volanoMark 45% regression
> Submitter : Zhang, Yanmin <[email protected]>
> Date : 2008-02-13 10:30
> References : http://lkml.org/lkml/2008/2/13/128
> Handled-By : Srivatsa Vaddagiri <[email protected]>
> Balbir Singh <[email protected]>

Peter's revert of the load balance patches should fix this one. Yanmin,
could you please confirm if the patch at
http://lkml.org/lkml/2008/2/25/202 helps?

--
regards,
Dhaval

2008-03-04 14:00:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24


* Pavel Machek <[email protected]> wrote:

> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10012
> > Subject : 2.6.24-git4+ regression
>
> Better description would be "interactivity problems in -git4+".
>
> > Submitter : Lukas Hejtmanek <[email protected]>
> > Date : 2008-01-30 14:56
> > References : http://lkml.org/lkml/2008/1/30/254
> > Handled-By : Ingo Molnar <[email protected]>
> > Srivatsa Vaddagiri <[email protected]>
>
> And he has GROUP_SCHED enabled, that is known broken.

this should all be fixed in sched-devel.git (via a revert):

http://people.redhat.com/mingo/sched-devel.git/README

it's lined up for upstream pull.

Ingo

2008-03-04 23:01:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Tuesday, 4 of March 2008, Ingo Molnar wrote:
>
> * Pavel Machek <[email protected]> wrote:
>
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10012
> > > Subject : 2.6.24-git4+ regression
> >
> > Better description would be "interactivity problems in -git4+".
> >
> > > Submitter : Lukas Hejtmanek <[email protected]>
> > > Date : 2008-01-30 14:56
> > > References : http://lkml.org/lkml/2008/1/30/254
> > > Handled-By : Ingo Molnar <[email protected]>
> > > Srivatsa Vaddagiri <[email protected]>
> >
> > And he has GROUP_SCHED enabled, that is known broken.
>
> this should all be fixed in sched-devel.git (via a revert):
>
> http://people.redhat.com/mingo/sched-devel.git/README
>
> it's lined up for upstream pull.

Is it:

commit 62fb185130e4d420f71a30ff59d8b16b74ef5d2b
Author: Peter Zijlstra <[email protected]>
Date: Mon Feb 25 17:34:02 2008 +0100

sched: revert load_balance_monitor() changes

2008-03-05 02:10:44

by Yanmin Zhang

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Tue, 2008-03-04 at 16:57 +0530, Dhaval Giani wrote:
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
> > Subject : 2.6.25-rc1: volanoMark 45% regression
> > Submitter : Zhang, Yanmin <[email protected]>
> > Date : 2008-02-13 10:30
> > References : http://lkml.org/lkml/2008/2/13/128
> > Handled-By : Srivatsa Vaddagiri <[email protected]>
> > Balbir Singh <[email protected]>
>
> Peter's revert of the load balance patches should fix this one. Yanmin,
> could you please confirm if the patch at
> http://lkml.org/lkml/2008/2/25/202 helps?
I tested it against 2.6.25-rc3 on my 16-core tigerton machine. It really improves
volano result although it doesn't recover all the result.
Comparing with 2.6.24, without the patch, volanoMark has about 50% regression
with 2.6.25-rc3. With the patch, volanoMark has about 15% regression.

-yanmin

2008-03-05 03:50:52

by Balbir Singh

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

Zhang, Yanmin wrote:
> On Tue, 2008-03-04 at 16:57 +0530, Dhaval Giani wrote:
>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
>>> Subject : 2.6.25-rc1: volanoMark 45% regression
>>> Submitter : Zhang, Yanmin <[email protected]>
>>> Date : 2008-02-13 10:30
>>> References : http://lkml.org/lkml/2008/2/13/128
>>> Handled-By : Srivatsa Vaddagiri <[email protected]>
>>> Balbir Singh <[email protected]>
>>
>> Peter's revert of the load balance patches should fix this one. Yanmin,
>> could you please confirm if the patch at
>> http://lkml.org/lkml/2008/2/25/202 helps?
> I tested it against 2.6.25-rc3 on my 16-core tigerton machine. It really improves
> volano result although it doesn't recover all the result.
> Comparing with 2.6.24, without the patch, volanoMark has about 50% regression
> with 2.6.25-rc3. With the patch, volanoMark has about 15% regression.
>

Have you had a chance to git-bisect the culprit after the revert?

--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL

2008-03-05 05:18:35

by Yanmin Zhang

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Wed, 2008-03-05 at 09:19 +0530, Balbir Singh wrote:
> Zhang, Yanmin wrote:
> > On Tue, 2008-03-04 at 16:57 +0530, Dhaval Giani wrote:
> >>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
> >>> Subject : 2.6.25-rc1: volanoMark 45% regression
> >>> Submitter : Zhang, Yanmin <[email protected]>
> >>> Date : 2008-02-13 10:30
> >>> References : http://lkml.org/lkml/2008/2/13/128
> >>> Handled-By : Srivatsa Vaddagiri <[email protected]>
> >>> Balbir Singh <[email protected]>
> >>
> >> Peter's revert of the load balance patches should fix this one. Yanmin,
> >> could you please confirm if the patch at
> >> http://lkml.org/lkml/2008/2/25/202 helps?
> > I tested it against 2.6.25-rc3 on my 16-core tigerton machine. It really improves
> > volano result although it doesn't recover all the result.
> > Comparing with 2.6.24, without the patch, volanoMark has about 50% regression
> > with 2.6.25-rc3. With the patch, volanoMark has about 15% regression.
> >
>
> Have you had a chance to git-bisect the culprit after the revert?
How to bisect it if the reverted patch is submitted after the culprit patch?

-yanmin

2008-03-05 06:28:34

by Yanmin Zhang

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Wed, 2008-03-05 at 10:06 +0800, Zhang, Yanmin wrote:
> On Tue, 2008-03-04 at 16:57 +0530, Dhaval Giani wrote:
> > >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
> > > Subject : 2.6.25-rc1: volanoMark 45% regression
> > > Submitter : Zhang, Yanmin <[email protected]>
> > > Date : 2008-02-13 10:30
> > > References : http://lkml.org/lkml/2008/2/13/128
> > > Handled-By : Srivatsa Vaddagiri <[email protected]>
> > > Balbir Singh <[email protected]>
> >
> > Peter's revert of the load balance patches should fix this one. Yanmin,
> > could you please confirm if the patch at
> > http://lkml.org/lkml/2008/2/25/202 helps?
> I tested it against 2.6.25-rc3 on my 16-core tigerton machine. It really improves
> volano result although it doesn't recover all the result.
> Comparing with 2.6.24, without the patch, volanoMark has about 50% regression
> with 2.6.25-rc3. With the patch, volanoMark has about 15% regression.
One more update on the reverted patch: Comparing with 2.6.24, cpu2000-fp has about
4% regression with kernel 2.6.25-rc on my madison IPF machine. As you know, cpu2000-fp
consists of many sub-testing. The most regression looks relevant to a couple of testing
in the middle step. But if I ran the sub-testing manually, I couldn't see any regression.
If I started kernel with boot parameter maxcpus=1, the regression becomes 1%.

If I apply Peter's revert patch to 2.6.25-rc3, the regression also becomes 1%.

I don't know what causes the last 1% regression.

-yanmin

2008-03-05 06:38:21

by Balbir Singh

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

Zhang, Yanmin wrote:
> On Wed, 2008-03-05 at 09:19 +0530, Balbir Singh wrote:
>> Zhang, Yanmin wrote:
>>> On Tue, 2008-03-04 at 16:57 +0530, Dhaval Giani wrote:
>>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9978
>>>>> Subject : 2.6.25-rc1: volanoMark 45% regression
>>>>> Submitter : Zhang, Yanmin <[email protected]>
>>>>> Date : 2008-02-13 10:30
>>>>> References : http://lkml.org/lkml/2008/2/13/128
>>>>> Handled-By : Srivatsa Vaddagiri <[email protected]>
>>>>> Balbir Singh <[email protected]>
>>>>
>>>> Peter's revert of the load balance patches should fix this one. Yanmin,
>>>> could you please confirm if the patch at
>>>> http://lkml.org/lkml/2008/2/25/202 helps?
>>> I tested it against 2.6.25-rc3 on my 16-core tigerton machine. It really improves
>>> volano result although it doesn't recover all the result.
>>> Comparing with 2.6.24, without the patch, volanoMark has about 50% regression
>>> with 2.6.25-rc3. With the patch, volanoMark has about 15% regression.
>>>
>> Have you had a chance to git-bisect the culprit after the revert?
> How to bisect it if the reverted patch is submitted after the culprit patch?

Good question. What I would do is create a branch at the patches that caused the
regression and apply all patches (except the reverted patches to it, solving
conflicts if any) and run git-bisect that. But, I suspect it is a lot of work.

--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL

2008-03-05 06:57:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24


* Zhang, Yanmin <[email protected]> wrote:

> > Have you had a chance to git-bisect the culprit after the revert?
>
> How to bisect it if the reverted patch is submitted after the culprit
> patch?

i do this by using quilt ontop of git-bisect.

I do something like this:

mkdir patches
echo revert.patch > patches/series
git-log -1 -p 62fb185130e4d420f > patches/revert.patch

git-bisect start
git-bisect bad v2.6.24-rc3
git-bisect good v2.6.24

quilt push # the revert is applied
[ test the kernel ]
quilt pop # revert is unapplied

git-bisect bad # if it's still bad

quilt push # apply the revert again
[ test the next kernel ]
quilt pop # undo the revert

git-bisect good # if it's good

etc. NOTE: if the "quilt push" fails, it's likely because you are in a
point in the tree that does not have the reverted commits applied yet.
In that case there's no need to push/pop, just test the bisection point.

Note, since there are _two_ guilty commits here:

commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729
Author: Srivatsa Vaddagiri <[email protected]>
Date: Fri Jan 25 21:08:00 2008 +0100
sched: group scheduling, change how cpu load is calculated

commit 6b2d7700266b9402e12824e11e0099ae6a4a6a79
Author: Srivatsa Vaddagiri <[email protected]>
Date: Fri Jan 25 21:08:00 2008 +0100
sched: group scheduler, fix fairness of cpu bandwidth allocation for task

make sure the bisection point is never "between" these two commits.

You can check whether a bisection point has the two guilty commits
applied, via:

git-log | grep -E '58e2d4ca581167c2a0|6b2d7700266b9402e12'

if this comes up empty, the guilty commits are not applied.

Ingo

2008-03-05 07:15:23

by Yanmin Zhang

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Wed, 2008-03-05 at 07:56 +0100, Ingo Molnar wrote:
> * Zhang, Yanmin <[email protected]> wrote:
>
> > > Have you had a chance to git-bisect the culprit after the revert?
> >
> > How to bisect it if the reverted patch is submitted after the culprit
> > patch?
>
> i do this by using quilt ontop of git-bisect.
Thanks for your kind information. My machines are buys in testing 2.6.25-rc4.

Let me find a timeslot to track this issue again.

-yanmin

>
> I do something like this:
>
> mkdir patches
> echo revert.patch > patches/series
> git-log -1 -p 62fb185130e4d420f > patches/revert.patch
>
> git-bisect start
> git-bisect bad v2.6.24-rc3
> git-bisect good v2.6.24
>
> quilt push # the revert is applied
> [ test the kernel ]
> quilt pop # revert is unapplied
>
> git-bisect bad # if it's still bad
>
> quilt push # apply the revert again
> [ test the next kernel ]
> quilt pop # undo the revert
>
> git-bisect good # if it's good
>
> etc. NOTE: if the "quilt push" fails, it's likely because you are in a
> point in the tree that does not have the reverted commits applied yet.
> In that case there's no need to push/pop, just test the bisection point.
>
> Note, since there are _two_ guilty commits here:
>
> commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729
> Author: Srivatsa Vaddagiri <[email protected]>
> Date: Fri Jan 25 21:08:00 2008 +0100
> sched: group scheduling, change how cpu load is calculated
>
> commit 6b2d7700266b9402e12824e11e0099ae6a4a6a79
> Author: Srivatsa Vaddagiri <[email protected]>
> Date: Fri Jan 25 21:08:00 2008 +0100
> sched: group scheduler, fix fairness of cpu bandwidth allocation for task
>
> make sure the bisection point is never "between" these two commits.
>
> You can check whether a bisection point has the two guilty commits
> applied, via:
>
> git-log | grep -E '58e2d4ca581167c2a0|6b2d7700266b9402e12'
>
> if this comes up empty, the guilty commits are not applied.
>
> Ingo

2008-03-06 07:27:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24


* Rafael J. Wysocki <[email protected]> wrote:

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10123
> Subject : No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> Submitter : Guennadi Liakhovetski <[email protected]>
> Date : 2008-02-27 08:15

Guennadi bisected this down to:

commit fd7d1ced29e5beb88c9068801da7a362606d8273
PCI: make pci_bus a struct device

and it's suspected that Andrew's poweroff problems might be related as
well. Guennadi, Andrew, find below a manual revert of this change - does
it make any difference?

Ingo

---------------->
Subject: revert "PCI: make pci_bus a struct device"
From: Ingo Molnar <[email protected]>
Date: Thu Mar 06 08:12:56 CET 2008

revert:

commit fd7d1ced29e5beb88c9068801da7a362606d8273
PCI: make pci_bus a struct device

patch for testing purposes.
---
drivers/pci/bus.c | 19 +++-----------
drivers/pci/pci-sysfs.c | 6 ++--
drivers/pci/pci.h | 2 -
drivers/pci/probe.c | 64 +++++++++++++++++++++++++++++-------------------
drivers/pci/remove.c | 6 +++-
include/linux/pci.h | 4 +--
6 files changed, 53 insertions(+), 48 deletions(-)

Index: linux/drivers/pci/bus.c
===================================================================
--- linux.orig/drivers/pci/bus.c
+++ linux/drivers/pci/bus.c
@@ -108,7 +108,6 @@ int pci_bus_add_device(struct pci_dev *d
void pci_bus_add_devices(struct pci_bus *bus)
{
struct pci_dev *dev;
- struct pci_bus *child_bus;
int retval;

list_for_each_entry(dev, &bus->devices, bus_list) {
@@ -139,21 +138,11 @@ void pci_bus_add_devices(struct pci_bus
up_write(&pci_bus_sem);
}
pci_bus_add_devices(dev->subordinate);
-
- /* register the bus with sysfs as the parent is now
- * properly registered. */
- child_bus = dev->subordinate;
- child_bus->dev.parent = child_bus->bridge;
- retval = device_register(&child_bus->dev);
- if (retval)
- dev_err(&dev->dev, "Error registering pci_bus,"
- " continuing...\n");
- else
- retval = device_create_file(&child_bus->dev,
- &dev_attr_cpuaffinity);
+ retval = sysfs_create_link(&dev->subordinate->class_dev.kobj,
+ &dev->dev.kobj, "bridge");
if (retval)
- dev_err(&dev->dev, "Error creating cpuaffinity"
- " file, continuing...\n");
+ dev_err(&dev->dev, "Error creating sysfs "
+ "bridge symlink, continuing...\n");
}
}
}
Index: linux/drivers/pci/pci-sysfs.c
===================================================================
--- linux.orig/drivers/pci/pci-sysfs.c
+++ linux/drivers/pci/pci-sysfs.c
@@ -358,7 +358,7 @@ pci_read_legacy_io(struct kobject *kobj,
char *buf, loff_t off, size_t count)
{
struct pci_bus *bus = to_pci_bus(container_of(kobj,
- struct device,
+ struct class_device,
kobj));

/* Only support 1, 2 or 4 byte accesses */
@@ -383,7 +383,7 @@ pci_write_legacy_io(struct kobject *kobj
char *buf, loff_t off, size_t count)
{
struct pci_bus *bus = to_pci_bus(container_of(kobj,
- struct device,
+ struct class_device,
kobj));
/* Only support 1, 2 or 4 byte accesses */
if (count != 1 && count != 2 && count != 4)
@@ -407,7 +407,7 @@ pci_mmap_legacy_mem(struct kobject *kobj
struct vm_area_struct *vma)
{
struct pci_bus *bus = to_pci_bus(container_of(kobj,
- struct device,
+ struct class_device,
kobj));

return pci_mmap_legacy_page_range(bus, vma);
Index: linux/drivers/pci/pci.h
===================================================================
--- linux.orig/drivers/pci/pci.h
+++ linux/drivers/pci/pci.h
@@ -64,7 +64,7 @@ static inline int pci_no_d1d2(struct pci
}
extern int pcie_mch_quirk;
extern struct device_attribute pci_dev_attrs[];
-extern struct device_attribute dev_attr_cpuaffinity;
+extern struct class_device_attribute class_device_attr_cpuaffinity;

/**
* pci_match_one_device - Tell if a PCI device structure has a matching
Index: linux/drivers/pci/probe.c
===================================================================
--- linux.orig/drivers/pci/probe.c
+++ linux/drivers/pci/probe.c
@@ -53,7 +53,7 @@ static void pci_create_legacy_files(stru
b->legacy_io->attr.mode = S_IRUSR | S_IWUSR;
b->legacy_io->read = pci_read_legacy_io;
b->legacy_io->write = pci_write_legacy_io;
- device_create_bin_file(&b->dev, b->legacy_io);
+ class_device_create_bin_file(&b->class_dev, b->legacy_io);

/* Allocated above after the legacy_io struct */
b->legacy_mem = b->legacy_io + 1;
@@ -61,15 +61,15 @@ static void pci_create_legacy_files(stru
b->legacy_mem->size = 1024*1024;
b->legacy_mem->attr.mode = S_IRUSR | S_IWUSR;
b->legacy_mem->mmap = pci_mmap_legacy_mem;
- device_create_bin_file(&b->dev, b->legacy_mem);
+ class_device_create_bin_file(&b->class_dev, b->legacy_mem);
}
}

void pci_remove_legacy_files(struct pci_bus *b)
{
if (b->legacy_io) {
- device_remove_bin_file(&b->dev, b->legacy_io);
- device_remove_bin_file(&b->dev, b->legacy_mem);
+ class_device_remove_bin_file(&b->class_dev, b->legacy_io);
+ class_device_remove_bin_file(&b->class_dev, b->legacy_mem);
kfree(b->legacy_io); /* both are allocated here */
}
}
@@ -81,27 +81,26 @@ void pci_remove_legacy_files(struct pci_
/*
* PCI Bus Class Devices
*/
-static ssize_t pci_bus_show_cpuaffinity(struct device *dev,
- struct device_attribute *attr,
+static ssize_t pci_bus_show_cpuaffinity(struct class_device *class_dev,
char *buf)
{
int ret;
cpumask_t cpumask;

- cpumask = pcibus_to_cpumask(to_pci_bus(dev));
+ cpumask = pcibus_to_cpumask(to_pci_bus(class_dev));
ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
if (ret < PAGE_SIZE)
buf[ret++] = '\n';
return ret;
}
-DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);
+CLASS_DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);

/*
* PCI Bus Class
*/
-static void release_pcibus_dev(struct device *dev)
+static void release_pcibus_dev(struct class_device *class_dev)
{
- struct pci_bus *pci_bus = to_pci_bus(dev);
+ struct pci_bus *pci_bus = to_pci_bus(class_dev);

if (pci_bus->bridge)
put_device(pci_bus->bridge);
@@ -110,7 +109,7 @@ static void release_pcibus_dev(struct de

static struct class pcibus_class = {
.name = "pci_bus",
- .dev_release = &release_pcibus_dev,
+ .release = &release_pcibus_dev,
};

static int __init pcibus_class_init(void)
@@ -393,6 +392,7 @@ pci_alloc_child_bus(struct pci_bus *pare
{
struct pci_bus *child;
int i;
+ int retval;

/*
* Allocate a new bus, and inherit stuff from the parent..
@@ -408,12 +408,15 @@ pci_alloc_child_bus(struct pci_bus *pare
child->bus_flags = parent->bus_flags;
child->bridge = get_device(&bridge->dev);

- /* initialize some portions of the bus device, but don't register it
- * now as the parent is not properly set up yet. This device will get
- * registered later in pci_bus_add_devices()
- */
- child->dev.class = &pcibus_class;
- sprintf(child->dev.bus_id, "%04x:%02x", pci_domain_nr(child), busnr);
+ child->class_dev.class = &pcibus_class;
+ sprintf(child->class_dev.class_id, "%04x:%02x", pci_domain_nr(child), busnr);
+ retval = class_device_register(&child->class_dev);
+ if (retval)
+ goto error_register;
+ retval = class_device_create_file(&child->class_dev,
+ &class_device_attr_cpuaffinity);
+ if (retval)
+ goto error_file_create;

/*
* Set up the primary, secondary and subordinate
@@ -431,6 +434,12 @@ pci_alloc_child_bus(struct pci_bus *pare
bridge->subordinate = child;

return child;
+
+error_file_create:
+ class_device_unregister(&child->class_dev);
+error_register:
+ kfree(child);
+ return NULL;
}

struct pci_bus *__ref pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev, int busnr)
@@ -1083,27 +1092,32 @@ struct pci_bus * pci_create_bus(struct d
goto dev_reg_err;
b->bridge = get_device(dev);

- b->dev.class = &pcibus_class;
- b->dev.parent = b->bridge;
- sprintf(b->dev.bus_id, "%04x:%02x", pci_domain_nr(b), bus);
- error = device_register(&b->dev);
+ b->class_dev.class = &pcibus_class;
+ sprintf(b->class_dev.class_id, "%04x:%02x", pci_domain_nr(b), bus);
+ error = class_device_register(&b->class_dev);
if (error)
goto class_dev_reg_err;
- error = device_create_file(&b->dev, &dev_attr_cpuaffinity);
+ error = class_device_create_file(&b->class_dev, &class_device_attr_cpuaffinity);
if (error)
- goto dev_create_file_err;
+ goto class_dev_create_file_err;

/* Create legacy_io and legacy_mem files for this bus */
pci_create_legacy_files(b);

+ error = sysfs_create_link(&b->class_dev.kobj, &b->bridge->kobj, "bridge");
+ if (error)
+ goto sys_create_link_err;
+
b->number = b->secondary = bus;
b->resource[0] = &ioport_resource;
b->resource[1] = &iomem_resource;

return b;

-dev_create_file_err:
- device_unregister(&b->dev);
+sys_create_link_err:
+ class_device_remove_file(&b->class_dev, &class_device_attr_cpuaffinity);
+class_dev_create_file_err:
+ class_device_unregister(&b->class_dev);
class_dev_reg_err:
device_unregister(dev);
dev_reg_err:
Index: linux/drivers/pci/remove.c
===================================================================
--- linux.orig/drivers/pci/remove.c
+++ linux/drivers/pci/remove.c
@@ -74,8 +74,10 @@ void pci_remove_bus(struct pci_bus *pci_
list_del(&pci_bus->node);
up_write(&pci_bus_sem);
pci_remove_legacy_files(pci_bus);
- device_remove_file(&pci_bus->dev, &dev_attr_cpuaffinity);
- device_unregister(&pci_bus->dev);
+ class_device_remove_file(&pci_bus->class_dev,
+ &class_device_attr_cpuaffinity);
+ sysfs_remove_link(&pci_bus->class_dev.kobj, "bridge");
+ class_device_unregister(&pci_bus->class_dev);
}
EXPORT_SYMBOL(pci_remove_bus);

Index: linux/include/linux/pci.h
===================================================================
--- linux.orig/include/linux/pci.h
+++ linux/include/linux/pci.h
@@ -275,13 +275,13 @@ struct pci_bus {
unsigned short bridge_ctl; /* manage NO_ISA/FBB/et al behaviors */
pci_bus_flags_t bus_flags; /* Inherited by child busses */
struct device *bridge;
- struct device dev;
+ struct class_device class_dev;
struct bin_attribute *legacy_io; /* legacy I/O for this bus */
struct bin_attribute *legacy_mem; /* legacy mem */
};

#define pci_bus_b(n) list_entry(n, struct pci_bus, node)
-#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
+#define to_pci_bus(n) container_of(n, struct pci_bus, class_dev)

/*
* Error values that may be returned by PCI functions.

2008-03-06 17:57:40

by Tilman Schmidt

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On 03.03.2008 03:16 Rafael J. Wysocki wrote:

> Unresolved regressions
> ----------------------

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10021
> Subject : Linux 2.6.25-rc2 regression: LVM cannot find volume group
> Submitter : Tilman Schmidt <[email protected]>
> Date : 2008-02-16 20:14
> References : http://lkml.org/lkml/2008/2/16/208
> Handled-By : Alan Cox <[email protected]>
> Jiri Slaby <[email protected]>

Fixed in 2.6.25-rc4 by the introduction of CONFIG_SYSFS_DEPRECATED_V2.

Thanks,
Tilman

--
Tilman Schmidt E-Mail: [email protected]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge?ffnet mindestens haltbar bis: (siehe R?ckseite)


Attachments:
signature.asc (253.00 B)
OpenPGP digital signature

2008-03-06 19:55:26

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008, Ingo Molnar wrote:

>
> * Rafael J. Wysocki <[email protected]> wrote:
>
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > Subject : No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> > Submitter : Guennadi Liakhovetski <[email protected]>
> > Date : 2008-02-27 08:15
>
> Guennadi bisected this down to:
>
> commit fd7d1ced29e5beb88c9068801da7a362606d8273
> PCI: make pci_bus a struct device
>
> and it's suspected that Andrew's poweroff problems might be related as
> well. Guennadi, Andrew, find below a manual revert of this change - does
> it make any difference?

Yes, this patch fixes both startup warnings and lets the system reboot and
power off again!

Thanks
Guennadi

>
> Ingo
>
> ---------------->
> Subject: revert "PCI: make pci_bus a struct device"
> From: Ingo Molnar <[email protected]>
> Date: Thu Mar 06 08:12:56 CET 2008
>
> revert:
>
> commit fd7d1ced29e5beb88c9068801da7a362606d8273
> PCI: make pci_bus a struct device
>
> patch for testing purposes.
> ---
> drivers/pci/bus.c | 19 +++-----------
> drivers/pci/pci-sysfs.c | 6 ++--
> drivers/pci/pci.h | 2 -
> drivers/pci/probe.c | 64 +++++++++++++++++++++++++++++-------------------
> drivers/pci/remove.c | 6 +++-
> include/linux/pci.h | 4 +--
> 6 files changed, 53 insertions(+), 48 deletions(-)
>
> Index: linux/drivers/pci/bus.c
> ===================================================================
> --- linux.orig/drivers/pci/bus.c
> +++ linux/drivers/pci/bus.c
> @@ -108,7 +108,6 @@ int pci_bus_add_device(struct pci_dev *d
> void pci_bus_add_devices(struct pci_bus *bus)
> {
> struct pci_dev *dev;
> - struct pci_bus *child_bus;
> int retval;
>
> list_for_each_entry(dev, &bus->devices, bus_list) {
> @@ -139,21 +138,11 @@ void pci_bus_add_devices(struct pci_bus
> up_write(&pci_bus_sem);
> }
> pci_bus_add_devices(dev->subordinate);
> -
> - /* register the bus with sysfs as the parent is now
> - * properly registered. */
> - child_bus = dev->subordinate;
> - child_bus->dev.parent = child_bus->bridge;
> - retval = device_register(&child_bus->dev);
> - if (retval)
> - dev_err(&dev->dev, "Error registering pci_bus,"
> - " continuing...\n");
> - else
> - retval = device_create_file(&child_bus->dev,
> - &dev_attr_cpuaffinity);
> + retval = sysfs_create_link(&dev->subordinate->class_dev.kobj,
> + &dev->dev.kobj, "bridge");
> if (retval)
> - dev_err(&dev->dev, "Error creating cpuaffinity"
> - " file, continuing...\n");
> + dev_err(&dev->dev, "Error creating sysfs "
> + "bridge symlink, continuing...\n");
> }
> }
> }
> Index: linux/drivers/pci/pci-sysfs.c
> ===================================================================
> --- linux.orig/drivers/pci/pci-sysfs.c
> +++ linux/drivers/pci/pci-sysfs.c
> @@ -358,7 +358,7 @@ pci_read_legacy_io(struct kobject *kobj,
> char *buf, loff_t off, size_t count)
> {
> struct pci_bus *bus = to_pci_bus(container_of(kobj,
> - struct device,
> + struct class_device,
> kobj));
>
> /* Only support 1, 2 or 4 byte accesses */
> @@ -383,7 +383,7 @@ pci_write_legacy_io(struct kobject *kobj
> char *buf, loff_t off, size_t count)
> {
> struct pci_bus *bus = to_pci_bus(container_of(kobj,
> - struct device,
> + struct class_device,
> kobj));
> /* Only support 1, 2 or 4 byte accesses */
> if (count != 1 && count != 2 && count != 4)
> @@ -407,7 +407,7 @@ pci_mmap_legacy_mem(struct kobject *kobj
> struct vm_area_struct *vma)
> {
> struct pci_bus *bus = to_pci_bus(container_of(kobj,
> - struct device,
> + struct class_device,
> kobj));
>
> return pci_mmap_legacy_page_range(bus, vma);
> Index: linux/drivers/pci/pci.h
> ===================================================================
> --- linux.orig/drivers/pci/pci.h
> +++ linux/drivers/pci/pci.h
> @@ -64,7 +64,7 @@ static inline int pci_no_d1d2(struct pci
> }
> extern int pcie_mch_quirk;
> extern struct device_attribute pci_dev_attrs[];
> -extern struct device_attribute dev_attr_cpuaffinity;
> +extern struct class_device_attribute class_device_attr_cpuaffinity;
>
> /**
> * pci_match_one_device - Tell if a PCI device structure has a matching
> Index: linux/drivers/pci/probe.c
> ===================================================================
> --- linux.orig/drivers/pci/probe.c
> +++ linux/drivers/pci/probe.c
> @@ -53,7 +53,7 @@ static void pci_create_legacy_files(stru
> b->legacy_io->attr.mode = S_IRUSR | S_IWUSR;
> b->legacy_io->read = pci_read_legacy_io;
> b->legacy_io->write = pci_write_legacy_io;
> - device_create_bin_file(&b->dev, b->legacy_io);
> + class_device_create_bin_file(&b->class_dev, b->legacy_io);
>
> /* Allocated above after the legacy_io struct */
> b->legacy_mem = b->legacy_io + 1;
> @@ -61,15 +61,15 @@ static void pci_create_legacy_files(stru
> b->legacy_mem->size = 1024*1024;
> b->legacy_mem->attr.mode = S_IRUSR | S_IWUSR;
> b->legacy_mem->mmap = pci_mmap_legacy_mem;
> - device_create_bin_file(&b->dev, b->legacy_mem);
> + class_device_create_bin_file(&b->class_dev, b->legacy_mem);
> }
> }
>
> void pci_remove_legacy_files(struct pci_bus *b)
> {
> if (b->legacy_io) {
> - device_remove_bin_file(&b->dev, b->legacy_io);
> - device_remove_bin_file(&b->dev, b->legacy_mem);
> + class_device_remove_bin_file(&b->class_dev, b->legacy_io);
> + class_device_remove_bin_file(&b->class_dev, b->legacy_mem);
> kfree(b->legacy_io); /* both are allocated here */
> }
> }
> @@ -81,27 +81,26 @@ void pci_remove_legacy_files(struct pci_
> /*
> * PCI Bus Class Devices
> */
> -static ssize_t pci_bus_show_cpuaffinity(struct device *dev,
> - struct device_attribute *attr,
> +static ssize_t pci_bus_show_cpuaffinity(struct class_device *class_dev,
> char *buf)
> {
> int ret;
> cpumask_t cpumask;
>
> - cpumask = pcibus_to_cpumask(to_pci_bus(dev));
> + cpumask = pcibus_to_cpumask(to_pci_bus(class_dev));
> ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
> if (ret < PAGE_SIZE)
> buf[ret++] = '\n';
> return ret;
> }
> -DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);
> +CLASS_DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);
>
> /*
> * PCI Bus Class
> */
> -static void release_pcibus_dev(struct device *dev)
> +static void release_pcibus_dev(struct class_device *class_dev)
> {
> - struct pci_bus *pci_bus = to_pci_bus(dev);
> + struct pci_bus *pci_bus = to_pci_bus(class_dev);
>
> if (pci_bus->bridge)
> put_device(pci_bus->bridge);
> @@ -110,7 +109,7 @@ static void release_pcibus_dev(struct de
>
> static struct class pcibus_class = {
> .name = "pci_bus",
> - .dev_release = &release_pcibus_dev,
> + .release = &release_pcibus_dev,
> };
>
> static int __init pcibus_class_init(void)
> @@ -393,6 +392,7 @@ pci_alloc_child_bus(struct pci_bus *pare
> {
> struct pci_bus *child;
> int i;
> + int retval;
>
> /*
> * Allocate a new bus, and inherit stuff from the parent..
> @@ -408,12 +408,15 @@ pci_alloc_child_bus(struct pci_bus *pare
> child->bus_flags = parent->bus_flags;
> child->bridge = get_device(&bridge->dev);
>
> - /* initialize some portions of the bus device, but don't register it
> - * now as the parent is not properly set up yet. This device will get
> - * registered later in pci_bus_add_devices()
> - */
> - child->dev.class = &pcibus_class;
> - sprintf(child->dev.bus_id, "%04x:%02x", pci_domain_nr(child), busnr);
> + child->class_dev.class = &pcibus_class;
> + sprintf(child->class_dev.class_id, "%04x:%02x", pci_domain_nr(child), busnr);
> + retval = class_device_register(&child->class_dev);
> + if (retval)
> + goto error_register;
> + retval = class_device_create_file(&child->class_dev,
> + &class_device_attr_cpuaffinity);
> + if (retval)
> + goto error_file_create;
>
> /*
> * Set up the primary, secondary and subordinate
> @@ -431,6 +434,12 @@ pci_alloc_child_bus(struct pci_bus *pare
> bridge->subordinate = child;
>
> return child;
> +
> +error_file_create:
> + class_device_unregister(&child->class_dev);
> +error_register:
> + kfree(child);
> + return NULL;
> }
>
> struct pci_bus *__ref pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev, int busnr)
> @@ -1083,27 +1092,32 @@ struct pci_bus * pci_create_bus(struct d
> goto dev_reg_err;
> b->bridge = get_device(dev);
>
> - b->dev.class = &pcibus_class;
> - b->dev.parent = b->bridge;
> - sprintf(b->dev.bus_id, "%04x:%02x", pci_domain_nr(b), bus);
> - error = device_register(&b->dev);
> + b->class_dev.class = &pcibus_class;
> + sprintf(b->class_dev.class_id, "%04x:%02x", pci_domain_nr(b), bus);
> + error = class_device_register(&b->class_dev);
> if (error)
> goto class_dev_reg_err;
> - error = device_create_file(&b->dev, &dev_attr_cpuaffinity);
> + error = class_device_create_file(&b->class_dev, &class_device_attr_cpuaffinity);
> if (error)
> - goto dev_create_file_err;
> + goto class_dev_create_file_err;
>
> /* Create legacy_io and legacy_mem files for this bus */
> pci_create_legacy_files(b);
>
> + error = sysfs_create_link(&b->class_dev.kobj, &b->bridge->kobj, "bridge");
> + if (error)
> + goto sys_create_link_err;
> +
> b->number = b->secondary = bus;
> b->resource[0] = &ioport_resource;
> b->resource[1] = &iomem_resource;
>
> return b;
>
> -dev_create_file_err:
> - device_unregister(&b->dev);
> +sys_create_link_err:
> + class_device_remove_file(&b->class_dev, &class_device_attr_cpuaffinity);
> +class_dev_create_file_err:
> + class_device_unregister(&b->class_dev);
> class_dev_reg_err:
> device_unregister(dev);
> dev_reg_err:
> Index: linux/drivers/pci/remove.c
> ===================================================================
> --- linux.orig/drivers/pci/remove.c
> +++ linux/drivers/pci/remove.c
> @@ -74,8 +74,10 @@ void pci_remove_bus(struct pci_bus *pci_
> list_del(&pci_bus->node);
> up_write(&pci_bus_sem);
> pci_remove_legacy_files(pci_bus);
> - device_remove_file(&pci_bus->dev, &dev_attr_cpuaffinity);
> - device_unregister(&pci_bus->dev);
> + class_device_remove_file(&pci_bus->class_dev,
> + &class_device_attr_cpuaffinity);
> + sysfs_remove_link(&pci_bus->class_dev.kobj, "bridge");
> + class_device_unregister(&pci_bus->class_dev);
> }
> EXPORT_SYMBOL(pci_remove_bus);
>
> Index: linux/include/linux/pci.h
> ===================================================================
> --- linux.orig/include/linux/pci.h
> +++ linux/include/linux/pci.h
> @@ -275,13 +275,13 @@ struct pci_bus {
> unsigned short bridge_ctl; /* manage NO_ISA/FBB/et al behaviors */
> pci_bus_flags_t bus_flags; /* Inherited by child busses */
> struct device *bridge;
> - struct device dev;
> + struct class_device class_dev;
> struct bin_attribute *legacy_io; /* legacy I/O for this bus */
> struct bin_attribute *legacy_mem; /* legacy mem */
> };
>
> #define pci_bus_b(n) list_entry(n, struct pci_bus, node)
> -#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
> +#define to_pci_bus(n) container_of(n, struct pci_bus, class_dev)
>
> /*
> * Error values that may be returned by PCI functions.
>

---
Guennadi Liakhovetski

2008-03-06 20:12:21

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008 20:55:25 +0100 (CET)
Guennadi Liakhovetski <[email protected]> wrote:

> On Thu, 6 Mar 2008, Ingo Molnar wrote:
>
> >
> > * Rafael J. Wysocki <[email protected]> wrote:
> >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > > Subject : No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> > > Submitter : Guennadi Liakhovetski <[email protected]>
> > > Date : 2008-02-27 08:15
> >
> > Guennadi bisected this down to:
> >
> > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > PCI: make pci_bus a struct device
> >
> > and it's suspected that Andrew's poweroff problems might be related as
> > well. Guennadi, Andrew, find below a manual revert of this change - does
> > it make any difference?
>
> Yes, this patch fixes both startup warnings and lets the system reboot and
> power off again!

Ingo's revert doesn't fix my machine-wont-power-off regression.


Reminder: what _does_ fix it is:

a) CONFIG_DETECT_SOFTLOCKUP=n or

b) This:

--- a/kernel/softlockup.c~softlockup-workaround
+++ a/kernel/softlockup.c
@@ -289,6 +289,7 @@ cpu_callback(struct notifier_block *nfb,
case CPU_DEAD_FROZEN:
p = per_cpu(watchdog_task, hotcpu);
per_cpu(watchdog_task, hotcpu) = NULL;
+ msleep(1);
kthread_stop(p);
break;
#endif /* CONFIG_HOTPLUG_CPU */
_


this one is unrelated to Greg's patch, I think. It's a timing thing.

2008-03-06 20:26:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thursday, 6 of March 2008, Tilman Schmidt wrote:
> On 03.03.2008 03:16 Rafael J. Wysocki wrote:
>
> > Unresolved regressions
> > ----------------------
>
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10021
> > Subject : Linux 2.6.25-rc2 regression: LVM cannot find volume group
> > Submitter : Tilman Schmidt <[email protected]>
> > Date : 2008-02-16 20:14
> > References : http://lkml.org/lkml/2008/2/16/208
> > Handled-By : Alan Cox <[email protected]>
> > Jiri Slaby <[email protected]>
>
> Fixed in 2.6.25-rc4 by the introduction of CONFIG_SYSFS_DEPRECATED_V2.

Already closed.

Thanks,
Rafael

2008-03-06 20:52:31

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, Mar 06, 2008 at 08:55:25PM +0100, Guennadi Liakhovetski wrote:
> On Thu, 6 Mar 2008, Ingo Molnar wrote:
>
> >
> > * Rafael J. Wysocki <[email protected]> wrote:
> >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > > Subject : No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> > > Submitter : Guennadi Liakhovetski <[email protected]>
> > > Date : 2008-02-27 08:15
> >
> > Guennadi bisected this down to:
> >
> > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > PCI: make pci_bus a struct device
> >
> > and it's suspected that Andrew's poweroff problems might be related as
> > well. Guennadi, Andrew, find below a manual revert of this change - does
> > it make any difference?
>
> Yes, this patch fixes both startup warnings and lets the system reboot and
> power off again!

Ok, I think we have two different problems here (as andrew showed.)

I'll work with Guennadi on his issue for now, as the pci patch shows
there is still an issue here.

thanks,

greg k-h

2008-03-06 20:53:48

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008 12:11:27 -0800
Andrew Morton <[email protected]> wrote:

> Reminder: what _does_ fix it is:
>
> a) CONFIG_DETECT_SOFTLOCKUP=n or
>
> b) This:
>
> --- a/kernel/softlockup.c~softlockup-workaround
> +++ a/kernel/softlockup.c
> @@ -289,6 +289,7 @@ cpu_callback(struct notifier_block *nfb,
> case CPU_DEAD_FROZEN:
> p = per_cpu(watchdog_task, hotcpu);
> per_cpu(watchdog_task, hotcpu) = NULL;
> + msleep(1);
> kthread_stop(p);
> break;
> #endif /* CONFIG_HOTPLUG_CPU */

sysrq-t works: http://userweb.kernel.org/~akpm/x.txt

It shows that `halt' is stuck in kthread_stop(), waiting for `watchdog' to
go away. But all the watchdog tasks are dreamily asleep, as if the wakeup
didn't work.

I'd love to poke around in kgdb (what does kthread_stop_info.k point at?)
but it seems that -mm's copy of kgdb got taken away when I wasn't looking.
Can I have it back please?

(btw, it isn't compulsory that every cpu callback function be literally
called "cpu_callback").

2008-03-06 21:00:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24


* Andrew Morton <[email protected]> wrote:

> I'd love to poke around in kgdb (what does kthread_stop_info.k point
> at?) but it seems that -mm's copy of kgdb got taken away when I wasn't
> looking. Can I have it back please?

it's in the full x86.git or you can pick up the kgdb-light tree:

http://people.redhat.com/mingo/kgdb-light.git/README

Ingo

2008-03-06 21:37:55

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008 21:59:51 +0100
Ingo Molnar <[email protected]> wrote:

>
> * Andrew Morton <[email protected]> wrote:
>
> > I'd love to poke around in kgdb (what does kthread_stop_info.k point
> > at?) but it seems that -mm's copy of kgdb got taken away when I wasn't
> > looking. Can I have it back please?
>
> it's in the full x86.git or you can pick up the kgdb-light tree:
>
> http://people.redhat.com/mingo/kgdb-light.git/README
>

We'll see.

Meanwhile, further investigation show that cpu_callback() (the one in
kernel/softlockup.c) is waiting on this thread:

watchdog/1 R running task 0 8 2 task_struct:ffff81025f1089e0
ffff81025f10deb0 0000000000000046 0000000000000000 0000000000000246
ffff81025f10de20 ffff81025f1089e0 ffff81025f1080c0 ffff81025f108d30
000000015f10de50 00000000ffff2adf ffffffffffffffff ffffffffffffffff
Call Trace:
[<ffffffff80263290>] ? watchdog+0x0/0x1dc
[<ffffffff802632d6>] watchdog+0x46/0x1dc
[<ffffffff80263290>] ? watchdog+0x0/0x1dc
[<ffffffff8024704d>] kthread+0x44/0x6b
[<ffffffff8020cd88>] child_rip+0xa/0x12
[<ffffffff80247009>] ? kthread+0x0/0x6b
[<ffffffff8020cd7e>] ? child_rip+0x0/0x12

kthread_stop_info.k=ffff81025f1089e0

(gdb) l *0xffffffff802632d6
0xffffffff802632d6 is in watchdog (kernel/softlockup.c:229).
224 */
225 while (!kthread_should_stop()) {
226 touch_softlockup_watchdog();
227 schedule();
228
229 if (kthread_should_stop())
230 break;
231
232 if (this_cpu == check_cpu) {
233 if (sysctl_hung_task_timeout_secs)

so this watchdog thread seems to be runnable, but not running. What would
cause this?

The only other runnable task is

events/2 R running task 0 29 2 task_struct:ffff81025f2da300
ffff81025f2e3ec0 0000000000000046 0000000000000286 ffff81000102ebc0
ffff81000102ebe0 ffff81025f2da300 ffff81025f147380 ffff81025f2da650
000000025f2e3e60 00000000ffff3310 ffff81000102ebc8 ffff81025f20e440
Call Trace:
[<ffffffff802442ce>] ? worker_thread+0x0/0xe5
[<ffffffff80244371>] worker_thread+0xa3/0xe5
[<ffffffff80247364>] ? autoremove_wake_function+0x0/0x36
[<ffffffff802442ce>] ? worker_thread+0x0/0xe5
[<ffffffff8024704d>] kthread+0x44/0x6b
[<ffffffff8020cd88>] child_rip+0xa/0x12
[<ffffffff80247009>] ? kthread+0x0/0x6b
[<ffffffff8020cd7e>] ? child_rip+0x0/0x12

which is on a different CPU.

2008-03-06 22:58:48

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008 13:36:32 -0800
Andrew Morton <[email protected]> wrote:

> On Thu, 6 Mar 2008 21:59:51 +0100
> Ingo Molnar <[email protected]> wrote:
>
> >
> > * Andrew Morton <[email protected]> wrote:
> >
> > > I'd love to poke around in kgdb (what does kthread_stop_info.k point
> > > at?) but it seems that -mm's copy of kgdb got taken away when I wasn't
> > > looking. Can I have it back please?
> >
> > it's in the full x86.git or you can pick up the kgdb-light tree:
> >
> > http://people.redhat.com/mingo/kgdb-light.git/README
> >
>
> We'll see.
>
> Meanwhile, further investigation show that cpu_callback() (the one in
> kernel/softlockup.c) is waiting on this thread:
>
> watchdog/1 R running task 0 8 2 task_struct:ffff81025f1089e0

Note the "/1".

> ffff81025f10deb0 0000000000000046 0000000000000000 0000000000000246
> ffff81025f10de20 ffff81025f1089e0 ffff81025f1080c0 ffff81025f108d30
> 000000015f10de50 00000000ffff2adf ffffffffffffffff ffffffffffffffff
> Call Trace:
> [<ffffffff80263290>] ? watchdog+0x0/0x1dc
> [<ffffffff802632d6>] watchdog+0x46/0x1dc
> [<ffffffff80263290>] ? watchdog+0x0/0x1dc
> [<ffffffff8024704d>] kthread+0x44/0x6b
> [<ffffffff8020cd88>] child_rip+0xa/0x12
> [<ffffffff80247009>] ? kthread+0x0/0x6b
> [<ffffffff8020cd7e>] ? child_rip+0x0/0x12
>
> kthread_stop_info.k=ffff81025f1089e0
>
> (gdb) l *0xffffffff802632d6
> 0xffffffff802632d6 is in watchdog (kernel/softlockup.c:229).
> 224 */
> 225 while (!kthread_should_stop()) {
> 226 touch_softlockup_watchdog();
> 227 schedule();
> 228
> 229 if (kthread_should_stop())
> 230 break;
> 231
> 232 if (this_cpu == check_cpu) {
> 233 if (sysctl_hung_task_timeout_secs)
>
> so this watchdog thread seems to be runnable, but not running. What would
> cause this?

At the start of the sysrq-T trace we have:

sd 1:0:0:0: [sdb] Stopping disk
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
ACPI: PCI interrupt for device 0000:05:00.1 disabled
ACPI: PCI interrupt for device 0000:05:00.0 disabled
ACPI: Preparing to enter system sleep state S5
Disabling non-boot CPUs ...
CPU 1 is now offline
SysRq : Show State
task PC stack pid father

So CPU 1 is offline. But the comatose watchdog thread is pinned to CPU 1.
Could this be related to the problem? By what means is a task which is
pinned to a going-away CPU handled? How is this guy supposed to ever run
again?

2008-03-06 23:13:49

by Suresh Siddha

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, Mar 06, 2008 at 02:57:39PM -0800, Andrew Morton wrote:
> On Thu, 6 Mar 2008 13:36:32 -0800
> Andrew Morton <[email protected]> wrote:
>
> > On Thu, 6 Mar 2008 21:59:51 +0100
> > Ingo Molnar <[email protected]> wrote:
> >
> > >
> > > * Andrew Morton <[email protected]> wrote:
> > >
> > > > I'd love to poke around in kgdb (what does kthread_stop_info.k point
> > > > at?) but it seems that -mm's copy of kgdb got taken away when I wasn't
> > > > looking. Can I have it back please?
> > >
> > > it's in the full x86.git or you can pick up the kgdb-light tree:
> > >
> > > http://people.redhat.com/mingo/kgdb-light.git/README
> > >
> >
> > We'll see.
> >
> > Meanwhile, further investigation show that cpu_callback() (the one in
> > kernel/softlockup.c) is waiting on this thread:
> >
> > watchdog/1 R running task 0 8 2 task_struct:ffff81025f1089e0
>
> Note the "/1".
>
> > ffff81025f10deb0 0000000000000046 0000000000000000 0000000000000246
> > ffff81025f10de20 ffff81025f1089e0 ffff81025f1080c0 ffff81025f108d30
> > 000000015f10de50 00000000ffff2adf ffffffffffffffff ffffffffffffffff
> > Call Trace:
> > [<ffffffff80263290>] ? watchdog+0x0/0x1dc
> > [<ffffffff802632d6>] watchdog+0x46/0x1dc
> > [<ffffffff80263290>] ? watchdog+0x0/0x1dc
> > [<ffffffff8024704d>] kthread+0x44/0x6b
> > [<ffffffff8020cd88>] child_rip+0xa/0x12
> > [<ffffffff80247009>] ? kthread+0x0/0x6b
> > [<ffffffff8020cd7e>] ? child_rip+0x0/0x12
> >
> > kthread_stop_info.k=ffff81025f1089e0
> >
> > (gdb) l *0xffffffff802632d6
> > 0xffffffff802632d6 is in watchdog (kernel/softlockup.c:229).
> > 224 */
> > 225 while (!kthread_should_stop()) {
> > 226 touch_softlockup_watchdog();
> > 227 schedule();
> > 228
> > 229 if (kthread_should_stop())
> > 230 break;
> > 231
> > 232 if (this_cpu == check_cpu) {
> > 233 if (sysctl_hung_task_timeout_secs)
> >
> > so this watchdog thread seems to be runnable, but not running. What would
> > cause this?
>
> At the start of the sysrq-T trace we have:
>
> sd 1:0:0:0: [sdb] Stopping disk
> sd 0:0:0:0: [sda] Synchronizing SCSI cache
> sd 0:0:0:0: [sda] Stopping disk
> ACPI: PCI interrupt for device 0000:05:00.1 disabled
> ACPI: PCI interrupt for device 0000:05:00.0 disabled
> ACPI: Preparing to enter system sleep state S5
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> SysRq : Show State
> task PC stack pid father

I have been looking into a similar issue, which stops my system going into
standy.

>
> So CPU 1 is offline. But the comatose watchdog thread is pinned to CPU 1.
> Could this be related to the problem? By what means is a task which is
> pinned to a going-away CPU handled? How is this guy supposed to ever run
> again?

move_task_off_dead_cpu() should move that thread to another online cpu. But
for some reason it isn't running.

thanks,
suresh

2008-03-06 23:26:05

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.25-rc3-git3: Reported regressions from 2.6.24

On Thu, 6 Mar 2008 15:13:31 -0800
Suresh Siddha <[email protected]> wrote:

> > sd 1:0:0:0: [sdb] Stopping disk
> > sd 0:0:0:0: [sda] Synchronizing SCSI cache
> > sd 0:0:0:0: [sda] Stopping disk
> > ACPI: PCI interrupt for device 0000:05:00.1 disabled
> > ACPI: PCI interrupt for device 0000:05:00.0 disabled
> > ACPI: Preparing to enter system sleep state S5
> > Disabling non-boot CPUs ...
> > CPU 1 is now offline
> > SysRq : Show State
> > task PC stack pid father
>
> I have been looking into a similar issue, which stops my system going into
> standy.

OK.

> >
> > So CPU 1 is offline. But the comatose watchdog thread is pinned to CPU 1.
> > Could this be related to the problem? By what means is a task which is
> > pinned to a going-away CPU handled? How is this guy supposed to ever run
> > again?
>
> move_task_off_dead_cpu() should move that thread to another online cpu. But
> for some reason it isn't running.

hm. What guarantees that kernel/sched.c:migration_call(CPU_DEAD) is called
before kernel/softlockup.c:cpu_callback(CPU_DEAD)? Just the ordering in
do_pre_smp_initcalls(), and notifier-chain behaviour, I guess.