2010-02-01 00:52:44

by Rafael J. Wysocki

[permalink] [raw]
Subject: 2.6.33-rc6: Reported regressions 2.6.31 -> 2.6.32

[NOTE:
* Still growing faster than we're fixing them.]

This message contains a list of some regressions introduced between 2.6.31 and
2.6.32, for which there are no fixes in the mainline I know of. If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.31
and 2.6.32, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2010-02-01 149 50 45
2010-01-24 140 45 43
2010-01-10 130 44 40
2009-12-29 124 60 57
2009-11-21 86 29 25
2009-11-16 84 46 41
2009-10-26 66 42 37
2009-10-12 48 31 27
2009-10-02 22 15 9


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15193
Subject : kswapd continuously active
Submitter : Jan Engelhardt <[email protected]>
Date : 2010-01-22 23 (10 days old)
References : http://marc.info/?l=linux-kernel&m=126420434519039&w=4
Handled-By : Jens Axboe <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15158
Subject : oops related to i915_gem_object_save_bit_17_swizzle
Submitter : Werner Lemberg <[email protected]>
Date : 2010-01-28 08:26 (4 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15156
Subject : 2.6.32.6 hang at boot with ati x1600
Submitter : Alexey Kuznetsov <[email protected]>
Date : 2010-01-28 05:02 (4 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
Subject : Bluetooth: sleeping function called from invalid context
Submitter : David John <[email protected]>
Date : 2010-01-12 9:19 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15108
Subject : Blank screen with KMS enabled (on clevo M5xN laptop)
Submitter : Jérémy Lal <[email protected]>
Date : 2010-01-22 20:30 (10 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15100
Subject : X11 is black after resume from s2ram if my T400 was previous in docking station before
Submitter : Toralf Förster <[email protected]>
Date : 2010-01-21 08:56 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c1c7af60892070e4b82ad63bbfb95ae745056de0


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15096
Subject : Resume lock up -- bisected, commit 3a1151e3f124fd1a2c54b8153f510f1a7c715369
Submitter : Rafał Miłecki <[email protected]>
Date : 2010-01-20 23:15 (12 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3a1151e3f124fd1a2c54b8153f510f1a7c715369


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15071
Subject : IBM/Lenovo Trackpoint speed, sensitivity reset after suspend
Submitter : Marten Vance <[email protected]>
Date : 2010-01-16 16:19 (16 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15042
Subject : socket(PF_INET6 hangs when ipv6 not yet initialized
Submitter : Marc Haber <[email protected]>
Date : 2010-01-10 14:28 (22 days old)
References : http://marc.info/?l=linux-kernel&m=126313553029280&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15021
Subject : agpgart sometimes fails to initialize sometimes
Submitter : Maciej Piechotka <[email protected]>
Date : 2010-01-09 23:31 (23 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15015
Subject : blank screen at random times in laptop when sitting idle
Submitter : Jithin Emmanuel <[email protected]>
Date : 2010-01-09 16:48 (23 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15004
Subject : i915: *ERROR* Execbuf while wedged
Submitter : tomas m <[email protected]>
Date : 2010-01-07 18:53 (25 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15000
Subject : Thinkpad dock button no longer works
Submitter : Paul Martin <[email protected]>
Date : 2010-01-07 02:11 (25 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14998
Subject : Caught 32-bit read from uninitialized memory in acpi_system_read_event -- 2.6.31 regression
Submitter : Christian Casteyde <[email protected]>
Date : 2010-01-06 21:40 (26 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=eca6f534e61919b28fb21aafbd1c2983deae75be


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14997
Subject : Closing and re-opening the lid does not reactivate the backlight
Submitter : o. meijer <[email protected]>
Date : 2010-01-06 15:38 (26 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14943
Subject : nfs regression?
Submitter : Nikola Ciprich <[email protected]>
Date : 2009-12-28 12:10 (35 days old)
References : http://marc.info/?l=linux-kernel&m=126200276223524&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14939
Subject : drm: random hang with i915
Submitter : Arnd Bergmann <[email protected]>
Date : 2009-12-07 17:30 (56 days old)
References : http://marc.info/?l=linux-kernel&m=126020704125723&w=4
Handled-By : Jesse Barnes <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14922
Subject : 2.6.32 seemed to have broken nVidia MCP7A sata controller
Submitter : Mike Cui <[email protected]>
Date : 2009-12-19 6:13 (44 days old)
References : http://marc.info/?l=linux-ide&m=126120323407742&w=4
Handled-By : Jeff Garzik <[email protected]>
Robert Hancock <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14898
Subject : ksoftirqd problem
Submitter : Nico <[email protected]>
Date : 2009-12-13 19:05 (50 days old)
References : http://marc.info/?l=linux-kernel&m=126073114325690&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14895
Subject : BUG in kernel 2.6.32 when using luks encrypted root and RAID0..
Submitter : r4 <[email protected]>
Date : 2009-12-03 18:24 (60 days old)
References : http://marc.info/?l=linux-kernel&m=125986664904751&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14894
Subject : pohmelfs: NULL pointer dereference
Submitter : Alexander Beregalov <[email protected]>
Date : 2009-12-02 1:11 (61 days old)
References : http://marc.info/?l=linux-kernel&m=125971633107940&w=4
Handled-By : Evgeniy Polyakov <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14891
Subject : Deadlock regression related to NFS root
Submitter : Stephen R. van den Berg <[email protected]>
Date : 2009-11-24 0:24 (69 days old)
References : http://marc.info/?l=linux-kernel&m=125902279909452&w=4
Handled-By : Trond Myklebust <[email protected]>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14886
Subject : Asus P2B-DS not detected as SMP moterboard
Submitter : Lorenzo Buzzi <[email protected]>
Date : 2009-12-27 17:20 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e5b8fc6ac158f65598f58dba2c0d52ba3b412f52


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14868
Subject : flood of "don't try to register things with the same name in the same directory." on upgrade to 2.6.32
Submitter : Rich Ercolani <[email protected]>
Date : 2009-12-24 02:44 (39 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14817
Subject : When is system under load, then freeze/HD fail
Submitter : okias <[email protected]>
Date : 2009-12-15 11:12 (48 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14783
Subject : Unhandled IRQ on Thinkpad R61i: "irq 16: nobody cared"
Submitter : Stefan Zegenhagen <[email protected]>
Date : 2009-12-10 19:14 (53 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14782
Subject : Suspend hangs Lenovo SL300 after gdm login
Submitter : Gary Trakhman <[email protected]>
Date : 2009-12-10 18:53 (53 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=03ba3782e8dcc5b0e1efe440d33084f066e38cae


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14748
Subject : e1000e NIC not working after reboot
Submitter : Maciek Sitarz <[email protected]>
Date : 2009-12-06 13:04 (57 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14742
Subject : 2.6.32 new menu idle governor causes very high CPU temp
Submitter : <[email protected]>
Date : 2009-12-05 17:24 (58 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14695
Subject : regression in karmic thermal control
Submitter : Bugie <[email protected]>
Date : 2009-11-26 08:45 (67 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14670
Subject : i915: playing video via XVideo extension makes the screen flicker
Submitter : Thomas Meyer <[email protected]>
Date : 2009-11-23 13:15 (70 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b42d4c5c6a872815d711e5d51a600f5122c38eee
References : http://lkml.org/lkml/2010/1/11/150


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14668
Subject : Resume from disk hangs in acpi_ex_acquire_global_lock
Submitter : Maxim Levitsky <[email protected]>
Date : 2009-11-22 21:25 (71 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14667
Subject : bisected 2.6.32 EC regression - Temperatures not correctly detected after suspend - Dell Studio XPS 16 laptop
Submitter : Federico Chiacchiaretta <[email protected]>
Date : 2009-11-22 20:42 (71 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6a63b06f3c494cc87eade97f081300bda60acec7


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14657
Subject : perf subsystem breakage in 2.6.32-rc7
Submitter : Arjan van de Ven <[email protected]>
Date : 2009-11-19 19:50 (74 days old)
References : http://marc.info/?l=linux-kernel&m=125866013419738&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14656
Subject : Oops at __rmqueue+0x98 with 2.6.32-rc6
Submitter : Lucas C. Villa Real <[email protected]>
Date : 2009-11-19 3:48 (74 days old)
References : http://marc.info/?l=linux-kernel&m=125860255229092&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14624
Subject : ath9k: BUG kmalloc-8192: Poison overwritten
Submitter : Miles Lane <[email protected]>
Date : 2009-11-12 4:58 (81 days old)
References : http://marc.info/?l=linux-kernel&m=125800196520396&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14621
Subject : specjbb2005 and aim7 regression with 2.6.32-rc kernels
Submitter : Zhang, Yanmin <[email protected]>
Date : 2009-11-06 7:38 (87 days old)
References : http://marc.info/?l=linux-kernel&m=125749310413174&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14504
Subject : intermittent hibernation problem again
Submitter : Ferenc Wágner <[email protected]>
Date : 2009-10-28 23:49 (96 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
Submitter : Justin P. Mattock <[email protected]>
Date : 2009-10-23 16:45 (101 days old)
References : http://lkml.org/lkml/2009/10/23/252


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
Submitter : Alexander Clouter <[email protected]>
Date : 2009-10-23 10:30 (101 days old)
References : http://lkml.org/lkml/2009/10/23/50


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14442
Subject : resume after hibernate: /dev/sdb drops and returns as /dev/sde
Submitter : Duncan <[email protected]>
Date : 2009-10-20 01:52 (104 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14426
Subject : CE: hpet increasing min_delta_ns flood
Submitter : Thibault Mondary <[email protected]>
Date : 2009-10-17 09:29 (107 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14376
Subject : Kernel NULL pointer dereference/ kvm subsystem
Submitter : Don Dupuis <[email protected]>
Date : 2009-10-06 14:38 (118 days old)
References : http://marc.info/?l=linux-kernel&m=125484025021737&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject : warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter : Pavel Machek <[email protected]>
Date : 2009-09-30 20:07 (124 days old)
References : http://marc.info/?l=linux-kernel&m=125434130703538&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject : console resume broken since ba15ab0e8d
Submitter : Sascha Hauer <[email protected]>
Date : 2009-09-30 15:11 (124 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba15ab0e8de0d4439a91342ad52d55ca9e313f3d
References : http://marc.info/?l=linux-kernel&m=125432349404060&w=4


Regressions with patches
------------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15197
Subject : padlock_sha1 and hmac broken?
Submitter : Wolfgang Walter <[email protected]>
Date : 2010-01-29 23:44 (3 days old)
References : http://marc.info/?l=linux-kernel&m=126480912924283&w=4
Handled-By : Herbert Xu <[email protected]>
Patch : http://patchwork.kernel.org/patch/75959/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15135
Subject : Kernel 2.6.32.x hangs during boot process
Submitter : François Figarola <[email protected]>
Date : 2010-01-16 9:58 (16 days old)
References : http://marc.info/?l=linux-kernel&m=126363593817261&w=4
Handled-By : Jun'ichi Nomura <[email protected]>
Patch : http://patchwork.kernel.org/patch/75560/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15134
Subject : gobi_loader hangs after commit 8e8dce065088
Submitter : Matthew Garrett <[email protected]>
Date : 2010-01-17 2:55 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126369696509502&w=4
Handled-By : Oliver Neukum <[email protected]>
Alan Cox <[email protected]>
Patch : http://patchwork.kernel.org/patch/73878/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15040
Subject : High cpu temperature with 2.6.32 - bisected to cpuidle menu update
Submitter : Dimitrios Apostolou <[email protected]>
Date : 2010-01-06 17:39 (26 days old)
References : http://marc.info/?l=linux-kernel&m=126279952723036&w=4
Handled-By : Arjan van de Ven <[email protected]>
Patch : http://patchwork.kernel.org/patch/71962/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
Subject : i915: Commit 0e442c60 causes flickering
Submitter : David John <[email protected]>
Date : 2009-12-09 17:26 (54 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
References : http://marc.info/?l=linux-kernel&m=126037889600769&w=4
Handled-By : David John <[email protected]>
Patch : http://patchwork.kernel.org/patch/75423/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.31 and 2.6.32, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14230

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


2010-02-01 00:52:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14297] console resume broken since ba15ab0e8d

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject : console resume broken since ba15ab0e8d
Submitter : Sascha Hauer <[email protected]>
Date : 2009-09-30 15:11 (124 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba15ab0e8de0d4439a91342ad52d55ca9e313f3d
References : http://marc.info/?l=linux-kernel&m=125432349404060&w=4

2010-02-01 00:57:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14376] Kernel NULL pointer dereference/ kvm subsystem

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14376
Subject : Kernel NULL pointer dereference/ kvm subsystem
Submitter : Don Dupuis <[email protected]>
Date : 2009-10-06 14:38 (118 days old)
References : http://marc.info/?l=linux-kernel&m=125484025021737&w=4

2010-02-01 00:57:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
Submitter : Alexander Clouter <[email protected]>
Date : 2009-10-23 10:30 (101 days old)
References : http://lkml.org/lkml/2009/10/23/50

2010-02-01 00:57:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14298] warning at manage.c:361 (set_irq_wake), matrix-keypad related?

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject : warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter : Pavel Machek <[email protected]>
Date : 2009-09-30 20:07 (124 days old)
References : http://marc.info/?l=linux-kernel&m=125434130703538&w=4

2010-02-01 00:57:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14426] CE: hpet increasing min_delta_ns flood

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14426
Subject : CE: hpet increasing min_delta_ns flood
Submitter : Thibault Mondary <[email protected]>
Date : 2009-10-17 09:29 (107 days old)

2010-02-01 00:57:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14504] intermittent hibernation problem again

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14504
Subject : intermittent hibernation problem again
Submitter : Ferenc Wágner <[email protected]>
Date : 2009-10-28 23:49 (96 days old)

2010-02-01 00:57:57

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14668] Resume from disk hangs in acpi_ex_acquire_global_lock

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14668
Subject : Resume from disk hangs in acpi_ex_acquire_global_lock
Submitter : Maxim Levitsky <[email protected]>
Date : 2009-11-22 21:25 (71 days old)

2010-02-01 00:58:21

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14670] i915: playing video via XVideo extension makes the screen flicker

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14670
Subject : i915: playing video via XVideo extension makes the screen flicker
Submitter : Thomas Meyer <[email protected]>
Date : 2009-11-23 13:15 (70 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b42d4c5c6a872815d711e5d51a600f5122c38eee
References : http://lkml.org/lkml/2010/1/11/150

2010-02-01 00:59:24

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14897] i915: Commit 0e442c60 causes flickering

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
Subject : i915: Commit 0e442c60 causes flickering
Submitter : David John <[email protected]>
Date : 2009-12-09 17:26 (54 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
References : http://marc.info/?l=linux-kernel&m=126037889600769&w=4
Handled-By : David John <[email protected]>
Patch : http://patchwork.kernel.org/patch/75423/

2010-02-01 00:58:44

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14886] Asus P2B-DS not detected as SMP moterboard

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14886
Subject : Asus P2B-DS not detected as SMP moterboard
Submitter : Lorenzo Buzzi <[email protected]>
Date : 2009-12-27 17:20 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e5b8fc6ac158f65598f58dba2c0d52ba3b412f52

2010-02-01 00:58:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14817] When is system under load, then freeze/HD fail

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14817
Subject : When is system under load, then freeze/HD fail
Submitter : okias <[email protected]>
Date : 2009-12-15 11:12 (48 days old)

2010-02-01 00:59:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14898] ksoftirqd problem

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14898
Subject : ksoftirqd problem
Submitter : Nico <[email protected]>
Date : 2009-12-13 19:05 (50 days old)
References : http://marc.info/?l=linux-kernel&m=126073114325690&w=4

2010-02-01 00:58:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14891] Deadlock regression related to NFS root

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14891
Subject : Deadlock regression related to NFS root
Submitter : Stephen R. van den Berg <[email protected]>
Date : 2009-11-24 0:24 (69 days old)
References : http://marc.info/?l=linux-kernel&m=125902279909452&w=4
Handled-By : Trond Myklebust <[email protected]>

2010-02-01 00:58:46

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14894] pohmelfs: NULL pointer dereference

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14894
Subject : pohmelfs: NULL pointer dereference
Submitter : Alexander Beregalov <[email protected]>
Date : 2009-12-02 1:11 (61 days old)
References : http://marc.info/?l=linux-kernel&m=125971633107940&w=4
Handled-By : Evgeniy Polyakov <[email protected]>

2010-02-01 00:58:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14868] flood of "don't try to register things with the same name in the same directory." on upgrade to 2.6.32

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14868
Subject : flood of "don't try to register things with the same name in the same directory." on upgrade to 2.6.32
Submitter : Rich Ercolani <[email protected]>
Date : 2009-12-24 02:44 (39 days old)

2010-02-01 00:59:29

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14943] nfs regression?

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14943
Subject : nfs regression?
Submitter : Nikola Ciprich <[email protected]>
Date : 2009-12-28 12:10 (35 days old)
References : http://marc.info/?l=linux-kernel&m=126200276223524&w=4

2010-02-01 00:59:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14998] Caught 32-bit read from uninitialized memory in acpi_system_read_event -- 2.6.31 regression

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14998
Subject : Caught 32-bit read from uninitialized memory in acpi_system_read_event -- 2.6.31 regression
Submitter : Christian Casteyde <[email protected]>
Date : 2010-01-06 21:40 (26 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=eca6f534e61919b28fb21aafbd1c2983deae75be

2010-02-01 01:00:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15127] Bluetooth: sleeping function called from invalid context

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
Subject : Bluetooth: sleeping function called from invalid context
Submitter : David John <[email protected]>
Date : 2010-01-12 9:19 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4

2010-02-01 01:00:07

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15096] Resume lock up -- bisected, commit 3a1151e3f124fd1a2c54b8153f510f1a7c715369

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15096
Subject : Resume lock up -- bisected, commit 3a1151e3f124fd1a2c54b8153f510f1a7c715369
Submitter : Rafał Miłecki <[email protected]>
Date : 2010-01-20 23:15 (12 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3a1151e3f124fd1a2c54b8153f510f1a7c715369

2010-02-01 01:00:01

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15071] IBM/Lenovo Trackpoint speed, sensitivity reset after suspend

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15071
Subject : IBM/Lenovo Trackpoint speed, sensitivity reset after suspend
Submitter : Marten Vance <[email protected]>
Date : 2010-01-16 16:19 (16 days old)

2010-02-01 01:00:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15158] oops related to i915_gem_object_save_bit_17_swizzle

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15158
Subject : oops related to i915_gem_object_save_bit_17_swizzle
Submitter : Werner Lemberg <[email protected]>
Date : 2010-01-28 08:26 (4 days old)

2010-02-01 01:00:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15197] padlock_sha1 and hmac broken?

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15197
Subject : padlock_sha1 and hmac broken?
Submitter : Wolfgang Walter <[email protected]>
Date : 2010-01-29 23:44 (3 days old)
References : http://marc.info/?l=linux-kernel&m=126480912924283&w=4
Handled-By : Herbert Xu <[email protected]>
Patch : http://patchwork.kernel.org/patch/75959/

2010-02-01 01:01:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15193] kswapd continuously active

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15193
Subject : kswapd continuously active
Submitter : Jan Engelhardt <[email protected]>
Date : 2010-01-22 23 (10 days old)
References : http://marc.info/?l=linux-kernel&m=126420434519039&w=4
Handled-By : Jens Axboe <[email protected]>

2010-02-01 01:01:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15108] Blank screen with KMS enabled (on clevo M5xN laptop)

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15108
Subject : Blank screen with KMS enabled (on clevo M5xN laptop)
Submitter : Jérémy Lal <[email protected]>
Date : 2010-01-22 20:30 (10 days old)

2010-02-01 00:59:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15042] socket(PF_INET6 hangs when ipv6 not yet initialized

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15042
Subject : socket(PF_INET6 hangs when ipv6 not yet initialized
Submitter : Marc Haber <[email protected]>
Date : 2010-01-10 14:28 (22 days old)
References : http://marc.info/?l=linux-kernel&m=126313553029280&w=4

2010-02-01 01:00:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15156] 2.6.32.6 hang at boot with ati x1600

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15156
Subject : 2.6.32.6 hang at boot with ati x1600
Submitter : Alexey Kuznetsov <[email protected]>
Date : 2010-01-28 05:02 (4 days old)

2010-02-01 01:01:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15134] gobi_loader hangs after commit 8e8dce065088

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15134
Subject : gobi_loader hangs after commit 8e8dce065088
Submitter : Matthew Garrett <[email protected]>
Date : 2010-01-17 2:55 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126369696509502&w=4
Handled-By : Oliver Neukum <[email protected]>
Alan Cox <[email protected]>
Patch : http://patchwork.kernel.org/patch/73878/

2010-02-01 01:01:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15135] Kernel 2.6.32.x hangs during boot process

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15135
Subject : Kernel 2.6.32.x hangs during boot process
Submitter : François Figarola <[email protected]>
Date : 2010-01-16 9:58 (16 days old)
References : http://marc.info/?l=linux-kernel&m=126363593817261&w=4
Handled-By : Jun'ichi Nomura <[email protected]>
Patch : http://patchwork.kernel.org/patch/75560/

2010-02-01 01:02:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15100] X11 is black after resume from s2ram if my T400 was previous in docking station before

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15100
Subject : X11 is black after resume from s2ram if my T400 was previous in docking station before
Submitter : Toralf Förster <[email protected]>
Date : 2010-01-21 08:56 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c1c7af60892070e4b82ad63bbfb95ae745056de0

2010-02-01 01:02:29

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15021] agpgart sometimes fails to initialize sometimes

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15021
Subject : agpgart sometimes fails to initialize sometimes
Submitter : Maciej Piechotka <[email protected]>
Date : 2010-01-09 23:31 (23 days old)

2010-02-01 01:02:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15040] High cpu temperature with 2.6.32 - bisected to cpuidle menu update

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15040
Subject : High cpu temperature with 2.6.32 - bisected to cpuidle menu update
Submitter : Dimitrios Apostolou <[email protected]>
Date : 2010-01-06 17:39 (26 days old)
References : http://marc.info/?l=linux-kernel&m=126279952723036&w=4
Handled-By : Arjan van de Ven <[email protected]>
Patch : http://patchwork.kernel.org/patch/71962/

2010-02-01 00:59:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14922] 2.6.32 seemed to have broken nVidia MCP7A sata controller

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14922
Subject : 2.6.32 seemed to have broken nVidia MCP7A sata controller
Submitter : Mike Cui <[email protected]>
Date : 2009-12-19 6:13 (44 days old)
References : http://marc.info/?l=linux-ide&m=126120323407742&w=4
Handled-By : Jeff Garzik <[email protected]>
Robert Hancock <[email protected]>

2010-02-01 01:02:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15015] blank screen at random times in laptop when sitting idle

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15015
Subject : blank screen at random times in laptop when sitting idle
Submitter : Jithin Emmanuel <[email protected]>
Date : 2010-01-09 16:48 (23 days old)

2010-02-01 01:02:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14997] Closing and re-opening the lid does not reactivate the backlight

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14997
Subject : Closing and re-opening the lid does not reactivate the backlight
Submitter : o. meijer <[email protected]>
Date : 2010-01-06 15:38 (26 days old)

2010-02-01 01:03:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15004] i915: *ERROR* Execbuf while wedged

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15004
Subject : i915: *ERROR* Execbuf while wedged
Submitter : tomas m <[email protected]>
Date : 2010-01-07 18:53 (25 days old)

2010-02-01 01:03:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14939] drm: random hang with i915

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14939
Subject : drm: random hang with i915
Submitter : Arnd Bergmann <[email protected]>
Date : 2009-12-07 17:30 (56 days old)
References : http://marc.info/?l=linux-kernel&m=126020704125723&w=4
Handled-By : Jesse Barnes <[email protected]>

2010-02-01 01:03:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #15000] Thinkpad dock button no longer works

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15000
Subject : Thinkpad dock button no longer works
Submitter : Paul Martin <[email protected]>
Date : 2010-01-07 02:11 (25 days old)

2010-02-01 01:04:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14895] BUG in kernel 2.6.32 when using luks encrypted root and RAID0..

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14895
Subject : BUG in kernel 2.6.32 when using luks encrypted root and RAID0..
Submitter : r4 <[email protected]>
Date : 2009-12-03 18:24 (60 days old)
References : http://marc.info/?l=linux-kernel&m=125986664904751&w=4

2010-02-01 00:58:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14783] Unhandled IRQ on Thinkpad R61i: "irq 16: nobody cared"

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14783
Subject : Unhandled IRQ on Thinkpad R61i: "irq 16: nobody cared"
Submitter : Stefan Zegenhagen <[email protected]>
Date : 2009-12-10 19:14 (53 days old)

2010-02-01 01:04:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14782] Suspend hangs Lenovo SL300 after gdm login

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14782
Subject : Suspend hangs Lenovo SL300 after gdm login
Submitter : Gary Trakhman <[email protected]>
Date : 2009-12-10 18:53 (53 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=03ba3782e8dcc5b0e1efe440d33084f066e38cae

2010-02-01 01:04:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14748] e1000e NIC not working after reboot

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14748
Subject : e1000e NIC not working after reboot
Submitter : Maciek Sitarz <[email protected]>
Date : 2009-12-06 13:04 (57 days old)

2010-02-01 00:58:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14656] Oops at __rmqueue+0x98 with 2.6.32-rc6

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14656
Subject : Oops at __rmqueue+0x98 with 2.6.32-rc6
Submitter : Lucas C. Villa Real <[email protected]>
Date : 2009-11-19 3:48 (74 days old)
References : http://marc.info/?l=linux-kernel&m=125860255229092&w=4

2010-02-01 01:05:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14742] 2.6.32 new menu idle governor causes very high CPU temp

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14742
Subject : 2.6.32 new menu idle governor causes very high CPU temp
Submitter : <[email protected]>
Date : 2009-12-05 17:24 (58 days old)

2010-02-01 00:58:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14624] ath9k: BUG kmalloc-8192: Poison overwritten

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14624
Subject : ath9k: BUG kmalloc-8192: Poison overwritten
Submitter : Miles Lane <[email protected]>
Date : 2009-11-12 4:58 (81 days old)
References : http://marc.info/?l=linux-kernel&m=125800196520396&w=4

2010-02-01 00:58:13

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14667] bisected 2.6.32 EC regression - Temperatures not correctly detected after suspend - Dell Studio XPS 16 laptop

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14667
Subject : bisected 2.6.32 EC regression - Temperatures not correctly detected after suspend - Dell Studio XPS 16 laptop
Submitter : Federico Chiacchiaretta <[email protected]>
Date : 2009-11-22 20:42 (71 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6a63b06f3c494cc87eade97f081300bda60acec7

2010-02-01 01:05:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14695] regression in karmic thermal control

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14695
Subject : regression in karmic thermal control
Submitter : Bugie <[email protected]>
Date : 2009-11-26 08:45 (67 days old)

2010-02-01 01:06:13

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
Submitter : Justin P. Mattock <[email protected]>
Date : 2009-10-23 16:45 (101 days old)
References : http://lkml.org/lkml/2009/10/23/252

2010-02-01 01:06:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14442] resume after hibernate: /dev/sdb drops and returns as /dev/sde

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14442
Subject : resume after hibernate: /dev/sdb drops and returns as /dev/sde
Submitter : Duncan <[email protected]>
Date : 2009-10-20 01:52 (104 days old)

2010-02-01 01:06:29

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14621] specjbb2005 and aim7 regression with 2.6.32-rc kernels

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14621
Subject : specjbb2005 and aim7 regression with 2.6.32-rc kernels
Submitter : Zhang, Yanmin <[email protected]>
Date : 2009-11-06 7:38 (87 days old)
References : http://marc.info/?l=linux-kernel&m=125749310413174&w=4

2010-02-01 01:06:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Bug #14657] perf subsystem breakage in 2.6.32-rc7

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32. Please verify if it still should
be listed and let me know (either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14657
Subject : perf subsystem breakage in 2.6.32-rc7
Submitter : Arjan van de Ven <[email protected]>
Date : 2009-11-19 19:50 (74 days old)
References : http://marc.info/?l=linux-kernel&m=125866013419738&w=4

2010-02-01 01:07:35

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

Hi Rafael,

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
> Subject : Bluetooth: sleeping function called from invalid context
> Submitter : David John <[email protected]>
> Date : 2010-01-12 9:19 (20 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4

you have an outdated email from Luiz and I change it to the right one
now.

I looked with him at the patch and I think this will fix it:

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index fc5ee32..2b50637 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
arg)
BT_DBG("session %p state %ld", s, s->state);

set_bit(RFCOMM_TIMED_OUT, &s->flags);
- rfcomm_session_put(s);
rfcomm_schedule(RFCOMM_SCHED_TIMEO);
}

@@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
s->state = BT_DISCONN;
rfcomm_send_disc(s, 0);
+ rfcomm_session_put(s);
continue;
}

We need some extra testing on this with the actual hardware we did the
patch for. So this will take at least a few days before we get our hands
on it.

Regards

Marcel

2010-02-01 01:13:04

by Robert Hancock

[permalink] [raw]
Subject: Re: [Bug #14922] 2.6.32 seemed to have broken nVidia MCP7A sata controller

On Sun, Jan 31, 2010 at 6:43 PM, Rafael J. Wysocki <[email protected]> wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. ?Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry ? ? ? : http://bugzilla.kernel.org/show_bug.cgi?id=14922
> Subject ? ? ? ? : 2.6.32 seemed to have broken nVidia MCP7A sata controller
> Submitter ? ? ? : Mike Cui <[email protected]>
> Date ? ? ? ? ? ?: 2009-12-19 6:13 (44 days old)
> References ? ? ?: http://marc.info/?l=linux-ide&m=126120323407742&w=4
> Handled-By ? ? ?: Jeff Garzik <[email protected]>
> ? ? ? ? ? ? ? ? ?Robert Hancock <[email protected]>

Still outstanding. I posted a patch that should fix the problem,
waiting for feedback from the reporter.

2010-02-01 01:44:12

by Justin P. Mattock

[permalink] [raw]
Subject: Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

On 01/31/10 16:43, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
> Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
> Submitter : Justin P. Mattock<[email protected]>
> Date : 2009-10-23 16:45 (101 days old)
> References : http://lkml.org/lkml/2009/10/23/252
>
>
>


yeah still hitting this.
looking at the issue if I change:

@@ 260

if ((class == 0xffffffff))
continue;
to

if ((class == 0xffffffff || 0xffffffffffffffff))
continue;

I'm able to boot, but don't have enough knowledge to know
what is really happening(or how to execute this).
will continue looking at this
(hopefully I get somewhere on this);

Justin P. Mattock

2010-02-01 08:06:25

by Mike Galbraith

[permalink] [raw]
Subject: Re: [Bug #14621] specjbb2005 and aim7 regression with 2.6.32-rc kernels

On Mon, 2010-02-01 at 01:43 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).

Yes, it should remain open. Aim7 regression isn't reproducible here,
specjbb2005 unknown, not available to the general public.

-Mike

2010-02-01 09:31:51

by David John

[permalink] [raw]
Subject: Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
> Subject : i915: Commit 0e442c60 causes flickering
> Submitter : David John <[email protected]>
> Date : 2009-12-09 17:26 (54 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
> References : http://marc.info/?l=linux-kernel&m=126037889600769&w=4
> Handled-By : David John <[email protected]>
> Patch : http://patchwork.kernel.org/patch/75423/
>
>
>

Hi Rafael,

The patch fixing this has not been merged yet, so the bug should still
be listed.

Regards,
David.

2010-02-01 12:54:55

by Dan Carpenter

[permalink] [raw]
Subject: Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
> On 01/31/10 16:43, Rafael J. Wysocki wrote:
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
>> Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
>> Submitter : Justin P. Mattock<[email protected]>
>> Date : 2009-10-23 16:45 (101 days old)
>> References : http://lkml.org/lkml/2009/10/23/252
>>
>>
>>
>
>
> yeah still hitting this.
> looking at the issue if I change:
>
> @@ 260
>
> if ((class == 0xffffffff))
> continue;
> to
>
> if ((class == 0xffffffff || 0xffffffffffffffff))
> continue;
>

Uh... 0xffffffffffffffff is always true so basically that's the same as deleting the
if condition.

I've added the linux1394-devel people to the CC list.

Justin has found an issue that when he boots with: ohci1394_dma=early his computer
crashes.

He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:

init_ohci1394_dma_on_all_controllers()
254 /* Poor man's PCI discovery, the only thing we can do at early boot */
255 for (num = 0; num < 32; num++) {
256 for (slot = 0; slot < 32; slot++) {
257 for (func = 0; func < 8; func++) {
258 u32 class = read_pci_config(num,slot,func,
259 PCI_CLASS_REVISION);
260 if ((class == 0xffffffff))
261 continue; /* No device at this func */

If he continues here then his system boots.

262
263 if (class>>8 != PCI_CLASS_SERIAL_FIREWIRE_OHCI)
264 continue; /* Not an OHCI-1394 device */
265
266 init_ohci1394_controller(num, slot, func);
267 break; /* Assume one controller per device */

This comment is not terribly clear btw. The code assumes one controller per slot.

268 }
269 }
270 }

regards,
dan carpenter


> I'm able to boot, but don't have enough knowledge to know
> what is really happening(or how to execute this).
> will continue looking at this
> (hopefully I get somewhere on this);
>
> Justin P. Mattock
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2010-02-01 15:47:47

by Thomas Backlund

[permalink] [raw]
Subject: Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

01.02.2010 02:43, Rafael J. Wysocki skrev:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
> Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
> Submitter : Alexander Clouter<[email protected]>
> Date : 2009-10-23 10:30 (101 days old)
> References : http://lkml.org/lkml/2009/10/23/50
>
>

Afaik this is the same issue as the one referenced here:

http://lkml.org/lkml/2010/1/28/292

The patch in the above thread should fix the issue.

--
Thomas

2010-02-01 17:39:29

by David John

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

On 02/01/2010 06:36 AM, Marcel Holtmann wrote:
> Hi Rafael,
>
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
>> Subject : Bluetooth: sleeping function called from invalid context
>> Submitter : David John <[email protected]>
>> Date : 2010-01-12 9:19 (20 days old)
>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>
> you have an outdated email from Luiz and I change it to the right one
> now.
>
> I looked with him at the patch and I think this will fix it:
>
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index fc5ee32..2b50637 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> arg)
> BT_DBG("session %p state %ld", s, s->state);
>
> set_bit(RFCOMM_TIMED_OUT, &s->flags);
> - rfcomm_session_put(s);
> rfcomm_schedule(RFCOMM_SCHED_TIMEO);
> }
>
> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
> if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
> s->state = BT_DISCONN;
> rfcomm_send_disc(s, 0);
> + rfcomm_session_put(s);
> continue;
> }
>
> We need some extra testing on this with the actual hardware we did the
> patch for. So this will take at least a few days before we get our hands
> on it.
>
> Regards
>
> Marcel
>
>
>

Hi Marcel,

FWIW, your patch fixes the issue.

Regards,
David.

2010-02-01 17:56:06

by Justin P. Mattock

[permalink] [raw]
Subject: Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

On 02/01/10 04:54, Dan Carpenter wrote:
> On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
>> On 01/31/10 16:43, Rafael J. Wysocki wrote:
>>> This message has been generated automatically as a part of a report
>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>
>>> The following bug entry is on the current list of known regressions
>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>> be listed and let me know (either way).
>>>
>>>
>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
>>> Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
>>> Submitter : Justin P. Mattock<[email protected]>
>>> Date : 2009-10-23 16:45 (101 days old)
>>> References : http://lkml.org/lkml/2009/10/23/252
>>>
>>>
>>>
>>
>>
>> yeah still hitting this.
>> looking at the issue if I change:
>>
>> @@ 260
>>
>> if ((class == 0xffffffff))
>> continue;
>> to
>>
>> if ((class == 0xffffffff || 0xffffffffffffffff))
>> continue;
>>
>
> Uh... 0xffffffffffffffff is always true so basically that's the same as deleting the
> if condition.
>
> I've added the linux1394-devel people to the CC list.
>
> Justin has found an issue that when he boots with: ohci1394_dma=early his computer
> crashes.
>
> He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:
>
> init_ohci1394_dma_on_all_controllers()
> 254 /* Poor man's PCI discovery, the only thing we can do at early boot */
> 255 for (num = 0; num< 32; num++) {
> 256 for (slot = 0; slot< 32; slot++) {
> 257 for (func = 0; func< 8; func++) {
> 258 u32 class = read_pci_config(num,slot,func,
> 259 PCI_CLASS_REVISION);
> 260 if ((class == 0xffffffff))
> 261 continue; /* No device at this func */
>
> If he continues here then his system boots.
>
> 262
> 263 if (class>>8 != PCI_CLASS_SERIAL_FIREWIRE_OHCI)
> 264 continue; /* Not an OHCI-1394 device */
> 265
> 266 init_ohci1394_controller(num, slot, func);
> 267 break; /* Assume one controller per device */
>
> This comment is not terribly clear btw. The code assumes one controller per slot.
>
> 268 }
> 269 }
> 270 }
>
> regards,
> dan carpenter
>
>
>> I'm able to boot, but don't have enough knowledge to know
>> what is really happening(or how to execute this).
>> will continue looking at this
>> (hopefully I get somewhere on this);
>>
>> Justin P. Mattock
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>

yeah I'll admit it, I don't know what I'm doing
(but am willing to try).

Thanks for the response, I'll try and
give as much info on this as possible.

Justin P. Mattock

2010-02-01 19:15:09

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

Hi David,

> >> This message has been generated automatically as a part of a report
> >> of regressions introduced between 2.6.31 and 2.6.32.
> >>
> >> The following bug entry is on the current list of known regressions
> >> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> >> be listed and let me know (either way).
> >>
> >>
> >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
> >> Subject : Bluetooth: sleeping function called from invalid context
> >> Submitter : David John <[email protected]>
> >> Date : 2010-01-12 9:19 (20 days old)
> >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
> >> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
> >
> > you have an outdated email from Luiz and I change it to the right one
> > now.
> >
> > I looked with him at the patch and I think this will fix it:
> >
> > diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> > index fc5ee32..2b50637 100644
> > --- a/net/bluetooth/rfcomm/core.c
> > +++ b/net/bluetooth/rfcomm/core.c
> > @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> > arg)
> > BT_DBG("session %p state %ld", s, s->state);
> >
> > set_bit(RFCOMM_TIMED_OUT, &s->flags);
> > - rfcomm_session_put(s);
> > rfcomm_schedule(RFCOMM_SCHED_TIMEO);
> > }
> >
> > @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
> > if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
> > s->state = BT_DISCONN;
> > rfcomm_send_disc(s, 0);
> > + rfcomm_session_put(s);
> > continue;
> > }
> >
> > We need some extra testing on this with the actual hardware we did the
> > patch for. So this will take at least a few days before we get our hands
> > on it.
>
> FWIW, your patch fixes the issue.

nice. So I can add a tested-by line to the final patch?

Just our of curiosity, which hardware did you test this with. We only
know about one headset that should cause this issue.

Regards

Marcel

2010-02-01 19:58:33

by Stefan Richter

[permalink] [raw]
Subject: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

Justin P. Mattock wrote:
> On 02/01/10 04:54, Dan Carpenter wrote:
>> On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
>>> On 01/31/10 16:43, Rafael J. Wysocki wrote:
>>>> This message has been generated automatically as a part of a report
>>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>>
>>>> The following bug entry is on the current list of known regressions
>>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>>> be listed and let me know (either way).
>>>>
>>>>
>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
>>>> Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
>>>> Submitter : Justin P. Mattock<[email protected]>
>>>> Date : 2009-10-23 16:45 (101 days old)
>>>> References : http://lkml.org/lkml/2009/10/23/252
[...]
>>> yeah still hitting this.
[...]
>> I've added the linux1394-devel people to the CC list.

Thanks. Alas the original author is MIA, and the bug seems to be tied
to the early platform setup code (rather than OHCI 1394 device specific
code) about which I for one am clueless.

The listed MAINTAINERS contact of init_ohci1394_dma.c is linux1394-devel
and me, but a good deal of this driver is very x86 platform specific.
(There was some interest in making useful for other architectures, but
this would merely mean that the respective architecture people need to
keep an eye on their parts of this driver.)

>> Justin has found an issue that when he boots with: ohci1394_dma=early
>> his computer
>> crashes.
>>
>> He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:
[...]

This modification and some others in the LKML thread from October simply
cause init_ohci1394_controller() to be skipped for all devices.

init_ohci1394_controller() is simple enough:

static inline void __init init_ohci1394_controller(int num, int slot,
int func)
{
unsigned long ohci_base;
struct ti_ohci ohci;

printk(KERN_INFO "init_ohci1394_dma: initializing OHCI-1394"
" at %02x:%02x.%x\n", num, slot, func);

ohci_base = read_pci_config(num, slot, func,
PCI_BASE_ADDRESS_0+(0<<2)) & PCI_BASE_ADDRESS_MEM_MASK;

set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);

init_ohci1394_reset_and_init_dma(&ohci);
}

Justin, you already established that read_pci_config is not the point
where it crashes, right?

set_fixmap_nocache() and fix_to_virt() frighten me because I don't know
what they do. :-)

The rest, init_ohci1394_reset_and_init_dma(), is something which I can
easily follow. There is just a bunch of register reads and writes with
occasional mdelays. This /could/ be a cause of the crash too if the
controller is inspired to do something dangerous in there --- meaning,
if the OHCI 1394 controller starts to write something per DMA into
memory. However, we do not switch on any DMA context except for the
so-called physical DMA unit which only springs into action if a remote
FireWire-attached console instructs it to do so.

I am noticing one point where init_ohci1394_dma.c violates the OHCI 1394
specification: OHCI1394_HCControl_linkEnable is witched on while the
OHCI1394_ConfigROMmap register is still invalid. This register needs to
contain a physical address of a 1kB sized, 1kB aligned memory region
which allows DMA_TO_DEVICE. So, since this is a read-only DMA, I am
tempted to say that this potential issue should not be a cause for a
kernel crash.

(Sinde note, the OHCI 1394 spec is freely available, see
http://ieee1394.wiki.kernel.org/index.php/Specifications#OHCI_Release_1.1.2C_January_6.2C_2000
)


Justin Mattock wrote on 2009-10-27 in http://lkml.org/lkml/2009/10/27/335:
> o.k. you should be able to view
> this:(let me know if you can't and I can
> manually write out, and in time find a public
> photo sharing suite to make things easier).
>
> http://www.flickr.com/photos/44066293@N08/4050317695
>
> When this happens I see lots of messages from the print
> during boot, then this happens.

(Now that a bugzilla.kernel.org ticket exists for this you can also use
bugzilla.kernel.org to publish screenshots if you have an account there.)

This screenshot looks like ___alloc_bootmem_node is the issue here, or
am I mistaken of what the order of functions in the backtrace means?
--
Stefan Richter
-=====-==-=- --=- ----=
http://arcgraph.de/sr/

2010-02-01 20:58:12

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/01/10 11:57, Stefan Richter wrote:
> Justin P. Mattock wrote:
>> On 02/01/10 04:54, Dan Carpenter wrote:
>>> On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
>>>> On 01/31/10 16:43, Rafael J. Wysocki wrote:
>>>>> This message has been generated automatically as a part of a report
>>>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>>>
>>>>> The following bug entry is on the current list of known regressions
>>>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>>>> be listed and let me know (either way).
>>>>>
>>>>>
>>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14487
>>>>> Subject : PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0
>>>>> Submitter : Justin P. Mattock<[email protected]>
>>>>> Date : 2009-10-23 16:45 (101 days old)
>>>>> References : http://lkml.org/lkml/2009/10/23/252
> [...]
>>>> yeah still hitting this.
> [...]
>>> I've added the linux1394-devel people to the CC list.
>
> Thanks. Alas the original author is MIA, and the bug seems to be tied
> to the early platform setup code (rather than OHCI 1394 device specific
> code) about which I for one am clueless.
>
> The listed MAINTAINERS contact of init_ohci1394_dma.c is linux1394-devel
> and me, but a good deal of this driver is very x86 platform specific.
> (There was some interest in making useful for other architectures, but
> this would merely mean that the respective architecture people need to
> keep an eye on their parts of this driver.)
>
>>> Justin has found an issue that when he boots with: ohci1394_dma=early
>>> his computer
>>> crashes.
>>>
>>> He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:
> [...]
>
> This modification and some others in the LKML thread from October simply
> cause init_ohci1394_controller() to be skipped for all devices.
>
> init_ohci1394_controller() is simple enough:
>
> static inline void __init init_ohci1394_controller(int num, int slot,
> int func)
> {
> unsigned long ohci_base;
> struct ti_ohci ohci;
>
> printk(KERN_INFO "init_ohci1394_dma: initializing OHCI-1394"
> " at %02x:%02x.%x\n", num, slot, func);
>
> ohci_base = read_pci_config(num, slot, func,
> PCI_BASE_ADDRESS_0+(0<<2))& PCI_BASE_ADDRESS_MEM_MASK;
>
> set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);
>
> ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);
>
> init_ohci1394_reset_and_init_dma(&ohci);
> }
>
> Justin, you already established that read_pci_config is not the point
> where it crashes, right?
>
> set_fixmap_nocache() and fix_to_virt() frighten me because I don't know
> what they do. :-)
>
> The rest, init_ohci1394_reset_and_init_dma(), is something which I can
> easily follow. There is just a bunch of register reads and writes with
> occasional mdelays. This /could/ be a cause of the crash too if the
> controller is inspired to do something dangerous in there --- meaning,
> if the OHCI 1394 controller starts to write something per DMA into
> memory. However, we do not switch on any DMA context except for the
> so-called physical DMA unit which only springs into action if a remote
> FireWire-attached console instructs it to do so.
>
> I am noticing one point where init_ohci1394_dma.c violates the OHCI 1394
> specification: OHCI1394_HCControl_linkEnable is witched on while the
> OHCI1394_ConfigROMmap register is still invalid. This register needs to
> contain a physical address of a 1kB sized, 1kB aligned memory region
> which allows DMA_TO_DEVICE. So, since this is a read-only DMA, I am
> tempted to say that this potential issue should not be a cause for a
> kernel crash.
>
> (Sinde note, the OHCI 1394 spec is freely available, see
> http://ieee1394.wiki.kernel.org/index.php/Specifications#OHCI_Release_1.1.2C_January_6.2C_2000
> )
>
>
> Justin Mattock wrote on 2009-10-27 in http://lkml.org/lkml/2009/10/27/335:
>> o.k. you should be able to view
>> this:(let me know if you can't and I can
>> manually write out, and in time find a public
>> photo sharing suite to make things easier).
>>
>> http://www.flickr.com/photos/44066293@N08/4050317695
>>
>> When this happens I see lots of messages from the print
>> during boot, then this happens.
>
> (Now that a bugzilla.kernel.org ticket exists for this you can also use
> bugzilla.kernel.org to publish screenshots if you have an account there.)
>
> This screenshot looks like ___alloc_bootmem_node is the issue here, or
> am I mistaken of what the order of functions in the backtrace means?


cool, thanks for the assistance and info on this.
(I'll have to read through the specification for ohci1394);

as for __alloc_bootmem_node I have not looked into that yet.
(I can read up on this today).

what I was looking at was:
set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

which led me to arch/x86/include/asm/fixmap.h
leading me to believe I was hitting something with
FIXADDR_TOP because the system is a pure64.
(reading through fixmap.h there is mention that
vsyscall only covers 32bit making me think this might
be it).

and also:

init_ohci1394_reset_and_init_dma(&ohci);
(on the bugreport I have a temporary patch
which gets me up and running to do early debugging,
there you will see both calls are commented out

(as for yesterdays 0xffffffffffffffff(just experimenting)Google gives me
no info on the differences between 8f's to 16f's, I was under the
impression that it's x86_32 and x86_64 for the pci address).

as for the bugzilla.kernel.org I'll have to setup an
account there(flickr is nice, but having a bugreport
photo and pics of my vacation isn't);

In general I'm thinking this has todo with the arch(but could be wrong),
because one lfs system I built was x86_32,which worked fine, and then
the next is a pure64 which triggers this.

Thanks for the info/help.

Justin P. Mattock












2010-02-01 21:47:06

by Nikola Ciprich

[permalink] [raw]
Subject: Re: [Bug #14943] nfs regression?

Hi Rafael,
Sorry, I haven't had time to test newer kernels lately :(, but according to
changelogs, no related problems were fixed till 2.6.32.7...
I'll update problematic machine on wednesday though...
regards
nik
On Mon, Feb 01, 2010 at 01:43:18AM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14943
> Subject : nfs regression?
> Submitter : Nikola Ciprich <[email protected]>
> Date : 2009-12-28 12:10 (35 days old)
> References : http://marc.info/?l=linux-kernel&m=126200276223524&w=4
>
>
>

--
-------------------------------------
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.: +420 596 603 142
fax: +420 596 621 273
mobil: +420 777 093 799

http://www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: [email protected]
-------------------------------------

2010-02-01 22:00:53

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

Hi,

On Mon, Feb 1, 2010 at 11:14 AM, Marcel Holtmann <[email protected]> wrote:
> Hi David,
>
>> >> This message has been generated automatically as a part of a report
>> >> of regressions introduced between 2.6.31 and 2.6.32.
>> >>
>> >> The following bug entry is on the current list of known regressions
>> >> introduced between 2.6.31 and 2.6.32. ?Please verify if it still should
>> >> be listed and let me know (either way).
>> >>
>> >>
>> >> Bug-Entry ?: http://bugzilla.kernel.org/show_bug.cgi?id=15127
>> >> Subject ? ? ? ? ? ?: Bluetooth: sleeping function called from invalid context
>> >> Submitter ?: David John <[email protected]>
>> >> Date ? ? ? ? ? ? ? : 2010-01-12 9:19 (20 days old)
>> >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>> >> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>> >
>> > you have an outdated email from Luiz and I change it to the right one
>> > now.
>> >
>> > I looked with him at the patch and I think this will fix it:
>> >
>> > diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>> > index fc5ee32..2b50637 100644
>> > --- a/net/bluetooth/rfcomm/core.c
>> > +++ b/net/bluetooth/rfcomm/core.c
>> > @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
>> > arg)
>> > ? ? BT_DBG("session %p state %ld", s, s->state);
>> >
>> > ? ? set_bit(RFCOMM_TIMED_OUT, &s->flags);
>> > - ? rfcomm_session_put(s);
>> > ? ? rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>> > ?}
>> >
>> > @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>> > ? ? ? ? ? ? if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>> > ? ? ? ? ? ? ? ? ? ? s->state = BT_DISCONN;
>> > ? ? ? ? ? ? ? ? ? ? rfcomm_send_disc(s, 0);
>> > + ? ? ? ? ? ? ? ? ? rfcomm_session_put(s);
>> > ? ? ? ? ? ? ? ? ? ? continue;
>> > ? ? ? ? ? ? }
>> >
>> > We need some extra testing on this with the actual hardware we did the
>> > patch for. So this will take at least a few days before we get our hands
>> > on it.
>>
>> FWIW, your patch fixes the issue.
>
> nice. So I can add a tested-by line to the final patch?
>
> Just our of curiosity, which hardware did you test this with. We only
> know about one headset that should cause this issue.
>

Just in case, here is the hcidump of the Nokia HS-12W, the one that
has problem when we connection authorization is denied:

> ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0041 len 4 [psm 3]
RFCOMM(s): SABM: cr 1 dlci 26 pf 1 ilen 0 fcs 0xe7
< ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn req: dcid 0x0042 scid 0x0040
< ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0044 len 4 [psm 3]
RFCOMM(s): DM: cr 1 dlci 26 pf 1 ilen 0 fcs 0xcd
> HCI Event: Number of Completed Packets (0x13) plen 5
> ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn rsp: dcid 0x0042 scid 0x0040
< ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0044 len 4 [psm 3]
RFCOMM(s): DISC: cr 0 dlci 0 pf 1 ilen 0 fcs 0x9c
> ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0041 len 4 [psm 3]
RFCOMM(s): UA: cr 0 dlci 0 pf 1 ilen 0 fcs 0xb6
< ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn req: dcid 0x0044 scid 0x0041
> HCI Event: Number of Completed Packets (0x13) plen 5
> ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn rsp: dcid 0x0044 scid 0x0041
< HCI Command: Disconnect (0x01|0x0006) plen 3
> HCI Event: Command Status (0x0f) plen 4
> HCI Event: Disconn Complete (0x05) plen 4

So this means the patch works. DISC 0 is send from our side (due to
the session timeout) when normally it should be other end that
disconnects right away when we respond with DM.

--
Luiz Augusto von Dentz
Engenheiro de Computa??o

2010-02-01 22:27:39

by Stefan Richter

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

Justin P. Mattock wrote:
> (as for yesterdays 0xffffffffffffffff(just experimenting)Google gives me
> no info on the differences between 8f's to 16f's, I was under the
> impression that it's x86_32 and x86_64 for the pci address).

As Dan noted,
(class == 0xffffffff || 0xffffffffffffffff)
is always true because it is logically the same as
(class == whatever) || true

If you really meant
class == 0xffffffff || class == 0xffffffffffffffff
then the latter half will never become true because class is declared as
u32 and got its value from read_pci_config() which also returns u32.

BTW, whether a PCI device is capable of accessing 32 bit bus addresses
or also 64 bit bus addresses depends on the device, not on the CPU.
Originally, PCI only had a 32 bit addressing model. OHCI 1394 1.0/1.1
implementations only deal with 32 bit local bus addresses.

The 'class' however is not an address but merely a register value with
24 bits width. (Defined in the PCI Local Bus spec which is not freely
available, cited in OHCI 1394 annex A.3.) This register is read as a 32
bits wide register from which the excess byte is later discarded. If
all bits read are 1, the bus:slot:function is not actually populated.
--
Stefan Richter
-=====-==-=- --=- ----=
http://arcgraph.de/sr/

2010-02-01 23:50:27

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/01/10 14:27, Stefan Richter wrote:
> Justin P. Mattock wrote:
>> (as for yesterdays 0xffffffffffffffff(just experimenting)Google gives me
>> no info on the differences between 8f's to 16f's, I was under the
>> impression that it's x86_32 and x86_64 for the pci address).
>
> As Dan noted,
> (class == 0xffffffff || 0xffffffffffffffff)
> is always true because it is logically the same as
> (class == whatever) || true
>
> If you really meant
> class == 0xffffffff || class == 0xffffffffffffffff

yeah that's what I was going for(just to see).

> then the latter half will never become true because class is declared as
> u32 and got its value from read_pci_config() which also returns u32.
>

That's what I was afraid of. I'm guessing there probably would be a lot
of things to change for(if this correct) u64.

> BTW, whether a PCI device is capable of accessing 32 bit bus addresses
> or also 64 bit bus addresses depends on the device, not on the CPU.
> Originally, PCI only had a 32 bit addressing model. OHCI 1394 1.0/1.1
> implementations only deal with 32 bit local bus addresses.
>
I haven't even looked at what the device was capable of doing.


> The 'class' however is not an address but merely a register value with
> 24 bits width. (Defined in the PCI Local Bus spec which is not freely
> available, cited in OHCI 1394 annex A.3.) This register is read as a 32
> bits wide register from which the excess byte is later discarded. If
> all bits read are 1, the bus:slot:function is not actually populated.

So(correct me if I'm wrong), I'm generating a 64 bit register
and the kernel is looking for a 32 bit register causing the crash.


Justin P. Mattock

2010-02-02 05:17:34

by David John

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

On 02/02/2010 12:44 AM, Marcel Holtmann wrote:
> Hi David,
>
>>>> This message has been generated automatically as a part of a report
>>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>>
>>>> The following bug entry is on the current list of known regressions
>>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>>> be listed and let me know (either way).
>>>>
>>>>
>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
>>>> Subject : Bluetooth: sleeping function called from invalid context
>>>> Submitter : David John <[email protected]>
>>>> Date : 2010-01-12 9:19 (20 days old)
>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>>>> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>>>
>>> you have an outdated email from Luiz and I change it to the right one
>>> now.
>>>
>>> I looked with him at the patch and I think this will fix it:
>>>
>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>>> index fc5ee32..2b50637 100644
>>> --- a/net/bluetooth/rfcomm/core.c
>>> +++ b/net/bluetooth/rfcomm/core.c
>>> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
>>> arg)
>>> BT_DBG("session %p state %ld", s, s->state);
>>>
>>> set_bit(RFCOMM_TIMED_OUT, &s->flags);
>>> - rfcomm_session_put(s);
>>> rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>>> }
>>>
>>> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>>> if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>>> s->state = BT_DISCONN;
>>> rfcomm_send_disc(s, 0);
>>> + rfcomm_session_put(s);
>>> continue;
>>> }
>>>
>>> We need some extra testing on this with the actual hardware we did the
>>> patch for. So this will take at least a few days before we get our hands
>>> on it.
>>
>> FWIW, your patch fixes the issue.
>
> nice. So I can add a tested-by line to the final patch?

Sure,

Tested-by: David John <[email protected]>

>
> Just our of curiosity, which hardware did you test this with.

I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
(413c:8160). I can send more info about the device if you want.

> We only know about one headset that should cause this issue.

That's weird. I assumed it would happen for any device, since
rfcomm_session_add is called from multiple places and it adds
rfcomm_session_timeout on a timer which will cause the trace
if the timer fires.

I could be wrong though.

Regards,
David.

>
> Regards
>
> Marcel
>
>
>

2010-02-02 05:42:18

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

Hi David,

> >>>> This message has been generated automatically as a part of a report
> >>>> of regressions introduced between 2.6.31 and 2.6.32.
> >>>>
> >>>> The following bug entry is on the current list of known regressions
> >>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> >>>> be listed and let me know (either way).
> >>>>
> >>>>
> >>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
> >>>> Subject : Bluetooth: sleeping function called from invalid context
> >>>> Submitter : David John <[email protected]>
> >>>> Date : 2010-01-12 9:19 (20 days old)
> >>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
> >>>> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
> >>>
> >>> you have an outdated email from Luiz and I change it to the right one
> >>> now.
> >>>
> >>> I looked with him at the patch and I think this will fix it:
> >>>
> >>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> >>> index fc5ee32..2b50637 100644
> >>> --- a/net/bluetooth/rfcomm/core.c
> >>> +++ b/net/bluetooth/rfcomm/core.c
> >>> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> >>> arg)
> >>> BT_DBG("session %p state %ld", s, s->state);
> >>>
> >>> set_bit(RFCOMM_TIMED_OUT, &s->flags);
> >>> - rfcomm_session_put(s);
> >>> rfcomm_schedule(RFCOMM_SCHED_TIMEO);
> >>> }
> >>>
> >>> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
> >>> if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
> >>> s->state = BT_DISCONN;
> >>> rfcomm_send_disc(s, 0);
> >>> + rfcomm_session_put(s);
> >>> continue;
> >>> }
> >>>
> >>> We need some extra testing on this with the actual hardware we did the
> >>> patch for. So this will take at least a few days before we get our hands
> >>> on it.
> >>
> >> FWIW, your patch fixes the issue.
> >
> > nice. So I can add a tested-by line to the final patch?
>
> Sure,
>
> Tested-by: David John <[email protected]>
>
> >
> > Just our of curiosity, which hardware did you test this with.
>
> I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
> (413c:8160). I can send more info about the device if you want.

I meant which device you are connection to. Is it a headset or another
computer.

> > We only know about one headset that should cause this issue.
>
> That's weird. I assumed it would happen for any device, since
> rfcomm_session_add is called from multiple places and it adds
> rfcomm_session_timeout on a timer which will cause the trace
> if the timer fires.

The timer will only fire for non-behaving remote stacks. With a proper
stack following the RFCOMM specification it should never fire.

Regards

Marcel

2010-02-02 05:46:14

by Stefan Richter

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

Justin P. Mattock wrote:
> So(correct me if I'm wrong), I'm generating a 64 bit register
> and the kernel is looking for a 32 bit register causing the crash.

No, the class = read_pci_config(); if (class == ...) ... parts of the
code are entirely innocent as far as I can tell. This is just the
FireWire--PCI chip detection. It is the subsequent driver setup for the
chip that crashes somewhere.

When you modified that chip detection code earlier, you only prevented
crashes when your modifications ended up as "ignore all PCI devices,
also FireWire ones" == "do nothing at all".

Perhaps the bootup sequence of the x86(-64) platform was changed from
2.6.31 to .32 thus that some assumptions in init_ohci1394_dma about when
are what resources available are not true anymore. According to your
screenshot in http://lkml.org/lkml/2009/10/27/335 the issue is about
memory allocation, not about PCI bus access.
--
Stefan Richter
-=====-==-=- --=- ---=-
http://arcgraph.de/sr/

2010-02-02 05:56:50

by David John

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

On 02/02/2010 11:11 AM, Marcel Holtmann wrote:
> Hi David,
>
>>>>>> This message has been generated automatically as a part of a report
>>>>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>>>>
>>>>>> The following bug entry is on the current list of known regressions
>>>>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>>>>> be listed and let me know (either way).
>>>>>>
>>>>>>
>>>>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
>>>>>> Subject : Bluetooth: sleeping function called from invalid context
>>>>>> Submitter : David John <[email protected]>
>>>>>> Date : 2010-01-12 9:19 (20 days old)
>>>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>>>>>> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>>>>>
>>>>> you have an outdated email from Luiz and I change it to the right one
>>>>> now.
>>>>>
>>>>> I looked with him at the patch and I think this will fix it:
>>>>>
>>>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>>>>> index fc5ee32..2b50637 100644
>>>>> --- a/net/bluetooth/rfcomm/core.c
>>>>> +++ b/net/bluetooth/rfcomm/core.c
>>>>> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
>>>>> arg)
>>>>> BT_DBG("session %p state %ld", s, s->state);
>>>>>
>>>>> set_bit(RFCOMM_TIMED_OUT, &s->flags);
>>>>> - rfcomm_session_put(s);
>>>>> rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>>>>> }
>>>>>
>>>>> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>>>>> if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>>>>> s->state = BT_DISCONN;
>>>>> rfcomm_send_disc(s, 0);
>>>>> + rfcomm_session_put(s);
>>>>> continue;
>>>>> }
>>>>>
>>>>> We need some extra testing on this with the actual hardware we did the
>>>>> patch for. So this will take at least a few days before we get our hands
>>>>> on it.
>>>>
>>>> FWIW, your patch fixes the issue.
>>>
>>> nice. So I can add a tested-by line to the final patch?
>>
>> Sure,
>>
>> Tested-by: David John <[email protected]>
>>
>>>
>>> Just our of curiosity, which hardware did you test this with.
>>
>> I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
>> (413c:8160). I can send more info about the device if you want.
>
> I meant which device you are connection to. Is it a headset or another
> computer.
>
>>> We only know about one headset that should cause this issue.
>>
>> That's weird. I assumed it would happen for any device, since
>> rfcomm_session_add is called from multiple places and it adds
>> rfcomm_session_timeout on a timer which will cause the trace
>> if the timer fires.
>
> The timer will only fire for non-behaving remote stacks. With a proper
> stack following the RFCOMM specification it should never fire.
>
> Regards
>
> Marcel
>
>
>

Ah. It's a Sony Ericsson W800i phone. I noticed a new problem while
testing yesterday: Transferring a file to the phone seems to happen
correctly, but at the end of the transfer, the phone reports that the
connection was lost and I get this in the log:

btusb_bulk_complete: hci0 urb ffff88007a5b59c0 failed to resubmit (19)
btusb_bulk_complete: hci0 urb ffff880077a200c0 failed to resubmit (19)
btusb_intr_complete: hci0 urb ffff88007a5b5780 failed to resubmit (19)
btusb_send_frame: hci0 urb ffff88004db809c0 submission failed

To remove btusb, I have to shutdown the laptop Bluetooth. I'll check and
see if I can reproduce and track down the issue. Note that the phone was
working okay pre 2.3.32.

Regards,
David.

2010-02-02 06:21:50

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/01/10 21:45, Stefan Richter wrote:
> Justin P. Mattock wrote:
>> So(correct me if I'm wrong), I'm generating a 64 bit register
>> and the kernel is looking for a 32 bit register causing the crash.
>
> No, the class = read_pci_config(); if (class == ...) ... parts of the
> code are entirely innocent as far as I can tell. This is just the
> FireWire--PCI chip detection. It is the subsequent driver setup for the
> chip that crashes somewhere.
>
> When you modified that chip detection code earlier, you only prevented
> crashes when your modifications ended up as "ignore all PCI devices,
> also FireWire ones" == "do nothing at all".
>
> Perhaps the bootup sequence of the x86(-64) platform was changed from
> 2.6.31 to .32 thus that some assumptions in init_ohci1394_dma about when
> are what resources available are not true anymore. According to your
> screenshot in http://lkml.org/lkml/2009/10/27/335 the issue is about
> memory allocation, not about PCI bus access.


Alright.. I'll keep focus on that
and see if I can figure this out.

As for anything changed in the kernel
(2.6.31 - present), tough to say
from what I remember I had created a new fresh
lfs system using these CFLAGS:

CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"
(without -m option gcc defaults(I think)to -m32).

which booted with ohci1394_dma=early just fine.

then decided to build another lfs system with the same CFLAGS except
added -m64 (pure64) to the build process.
(then this showed up).

What I can try is do a git revert to 2.6.29/27 to see if this thing
fires off(before going any further). if the system boots then do a bisect.

Justin P. Mattock

2010-02-02 06:45:33

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

o.k. I feel really stupid right now.
after starring at this for some time I didn't even
think to do a git revert to test other
kernel versions(duh!!).

so doing a git revert to v2.6.27 ohci1394_dma
boots up fine.
a bit late now to do a bisect, but in the morning
I'll start this and see what I get from it, then
go from there.

(man!! let this be a lesson for me);

Justin P. Mattock

2010-02-02 06:55:32

by Stefan Richter

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

Justin P. Mattock wrote:
> As for anything changed in the kernel
> (2.6.31 - present), tough to say
> from what I remember I had created a new fresh
> lfs system using these CFLAGS:
>
> CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer"
> CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"
> (without -m option gcc defaults(I think)to -m32).
>
> which booted with ohci1394_dma=early just fine.
>
> then decided to build another lfs system with the same CFLAGS except
> added -m64 (pure64) to the build process.
> (then this showed up).
>
> What I can try is do a git revert to 2.6.29/27 to see if this thing
> fires off(before going any further). if the system boots then do a bisect.

Do I understand correctly that at this moment it is only known that the
bug could be
- *either* a 2.6.31 -> 2.6.32 regression
- *or* an x86-64 specific bug that does not occur on x86-32,
right?

I have an Core 2 Duo based PC with x86-32 kernel and userland and an AMD
based x86-64 PC and could give ohci1394_dma=early a try on both (never
tested it myself before). I could furthermore attempt to build and
install an x86-64 kernel on the Core 2 Duo PC but I am afraid I am far
too short of spare time for that.
--
Stefan Richter
-=====-==-=- --=- ---=-
http://arcgraph.de/sr/

2010-02-02 06:58:08

by Stefan Richter

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

Stefan Richter wrote:
> Do I understand correctly that at this moment it is only known that the
> bug could be
> - *either* a 2.6.31 -> 2.6.32 regression
> - *or* an x86-64 specific bug that does not occur on x86-32,
> right?

(OK, according to your other post it /is/ a regression, at least on
x86-64 and definitely between 2.6.27 (good) and 2.6.32 (bad).)
--
Stefan Richter
-=====-==-=- --=- ---=-
http://arcgraph.de/sr/

2010-02-02 07:02:22

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/01/10 22:55, Stefan Richter wrote:
> Justin P. Mattock wrote:
>> As for anything changed in the kernel
>> (2.6.31 - present), tough to say
>> from what I remember I had created a new fresh
>> lfs system using these CFLAGS:
>>
>> CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer"
>> CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"
>> (without -m option gcc defaults(I think)to -m32).
>>
>> which booted with ohci1394_dma=early just fine.
>>
>> then decided to build another lfs system with the same CFLAGS except
>> added -m64 (pure64) to the build process.
>> (then this showed up).
>>
>> What I can try is do a git revert to 2.6.29/27 to see if this thing
>> fires off(before going any further). if the system boots then do a bisect.
>
> Do I understand correctly that at this moment it is only known that the
> bug could be
> - *either* a 2.6.31 -> 2.6.32 regression
> - *or* an x86-64 specific bug that does not occur on x86-32,
> right?
>

at first I was under the impression this was an arch thing because of
building an x86_32, and then building x86_64(and hitting this). but now
after reverting to 2.6.27 I'm thinking other wise.(my bad, should of
done this at first but didn't even think too);

> I have an Core 2 Duo based PC with x86-32 kernel and userland and an AMD
> based x86-64 PC and could give ohci1394_dma=early a try on both (never
> tested it myself before). I could furthermore attempt to build and
> install an x86-64 kernel on the Core 2 Duo PC but I am afraid I am far
> too short of spare time for that.

no..
I need to do a bisect from 2.6.27 to present to see
(just need to crash for a few hrs, then can start);
then I'll go from there.

Justin P. Mattock

2010-02-02 07:41:38

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/01/10 22:57, Stefan Richter wrote:
> Stefan Richter wrote:
>> Do I understand correctly that at this moment it is only known that the
>> bug could be
>> - *either* a 2.6.31 -> 2.6.32 regression
>> - *or* an x86-64 specific bug that does not occur on x86-32,
>> right?
>
> (OK, according to your other post it /is/ a regression, at least on
> x86-64 and definitely between 2.6.27 (good) and 2.6.32 (bad).)

I'll go with the bisect in the morning(late over here),
and then go from there.(just pissed at myself
for not thinking to do this at the beginning).

Justin P. Mattock

2010-02-02 20:51:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

On Monday 01 February 2010, Marcel Holtmann wrote:
> Hi Rafael,
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15127
> > Subject : Bluetooth: sleeping function called from invalid context
> > Submitter : David John <[email protected]>
> > Date : 2010-01-12 9:19 (20 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
> > References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>
> you have an outdated email from Luiz and I change it to the right one
> now.

Thanks. Unfortunately the address is from a commit sign-off and the script
picks it up automatically, so I can't really help it (well, I'd have to
remember to edit this particular message manually each time before posting it,
but I tend to forget).

The entry has been updated with a link to your patch.

Rafael

2010-02-02 20:52:46

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14922] 2.6.32 seemed to have broken nVidia MCP7A sata controller

On Monday 01 February 2010, Robert Hancock wrote:
> On Sun, Jan 31, 2010 at 6:43 PM, Rafael J. Wysocki <[email protected]> wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14922
> > Subject : 2.6.32 seemed to have broken nVidia MCP7A sata controller
> > Submitter : Mike Cui <[email protected]>
> > Date : 2009-12-19 6:13 (44 days old)
> > References : http://marc.info/?l=linux-ide&m=126120323407742&w=4
> > Handled-By : Jeff Garzik <[email protected]>
> > Robert Hancock <[email protected]>
>
> Still outstanding. I posted a patch that should fix the problem,
> waiting for feedback from the reporter.

Thanks for the update.

Rafael

2010-02-02 20:55:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

On Monday 01 February 2010, David John wrote:
> On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
> > Subject : i915: Commit 0e442c60 causes flickering
> > Submitter : David John <[email protected]>
> > Date : 2009-12-09 17:26 (54 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
> > References : http://marc.info/?l=linux-kernel&m=126037889600769&w=4
> > Handled-By : David John <[email protected]>
> > Patch : http://patchwork.kernel.org/patch/75423/
> >
> >
> >
>
> Hi Rafael,
>
> The patch fixing this has not been merged yet, so the bug should still
> be listed.

Thanks for the update, I'll close the bug when I see the patch in the Linus'
tree.

Rafael

2010-02-02 20:55:53

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14621] specjbb2005 and aim7 regression with 2.6.32-rc kernels

On Monday 01 February 2010, Mike Galbraith wrote:
> On Mon, 2010-02-01 at 01:43 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let me know (either way).
>
> Yes, it should remain open. Aim7 regression isn't reproducible here,
> specjbb2005 unknown, not available to the general public.

Thanks for the update.

Rafael

2010-02-02 21:03:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

On Monday 01 February 2010, Thomas Backlund wrote:
> 01.02.2010 02:43, Rafael J. Wysocki skrev:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
> > Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
> > Submitter : Alexander Clouter<[email protected]>
> > Date : 2009-10-23 10:30 (101 days old)
> > References : http://lkml.org/lkml/2009/10/23/50
> >
> >
>
> Afaik this is the same issue as the one referenced here:
>
> http://lkml.org/lkml/2010/1/28/292
>
> The patch in the above thread should fix the issue.

Thanks, updated.

Rafael

2010-02-03 01:42:18

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

o.k. finally finished with the bisect:

reverting this gets things going on 2.6.33-rc5

789d03f584484af85dbdc64935270c8e45f36ef7 is the first bad commit
commit 789d03f584484af85dbdc64935270c8e45f36ef7
Author: Jan Beulich <[email protected]>
Date: Tue Jun 30 11:52:23 2009 +0100

x86: Fix fixmap ordering

The merge of the 32- and 64-bit fixmap headers made a latent
bug on x86-64 a real one: with the right config settings
it is possible for FIX_OHCI1394_BASE to overlap the FIX_BTMAP_*
range.

Signed-off-by: Jan Beulich <[email protected]>
Cc: <[email protected]> # for 2.6.30.x
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

The only thing I can think of at this point
is maybe the CFLAGS I used to build this system.
(as for the x86_32 working and x86_64 failing not sure);

I'm curious to see if anybody else is hitting this?

Justin P. Mattock

2010-02-03 09:17:41

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 03.02.10 02:43 >>>
>The only thing I can think of at this point
>is maybe the CFLAGS I used to build this system.
>(as for the x86_32 working and x86_64 failing not sure);
>
>I'm curious to see if anybody else is hitting this?

I think it is pretty clear how a page fault can happen here (but you're
observing a double fault, which I cannot explain [nor can I explain
why the fault apparently didn't get an error code pushed, which is
why address and error code displayed are mixed up]): I would
suspect that FIX_OHCI1394_BASE is now in a different (virtual) 2Mb
range than what is covered by level{1,2}_fixmap_pgt, but this was
a latent issue even before that patch (just waiting for sufficiently
many fixmap entries getting inserted before
__end_of_permanent_fixed_addresses).

The thing is that head_64.S uses hard-coded numbers, but doesn't
really make sure (at build time) that the fixmap page tables established
indeed cover all the entries of importance (and honestly I even can't
easily tell which of the candidates - FIX_DBGP_BASE,
FIX_EARLYCON_MEM_BASE, and FIX_OHCI1394_BASE afaict - really
matter). If either of the first does, the only reasonable solution imo
is to move FIX_OHCI1394_BASE out of the boot time only range into
the permanent range (unless the other two can be moved into the
boot time only range). And obviously the hard coded numbers
should be eliminated from head_64.S.

Jan

2010-02-03 17:10:33

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/03/10 01:18, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 03.02.10 02:43>>>
>> The only thing I can think of at this point
>> is maybe the CFLAGS I used to build this system.
>> (as for the x86_32 working and x86_64 failing not sure);
>>
>> I'm curious to see if anybody else is hitting this?
>
> I think it is pretty clear how a page fault can happen here (but you're
> observing a double fault, which I cannot explain [nor can I explain
> why the fault apparently didn't get an error code pushed, which is
> why address and error code displayed are mixed up]): I would
> suspect that FIX_OHCI1394_BASE is now in a different (virtual) 2Mb
> range than what is covered by level{1,2}_fixmap_pgt, but this was
> a latent issue even before that patch (just waiting for sufficiently
> many fixmap entries getting inserted before
> __end_of_permanent_fixed_addresses).
>
> The thing is that head_64.S uses hard-coded numbers, but doesn't
> really make sure (at build time) that the fixmap page tables established
> indeed cover all the entries of importance (and honestly I even can't
> easily tell which of the candidates - FIX_DBGP_BASE,
> FIX_EARLYCON_MEM_BASE, and FIX_OHCI1394_BASE afaict - really
> matter). If either of the first does, the only reasonable solution imo
> is to move FIX_OHCI1394_BASE out of the boot time only range into
> the permanent range (unless the other two can be moved into the
> boot time only range). And obviously the hard coded numbers
> should be eliminated from head_64.S.
>
> Jan
>
>

Thanks for your info on this. I can try today moving things
around just to see. Looking more into this(keep in mind I
have no idea how these page,fix_to_virt calls etc.. work)
I was thinking with what stefan had mentioned ___alloc_bootmem_node
(still need to look into what that function does)maybe keeping fixmap.h
as is and looking somewhere else might be where the fix might be(but
could be wrong).

In any case I'll have another go at this today.

Justin P. Mattock

2010-02-03 19:24:41

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

jan,

Thanks for that info, after looking at
arch/x86/kernel/head_64.S
I'm thinking this is a grub2 issue.

From what I remember while building this system
I used ubuntu as the host, built grub2 from git
then once being able to boot, figured it was all good.

Just to see, I'll go and leave the kernel as it is,
build grub2 again, just to make sure.
(maybe there's something happening with it because
grub2 is built pure64, anything 32bit wont work
(could be wrong though)) i.g. if the kernel does 32bit something
then changes to 64bit and in the meantime grub2 can only see 64bit then
maybe this is what I'm hitting.

Justin P. Mattock

2010-02-03 23:03:42

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

o.k. while looking into grub2
I had noticed during compile time
and reading posts, that -m32 is always
being called(no matter how much I tweaked the
Makefile). seeing this made me think well
if this thing is being built with -m32 maybe that might
be it i.g. 32bit to 64bit might cause some issues, but
unfortunately is not the case(building lilo you can achieve a pure64
bit build).

So after all of that still no go, but the positive side
is lilo is able to show more up the line of the boot message
error:

[ 0.000000] 0100000000 - 0140000000 page 2M
[ 0.000000] kernel direct mapping tables up to 140000000 @ b000-11000
[ 0.000000] init_ohci1394_dma: initializing OHCI-1394 at 05:00.0
[ 0.000000] bootmem alloc of 4096 bytes failed!
[ 0.000000] Kernel panic - not syncing: Out of memory
[ 0.000000] Pid: 0, comm: swapper Not tainted
2.6.33-rc6-00072-gab65832 # 39
[ 0.000000] Call Trace:

then the rest shown on the picture on the bug report.

Out of memory?

Justin P. Mattock

2010-02-04 08:54:15

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 04.02.10 00:05 >>>
>[ 0.000000] 0100000000 - 0140000000 page 2M
>[ 0.000000] kernel direct mapping tables up to 140000000 @ b000-11000
>[ 0.000000] init_ohci1394_dma: initializing OHCI-1394 at 05:00.0
>[ 0.000000] bootmem alloc of 4096 bytes failed!
>[ 0.000000] Kernel panic - not syncing: Out of memory
>[ 0.000000] Pid: 0, comm: swapper Not tainted
>2.6.33-rc6-00072-gab65832 # 39
>[ 0.000000] Call Trace:
>
>then the rest shown on the picture on the bug report.
>
>Out of memory?

bootmem allocation before bootmem was even initialized. And that's
likely because the code tries to populate the pmd that (due to the
issue explained yesterday) isn't statically initialized.

Jan

2010-02-04 09:10:23

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/04/10 00:54, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 04.02.10 00:05>>>
>> [ 0.000000] 0100000000 - 0140000000 page 2M
>> [ 0.000000] kernel direct mapping tables up to 140000000 @ b000-11000
>> [ 0.000000] init_ohci1394_dma: initializing OHCI-1394 at 05:00.0
>> [ 0.000000] bootmem alloc of 4096 bytes failed!
>> [ 0.000000] Kernel panic - not syncing: Out of memory
>> [ 0.000000] Pid: 0, comm: swapper Not tainted
>> 2.6.33-rc6-00072-gab65832 # 39
>> [ 0.000000] Call Trace:
>>
>> then the rest shown on the picture on the bug report.
>>
>> Out of memory?
>
> bootmem allocation before bootmem was even initialized. And that's
> likely because the code tries to populate the pmd that (due to the
> issue explained yesterday) isn't statically initialized.
>
> Jan
>
>

I'll have a look at this in the morning(late over here),
but one thing I'm seeing is the device numbers:
the error shows 05:00.0 while on a good go
of this I saw the address at **3** something(can grab
the info later for you).

which probably goes to what you are saying:
tries to populate the pmd

a quick google on this showed somewhere
at bootmem.c any ideas on this or where
this might be caused besides fixmap?
(or is fixmap the main location?);


Justin P. Mattock

2010-02-04 09:10:58

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 04.02.10 10:04 >>>
>a quick google on this showed somewhere
>at bootmem.c any ideas on this or where
>this might be caused besides fixmap?
>(or is fixmap the main location?);

__native_set_fixmap() -> set_pte_vaddr() -> set_pte_vaddr_pud() ->
fill_pte() -> spp_getpage() -> alloc_bootmem_pages() -> panic().

Jan

2010-02-04 09:16:20

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/04/10 01:11, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 04.02.10 10:04>>>
>> a quick google on this showed somewhere
>> at bootmem.c any ideas on this or where
>> this might be caused besides fixmap?
>> (or is fixmap the main location?);
>
> __native_set_fixmap() -> set_pte_vaddr() -> set_pte_vaddr_pud() ->
> fill_pte() -> spp_getpage() -> alloc_bootmem_pages() -> panic().
>
> Jan
>
>

so something is using __native_set_fixmap
that's hitting some memory address then
set_fixmap_nocache(ohci1394_dma=early)
fires off hitting the same?

Justin P. Mattock

2010-02-04 09:34:32

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 04.02.10 10:17 >>>
>so something is using __native_set_fixmap
>that's hitting some memory address then
>set_fixmap_nocache(ohci1394_dma=early)
>fires off hitting the same?

No, afaict it is the ohci1394_dma=early code itself hitting that path.

Jan

2010-02-04 09:47:01

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/04/10 01:35, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 04.02.10 10:17>>>
>> so something is using __native_set_fixmap
>> that's hitting some memory address then
>> set_fixmap_nocache(ohci1394_dma=early)
>> fires off hitting the same?
>
> No, afaict it is the ohci1394_dma=early code itself hitting that path.
>
> Jan
>
>

alright.. looking at init_ohci1394_dma.c
I see:

ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);

then I think it calls:

set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

I'm guessing somewhere with the fix_to_virt might be something
(but could be wrong);

Justin P. Mattock

2010-02-04 09:57:00

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 04.02.10 10:48 >>>
>I see:
>
>ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);
>
>then I think it calls:
>
>set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);
>
>I'm guessing somewhere with the fix_to_virt might be something
>(but could be wrong);

No, it ought to be that set_fixmap_nocache().

Jan

2010-02-04 10:12:33

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/04/10 01:57, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 04.02.10 10:48>>>
>> I see:
>>
>> ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);
>>
>> then I think it calls:
>>
>> set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);
>>
>> I'm guessing somewhere with the fix_to_virt might be something
>> (but could be wrong);
>
> No, it ought to be that set_fixmap_nocache().
>
> Jan
>
>


hmm..
as a quick test I did try:

set_fixmap(FIX_OHCI1394_BASE, ohci_base);
(maybe ohci_base)

which still hit, maybe something else
in the set of calls is hitting
i.g. address specific or something.
(I'll have to keep looking on this);

Justin P. mattock

2010-02-05 18:50:37

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #14670] i915: playing video via XVideo extension makes the screen flicker

On Mon, 1 Feb 2010 01:43:12 +0100 (CET)
"Rafael J. Wysocki" <[email protected]> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still
> should be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14670
> Subject : i915: playing video via XVideo extension
> makes the screen flicker Submitter : Thomas Meyer
> <[email protected]> Date : 2009-11-23 13:15 (70 days old)
> First-Bad-Commit:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b42d4c5c6a872815d711e5d51a600f5122c38eee
> References : http://lkml.org/lkml/2010/1/11/150

Just updated the bug. Looks like we know where the problem is, just
need a couple more patches tested.

Thomas, can you check my addition and provide feedback?

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 18:52:15

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

On Mon, 01 Feb 2010 15:01:59 +0530
David John <[email protected]> wrote:

> On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still
> > should be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
> > Subject : i915: Commit 0e442c60 causes flickering
> > Submitter : David John <[email protected]>
> > Date : 2009-12-09 17:26 (54 days old)
> > First-Bad-Commit:
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
> > References :
> > http://marc.info/?l=linux-kernel&m=126037889600769&w=4
> > Handled-By : David John <[email protected]>
> > Patch : http://patchwork.kernel.org/patch/75423/
> >
> >
> >
>
> Hi Rafael,
>
> The patch fixing this has not been merged yet, so the bug should still
> be listed.

Eric, can you pick up David's "Disable SR when more than one pipe is
enabled" patch? Yakui has a bigger rework of the wm code that also
fixes this problem, but it's too invasive for 2.6.33.

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 18:57:58

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #14997] Closing and re-opening the lid does not reactivate the backlight

On Mon, 1 Feb 2010 01:43:18 +0100 (CET)
"Rafael J. Wysocki" <[email protected]> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still
> should be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14997
> Subject : Closing and re-opening the lid does not
> reactivate the backlight Submitter : o. meijer
> <[email protected]> Date : 2010-01-06 15:38 (26 days
> old)

Just updated this one as well. Since it sounds easily reproducible it
would be good to get a bisect. We haven't changed backlight handling,
but it's likely the lid changes make X handle things differently...

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 19:01:49

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Mon, 1 Feb 2010 01:43:19 +0100 (CET)
"Rafael J. Wysocki" <[email protected]> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still
> should be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15004
> Subject : i915: *ERROR* Execbuf while wedged
> Submitter : tomas m <[email protected]>
> Date : 2010-01-07 18:53 (25 days old)

Can you bisect this to a particular kernel commit? The particular
error message means the kernel detected a GPU hang. That's usually
a userspace bug, but the kernel should recover from it.

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 19:06:32

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15100] X11 is black after resume from s2ram if my T400 was previous in docking station before

On Mon, 1 Feb 2010 01:43:20 +0100 (CET)
"Rafael J. Wysocki" <[email protected]> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still
> should be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15100
> Subject : X11 is black after resume from s2ram if my
> T400 was previous in docking station before Submitter : Toralf
> Förster <[email protected]> Date : 2010-01-21
> 08:56 (11 days old) First-Bad-Commit:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c1c7af60892070e4b82ad63bbfb95ae745056de0

Just updated the bug, looks like another LID status related bug.

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 19:09:01

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15108] Blank screen with KMS enabled (on clevo M5xN laptop)

On Mon, 1 Feb 2010 01:43:21 +0100 (CET)
"Rafael J. Wysocki" <[email protected]> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still
> should be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15108
> Subject : Blank screen with KMS enabled (on clevo M5xN
> laptop) Submitter : Jérémy Lal <[email protected]>
> Date : 2010-01-22 20:30 (10 days old)

Looks like this one has a patch available.

Rui, is the patch upstream already?

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 19:10:32

by Chris Mason

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Fri, Feb 05, 2010 at 11:01:00AM -0800, Jesse Barnes wrote:
> On Mon, 1 Feb 2010 01:43:19 +0100 (CET)
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still
> > should be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15004
> > Subject : i915: *ERROR* Execbuf while wedged
> > Submitter : tomas m <[email protected]>
> > Date : 2010-01-07 18:53 (25 days old)
>
> Can you bisect this to a particular kernel commit? The particular
> error message means the kernel detected a GPU hang. That's usually
> a userspace bug, but the kernel should recover from it.

I see these about once a week, which would be a very difficult bisect.
The kernel does recover when I reboot, but beyond that the messages loop
forever.

I haven't yet seen it on 2.6.33-rc, but I'm still struggling with
suspend/resume failures and haven't really had a one week up time yet.

-chris

2010-02-05 19:18:38

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Fri, 5 Feb 2010 14:09:16 -0500
Chris Mason <[email protected]> wrote:

> On Fri, Feb 05, 2010 at 11:01:00AM -0800, Jesse Barnes wrote:
> > On Mon, 1 Feb 2010 01:43:19 +0100 (CET)
> > "Rafael J. Wysocki" <[email protected]> wrote:
> >
> > > This message has been generated automatically as a part of a
> > > report of regressions introduced between 2.6.31 and 2.6.32.
> > >
> > > The following bug entry is on the current list of known
> > > regressions introduced between 2.6.31 and 2.6.32. Please verify
> > > if it still should be listed and let me know (either way).
> > >
> > >
> > > Bug-Entry :
> > > http://bugzilla.kernel.org/show_bug.cgi?id=15004
> > > Subject : i915: *ERROR* Execbuf while wedged
> > > Submitter : tomas m <[email protected]>
> > > Date : 2010-01-07 18:53 (25 days old)
> >
> > Can you bisect this to a particular kernel commit? The particular
> > error message means the kernel detected a GPU hang. That's usually
> > a userspace bug, but the kernel should recover from it.
>
> I see these about once a week, which would be a very difficult bisect.
> The kernel does recover when I reboot, but beyond that the messages
> loop forever.
>
> I haven't yet seen it on 2.6.33-rc, but I'm still struggling with
> suspend/resume failures and haven't really had a one week up time yet.

The fdo bug referenced from kernel bugzilla has a small workaround you
might try. It forces the driver to try to recover from the hang, so
you might not need to reboot.

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-05 22:31:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #15108] Blank screen with KMS enabled (on clevo M5xN laptop)

On Friday 05 February 2010, Jesse Barnes wrote:
> On Mon, 1 Feb 2010 01:43:21 +0100 (CET)
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still
> > should be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15108
> > Subject : Blank screen with KMS enabled (on clevo M5xN
> > laptop) Submitter : Jérémy Lal <[email protected]>
> > Date : 2010-01-22 20:30 (10 days old)
>
> Looks like this one has a patch available.
>
> Rui, is the patch upstream already?

I don't see it in there.

Rafael

2010-02-06 23:56:19

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/04/10 01:57, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 04.02.10 10:48>>>
>> I see:
>>
>> ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);
>>
>> then I think it calls:
>>
>> set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);
>>
>> I'm guessing somewhere with the fix_to_virt might be something
>> (but could be wrong);
>
> No, it ought to be that set_fixmap_nocache().
>
> Jan
>
>


looking into fixmap.h I started to look into:
#define NR_FIX_BTMAPS 64
#define FIX_BTMAPS_SLOTS 4
FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 256 -
(__end_of_permanent_fixed_addresses & 255),
FIX_BTMAP_BEGIN = FIX_BTMAP_END +
NR_FIX_BTMAPS*FIX_BTMAPS_SLOTS - 1,

which led me to a patch you had submitted:
http://patchwork.kernel.org/patch/68719/
and another located here:
http://lists.openwall.net/linux-kernel/2008/08/29/211

your patch works, I reapplied it to the latest HEAD, and added a
bisected-and-tested-by unto it and sent it as an attachment to
the bug report.

the other thread(patch) I was able to get the system boot with that, as
well but with it only changed the size of page(256 to 512 etc..).

Let me know what would be the best approach with this.

Justin P. Mattock







2010-02-07 12:56:12

by David John

[permalink] [raw]
Subject: Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

On 02/03/2010 02:25 AM, Rafael J. Wysocki wrote:
> On Monday 01 February 2010, David John wrote:
>> On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
>>> This message has been generated automatically as a part of a report
>>> of regressions introduced between 2.6.31 and 2.6.32.
>>>
>>> The following bug entry is on the current list of known regressions
>>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>>> be listed and let me know (either way).
>>>
>>>
>>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
>>> Subject : i915: Commit 0e442c60 causes flickering
>>> Submitter : David John <[email protected]>
>>> Date : 2009-12-09 17:26 (54 days old)
>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
>>> References : http://marc.info/?l=linux-kernel&m=126037889600769&w=4
>>> Handled-By : David John <[email protected]>
>>> Patch : http://patchwork.kernel.org/patch/75423/
>>>
>>>
>>>
>>
>> Hi Rafael,
>>
>> The patch fixing this has not been merged yet, so the bug should still
>> be listed.
>
> Thanks for the update, I'll close the bug when I see the patch in the Linus'
> tree.
>
> Rafael
>

Hi Rafael,

This regression entry can now be closed, it's fixed by the upstream
commit 33c5fd12 (drm/i915: Disable SR when more than one pipe is enabled).

Thanks,
David.

2010-02-07 13:14:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

On Sunday 07 February 2010, David John wrote:
> On 02/03/2010 02:25 AM, Rafael J. Wysocki wrote:
> > On Monday 01 February 2010, David John wrote:
> >> On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
> >>> This message has been generated automatically as a part of a report
> >>> of regressions introduced between 2.6.31 and 2.6.32.
> >>>
> >>> The following bug entry is on the current list of known regressions
> >>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> >>> be listed and let me know (either way).
> >>>
> >>>
> >>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
> >>> Subject : i915: Commit 0e442c60 causes flickering
> >>> Submitter : David John <[email protected]>
> >>> Date : 2009-12-09 17:26 (54 days old)
>
> Hi Rafael,
>
> This regression entry can now be closed, it's fixed by the upstream
> commit 33c5fd12 (drm/i915: Disable SR when more than one pipe is enabled).

Thanks, closing.

Rafael

2010-02-07 23:11:06

by Werner LEMBERG

[permalink] [raw]
Subject: Re: [Bug #15158] oops related to i915_gem_object_save_bit_17_swizzle

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15158
> Subject : oops related to i915_gem_object_save_bit_17_swizzle
> Submitter : Werner Lemberg <[email protected]>
> Date : 2010-01-28 08:26 (4 days old)

I've just experienced the crash again, this time with openSuSE kernel
2.6.33-rc6-11-pae and xorg-x11-driver-video-7.4-245.1


Werner

2010-02-08 10:09:05

by Alexander Clouter

[permalink] [raw]
Subject: Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

Hi,

* Thomas Backlund <[email protected]> [2010-02-01 17:47:42+0200]:
>
> 01.02.2010 02:43, Rafael J. Wysocki skrev:
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32. Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
>> Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
>> Submitter : Alexander Clouter<[email protected]>
>> Date : 2009-10-23 10:30 (101 days old)
>> References : http://lkml.org/lkml/2009/10/23/50
>>
>
> Afaik this is the same issue as the one referenced here:
>
> http://lkml.org/lkml/2010/1/28/292
>
> The patch in the above thread should fix the issue.
>
I'm getting my hands on some not-in-production-state equipment to test
this with, so bear with me :)

Cheers

--
Alexander Clouter
.sigmonster says: Dealer prices may vary.

2010-02-08 17:26:44

by Chris Mason

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Fri, Feb 05, 2010 at 11:17:47AM -0800, Jesse Barnes wrote:
> On Fri, 5 Feb 2010 14:09:16 -0500
> Chris Mason <[email protected]> wrote:
>
> > On Fri, Feb 05, 2010 at 11:01:00AM -0800, Jesse Barnes wrote:
> > > On Mon, 1 Feb 2010 01:43:19 +0100 (CET)
> > > "Rafael J. Wysocki" <[email protected]> wrote:
> > >
> > > > This message has been generated automatically as a part of a
> > > > report of regressions introduced between 2.6.31 and 2.6.32.
> > > >
> > > > The following bug entry is on the current list of known
> > > > regressions introduced between 2.6.31 and 2.6.32. Please verify
> > > > if it still should be listed and let me know (either way).
> > > >
> > > >
> > > > Bug-Entry :
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=15004
> > > > Subject : i915: *ERROR* Execbuf while wedged
> > > > Submitter : tomas m <[email protected]>
> > > > Date : 2010-01-07 18:53 (25 days old)
> > >
> > > Can you bisect this to a particular kernel commit? The particular
> > > error message means the kernel detected a GPU hang. That's usually
> > > a userspace bug, but the kernel should recover from it.
> >
> > I see these about once a week, which would be a very difficult bisect.
> > The kernel does recover when I reboot, but beyond that the messages
> > loop forever.
> >
> > I haven't yet seen it on 2.6.33-rc, but I'm still struggling with
> > suspend/resume failures and haven't really had a one week up time yet.

Ok, updating to rc7 and updating my xf86 driver to 2.10 seems to have
fixed up my suspend/resume problems. So, I should be able to trigger
the execbuf problem again.

>
> The fdo bug referenced from kernel bugzilla has a small workaround you
> might try. It forces the driver to try to recover from the hang, so
> you might not need to reboot.

Well, the rebooting isn't a huge deal, but if there's something I can
track/kick or force to core dump, would it help?

-chris


2010-02-08 17:37:13

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Mon, 8 Feb 2010 12:24:10 -0500
Chris Mason <[email protected]> wrote:

> On Fri, Feb 05, 2010 at 11:17:47AM -0800, Jesse Barnes wrote:
> > On Fri, 5 Feb 2010 14:09:16 -0500
> > Chris Mason <[email protected]> wrote:
> >
> > > On Fri, Feb 05, 2010 at 11:01:00AM -0800, Jesse Barnes wrote:
> > > > On Mon, 1 Feb 2010 01:43:19 +0100 (CET)
> > > > "Rafael J. Wysocki" <[email protected]> wrote:
> > > >
> > > > > This message has been generated automatically as a part of a
> > > > > report of regressions introduced between 2.6.31 and 2.6.32.
> > > > >
> > > > > The following bug entry is on the current list of known
> > > > > regressions introduced between 2.6.31 and 2.6.32. Please
> > > > > verify if it still should be listed and let me know (either
> > > > > way).
> > > > >
> > > > >
> > > > > Bug-Entry :
> > > > > http://bugzilla.kernel.org/show_bug.cgi?id=15004
> > > > > Subject : i915: *ERROR* Execbuf while wedged
> > > > > Submitter : tomas m <[email protected]>
> > > > > Date : 2010-01-07 18:53 (25 days old)
> > > >
> > > > Can you bisect this to a particular kernel commit? The
> > > > particular error message means the kernel detected a GPU hang.
> > > > That's usually a userspace bug, but the kernel should recover
> > > > from it.
> > >
> > > I see these about once a week, which would be a very difficult
> > > bisect. The kernel does recover when I reboot, but beyond that
> > > the messages loop forever.
> > >
> > > I haven't yet seen it on 2.6.33-rc, but I'm still struggling with
> > > suspend/resume failures and haven't really had a one week up time
> > > yet.
>
> Ok, updating to rc7 and updating my xf86 driver to 2.10 seems to have
> fixed up my suspend/resume problems. So, I should be able to trigger
> the execbuf problem again.

I've heard some reports that the 2D driver introduces and fixes hangs,
so it's possible 2.10 will fix both issues for you.

> Well, the rebooting isn't a huge deal, but if there's something I can
> track/kick or force to core dump, would it help?

We do have a test in intel-gpu-tools that will instigate a hang, but it
would be best to figure out what's causing it in your environment.

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-08 20:01:23

by Chris Mason

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Mon, Feb 08, 2010 at 09:35:51AM -0800, Jesse Barnes wrote:
> >
> > Ok, updating to rc7 and updating my xf86 driver to 2.10 seems to have
> > fixed up my suspend/resume problems. So, I should be able to trigger
> > the execbuf problem again.
>
> I've heard some reports that the 2D driver introduces and fixes hangs,
> so it's possible 2.10 will fix both issues for you.
>
> > Well, the rebooting isn't a huge deal, but if there's something I can
> > track/kick or force to core dump, would it help?
>
> We do have a test in intel-gpu-tools that will instigate a hang, but it
> would be best to figure out what's causing it in your environment.
>

[63516.632060] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[63516.632069] render error detected, EIR: 0x00000000
[63516.632092] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 2590581 at 2590579)

So, this popped up while surfing in firefox, which is usually where I
hit the execbuf errors. X was totally stuck afterwards, but I could
switch to a vc and get the dmesg.

If there are specific procs that I can try to get traces of, just let
me know for next time.

-chris

2010-02-08 23:41:01

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Mon, 8 Feb 2010 15:00:44 -0500
Chris Mason <[email protected]> wrote:

> On Mon, Feb 08, 2010 at 09:35:51AM -0800, Jesse Barnes wrote:
> > >
> > > Ok, updating to rc7 and updating my xf86 driver to 2.10 seems to
> > > have fixed up my suspend/resume problems. So, I should be able
> > > to trigger the execbuf problem again.
> >
> > I've heard some reports that the 2D driver introduces and fixes
> > hangs, so it's possible 2.10 will fix both issues for you.
> >
> > > Well, the rebooting isn't a huge deal, but if there's something I
> > > can track/kick or force to core dump, would it help?
> >
> > We do have a test in intel-gpu-tools that will instigate a hang,
> > but it would be best to figure out what's causing it in your
> > environment.
> >
>
> [63516.632060] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
> elapsed... GPU hung [63516.632069] render error detected, EIR:
> 0x00000000 [63516.632092] [drm:i915_do_wait_request] *ERROR*
> i915_do_wait_request returns -5 (awaiting 2590581 at 2590579)
>
> So, this popped up while surfing in firefox, which is usually where I
> hit the execbuf errors. X was totally stuck afterwards, but I could
> switch to a vc and get the dmesg.
>
> If there are specific procs that I can try to get traces of, just let
> me know for next time.

Hm, EIR is clear so this may be a failure of our hangcheck timer.

Chris Wilson saw these recently too; hoping he has ideas.

--
Jesse Barnes, Intel Open Source Technology Center

2010-02-10 16:45:50

by Jesse Barnes

[permalink] [raw]
Subject: Re: [Bug #15004] i915: *ERROR* Execbuf while wedged

On Mon, 8 Feb 2010 15:39:50 -0800
Jesse Barnes <[email protected]> wrote:

> On Mon, 8 Feb 2010 15:00:44 -0500
> Chris Mason <[email protected]> wrote:
>
> > On Mon, Feb 08, 2010 at 09:35:51AM -0800, Jesse Barnes wrote:
> > > >
> > > > Ok, updating to rc7 and updating my xf86 driver to 2.10 seems to
> > > > have fixed up my suspend/resume problems. So, I should be able
> > > > to trigger the execbuf problem again.
> > >
> > > I've heard some reports that the 2D driver introduces and fixes
> > > hangs, so it's possible 2.10 will fix both issues for you.
> > >
> > > > Well, the rebooting isn't a huge deal, but if there's something
> > > > I can track/kick or force to core dump, would it help?
> > >
> > > We do have a test in intel-gpu-tools that will instigate a hang,
> > > but it would be best to figure out what's causing it in your
> > > environment.
> > >
> >
> > [63516.632060] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
> > elapsed... GPU hung [63516.632069] render error detected, EIR:
> > 0x00000000 [63516.632092] [drm:i915_do_wait_request] *ERROR*
> > i915_do_wait_request returns -5 (awaiting 2590581 at 2590579)
> >
> > So, this popped up while surfing in firefox, which is usually where
> > I hit the execbuf errors. X was totally stuck afterwards, but I
> > could switch to a vc and get the dmesg.
> >
> > If there are specific procs that I can try to get traces of, just
> > let me know for next time.
>
> Hm, EIR is clear so this may be a failure of our hangcheck timer.
>
> Chris Wilson saw these recently too; hoping he has ideas.

The kernel bz was updated with a patch to libdrm that fixed this issue
for at least one user. Can you confirm?

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2010-02-15 16:24:57

by Alexander Clouter

[permalink] [raw]
Subject: Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

Hi,

* Alexander Clouter <[email protected]> [2010-02-08 09:59:43+0000]:
>
> * Thomas Backlund <[email protected]> [2010-02-01 17:47:42+0200]:
> >
> > 01.02.2010 02:43, Rafael J. Wysocki skrev:
> >> This message has been generated automatically as a part of a report
> >> of regressions introduced between 2.6.31 and 2.6.32.
> >>
> >> The following bug entry is on the current list of known regressions
> >> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> >> be listed and let me know (either way).
> >>
> >>
> >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
> >> Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
> >> Submitter : Alexander Clouter<[email protected]>
> >> Date : 2009-10-23 10:30 (101 days old)
> >> References : http://lkml.org/lkml/2009/10/23/50
> >>
> >
> > Afaik this is the same issue as the one referenced here:
> >
> > http://lkml.org/lkml/2010/1/28/292
> >
> > The patch in the above thread should fix the issue.
> >
> I'm getting my hands on some not-in-production-state equipment to test
> this with, so bear with me :)
>
I can confirm this patch fixes the problem I was seeing. Once applied I
could no longer OOP's my kernel.

Cheers

--
Alexander Clouter
.sigmonster says: Please take note:

2010-02-15 20:59:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

On Monday 15 February 2010, Alexander Clouter wrote:
> Hi,
>
> * Alexander Clouter <[email protected]> [2010-02-08 09:59:43+0000]:
> >
> > * Thomas Backlund <[email protected]> [2010-02-01 17:47:42+0200]:
> > >
> > > 01.02.2010 02:43, Rafael J. Wysocki skrev:
> > >> This message has been generated automatically as a part of a report
> > >> of regressions introduced between 2.6.31 and 2.6.32.
> > >>
> > >> The following bug entry is on the current list of known regressions
> > >> introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > >> be listed and let me know (either way).
> > >>
> > >>
> > >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14482
> > >> Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
> > >> Submitter : Alexander Clouter<[email protected]>
> > >> Date : 2009-10-23 10:30 (101 days old)
> > >> References : http://lkml.org/lkml/2009/10/23/50
> > >>
> > >
> > > Afaik this is the same issue as the one referenced here:
> > >
> > > http://lkml.org/lkml/2010/1/28/292
> > >
> > > The patch in the above thread should fix the issue.
> > >
> > I'm getting my hands on some not-in-production-state equipment to test
> > this with, so bear with me :)
> >
> I can confirm this patch fixes the problem I was seeing. Once applied I
> could no longer OOP's my kernel.

The patch has already been merged, so the bug is closed.

Rafael

2010-02-24 14:38:10

by Jan Beulich

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

>>> "Justin P. Mattock" <[email protected]> 03.02.10 02:43 >>>
Could you try this simple patch (against plain 2.6.33-rc8)?

Thanks, Jan

--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -82,6 +82,9 @@ enum fixed_addresses {
#endif
FIX_DBGP_BASE,
FIX_EARLYCON_MEM_BASE,
+#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
+ FIX_OHCI1394_BASE,
+#endif
#ifdef CONFIG_X86_LOCAL_APIC
FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */
#endif
@@ -126,9 +129,6 @@ enum fixed_addresses {
FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 256 -
(__end_of_permanent_fixed_addresses & 255),
FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_SLOTS - 1,
-#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
- FIX_OHCI1394_BASE,
-#endif
#ifdef CONFIG_X86_32
FIX_WP_TEST,
#endif

2010-02-24 15:59:08

by Justin P. Mattock

[permalink] [raw]
Subject: Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

On 02/24/2010 06:37 AM, Jan Beulich wrote:
>>>> "Justin P. Mattock"<[email protected]> 03.02.10 02:43>>>
> Could you try this simple patch (against plain 2.6.33-rc8)?
>
> Thanks, Jan
>
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -82,6 +82,9 @@ enum fixed_addresses {
> #endif
> FIX_DBGP_BASE,
> FIX_EARLYCON_MEM_BASE,
> +#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
> + FIX_OHCI1394_BASE,
> +#endif
> #ifdef CONFIG_X86_LOCAL_APIC
> FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */
> #endif
> @@ -126,9 +129,6 @@ enum fixed_addresses {
> FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 256 -
> (__end_of_permanent_fixed_addresses& 255),
> FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_SLOTS - 1,
> -#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
> - FIX_OHCI1394_BASE,
> -#endif
> #ifdef CONFIG_X86_32
> FIX_WP_TEST,
> #endif
>
>
>


heres the bug report on this..
http://bugzilla.kernel.org/show_bug.cgi?id=14487

Justin P. Mattock