2009-06-28 06:22:12

by Alberto Gonzalez

[permalink] [raw]
Subject: Kernel 2.6.30 and udevd problem

[Sorry, wrong udev list in previous message.]

Hello,

I have recently upgraded my kernel to 2.6.30 and I've noticed that at some
point (randomly), udevd would start using a lot of CPU time. Killing the
process and starting it with --debug option prints these messages when the
problem starts:

[22652] event_queue_insert: seq 168515 queued, 'change' 'drm'
[22657] udev_device_new_from_syspath: device 0x2040320 has devpath
'/devices/pci0000:00/0000:00:02.0/drm/card0'
[22657] udev_device_read_db: device 0x2040320 filled with db symlink data
'/dev/dri/card0'
[22657] udev_rules_apply_to_event: LINK 'char/226:0' /lib/udev/rules.d/50-
udev-default.rules:5
[22657] udev_rules_apply_to_event: NAME 'dri/card0' /lib/udev/rules.d/50-udev-
default.rules:38
[22657] udev_rules_apply_to_event: GROUP 91 /lib/udev/rules.d/50-udev-
default.rules:42
[22657] udev_device_new_from_syspath: device 0x2040790 has devpath
'/devices/pci0000:00/0000:00:02.0'
[22657] udev_device_new_from_syspath: device 0x2040a80 has devpath
'/devices/pci0000:00'
[22657] udev_rules_apply_to_event: RUN
'socket:@/org/freedesktop/hal/udev_event' /etc/udev/rules.d/90-hal.rules:2
[22657] udev_device_update_db: create db link (dri/card0 char/226:0)
[22657] udev_node_add: creating device node '/dev/dri/card0', devnum=226:0,
mode=0660, uid=0, gid=91
[22657] udev_node_mknod: preserve file '/dev/dri/card0', because it has correct
dev_t
[22657] update_link: '/dev/char/226:0' with target '/dev/dri/card0' has the
highest priority 0, create it
[22657] node_symlink: preserve already existing symlink '/dev/char/226:0' to
'../dri/card0'
[22657] udev_monitor_send_device: passed 230 bytes to monitor 0x2040320
[22657] udev_monitor_send_device: passed -1 bytes to monitor 0x2036150
[22657] event_fork: seq 168515 exit with 0
[22652] event_fork: seq 168515 forked, pid [22657], 'change' 'drm', 0 seconds
old
[22652] udev_done: seq 168515 cleanup, pid [22657], status 0, 0 seconds old


This sequence repeats ad infinitum. I have no clue what it means, so any help
is appreciated.

My system is a Dell Studio desktop with Intel graphics (G45). I run a standard
distro kernel (Arch Linux). Please ask me for any other relevant info.

Thanks,
Alberto.


2009-06-28 10:39:20

by Kay Sievers

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sun, Jun 28, 2009 at 08:21, Alberto Gonzalez<[email protected]> wrote:
> I have recently upgraded my kernel to 2.6.30 and I've noticed that at some
> point (randomly), udevd would start using a lot of CPU time. Killing the
> process and starting it with --debug option prints these messages when the
> problem starts:
>
> [22652] event_queue_insert: seq 168515 queued, 'change' 'drm'

Care to run:
udevadm monitor
and provide the output of the event sequence dump?

Kay

2009-06-28 12:37:16

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sunday 28 June 2009 12:37:28 Kay Sievers wrote:
> On Sun, Jun 28, 2009 at 08:21, Alberto Gonzalez<[email protected]>
wrote:
> > I have recently upgraded my kernel to 2.6.30 and I've noticed that at
> > some point (randomly), udevd would start using a lot of CPU time. Killing
> > the process and starting it with --debug option prints these messages
> > when the problem starts:
> >
> > [22652] event_queue_insert: seq 168515 queued, 'change' 'drm'
>
> Care to run:
> udevadm monitor
> and provide the output of the event sequence dump?

This is what I got wen the problem finally came back (it happens randomly, this
time I triggered it by running glxgears, but it can happen for other reasons
and none of them triggers it automatically):

KERNEL[1246192153.094553] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.094734] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.095273] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.095498] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.095665] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.095808] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.096463] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
KERNEL[1246192153.099394] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.103806] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.111549] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.120800] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.127997] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.135699] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.143440] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.151183] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.157421] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.163320] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.170793] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.178593] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)


So just an endless loop of the same message again and again.

Thanks,
Alberto.
>
> Kay

2009-06-28 12:50:18

by Kay Sievers

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sun, Jun 28, 2009 at 14:37, Alberto Gonzalez<[email protected]> wrote:

> This is what I got wen the problem finally came back (it happens randomly, this
> time I triggered it by running glxgears, but it can happen for other reasons
> and none of them triggers it automatically):
>
> KERNEL[1246192153.094553] change   /devices/pci0000:00/0000:00:02.0/drm/card0
> (drm)

> UDEV  [1246192153.178593] change   /devices/pci0000:00/0000:00:02.0/drm/card0
> (drm)
>
> So just an endless loop of the same message again and again.

If you see this loop and kill the udev daemon, the UDEV events will
stop. But do the KERNEL events continue, or do all events stop? This
should tell us if some udev rules trigger something here, or if it is
a loop in the kernel.

Kay

2009-06-28 14:22:20

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sunday 28 June 2009 14:49:52 Kay Sievers wrote:
> On Sun, Jun 28, 2009 at 14:37, Alberto Gonzalez<[email protected]>
wrote:
> > This is what I got wen the problem finally came back (it happens
> > randomly, this time I triggered it by running glxgears, but it can happen
> > for other reasons and none of them triggers it automatically):
> >
> > KERNEL[1246192153.094553] change
> > /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> >
> > UDEV [1246192153.178593] change
> > /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> >
> > So just an endless loop of the same message again and again.
>
> If you see this loop and kill the udev daemon, the UDEV events will
> stop. But do the KERNEL events continue, or do all events stop? This
> should tell us if some udev rules trigger something here, or if it is
> a loop in the kernel.

When I kill udevd, the KERNEL messages continue.

>
> Kay

2009-06-28 14:29:17

by Kay Sievers

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sun, Jun 28, 2009 at 16:22, Alberto Gonzalez<[email protected]> wrote:
> On Sunday 28 June 2009 14:49:52 Kay Sievers wrote:
>> On Sun, Jun 28, 2009 at 14:37, Alberto Gonzalez<[email protected]>
> wrote:
>> > This is what I got wen the problem finally came back (it happens
>> > randomly, this time I triggered it by running glxgears, but it can happen
>> > for other reasons and none of them triggers it automatically):
>> >
>> > KERNEL[1246192153.094553] change
>> > /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
>> >
>> > UDEV  [1246192153.178593] change
>> > /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
>> >
>> > So just an endless loop of the same message again and again.
>>
>> If you see this loop and kill the udev daemon, the UDEV events will
>> stop. But do the KERNEL events continue, or do all events stop? This
>> should tell us if some udev rules trigger something here, or if it is
>> a loop in the kernel.
>
> When I kill udevd, the KERNEL messages continue.

If there isn't something else running which acts on uevents that
trigger drm events, which I wouldn't expect, it seems like a drm
kernel problem.

Kay

2009-06-30 03:40:32

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> If there isn't something else running which acts on uevents that
> trigger drm events, which I wouldn't expect, it seems like a drm
> kernel problem.

Ok, thanks for looking at it. I'll sum up the problem for DRM people:

The problem started after upgrading to 2.6.30. At some point, udevd starts to
use a lot of CPU time. It happens randomly, but it seems easier to trigger
when running something graphics intensive (glxgears, gtkperf, tuxracer..).

Killing udevd and starting it with the --debug switch throws up this when the
problem starts:

[22652] event_queue_insert: seq 168515 queued, 'change' 'drm'
[22657] udev_device_new_from_syspath: device 0x2040320 has devpath
'/devices/pci0000:00/0000:00:02.0/drm/card0'
[22657] udev_device_read_db: device 0x2040320 filled with db symlink data
'/dev/dri/card0'
[22657] udev_rules_apply_to_event: LINK 'char/226:0' /lib/udev/rules.d/50-
udev-default.rules:5
[22657] udev_rules_apply_to_event: NAME 'dri/card0' /lib/udev/rules.d/50-udev-
default.rules:38
[22657] udev_rules_apply_to_event: GROUP 91 /lib/udev/rules.d/50-udev-
default.rules:42
[22657] udev_device_new_from_syspath: device 0x2040790 has devpath
'/devices/pci0000:00/0000:00:02.0'
[22657] udev_device_new_from_syspath: device 0x2040a80 has devpath
'/devices/pci0000:00'
[22657] udev_rules_apply_to_event: RUN
'socket:@/org/freedesktop/hal/udev_event' /etc/udev/rules.d/90-hal.rules:2
[22657] udev_device_update_db: create db link (dri/card0 char/226:0)
[22657] udev_node_add: creating device node '/dev/dri/card0', devnum=226:0,
mode=0660, uid=0, gid=91
[22657] udev_node_mknod: preserve file '/dev/dri/card0', because it has correct
dev_t
[22657] update_link: '/dev/char/226:0' with target '/dev/dri/card0' has the
highest priority 0, create it
[22657] node_symlink: preserve already existing symlink '/dev/char/226:0' to
'../dri/card0'
[22657] udev_monitor_send_device: passed 230 bytes to monitor 0x2040320
[22657] udev_monitor_send_device: passed -1 bytes to monitor 0x2036150
[22657] event_fork: seq 168515 exit with 0
[22652] event_fork: seq 168515 forked, pid [22657], 'change' 'drm', 0 seconds
old
[22652] udev_done: seq 168515 cleanup, pid [22657], status 0, 0 seconds old

It goes on in an infinite loop.

Then using "udevadm monitor" I also get a loop of these two messages:

KERNEL[1246192153.094553] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)
UDEV [1246192153.096463] change /devices/pci0000:00/0000:00:02.0/drm/card0
(drm)

Killing udevd stops the UDEV messages (and CPU usage goes down), but the
KERNEL messages continue.

My system is a Dell Studio desktop with Intel graphics (G45). I'm using a
standard distro kernel (Arch LInux). The problem occurs both with EXA and with
UXA+KMS, and in both 32 bit and 64 bit systems.

Thanks,
Alberto.

2009-06-30 03:46:46

by Dave Airlie

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem


> On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > If there isn't something else running which acts on uevents that
> > trigger drm events, which I wouldn't expect, it seems like a drm
> > kernel problem.
>
> Ok, thanks for looking at it. I'll sum up the problem for DRM people:
>
> The problem started after upgrading to 2.6.30. At some point, udevd starts to
> use a lot of CPU time. It happens randomly, but it seems easier to trigger
> when running something graphics intensive (glxgears, gtkperf, tuxracer..).
>
> Killing udevd and starting it with the --debug switch throws up this when the
> problem starts:

I've added jbarnes to the list,

Jesses are we sending events yet? what for?

Dave.

>
> [22652] event_queue_insert: seq 168515 queued, 'change' 'drm'
> [22657] udev_device_new_from_syspath: device 0x2040320 has devpath
> '/devices/pci0000:00/0000:00:02.0/drm/card0'
> [22657] udev_device_read_db: device 0x2040320 filled with db symlink data
> '/dev/dri/card0'
> [22657] udev_rules_apply_to_event: LINK 'char/226:0' /lib/udev/rules.d/50-
> udev-default.rules:5
> [22657] udev_rules_apply_to_event: NAME 'dri/card0' /lib/udev/rules.d/50-udev-
> default.rules:38
> [22657] udev_rules_apply_to_event: GROUP 91 /lib/udev/rules.d/50-udev-
> default.rules:42
> [22657] udev_device_new_from_syspath: device 0x2040790 has devpath
> '/devices/pci0000:00/0000:00:02.0'
> [22657] udev_device_new_from_syspath: device 0x2040a80 has devpath
> '/devices/pci0000:00'
> [22657] udev_rules_apply_to_event: RUN
> 'socket:@/org/freedesktop/hal/udev_event' /etc/udev/rules.d/90-hal.rules:2
> [22657] udev_device_update_db: create db link (dri/card0 char/226:0)
> [22657] udev_node_add: creating device node '/dev/dri/card0', devnum=226:0,
> mode=0660, uid=0, gid=91
> [22657] udev_node_mknod: preserve file '/dev/dri/card0', because it has correct
> dev_t
> [22657] update_link: '/dev/char/226:0' with target '/dev/dri/card0' has the
> highest priority 0, create it
> [22657] node_symlink: preserve already existing symlink '/dev/char/226:0' to
> '../dri/card0'
> [22657] udev_monitor_send_device: passed 230 bytes to monitor 0x2040320
> [22657] udev_monitor_send_device: passed -1 bytes to monitor 0x2036150
> [22657] event_fork: seq 168515 exit with 0
> [22652] event_fork: seq 168515 forked, pid [22657], 'change' 'drm', 0 seconds
> old
> [22652] udev_done: seq 168515 cleanup, pid [22657], status 0, 0 seconds old
>
> It goes on in an infinite loop.
>
> Then using "udevadm monitor" I also get a loop of these two messages:
>
> KERNEL[1246192153.094553] change /devices/pci0000:00/0000:00:02.0/drm/card0
> (drm)
> UDEV [1246192153.096463] change /devices/pci0000:00/0000:00:02.0/drm/card0
> (drm)
>
> Killing udevd stops the UDEV messages (and CPU usage goes down), but the
> KERNEL messages continue.
>
> My system is a Dell Studio desktop with Intel graphics (G45). I'm using a
> standard distro kernel (Arch LInux). The problem occurs both with EXA and with
> UXA+KMS, and in both 32 bit and 64 bit systems.
>
> Thanks,
> Alberto.
>
>

2009-06-30 16:08:45

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
Dave Airlie <[email protected]> wrote:

>
> > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > If there isn't something else running which acts on uevents that
> > > trigger drm events, which I wouldn't expect, it seems like a drm
> > > kernel problem.
> >
> > Ok, thanks for looking at it. I'll sum up the problem for DRM
> > people:
> >
> > The problem started after upgrading to 2.6.30. At some point, udevd
> > starts to use a lot of CPU time. It happens randomly, but it seems
> > easier to trigger when running something graphics intensive
> > (glxgears, gtkperf, tuxracer..).
> >
> > Killing udevd and starting it with the --debug switch throws up
> > this when the problem starts:
>
> I've added jbarnes to the list,
>
> Jesses are we sending events yet? what for?

Right now we just send uevents at hotplug time, so maybe one of our
hotplug interrupt bits is getting stuck, resulting in a continuous
stream of events as we generate other interrupts (which would happen
when running 3D apps for example).

There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c under
the if (I915_HAS_HOTPLUG(dev)) { check, if you make it into DRM_ERROR
we can see which one is getting stuck.

--
Jesse Barnes, Intel Open Source Technology Center

2009-07-01 07:09:19

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Tuesday 30 June 2009 18:08:35 Jesse Barnes wrote:
> On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
>
> Dave Airlie <[email protected]> wrote:
> > > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > > If there isn't something else running which acts on uevents that
> > > > trigger drm events, which I wouldn't expect, it seems like a drm
> > > > kernel problem.
> > >
> > > Ok, thanks for looking at it. I'll sum up the problem for DRM
> > > people:
> > >
> > > The problem started after upgrading to 2.6.30. At some point, udevd
> > > starts to use a lot of CPU time. It happens randomly, but it seems
> > > easier to trigger when running something graphics intensive
> > > (glxgears, gtkperf, tuxracer..).
> > >
> > > Killing udevd and starting it with the --debug switch throws up
> > > this when the problem starts:
> >
> > I've added jbarnes to the list,
> >
> > Jesses are we sending events yet? what for?
>
> Right now we just send uevents at hotplug time, so maybe one of our
> hotplug interrupt bits is getting stuck, resulting in a continuous
> stream of events as we generate other interrupts (which would happen
> when running 3D apps for example).
>
> There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c under
> the if (I915_HAS_HOTPLUG(dev)) { check, if you make it into DRM_ERROR
> we can see which one is getting stuck.

I am afraid I'll need a bit more guidance here. I guess this means patching
the kernel. Would it be possible to get a test patch against 2.6.30? And then
after patching and compiling, how should I debug it?

Thanks,
Alberto.

2009-07-01 17:22:29

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wed, 1 Jul 2009 09:09:03 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Tuesday 30 June 2009 18:08:35 Jesse Barnes wrote:
> > On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
> >
> > Dave Airlie <[email protected]> wrote:
> > > > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > > > If there isn't something else running which acts on uevents
> > > > > that trigger drm events, which I wouldn't expect, it seems
> > > > > like a drm kernel problem.
> > > >
> > > > Ok, thanks for looking at it. I'll sum up the problem for DRM
> > > > people:
> > > >
> > > > The problem started after upgrading to 2.6.30. At some point,
> > > > udevd starts to use a lot of CPU time. It happens randomly, but
> > > > it seems easier to trigger when running something graphics
> > > > intensive (glxgears, gtkperf, tuxracer..).
> > > >
> > > > Killing udevd and starting it with the --debug switch throws up
> > > > this when the problem starts:
> > >
> > > I've added jbarnes to the list,
> > >
> > > Jesses are we sending events yet? what for?
> >
> > Right now we just send uevents at hotplug time, so maybe one of our
> > hotplug interrupt bits is getting stuck, resulting in a continuous
> > stream of events as we generate other interrupts (which would happen
> > when running 3D apps for example).
> >
> > There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c
> > under the if (I915_HAS_HOTPLUG(dev)) { check, if you make it into
> > DRM_ERROR we can see which one is getting stuck.
>
> I am afraid I'll need a bit more guidance here. I guess this means
> patching the kernel. Would it be possible to get a test patch against
> 2.6.30? And then after patching and compiling, how should I debug it?

Here's a patch against git, it should apply to 2.6.30 though I think.

I'll just need your dmesg output from when the problem is occurring (if
I'm right this patch should flood your logs).

--
Jesse Barnes, Intel Open Source Technology Center


Attachments:
(No filename) (1.92 kB)
i915-hotplug-error.patch (596.00 B)
Download all attachments

2009-07-02 06:19:17

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wednesday 01 July 2009 19:22:14 Jesse Barnes wrote:
> On Wed, 1 Jul 2009 09:09:03 +0200
>
> Alberto Gonzalez <[email protected]> wrote:
> > On Tuesday 30 June 2009 18:08:35 Jesse Barnes wrote:
> > > On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
> > >
> > > Dave Airlie <[email protected]> wrote:
> > > > > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > > > > If there isn't something else running which acts on uevents
> > > > > > that trigger drm events, which I wouldn't expect, it seems
> > > > > > like a drm kernel problem.
> > > > >
> > > > > Ok, thanks for looking at it. I'll sum up the problem for DRM
> > > > > people:
> > > > >
> > > > > The problem started after upgrading to 2.6.30. At some point,
> > > > > udevd starts to use a lot of CPU time. It happens randomly, but
> > > > > it seems easier to trigger when running something graphics
> > > > > intensive (glxgears, gtkperf, tuxracer..).
> > > > >
> > > > > Killing udevd and starting it with the --debug switch throws up
> > > > > this when the problem starts:
> > > >
> > > > I've added jbarnes to the list,
> > > >
> > > > Jesses are we sending events yet? what for?
> > >
> > > Right now we just send uevents at hotplug time, so maybe one of our
> > > hotplug interrupt bits is getting stuck, resulting in a continuous
> > > stream of events as we generate other interrupts (which would happen
> > > when running 3D apps for example).
> > >
> > > There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c
> > > under the if (I915_HAS_HOTPLUG(dev)) { check, if you make it into
> > > DRM_ERROR we can see which one is getting stuck.
> >
> > I am afraid I'll need a bit more guidance here. I guess this means
> > patching the kernel. Would it be possible to get a test patch against
> > 2.6.30? And then after patching and compiling, how should I debug it?
>
> Here's a patch against git, it should apply to 2.6.30 though I think.
>
> I'll just need your dmesg output from when the problem is occurring (if
> I'm right this patch should flood your logs).

Thanks, the patch applied to 2.6.30 and when the problem stars I get these two
lines repeated all the time in dmesg:

[drm:i915_driver_irq_handler] *ERROR* hotplug event received, stat 0x10200300
[drm:i915_driver_irq_handler] *ERROR* hotplug event received, stat 0x18200300

Is this enough or should I provide something else?

Alberto.

2009-07-02 16:35:57

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Thu, 2 Jul 2009 08:18:58 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Wednesday 01 July 2009 19:22:14 Jesse Barnes wrote:
> > On Wed, 1 Jul 2009 09:09:03 +0200
> >
> > Alberto Gonzalez <[email protected]> wrote:
> > > On Tuesday 30 June 2009 18:08:35 Jesse Barnes wrote:
> > > > On Tue, 30 Jun 2009 04:46:38 +0100 (IST)
> > > >
> > > > Dave Airlie <[email protected]> wrote:
> > > > > > On Sunday 28 June 2009 16:28:44 Kay Sievers wrote:
> > > > > > > If there isn't something else running which acts on
> > > > > > > uevents that trigger drm events, which I wouldn't expect,
> > > > > > > it seems like a drm kernel problem.
> > > > > >
> > > > > > Ok, thanks for looking at it. I'll sum up the problem for
> > > > > > DRM people:
> > > > > >
> > > > > > The problem started after upgrading to 2.6.30. At some
> > > > > > point, udevd starts to use a lot of CPU time. It happens
> > > > > > randomly, but it seems easier to trigger when running
> > > > > > something graphics intensive (glxgears, gtkperf,
> > > > > > tuxracer..).
> > > > > >
> > > > > > Killing udevd and starting it with the --debug switch
> > > > > > throws up this when the problem starts:
> > > > >
> > > > > I've added jbarnes to the list,
> > > > >
> > > > > Jesses are we sending events yet? what for?
> > > >
> > > > Right now we just send uevents at hotplug time, so maybe one of
> > > > our hotplug interrupt bits is getting stuck, resulting in a
> > > > continuous stream of events as we generate other interrupts
> > > > (which would happen when running 3D apps for example).
> > > >
> > > > There's a DRM_DEBUG statement in drivers/gpu/drm/i915/i915_irq.c
> > > > under the if (I915_HAS_HOTPLUG(dev)) { check, if you make it
> > > > into DRM_ERROR we can see which one is getting stuck.
> > >
> > > I am afraid I'll need a bit more guidance here. I guess this means
> > > patching the kernel. Would it be possible to get a test patch
> > > against 2.6.30? And then after patching and compiling, how should
> > > I debug it?
> >
> > Here's a patch against git, it should apply to 2.6.30 though I
> > think.
> >
> > I'll just need your dmesg output from when the problem is occurring
> > (if I'm right this patch should flood your logs).
>
> Thanks, the patch applied to 2.6.30 and when the problem stars I get
> these two lines repeated all the time in dmesg:
>
> [drm:i915_driver_irq_handler] *ERROR* hotplug event received, stat
> 0x10200300 [drm:i915_driver_irq_handler] *ERROR* hotplug event
> received, stat 0x18200300
>
> Is this enough or should I provide something else?

Interesting. That hotplug status shouldn't end up generating any
uevents. Can you try this patch to see what's going on?

--
Jesse Barnes, Intel Open Source Technology Center


Attachments:
(No filename) (2.70 kB)
drm-uevent-debug.patch (1.25 kB)
Download all attachments

2009-07-02 19:37:39

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Thursday 02 July 2009 18:35:47 Jesse Barnes wrote:
> On Thu, 2 Jul 2009 08:18:58 +0200
>
> Alberto Gonzalez <[email protected]> wrote:
> > Thanks, the patch applied to 2.6.30 and when the problem stars I get
> > these two lines repeated all the time in dmesg:
> >
> > [drm:i915_driver_irq_handler] *ERROR* hotplug event received, stat
> > 0x10200300 [drm:i915_driver_irq_handler] *ERROR* hotplug event
> > received, stat 0x18200300
> >
> > Is this enough or should I provide something else?
>
> Interesting. That hotplug status shouldn't end up generating any
> uevents. Can you try this patch to see what's going on?

This is what I get with the patch applied (repeated again and again):

------------[ cut here ]------------
WARNING: at drivers/gpu/drm/drm_sysfs.c:452 drm_sysfs_hotplug_event+0x3e/0x90
[drm]()
Hardware name: Studio 540
hotplug uevent
Modules linked in: ipv6 usbhid hid ext4 jbd2 crc16 usb_storage snd_seq_dummy
snd_hda_codec_intelhdmi snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_hda_codec_realtek snd_pcm_oss snd_mixer_oss psmouse ohci1394 snd_hda_intel
ieee1394 i2c_i801 uhci_hcd serio_raw sg dcdbas iTCO_wdt iTCO_vendor_support
snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc r8169
mii ehci_hcd usbcore evdev thermal fan button battery ac cpufreq_ondemand
acpi_cpufreq freq_table processor rtc_cmos rtc_core rtc_lib ext3 jbd mbcache
sd_mod sr_mod cdrom pata_acpi ata_generic ata_piix libata scsi_mod i915
i2c_algo_bit video output drm i2c_core intel_agp agpgart
Pid: 9, comm: events/0 Tainted: G W 2.6.30-ARCH #1
Call Trace:
[<c013b17a>] ? warn_slowpath_common+0x7a/0xc0
[<f82be0fe>] ? drm_sysfs_hotplug_event+0x3e/0x90 [drm]
[<f83bb410>] ? i915_hotplug_work_func+0x0/0x30 [i915]
[<c013b237>] ? warn_slowpath_fmt+0x37/0x60
[<f82be0fe>] ? drm_sysfs_hotplug_event+0x3e/0x90 [drm]
[<c014f61f>] ? worker_thread+0x11f/0x280
[<c0154900>] ? autoremove_wake_function+0x0/0x60
[<c014f500>] ? worker_thread+0x0/0x280
[<c0154452>] ? kthread+0x52/0x90
[<c0154400>] ? kthread+0x0/0x90
[<c01047c7>] ? kernel_thread_helper+0x7/0x10
---[ end trace a39351f6aee6be08 ]---

Thanks,
Alberto.

2009-07-02 22:32:50

by Michal Soltys

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

Regarding Arch's default distro kernel - it's a bit patched. Most of the
patch file is addition of aufs, but there're other diffs included as
well, some of them modifying i915_reg.h, intel_display.c and intel_tv.c

ftp://ftp.archlinux.org/other/kernel26/patch-2.6.30-5-ARCH.bz2

Not sure how relevant that is, but fyi.

2009-07-04 22:15:59

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Friday 03 July 2009 00:00:49 Michal Soltys wrote:
> Regarding Arch's default distro kernel - it's a bit patched. Most of the
> patch file is addition of aufs, but there're other diffs included as
> well, some of them modifying i915_reg.h, intel_display.c and intel_tv.c
>
> ftp://ftp.archlinux.org/other/kernel26/patch-2.6.30-5-ARCH.bz2
>
> Not sure how relevant that is, but fyi.

Yes, thanks for pointing that out.

I now tried a vanilla 2.6.30.1 kernel with a custom config (so I don't have to
wait 45 minutes for it to compile) and I still see the problem. This is what I
get in dmesg:

[ 83.506496] ------------[ cut here ]------------
[ 83.506500] WARNING: at drivers/gpu/drm/drm_sysfs.c:452
drm_sysfs_hotplug_event+0x2b/0x63()
[ 83.506502] Hardware name: Studio 540
[ 83.506503] hotplug uevent
[ 83.506504] Modules linked in:
[ 83.506506] Pid: 7, comm: events/0 Tainted: G W 2.6.30.1 #1
[ 83.506507] Call Trace:
[ 83.506511] [<c1025df9>] warn_slowpath_common+0x65/0x7c
[ 83.506514] [<c11e9f9e>] ? drm_sysfs_hotplug_event+0x2b/0x63
[ 83.506517] [<c1025e44>] warn_slowpath_fmt+0x24/0x27
[ 83.506520] [<c11e9f9e>] drm_sysfs_hotplug_event+0x2b/0x63
[ 83.506523] [<c11f33de>] i915_hotplug_work_func+0xe/0x10
[ 83.506525] [<c1032f28>] worker_thread+0x131/0x1ab
[ 83.506528] [<c11f33d0>] ? i915_hotplug_work_func+0x0/0x10
[ 83.506531] [<c1035f61>] ? autoremove_wake_function+0x0/0x2f
[ 83.506534] [<c1032df7>] ? worker_thread+0x0/0x1ab
[ 83.506536] [<c1035c84>] kthread+0x46/0x6a
[ 83.506538] [<c1035c3e>] ? kthread+0x0/0x6a
[ 83.506541] [<c10033cf>] kernel_thread_helper+0x7/0x10
[ 83.506542] ---[ end trace 08f91010f92f7c3c ]---

Thanks,
Alberto.

2009-07-20 18:10:19

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Sun, 5 Jul 2009 00:10:43 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Friday 03 July 2009 00:00:49 Michal Soltys wrote:
> > Regarding Arch's default distro kernel - it's a bit patched. Most
> > of the patch file is addition of aufs, but there're other diffs
> > included as well, some of them modifying i915_reg.h,
> > intel_display.c and intel_tv.c
> >
> > ftp://ftp.archlinux.org/other/kernel26/patch-2.6.30-5-ARCH.bz2
> >
> > Not sure how relevant that is, but fyi.
>
> Yes, thanks for pointing that out.
>
> I now tried a vanilla 2.6.30.1 kernel with a custom config (so I
> don't have to wait 45 minutes for it to compile) and I still see the
> problem. This is what I get in dmesg:
>
> [ 83.506496] ------------[ cut here ]------------
> [ 83.506500] WARNING: at drivers/gpu/drm/drm_sysfs.c:452
> drm_sysfs_hotplug_event+0x2b/0x63()
> [ 83.506502] Hardware name: Studio 540
> [ 83.506503] hotplug uevent
> [ 83.506504] Modules linked in:
> [ 83.506506] Pid: 7, comm: events/0 Tainted: G W 2.6.30.1 #1
> [ 83.506507] Call Trace:
> [ 83.506511] [<c1025df9>] warn_slowpath_common+0x65/0x7c
> [ 83.506514] [<c11e9f9e>] ? drm_sysfs_hotplug_event+0x2b/0x63
> [ 83.506517] [<c1025e44>] warn_slowpath_fmt+0x24/0x27
> [ 83.506520] [<c11e9f9e>] drm_sysfs_hotplug_event+0x2b/0x63
> [ 83.506523] [<c11f33de>] i915_hotplug_work_func+0xe/0x10
> [ 83.506525] [<c1032f28>] worker_thread+0x131/0x1ab
> [ 83.506528] [<c11f33d0>] ? i915_hotplug_work_func+0x0/0x10
> [ 83.506531] [<c1035f61>] ? autoremove_wake_function+0x0/0x2f
> [ 83.506534] [<c1032df7>] ? worker_thread+0x0/0x1ab
> [ 83.506536] [<c1035c84>] kthread+0x46/0x6a
> [ 83.506538] [<c1035c3e>] ? kthread+0x0/0x6a
> [ 83.506541] [<c10033cf>] kernel_thread_helper+0x7/0x10
> [ 83.506542] ---[ end trace 08f91010f92f7c3c ]---

Sorry I missed this update; what about the register dump part of the
patch? Presumably you get a ton of these in your log, but with some
IIR register info beforehand?

--
Jesse Barnes, Intel Open Source Technology Center

2009-07-20 21:12:08

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Monday 20 July 2009 20:10:10 Jesse Barnes wrote:
> Alberto Gonzalez <[email protected]> wrote:
> > I now tried a vanilla 2.6.30.1 kernel with a custom config (so I
> > don't have to wait 45 minutes for it to compile) and I still see the
> > problem. This is what I get in dmesg:
> >
> > [ 83.506496] ------------[ cut here ]------------
> > [ 83.506500] WARNING: at drivers/gpu/drm/drm_sysfs.c:452
> > drm_sysfs_hotplug_event+0x2b/0x63()
> > [ 83.506502] Hardware name: Studio 540
> > [ 83.506503] hotplug uevent
> > [ 83.506504] Modules linked in:
> > [ 83.506506] Pid: 7, comm: events/0 Tainted: G W 2.6.30.1 #1
> > [ 83.506507] Call Trace:
> > [ 83.506511] [<c1025df9>] warn_slowpath_common+0x65/0x7c
> > [ 83.506514] [<c11e9f9e>] ? drm_sysfs_hotplug_event+0x2b/0x63
> > [ 83.506517] [<c1025e44>] warn_slowpath_fmt+0x24/0x27
> > [ 83.506520] [<c11e9f9e>] drm_sysfs_hotplug_event+0x2b/0x63
> > [ 83.506523] [<c11f33de>] i915_hotplug_work_func+0xe/0x10
> > [ 83.506525] [<c1032f28>] worker_thread+0x131/0x1ab
> > [ 83.506528] [<c11f33d0>] ? i915_hotplug_work_func+0x0/0x10
> > [ 83.506531] [<c1035f61>] ? autoremove_wake_function+0x0/0x2f
> > [ 83.506534] [<c1032df7>] ? worker_thread+0x0/0x1ab
> > [ 83.506536] [<c1035c84>] kthread+0x46/0x6a
> > [ 83.506538] [<c1035c3e>] ? kthread+0x0/0x6a
> > [ 83.506541] [<c10033cf>] kernel_thread_helper+0x7/0x10
> > [ 83.506542] ---[ end trace 08f91010f92f7c3c ]---
>
> Sorry I missed this update; what about the register dump part of the
> patch? Presumably you get a ton of these in your log, but with some
> IIR register info beforehand?

The problem is that when this happens I get these messages at a very high
rate, so they flood my dmesg in a few seconds. Is there a way I could access
that register dump that should precede them?

Thanks,
Alberto.

2009-07-20 22:49:10

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Monday 20 July 2009 23:06:51 Alberto Gonzalez wrote:
> On Monday 20 July 2009 20:10:10 Jesse Barnes wrote:
> > Alberto Gonzalez <[email protected]> wrote:
> > > I now tried a vanilla 2.6.30.1 kernel with a custom config (so I
> > > don't have to wait 45 minutes for it to compile) and I still see the
> > > problem. This is what I get in dmesg:
> > >
> > > [ 83.506496] ------------[ cut here ]------------
> > > [ 83.506500] WARNING: at drivers/gpu/drm/drm_sysfs.c:452
> > > drm_sysfs_hotplug_event+0x2b/0x63()
> > > [ 83.506502] Hardware name: Studio 540
> > > [ 83.506503] hotplug uevent
> > > [ 83.506504] Modules linked in:
> > > [ 83.506506] Pid: 7, comm: events/0 Tainted: G W 2.6.30.1 #1
> > > [ 83.506507] Call Trace:
> > > [ 83.506511] [<c1025df9>] warn_slowpath_common+0x65/0x7c
> > > [ 83.506514] [<c11e9f9e>] ? drm_sysfs_hotplug_event+0x2b/0x63
> > > [ 83.506517] [<c1025e44>] warn_slowpath_fmt+0x24/0x27
> > > [ 83.506520] [<c11e9f9e>] drm_sysfs_hotplug_event+0x2b/0x63
> > > [ 83.506523] [<c11f33de>] i915_hotplug_work_func+0xe/0x10
> > > [ 83.506525] [<c1032f28>] worker_thread+0x131/0x1ab
> > > [ 83.506528] [<c11f33d0>] ? i915_hotplug_work_func+0x0/0x10
> > > [ 83.506531] [<c1035f61>] ? autoremove_wake_function+0x0/0x2f
> > > [ 83.506534] [<c1032df7>] ? worker_thread+0x0/0x1ab
> > > [ 83.506536] [<c1035c84>] kthread+0x46/0x6a
> > > [ 83.506538] [<c1035c3e>] ? kthread+0x0/0x6a
> > > [ 83.506541] [<c10033cf>] kernel_thread_helper+0x7/0x10
> > > [ 83.506542] ---[ end trace 08f91010f92f7c3c ]---
> >
> > Sorry I missed this update; what about the register dump part of the
> > patch? Presumably you get a ton of these in your log, but with some
> > IIR register info beforehand?
>
> The problem is that when this happens I get these messages at a very high
> rate, so they flood my dmesg in a few seconds. Is there a way I could
> access that register dump that should precede them?

I've managed to get something hopefully more meaningful. First is the normal
boot log with just a few of these messages at the end (this is before the
"storm" starts). Then the dmesg once the problem triggers.

If what you need is the exact lines just before the "storm" starts let me know
and I'll keep trying, but it's not easy because I'll have to be very fast in
detecting the problem has started and logging the output of dmesg.

>
> Thanks,
> Alberto.


Attachments:
(No filename) (2.38 kB)
dmesg1.log (66.30 kB)
dmesg2.log (242.83 kB)
Download all attachments

2009-07-22 13:25:55

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Tuesday 21 July 2009 00:48:36 Alberto Gonzalez wrote:
> If what you need is the exact lines just before the "storm" starts let me
> know and I'll keep trying, but it's not easy because I'll have to be very
> fast in detecting the problem has started and logging the output of dmesg.

Ok, I caught a complete log from boot to when the problem starts. There
doesn't seem to be anything significantly different, but just in case.



Attachments:
(No filename) (433.00 B)
dmesg-complete.log (199.37 kB)
Download all attachments

2009-07-22 16:08:39

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wed, 22 Jul 2009 15:25:29 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Tuesday 21 July 2009 00:48:36 Alberto Gonzalez wrote:
> > If what you need is the exact lines just before the "storm" starts
> > let me know and I'll keep trying, but it's not easy because I'll
> > have to be very fast in detecting the problem has started and
> > logging the output of dmesg.
>
> Ok, I caught a complete log from boot to when the problem starts.
> There doesn't seem to be anything significantly different, but just
> in case.

Oh I guess this is without the debug patch I attached... this one
should be a bit less noisy but tell me which bit is stuck.

--
Jesse Barnes, Intel Open Source Technology Center


Attachments:
(No filename) (716.00 B)
i915-hotplug-error.patch (596.00 B)
Download all attachments

2009-07-22 16:51:39

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wednesday 22 July 2009 18:08:34 Jesse Barnes wrote:
> On Wed, 22 Jul 2009 15:25:29 +0200
> Alberto Gonzalez <[email protected]> wrote:
> > Ok, I caught a complete log from boot to when the problem starts.
> > There doesn't seem to be anything significantly different, but just
> > in case.
>
> Oh I guess this is without the debug patch I attached... this one
> should be a bit less noisy but tell me which bit is stuck.

Actually it was with the debug patch found here applied:

http://lkml.org/lkml/2009/7/2/290

Now I captured another one with the latest patch. This time the problem
started after opening Amarok.



Attachments:
(No filename) (628.00 B)
dmesg-new.log (104.23 kB)
Download all attachments

2009-07-22 17:12:57

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wed, 22 Jul 2009 18:51:20 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Wednesday 22 July 2009 18:08:34 Jesse Barnes wrote:
> > On Wed, 22 Jul 2009 15:25:29 +0200
> > Alberto Gonzalez <[email protected]> wrote:
> > > Ok, I caught a complete log from boot to when the problem starts.
> > > There doesn't seem to be anything significantly different, but
> > > just in case.
> >
> > Oh I guess this is without the debug patch I attached... this one
> > should be a bit less noisy but tell me which bit is stuck.
>
> Actually it was with the debug patch found here applied:
>
> http://lkml.org/lkml/2009/7/2/290
>
> Now I captured another one with the latest patch. This time the
> problem started after opening Amarok.
>

Ah I must have been looking at the wrong register. This one makes it
look like one of your HDMI hotplug bits is getting stuck (HDMIC in
particular). This might not even be wired up on your platform...

This test hack should prevent us from responding to those interrupts...

--
Jesse Barnes, Intel Open Source Technology Center


Attachments:
(No filename) (1.05 kB)
i915-g4x-hotplug-hack.patch (562.00 B)
Download all attachments

2009-07-22 18:44:07

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wednesday 22 July 2009 19:12:51 Jesse Barnes wrote:
> Ah I must have been looking at the wrong register. This one makes it
> look like one of your HDMI hotplug bits is getting stuck (HDMIC in
> particular). This might not even be wired up on your platform...

Yes, I don't use HDMI, my screen is attached via VGA
>
> This test hack should prevent us from responding to those interrupts...

I've been hitting it hard for over an hour with all the usual tricks to
trigger it (and a few reboots) and I've been unable to reproduce the problem
with this patch. Before I could trigger it reliably in a few minutes, so I'm
pretty sure this patch fixes it. If something new comes up, I'll let you know
anyway.

Thank you very much!

Regards,
Alberto.

2009-07-22 19:17:14

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wed, 22 Jul 2009 20:43:56 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Wednesday 22 July 2009 19:12:51 Jesse Barnes wrote:
> > Ah I must have been looking at the wrong register. This one makes
> > it look like one of your HDMI hotplug bits is getting stuck (HDMIC
> > in particular). This might not even be wired up on your platform...
>
> Yes, I don't use HDMI, my screen is attached via VGA
> >
> > This test hack should prevent us from responding to those
> > interrupts...
>
> I've been hitting it hard for over an hour with all the usual tricks
> to trigger it (and a few reboots) and I've been unable to reproduce
> the problem with this patch. Before I could trigger it reliably in a
> few minutes, so I'm pretty sure this patch fixes it. If something new
> comes up, I'll let you know anyway.
>
> Thank you very much!

Great, thanks for testing... I'll try to come up with a better patch
and send it over to Eric.

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2009-07-22 19:44:40

by Jesse Barnes

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wed, 22 Jul 2009 20:43:56 +0200
Alberto Gonzalez <[email protected]> wrote:

> On Wednesday 22 July 2009 19:12:51 Jesse Barnes wrote:
> > Ah I must have been looking at the wrong register. This one makes
> > it look like one of your HDMI hotplug bits is getting stuck (HDMIC
> > in particular). This might not even be wired up on your platform...
>
> Yes, I don't use HDMI, my screen is attached via VGA
> >
> > This test hack should prevent us from responding to those
> > interrupts...
>
> I've been hitting it hard for over an hour with all the usual tricks
> to trigger it (and a few reboots) and I've been unable to reproduce
> the problem with this patch. Before I could trigger it reliably in a
> few minutes, so I'm pretty sure this patch fixes it. If something new
> comes up, I'll let you know anyway.

Hm, so this type of interrupt problem is *supposed* to be handled by
setting of the PEG_BAND_GAP_DATA reg (low bits set to 0xd). Do you
have that in your tree? Does git master have this problem?

Maybe we need to add a patch to 2.6.30.x to set PEG_BAND_GAP_DATA as a
workaround (git master already has code to do this properly for HDMI
and DP outputs afaict).

--
Jesse Barnes, Intel Open Source Technology Center

2009-07-22 20:00:43

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wednesday 22 July 2009 21:44:36 Jesse Barnes wrote:
> On Wed, 22 Jul 2009 20:43:56 +0200
>
> Alberto Gonzalez <[email protected]> wrote:
> > On Wednesday 22 July 2009 19:12:51 Jesse Barnes wrote:
> > > Ah I must have been looking at the wrong register. This one makes
> > > it look like one of your HDMI hotplug bits is getting stuck (HDMIC
> > > in particular). This might not even be wired up on your platform...
> >
> > Yes, I don't use HDMI, my screen is attached via VGA
> >
> > > This test hack should prevent us from responding to those
> > > interrupts...
> >
> > I've been hitting it hard for over an hour with all the usual tricks
> > to trigger it (and a few reboots) and I've been unable to reproduce
> > the problem with this patch. Before I could trigger it reliably in a
> > few minutes, so I'm pretty sure this patch fixes it. If something new
> > comes up, I'll let you know anyway.
>
> Hm, so this type of interrupt problem is *supposed* to be handled by
> setting of the PEG_BAND_GAP_DATA reg (low bits set to 0xd). Do you
> have that in your tree? Does git master have this problem?

The problem started with 2.6.30 on a standard distro kernel (Arch Linux).
Latest tests were with vanilla 2.6.30.1 and a cut down config (attached). I
haven't tested .31.

I'll try to test latest git as soon as I can and let you know if the problem
exists there.

>
> Maybe we need to add a patch to 2.6.30.x to set PEG_BAND_GAP_DATA as a
> workaround (git master already has code to do this properly for HDMI
> and DP outputs afaict).


Attachments:
(No filename) (1.52 kB)
config (61.56 kB)
Download all attachments

2009-07-23 00:06:52

by Alberto Gonzalez

[permalink] [raw]
Subject: Re: Kernel 2.6.30 and udevd problem

On Wednesday 22 July 2009 22:00:21 Alberto Gonzalez wrote:
> On Wednesday 22 July 2009 21:44:36 Jesse Barnes wrote:
> > Hm, so this type of interrupt problem is *supposed* to be handled by
> > setting of the PEG_BAND_GAP_DATA reg (low bits set to 0xd). Do you
> > have that in your tree? Does git master have this problem?
>
> The problem started with 2.6.30 on a standard distro kernel (Arch Linux).
> Latest tests were with vanilla 2.6.30.1 and a cut down config (attached). I
> haven't tested .31.
>
> I'll try to test latest git as soon as I can and let you know if the
> problem exists there.

I've tested 2.6.31 from git and I can reproduce the problem. Attached is the
dmesg with the debug patch applied.

>
> > Maybe we need to add a patch to 2.6.30.x to set PEG_BAND_GAP_DATA as a
> > workaround (git master already has code to do this properly for HDMI
> > and DP outputs afaict).


Attachments:
(No filename) (895.00 B)
dmesg-2.6.31.log (78.18 kB)
Download all attachments