2012-08-17 22:25:40

by Justin Forbes

[permalink] [raw]
Subject: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
cirrusdrmfb.

This is the last message displayed before the system hangs. This seems
to be hitting a large number of users in Fedora, though certainly not
everyone. This started happening with the 3.5 updates, and is still an
issue. It appears to be a race condition, because various things have
allowed boot to continue for some users, though there is no clear work
around. Has anyone else run across this? Any ideas. For more
background we have the following bugs:

inteldrmfb:
https://bugzilla.redhat.com/show_bug.cgi?id=843826

radeondrmfb:
https://bugzilla.redhat.com/show_bug.cgi?id=845745

cirrusdrmfb <kvm>:
https://bugzilla.redhat.com/show_bug.cgi?id=843860

It should be noted that the conflicting fb hw usage message is not new,
it has been around for a while, but this is the last message seen before
the hang.

Thanks,
Justin


2012-08-17 22:30:31

by Randy Dunlap

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On 08/17/2012 03:25 PM, Justin M. Forbes wrote:

> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
> cirrusdrmfb.
>
> This is the last message displayed before the system hangs. This seems
> to be hitting a large number of users in Fedora, though certainly not
> everyone. This started happening with the 3.5 updates, and is still an
> issue. It appears to be a race condition, because various things have
> allowed boot to continue for some users, though there is no clear work
> around. Has anyone else run across this? Any ideas. For more
> background we have the following bugs:
>
> inteldrmfb:
> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>
> radeondrmfb:
> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>
> cirrusdrmfb <kvm>:
> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>
> It should be noted that the conflicting fb hw usage message is not new,
> it has been around for a while, but this is the last message seen before
> the hang.


Hi, (adding dri-devel mailing list)


I started seeing this problem on 3.5-rc6.

AFAICT, the system is not actually hung, it's just that no output
is showing up on the real (physical) output device (display) -- it's
going somewhere else (or to the bit bucket).

--
~Randy

2012-08-17 22:54:14

by Dave Airlie

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>
>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>> cirrusdrmfb.
>>
>> This is the last message displayed before the system hangs. This seems
>> to be hitting a large number of users in Fedora, though certainly not
>> everyone. This started happening with the 3.5 updates, and is still an
>> issue. It appears to be a race condition, because various things have
>> allowed boot to continue for some users, though there is no clear work
>> around. Has anyone else run across this? Any ideas. For more
>> background we have the following bugs:
>>
>> inteldrmfb:
>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>
>> radeondrmfb:
>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>
>> cirrusdrmfb <kvm>:
>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>
>> It should be noted that the conflicting fb hw usage message is not new,
>> it has been around for a while, but this is the last message seen before
>> the hang.
>
>
> Hi, (adding dri-devel mailing list)
>
>
> I started seeing this problem on 3.5-rc6.
>
> AFAICT, the system is not actually hung, it's just that no output
> is showing up on the real (physical) output device (display) -- it's
> going somewhere else (or to the bit bucket).
>

Can we bisect this at all?

I worry the intel one will bisect to where we moved the conflict
resolution earlier, but I'd like to see if applying that patch earlier
causes the issue, since radeon has it.

I haven't reproduced this on any hw I own, I also can't get it under qemu.

Dave.

2012-08-17 22:55:19

by Dave Airlie

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>
>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>> cirrusdrmfb.
>>>
>>> This is the last message displayed before the system hangs. This seems
>>> to be hitting a large number of users in Fedora, though certainly not
>>> everyone. This started happening with the 3.5 updates, and is still an
>>> issue. It appears to be a race condition, because various things have
>>> allowed boot to continue for some users, though there is no clear work
>>> around. Has anyone else run across this? Any ideas. For more
>>> background we have the following bugs:
>>>
>>> inteldrmfb:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>
>>> radeondrmfb:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>
>>> cirrusdrmfb <kvm>:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>
>>> It should be noted that the conflicting fb hw usage message is not new,
>>> it has been around for a while, but this is the last message seen before
>>> the hang.
>>
>>
>> Hi, (adding dri-devel mailing list)
>>
>>
>> I started seeing this problem on 3.5-rc6.
>>
>> AFAICT, the system is not actually hung, it's just that no output
>> is showing up on the real (physical) output device (display) -- it's
>> going somewhere else (or to the bit bucket).
>>
>
> Can we bisect this at all?
>
> I worry the intel one will bisect to where we moved the conflict
> resolution earlier, but I'd like to see if applying that patch earlier
> causes the issue, since radeon has it.
>
> I haven't reproduced this on any hw I own, I also can't get it under qemu.

I'm also wondering whether this grub2 related in some way, grub2 is
starting to mess with the graphics adapter pointlessly.

Dave.

2012-08-17 23:07:19

by Randy Dunlap

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On 08/17/2012 03:55 PM, Dave Airlie wrote:

> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>
>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>> cirrusdrmfb.
>>>>
>>>> This is the last message displayed before the system hangs. This seems
>>>> to be hitting a large number of users in Fedora, though certainly not
>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>> issue. It appears to be a race condition, because various things have
>>>> allowed boot to continue for some users, though there is no clear work
>>>> around. Has anyone else run across this? Any ideas. For more
>>>> background we have the following bugs:
>>>>
>>>> inteldrmfb:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>
>>>> radeondrmfb:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>
>>>> cirrusdrmfb <kvm>:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>
>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>> it has been around for a while, but this is the last message seen before
>>>> the hang.
>>>
>>>
>>> Hi, (adding dri-devel mailing list)
>>>
>>>
>>> I started seeing this problem on 3.5-rc6.
>>>
>>> AFAICT, the system is not actually hung, it's just that no output
>>> is showing up on the real (physical) output device (display) -- it's
>>> going somewhere else (or to the bit bucket).
>>>
>>
>> Can we bisect this at all?

I'll try.

>> I worry the intel one will bisect to where we moved the conflict
>> resolution earlier, but I'd like to see if applying that patch earlier
>> causes the issue, since radeon has it.
>>
>> I haven't reproduced this on any hw I own, I also can't get it under qemu.
>
> I'm also wondering whether this grub2 related in some way, grub2 is
> starting to mess with the graphics adapter pointlessly.


I'm using lilo, not grub.

--
~Randy

2012-08-20 05:13:16

by Randy Dunlap

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On 08/17/12 15:55, Dave Airlie wrote:

> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>
>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>> cirrusdrmfb.
>>>>
>>>> This is the last message displayed before the system hangs. This seems
>>>> to be hitting a large number of users in Fedora, though certainly not
>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>> issue. It appears to be a race condition, because various things have
>>>> allowed boot to continue for some users, though there is no clear work
>>>> around. Has anyone else run across this? Any ideas. For more
>>>> background we have the following bugs:
>>>>
>>>> inteldrmfb:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>
>>>> radeondrmfb:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>
>>>> cirrusdrmfb <kvm>:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>
>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>> it has been around for a while, but this is the last message seen before
>>>> the hang.
>>>
>>>
>>> Hi, (adding dri-devel mailing list)
>>>
>>>
>>> I started seeing this problem on 3.5-rc6.
>>>
>>> AFAICT, the system is not actually hung, it's just that no output
>>> is showing up on the real (physical) output device (display) -- it's
>>> going somewhere else (or to the bit bucket).
>>>
>>
>> Can we bisect this at all?

I guess I'll have to try again. My first attempt did not
prove anything, I think because the conflict does not happen
100% of the time (i.e., it feels like a timing problem).

>> I worry the intel one will bisect to where we moved the conflict
>> resolution earlier, but I'd like to see if applying that patch earlier
>> causes the issue, since radeon has it.

Do you know of a specific commit that I could revert and test?

>> I haven't reproduced this on any hw I own, I also can't get it under qemu.
>
> I'm also wondering whether this grub2 related in some way, grub2 is
> starting to mess with the graphics adapter pointlessly.
>
> Dave.



--
~Randy

2012-08-20 05:22:19

by Dave Airlie

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On Mon, Aug 20, 2012 at 3:13 PM, Randy Dunlap <[email protected]> wrote:
> On 08/17/12 15:55, Dave Airlie wrote:
>
>> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>>
>>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>>> cirrusdrmfb.
>>>>>
>>>>> This is the last message displayed before the system hangs. This seems
>>>>> to be hitting a large number of users in Fedora, though certainly not
>>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>>> issue. It appears to be a race condition, because various things have
>>>>> allowed boot to continue for some users, though there is no clear work
>>>>> around. Has anyone else run across this? Any ideas. For more
>>>>> background we have the following bugs:
>>>>>
>>>>> inteldrmfb:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>>
>>>>> radeondrmfb:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>>
>>>>> cirrusdrmfb <kvm>:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>>
>>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>>> it has been around for a while, but this is the last message seen before
>>>>> the hang.
>>>>
>>>>
>>>> Hi, (adding dri-devel mailing list)
>>>>
>>>>
>>>> I started seeing this problem on 3.5-rc6.
>>>>
>>>> AFAICT, the system is not actually hung, it's just that no output
>>>> is showing up on the real (physical) output device (display) -- it's
>>>> going somewhere else (or to the bit bucket).
>>>>
>>>
>>> Can we bisect this at all?
>
> I guess I'll have to try again. My first attempt did not
> prove anything, I think because the conflict does not happen
> 100% of the time (i.e., it feels like a timing problem).
>
>>> I worry the intel one will bisect to where we moved the conflict
>>> resolution earlier, but I'd like to see if applying that patch earlier
>>> causes the issue, since radeon has it.
>
> Do you know of a specific commit that I could revert and test?

9f846a16d213523fbe6daea17e20df6b8ac5a1e5

might work, but it just changes the timing mostly.

also testing 3.4 with that on top would be good.

Dave.

2012-08-20 22:45:17

by Randy Dunlap

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On 08/19/2012 10:22 PM, Dave Airlie wrote:

> On Mon, Aug 20, 2012 at 3:13 PM, Randy Dunlap <[email protected]> wrote:
>> On 08/17/12 15:55, Dave Airlie wrote:
>>
>>> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>>>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>>>
>>>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>>>> cirrusdrmfb.
>>>>>>
>>>>>> This is the last message displayed before the system hangs. This seems
>>>>>> to be hitting a large number of users in Fedora, though certainly not
>>>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>>>> issue. It appears to be a race condition, because various things have
>>>>>> allowed boot to continue for some users, though there is no clear work
>>>>>> around. Has anyone else run across this? Any ideas. For more
>>>>>> background we have the following bugs:
>>>>>>
>>>>>> inteldrmfb:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>>>
>>>>>> radeondrmfb:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>>>
>>>>>> cirrusdrmfb <kvm>:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>>>
>>>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>>>> it has been around for a while, but this is the last message seen before
>>>>>> the hang.
>>>>>
>>>>>
>>>>> Hi, (adding dri-devel mailing list)
>>>>>
>>>>>
>>>>> I started seeing this problem on 3.5-rc6.
>>>>>
>>>>> AFAICT, the system is not actually hung, it's just that no output
>>>>> is showing up on the real (physical) output device (display) -- it's
>>>>> going somewhere else (or to the bit bucket).
>>>>>
>>>>
>>>> Can we bisect this at all?
>>
>> I guess I'll have to try again. My first attempt did not
>> prove anything, I think because the conflict does not happen
>> 100% of the time (i.e., it feels like a timing problem).
>>
>>>> I worry the intel one will bisect to where we moved the conflict
>>>> resolution earlier, but I'd like to see if applying that patch earlier
>>>> causes the issue, since radeon has it.
>>
>> Do you know of a specific commit that I could revert and test?
>
> 9f846a16d213523fbe6daea17e20df6b8ac5a1e5
>
> might work, but it just changes the timing mostly.
>
> also testing 3.4 with that on top would be good.


That commit doesn't apply cleanly to 3.4, but reverting
it on 3.5-rc6 (where I first saw the problem) allows me to boot
3.5-rc6 multiple times without a problem.

Maybe Justin can get more stable testing done also..

thanks,
--
~Randy

2012-08-21 00:23:50

by Dave Airlie

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On Tue, Aug 21, 2012 at 8:45 AM, Randy Dunlap <[email protected]> wrote:
> On 08/19/2012 10:22 PM, Dave Airlie wrote:
>
>> On Mon, Aug 20, 2012 at 3:13 PM, Randy Dunlap <[email protected]> wrote:
>>> On 08/17/12 15:55, Dave Airlie wrote:
>>>
>>>> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>>>>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>>>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>>>>
>>>>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>>>>> cirrusdrmfb.
>>>>>>>
>>>>>>> This is the last message displayed before the system hangs. This seems
>>>>>>> to be hitting a large number of users in Fedora, though certainly not
>>>>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>>>>> issue. It appears to be a race condition, because various things have
>>>>>>> allowed boot to continue for some users, though there is no clear work
>>>>>>> around. Has anyone else run across this? Any ideas. For more
>>>>>>> background we have the following bugs:
>>>>>>>
>>>>>>> inteldrmfb:
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>>>>
>>>>>>> radeondrmfb:
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>>>>
>>>>>>> cirrusdrmfb <kvm>:
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>>>>
>>>>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>>>>> it has been around for a while, but this is the last message seen before
>>>>>>> the hang.
>>>>>>
>>>>>>
>>>>>> Hi, (adding dri-devel mailing list)
>>>>>>
>>>>>>
>>>>>> I started seeing this problem on 3.5-rc6.
>>>>>>
>>>>>> AFAICT, the system is not actually hung, it's just that no output
>>>>>> is showing up on the real (physical) output device (display) -- it's
>>>>>> going somewhere else (or to the bit bucket).
>>>>>>
>>>>>
>>>>> Can we bisect this at all?
>>>
>>> I guess I'll have to try again. My first attempt did not
>>> prove anything, I think because the conflict does not happen
>>> 100% of the time (i.e., it feels like a timing problem).
>>>
>>>>> I worry the intel one will bisect to where we moved the conflict
>>>>> resolution earlier, but I'd like to see if applying that patch earlier
>>>>> causes the issue, since radeon has it.
>>>
>>> Do you know of a specific commit that I could revert and test?
>>
>> 9f846a16d213523fbe6daea17e20df6b8ac5a1e5
>>
>> might work, but it just changes the timing mostly.
>>
>> also testing 3.4 with that on top would be good.
>
>
> That commit doesn't apply cleanly to 3.4, but reverting
> it on 3.5-rc6 (where I first saw the problem) allows me to boot
> 3.5-rc6 multiple times without a problem.
>
> Maybe Justin can get more stable testing done also..

Randy do you have a vga= on your kernel command line?

Dave.

2012-08-21 00:58:41

by Randy Dunlap

[permalink] [raw]
Subject: Re: 3.5.x boot hang after conflicting fb hw usage <driver> vs VESA VGA - removing generic driver

On 08/20/2012 05:23 PM, Dave Airlie wrote:

> On Tue, Aug 21, 2012 at 8:45 AM, Randy Dunlap <[email protected]> wrote:
>> On 08/19/2012 10:22 PM, Dave Airlie wrote:
>>
>>> On Mon, Aug 20, 2012 at 3:13 PM, Randy Dunlap <[email protected]> wrote:
>>>> On 08/17/12 15:55, Dave Airlie wrote:
>>>>
>>>>> On Sat, Aug 18, 2012 at 8:54 AM, Dave Airlie <[email protected]> wrote:
>>>>>> On Sat, Aug 18, 2012 at 8:28 AM, Randy Dunlap <[email protected]> wrote:
>>>>>>> On 08/17/2012 03:25 PM, Justin M. Forbes wrote:
>>>>>>>
>>>>>>>> for <driver>, we have verified cases on inteldrmfb, radeondrmfb, and
>>>>>>>> cirrusdrmfb.
>>>>>>>>
>>>>>>>> This is the last message displayed before the system hangs. This seems
>>>>>>>> to be hitting a large number of users in Fedora, though certainly not
>>>>>>>> everyone. This started happening with the 3.5 updates, and is still an
>>>>>>>> issue. It appears to be a race condition, because various things have
>>>>>>>> allowed boot to continue for some users, though there is no clear work
>>>>>>>> around. Has anyone else run across this? Any ideas. For more
>>>>>>>> background we have the following bugs:
>>>>>>>>
>>>>>>>> inteldrmfb:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843826
>>>>>>>>
>>>>>>>> radeondrmfb:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=845745
>>>>>>>>
>>>>>>>> cirrusdrmfb <kvm>:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=843860
>>>>>>>>
>>>>>>>> It should be noted that the conflicting fb hw usage message is not new,
>>>>>>>> it has been around for a while, but this is the last message seen before
>>>>>>>> the hang.
>>>>>>>
>>>>>>>
>>>>>>> Hi, (adding dri-devel mailing list)
>>>>>>>
>>>>>>>
>>>>>>> I started seeing this problem on 3.5-rc6.
>>>>>>>
>>>>>>> AFAICT, the system is not actually hung, it's just that no output
>>>>>>> is showing up on the real (physical) output device (display) -- it's
>>>>>>> going somewhere else (or to the bit bucket).
>>>>>>>
>>>>>>
>>>>>> Can we bisect this at all?
>>>>
>>>> I guess I'll have to try again. My first attempt did not
>>>> prove anything, I think because the conflict does not happen
>>>> 100% of the time (i.e., it feels like a timing problem).
>>>>
>>>>>> I worry the intel one will bisect to where we moved the conflict
>>>>>> resolution earlier, but I'd like to see if applying that patch earlier
>>>>>> causes the issue, since radeon has it.
>>>>
>>>> Do you know of a specific commit that I could revert and test?
>>>
>>> 9f846a16d213523fbe6daea17e20df6b8ac5a1e5
>>>
>>> might work, but it just changes the timing mostly.
>>>
>>> also testing 3.4 with that on top would be good.
>>
>>
>> That commit doesn't apply cleanly to 3.4, but reverting
>> it on 3.5-rc6 (where I first saw the problem) allows me to boot
>> 3.5-rc6 multiple times without a problem.
>>
>> Maybe Justin can get more stable testing done also..
>
> Randy do you have a vga= on your kernel command line?


Ah, yes: "vga=ask"

--
~Randy