2011-05-20 17:06:46

by Luke Dashjr

[permalink] [raw]
Subject: Major 2.6.38 regression ignored?

I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
against 2.6.38. Now 2.6.39 was just released without the regression being
addressed. This bug makes the system unusable... Some guys on IRC suggested I
email, so here it is.


2011-05-20 20:24:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

On Friday, May 20, 2011, Ray Lee wrote:
> [ Adding Chris Wilson (author of the problematic patch) and Rafael Wysocki
> to the message ]

It is on the list of known regressions from 2.6.37, but we're not tracking
them any more now that 2.6.39 is out.

Thanks,
Rafael


> On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <[email protected]> wrote:
>
> > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
> > against 2.6.38. Now 2.6.39 was just released without the regression being
> > addressed. This bug makes the system unusable... Some guys on IRC suggested
> > I
> > email, so here it is.
> >
>
> See the bugzilla entry for the bisection history.
>
> ~r.
>

2011-05-20 21:12:22

by Ray Lee

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

2011/5/20 Rafael J. Wysocki <[email protected]>
> It is on the list of known regressions from 2.6.37, but we're not tracking
> them any more now that 2.6.39 is out.

Hopefully Chris is still tracking them, even if you aren't.

Chris? What other information can the affected person provide, or what
tests can he run to help close this out?

2011-05-21 08:41:57

by Chris Wilson

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <[email protected]> wrote:
> [ Adding Chris Wilson (author of the problematic patch) and Rafael Wysocki
> to the message ]
>
> On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <[email protected]> wrote:
>
> > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
> > against 2.6.38. Now 2.6.39 was just released without the regression being
> > addressed. This bug makes the system unusable... Some guys on IRC suggested
> > I
> > email, so here it is.
> >
>
> See the bugzilla entry for the bisection history.

Which has nothing to do with Luke's bug. Considering the thousand things
that can go wrong during X starting, without a hint as to which it is nigh
on impossible to debug except by trial and error. If you set up
netconsole, does the kernel emit an OOPS with it's last dying breath?
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-05-21 15:24:24

by Luke Dashjr

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <[email protected]> wrote:
> > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > Wysocki to the message ]
> >
> > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <[email protected]> wrote:
> > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > regression being addressed. This bug makes the system unusable... Some
> > > guys on IRC suggested I
> > > email, so here it is.
> >
> > See the bugzilla entry for the bisection history.
>
> Which has nothing to do with Luke's bug. Considering the thousand things
> that can go wrong during X starting, without a hint as to which it is nigh
> on impossible to debug except by trial and error. If you set up
> netconsole, does the kernel emit an OOPS with it's last dying breath?

Why assume it's a different bug? I would almost wonder if it might affect
all Sandy Bridge GPUs. In any case, I no longer have the original
motherboard (it was recalled, as I said in the first post), nor even the
revision of it (it had other issues that weren't being fixed). I *assume* I
will have the same problem with my new motherboard (Intel DQ67SW), but I
haven't verified that yet. I'll be sure to try a netconsole when I have to
reboot next and get a chance to try the most recent 2.6.38 and .39 kernels,
but at the moment it seems reasonable to address the problem bisected in the
bug, even if it turns out to be different.

2011-05-21 15:40:24

by Chris Wilson

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <[email protected]> wrote:
> On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <[email protected]> wrote:
> > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > > Wysocki to the message ]
> > >
> > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <[email protected]> wrote:
> > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > > regression being addressed. This bug makes the system unusable... Some
> > > > guys on IRC suggested I
> > > > email, so here it is.
> > >
> > > See the bugzilla entry for the bisection history.
> >
> > Which has nothing to do with Luke's bug. Considering the thousand things
> > that can go wrong during X starting, without a hint as to which it is nigh
> > on impossible to debug except by trial and error. If you set up
> > netconsole, does the kernel emit an OOPS with it's last dying breath?
>
> Why assume it's a different bug? I would almost wonder if it might affect
> all Sandy Bridge GPUs. In any case, I no longer have the original
> motherboard (it was recalled, as I said in the first post), nor even the
> revision of it (it had other issues that weren't being fixed). I *assume* I
> will have the same problem with my new motherboard (Intel DQ67SW), but I
> haven't verified that yet. I'll be sure to try a netconsole when I have to
> reboot next and get a chance to try the most recent 2.6.38 and .39 kernels,
> but at the moment it seems reasonable to address the problem bisected in the
> bug, even if it turns out to be different.

The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
locking between release and IRQ and so is prone to such races as befell
Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
I can quite confidently state they are separate bugs.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-05-21 19:34:05

by Luke Dashjr

[permalink] [raw]
Subject: Re: Major 2.6.38 regression ignored?

On Saturday, May 21, 2011 11:40:17 AM Chris Wilson wrote:
> The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> locking between release and IRQ and so is prone to such races as befell
> Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> I can quite confidently state they are separate bugs.

Unfortunately, I cannot help troubleshoot that bug any further, as I no longer
have the affected motherboard. I was unable to reproduce it on my Intel
DQ67SW.

However, I did encounter a new regression, which I have reported as:
https://bugzilla.kernel.org/show_bug.cgi?id=35552
This one is related to Intel HD Audio, not Graphics.

2011-05-28 13:37:35

by Kirill Smelkov

[permalink] [raw]
Subject: Re: Major 2.6.38 / 2.6.39 regression ignored?

Hello Chris, everyone,

On Sat, May 21, 2011 at 04:40:17PM +0100, Chris Wilson wrote:
> On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <[email protected]> wrote:
> > On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> > > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <[email protected]> wrote:
> > > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > > > Wysocki to the message ]
> > > >
> > > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <[email protected]> wrote:
> > > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > > > regression being addressed. This bug makes the system unusable... Some
> > > > > guys on IRC suggested I
> > > > > email, so here it is.
> > > >
> > > > See the bugzilla entry for the bisection history.
> > >
> > > Which has nothing to do with Luke's bug. Considering the thousand things
> > > that can go wrong during X starting, without a hint as to which it is nigh
> > > on impossible to debug except by trial and error. If you set up
> > > netconsole, does the kernel emit an OOPS with it's last dying breath?
> >
> > Why assume it's a different bug? I would almost wonder if it might affect
> > all Sandy Bridge GPUs. In any case, I no longer have the original
> > motherboard (it was recalled, as I said in the first post), nor even the
> > revision of it (it had other issues that weren't being fixed). I *assume* I
> > will have the same problem with my new motherboard (Intel DQ67SW), but I
> > haven't verified that yet. I'll be sure to try a netconsole when I have to
> > reboot next and get a chance to try the most recent 2.6.38 and .39 kernels,
> > but at the moment it seems reasonable to address the problem bisected in the
> > bug, even if it turns out to be different.
>
> The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> locking between release and IRQ and so is prone to such races as befell
> Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> I can quite confidently state they are separate bugs.
> -Chris

I see DRI1 is maybe buggy and old, but still, pre-kms X used to work ok
on kernels < 2.6.38, and starting from 2.6.38 the system is just
unusable because X either crashes the kernel (2.6.38), or does not start
at all (2.6.39):

https://bugzilla.kernel.org/show_bug.cgi?id=36052


It's a regression. It's blocking me to upgrade to newer kernels. I've
done my homework -- digged it and came with detailed OOPS on netconsole
and bisected to single commit. Could this please be fixed?


Thanks,
Kirill