2006-03-22 01:30:17

by Luming Yu

[permalink] [raw]
Subject: RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]

>Two more experiments:
>
> With a vanilla kernel, I faked EC0.UPDT() to just return
>0x00, and the
> system hung on the second sleep.
>
> Then, again in the DSDT, I also faked the 4 _TMP methods (one in each
> thermal zone), and the system hung on the second sleep.
>
>I think we've raced too far ahead by trying to debug many thermal zones
>at once. Perhaps there are two bugs. So let's find them one by one.

Hmm, you seems to prefer depth-first search algorithm?
I like it too. :-)


>
>One bug is quite repeatable and we know a lot about it. With all zones
>except THM0 commented out, the system hung. With the EC0.UPDT line in
>THM0._TMP also commented out, the system didn't hang. So there's a
>problem related to the EC, even with only THM0. And finding that
>problem may giveideas for what else may be wrong.

We can do bisection in EC0.UPDT to find out which statement cause hang?
Hmm, we are going to fix BIOS. :-)

My assumption is that since Windows works well, then these BIOS code
should have been tested ok. The only possible excuse for BIOS is that
Linux is using unnecessary/untested code path for Suspend/resume.
So, Eventually, we need to disable unnecessary BIOS call for
suspend/resume

Thanks,
Luming


2006-03-22 04:35:33

by Sanjoy Mahajan

[permalink] [raw]
Subject: Re: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]

> We can do bisection in EC0.UPDT to find out which statement cause
> hang?

Yes, though see below for why I don't think it'll help no matter what we
find there.

> My assumption is that since Windows works well, then these BIOS code
> should have been tested ok. The only possible excuse for BIOS is that
> Linux is using unnecessary/untested code path for Suspend/resume. So,
> Eventually, we need to disable unnecessary BIOS call for
> suspend/resume

Maybe we're not collecting the right data in that case. We know that
commenting out the call to UPDT in THM0.TMP fixes the hang. But it does
not follow that the osl suspend code should avoid running UPDT.

The hang may work like this: Between boot and sleep, calling UPDT messes
up something in the ec [which is why it takes >1 sleep to cause a hang].
When the system tries to sleep, that something triggers and the ec
hangs. But it may hang somewhere else than UPDT, and avoiding UPDT
during sleep will not fix it.

However, we do have one more piece of data. When it hangs, it hangs in
\_SI._SST, because I see that line on successful sleeps (as the last
method before the beep) but not when it hangs (and then I also don't
hear a beep). There are lots of calls to EC0.XXX, including to
EC0.BEEP, within _SST, which isn't surprising if the EC is the problem.
So perhaps I should bisect in _SST and put in the debug lines there?

Here's another idea, which is a terrible hack. But there are lots of
lines in the DSDT like
If (LOr (SPS, WNTF))
which I imagine is saying "If something or if WinNT". So, what if Linux
pretends to be WinNT (or W98F -- which is another common test), at least
for the 600x? Maybe those code paths are known to work.

-Sanjoy

`A society of sheep must in time beget a government of wolves.'
- Bertrand de Jouvenal

2006-03-22 07:15:41

by Sanjoy Mahajan

[permalink] [raw]
Subject: Re: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]

So the kernel with this UPDT() hung at the 2nd sleep:

Method (UPDT, 0, NotSerialized)
{
If (IGNR)
{
Decrement (IGNR)
}
Else
{
If (H8DR)
{
If (Acquire (I2CM, 0x0064)) {}
Else
{
Store (I2RB (Zero, 0x01, 0x04), Local7)
If (Local7)
{
Fatal (0x01, 0x80000003, Local7)
}

Release (I2CM)
}
}
}
}

Relative to a working kernel (well, a kernel that I could get to hang
only once, and then all reboots afterwards it never would hang), these
are the extra lines:

Store (I2RB (Zero, 0x01, 0x04), Local7)
If (Local7)
{
Fatal (0x01, 0x80000003, Local7)
}

Since I don't think Fatal() isn't being called, I guess the problem is
in I2RB. But all those magic numbers in I2RB make me recultant to take
out lines, unless you tell me which changes won't harm the hardware.

-Sanjoy

`A society of sheep must in time beget a government of wolves.'
- Bertrand de Jouvenal