2008-11-10 16:11:20

by Matthew Garrett

[permalink] [raw]
Subject: ACPI crash on lid close - SMP race?

If the _DOS flag on my HP 2510p is set to 0 (ie, signal OS when screen
notification is requested, don't change automatically) then it'll crash
on random lid open/closes. The trace generally makes little sense and
depends on the kernel version and phase of the moon. I'd ignored this as
firmware brokenness up until lately, but since having _DOS set to 0 is
the only way to get a notification when the display switch key is
pressed on this machine I'd be interested in fixing it properly.

Unfortunately, I've got no real idea what on earth is going on. The only
clue I've found so far is that booting with maxcpus=1 leaves it working
perfectly. What parts of the ACPI stack could be triggering this?

--
Matthew Garrett | [email protected]


2008-11-10 17:27:05

by Sergio Monteiro Basto

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

Hi, About this _DOS , I reported this problem a long time ago, which now
is on http://bugzilla.kernel.org/show_bug.cgi?id=6001
Could you put yours comment on it ?

On fedora stock kernel, appears one patch, (Now I see that you are the
author :) )

I tested your patch and it works for me !

linux-2.6-acpi-video-dos.patch
Disable ACPI video display switching by default

-- mjg59

diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c
index bac2901..93b1a9e 100644
--- a/drivers/acpi/video.c
+++ b/drivers/acpi/video.c
@@ -1818,7 +1818,7 @@ static int acpi_video_bus_put_devices(struct acpi_video_bus *video)

static int acpi_video_bus_start_devices(struct acpi_video_bus *video)
{
- return acpi_video_bus_DOS(video, 0, 0);
+ return acpi_video_bus_DOS(video, 3, 1);
}

static int acpi_video_bus_stop_devices(struct acpi_video_bus *video)


On Mon, 2008-11-10 at 16:11 +0000, Matthew Garrett wrote:
> If the _DOS flag on my HP 2510p is set to 0 (ie, signal OS when screen
> notification is requested, don't change automatically) then it'll crash
> on random lid open/closes. The trace generally makes little sense and
> depends on the kernel version and phase of the moon. I'd ignored this as
> firmware brokenness up until lately, but since having _DOS set to 0 is
> the only way to get a notification when the display switch key is
> pressed on this machine I'd be interested in fixing it properly.
>
> Unfortunately, I've got no real idea what on earth is going on. The only
> clue I've found so far is that booting with maxcpus=1 leaves it working
> perfectly. What parts of the ACPI stack could be triggering this?
>
--
Sérgio M. B.


Attachments:
smime.p7s (2.14 kB)

2008-11-10 17:29:59

by Matthew Garrett

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

On Mon, Nov 10, 2008 at 05:26:35PM +0000, Sergio Monteiro Basto wrote:
> Hi, About this _DOS , I reported this problem a long time ago, which now
> is on http://bugzilla.kernel.org/show_bug.cgi?id=6001
> Could you put yours comment on it ?
>
> On fedora stock kernel, appears one patch, (Now I see that you are the
> author :) )
>
> I tested your patch and it works for me !

Yeah, that's the "safe" patch which stops us executing the codepath that
breaks, but also means I don't get display switch events. Having it set
to 1 means executing BIOS code that's likely to interfere with the rest
of the system, so a crash isn't surprising. But a setting of 0 *should*
be safe, and I'm quite confused as to why it's exploding.

--
Matthew Garrett | [email protected]

2008-11-10 23:53:22

by Sergio Monteiro Basto

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?


On Mon, 2008-11-10 at 17:29 +0000, Matthew Garrett wrote:
> On Mon, Nov 10, 2008 at 05:26:35PM +0000, Sergio Monteiro Basto wrote:
> > Hi, About this _DOS , I reported this problem a long time ago, which now
> > is on http://bugzilla.kernel.org/show_bug.cgi?id=6001
> > Could you put yours comment on it ?
> >
> > On fedora stock kernel, appears one patch, (Now I see that you are the
> > author :) )
> >
> > I tested your patch and it works for me !
>
> Yeah, that's the "safe" patch which stops us executing the codepath that
> breaks, but also means I don't get display switch events. Having it set
> to 1 means executing BIOS code that's likely to interfere with the rest
> of the system, so a crash isn't surprising.

Are you talking about 1st or 2nd parameter ?
I prefer
"return acpi_video_bus_DOS(video, 1, 0);" (which is the original code,
before last change on kernel vanilla)


> But a setting of 0 *should*
> be safe, and I'm quite confused as to why it's exploding.

The second parameter ?
I got some instability (after resume from disk) with a
kernel-2.6.27.4-79, which have yours patch, but I am far to be sure that
is related.

Thanks,
--
Sérgio M. B.


Attachments:
smime.p7s (2.14 kB)

2008-11-11 01:30:10

by Zhang, Rui

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

well, there is a bug report for this problem:
http://bugzilla.kernel.org/show_bug.cgi?id=11259

and there are some bug reports for a similar problem (system crashes on
lid close if _DOS=1, but on a UP platform).
http://bugzilla.kernel.org/show_bug.cgi?id=6001#c49

but unfortunately we haven't made any progress in this issue. :(

thanks,
rui

On Tue, 2008-11-11 at 00:11 +0800, Matthew Garrett wrote:
> If the _DOS flag on my HP 2510p is set to 0 (ie, signal OS when screen
> notification is requested, don't change automatically) then it'll crash
> on random lid open/closes. The trace generally makes little sense and
> depends on the kernel version and phase of the moon. I'd ignored this as
> firmware brokenness up until lately, but since having _DOS set to 0 is
> the only way to get a notification when the display switch key is
> pressed on this machine I'd be interested in fixing it properly.
>
> Unfortunately, I've got no real idea what on earth is going on. The only
> clue I've found so far is that booting with maxcpus=1 leaves it working
> perfectly. What parts of the ACPI stack could be triggering this?
>

2008-11-11 01:34:21

by Matthew Garrett

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

On Tue, Nov 11, 2008 at 09:27:14AM +0800, Zhang Rui wrote:
> well, there is a bug report for this problem:
> http://bugzilla.kernel.org/show_bug.cgi?id=11259

I'll give the noirqbalance case a go.

> and there are some bug reports for a similar problem (system crashes on
> lid close if _DOS=1, but on a UP platform).
> http://bugzilla.kernel.org/show_bug.cgi?id=6001#c49

I think the _DOS=1 case is somewhat different. There you'd expect
significantly more system BIOS code to be run, and that's going to stand
a much higher chance of causing unfortunate interactions with however
Linux has set up the graphics.

--
Matthew Garrett | [email protected]

2008-11-11 02:02:21

by Zhang, Rui

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

On Tue, 2008-11-11 at 09:33 +0800, Matthew Garrett wrote:
> On Tue, Nov 11, 2008 at 09:27:14AM +0800, Zhang Rui wrote:
> > well, there is a bug report for this problem:
> > http://bugzilla.kernel.org/show_bug.cgi?id=11259
>
> I'll give the noirqbalance case a go.
>
> > and there are some bug reports for a similar problem (system crashes on
> > lid close if _DOS=1, but on a UP platform).
> > http://bugzilla.kernel.org/show_bug.cgi?id=6001#c49
>
> I think the _DOS=1 case is somewhat different. There you'd expect
> significantly more system BIOS code to be run, and that's going to stand
> a much higher chance of causing unfortunate interactions with however
> Linux has set up the graphics.
>
oops. sorry, I mean system crashes on lid close if _DOS=0 on some UP
platform.

thanks,
rui

2008-11-11 02:05:50

by Matthew Garrett

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

On Tue, Nov 11, 2008 at 09:59:38AM +0800, Zhang Rui wrote:

> oops. sorry, I mean system crashes on lid close if _DOS=0 on some UP
> platform.

Oh, ugh. I don't have any with that issue, but that ought to be
"interesting" to fix.

--
Matthew Garrett | [email protected]

2008-11-12 05:31:50

by Sergio Monteiro Basto

[permalink] [raw]
Subject: Re: ACPI crash on lid close - SMP race?

Hi,
On my laptop which is not SMP , is a Intel Centrino 1.7Mhz , I have a
little stranger problem my lid button (backlight) doesn't work with
_DOS=0 after load Hal service and on first boot.
Like _DOS=0 on some laptop crash on lid close.

The report #6001 , made by other S?rgio (not me), which complains on
system freeze when _DOS=1 and on switching display.
Switching display for him means on switching to CTR or to monitor with
fn+ f4. With acpi=off don't freeze because that have the event , and
with acpi on, freeze , so IMHO, this is a problem probably not related
with _DOS value, S?rgio Luis , in last comment
( http://bugzilla.kernel.org/show_bug.cgi?id=6001#c51 ) wrote with
updated kernel freeze in the 2 cases.

But this report introduce a patch in main kernel that change _DOS from 1
to 0 , which I think at least should be reverted.

I'd love learn more about this switches , if someone can help me with
some tip , many thanks,


On Tue, 2008-11-11 at 02:05 +0000, Matthew Garrett wrote:
> On Tue, Nov 11, 2008 at 09:59:38AM +0800, Zhang Rui wrote:
>
> > oops. sorry, I mean system crashes on lid close if _DOS=0 on some UP
> > platform.
>
> Oh, ugh. I don't have any with that issue, but that ought to be
> "interesting" to fix.
>
--
S?rgio M.B.


Attachments:
smime.p7s (2.14 kB)