2008-07-15 22:25:00

by Alan Jenkins

[permalink] [raw]
Subject: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

It looks like this EC clears the SMI_EVT bit after every query, even if there
are more events pending. The workaround is to repeatedly query the EC until
it reports that no events remain.

This fixes a regression in 2.6.26 (from 2.6.25.3). Initially reported as
"Asus Eee PC hotkeys stop working if pressed quickly" in bugzilla
<http://bugzilla.kernel.org/show_bug.cgi?id=11089>.

The regression was caused by a recently added check for interrupt storms.
The Eee PC triggers this check and switches to polling. When multiple events
arrive between polling intervals, only one is fetched from the EC. This causes
erroneous behaviour; ultimately events stop being delivered altogether when the
EC buffer overflows.

Signed-off-by: Alan Jenkins <[email protected]>

---
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 5622aee..2b4c5a2 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -459,14 +459,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)

EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);

-static void acpi_ec_gpe_query(void *ec_cxt)
+static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
{
- struct acpi_ec *ec = ec_cxt;
- u8 value = 0;
struct acpi_ec_query_handler *handler, copy;

- if (!ec || acpi_ec_query(ec, &value))
- return;
mutex_lock(&ec->lock);
list_for_each_entry(handler, &ec->list, node) {
if (value == handler->query_bit) {
@@ -484,6 +480,18 @@ static void acpi_ec_gpe_query(void *ec_cxt)
mutex_unlock(&ec->lock);
}

+static void acpi_ec_gpe_query(void *ec_cxt)
+{
+ struct acpi_ec *ec = ec_cxt;
+ u8 value = 0;
+
+ if (!ec)
+ return;
+
+ while (!acpi_ec_query(ec, &value))
+ acpi_ec_gpe_run_handler(ec, value);
+}
+
static u32 acpi_ec_gpe_handler(void *data)
{
acpi_status status = AE_OK;


2008-07-17 11:49:58

by Alexey Starikovskiy

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Hi Alan,

Thanks for the patch, ACK.

Regards,
Alex.

Alan Jenkins wrote:
> It looks like this EC clears the SMI_EVT bit after every query, even if there
> are more events pending. The workaround is to repeatedly query the EC until
> it reports that no events remain.
>
> This fixes a regression in 2.6.26 (from 2.6.25.3). Initially reported as
> "Asus Eee PC hotkeys stop working if pressed quickly" in bugzilla
> <http://bugzilla.kernel.org/show_bug.cgi?id=11089>.
>
> The regression was caused by a recently added check for interrupt storms.
> The Eee PC triggers this check and switches to polling. When multiple events
> arrive between polling intervals, only one is fetched from the EC. This causes
> erroneous behaviour; ultimately events stop being delivered altogether when the
> EC buffer overflows.
>
> Signed-off-by: Alan Jenkins <[email protected]>
>
> ---
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 5622aee..2b4c5a2 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -459,14 +459,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)
>
> EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);
>
> -static void acpi_ec_gpe_query(void *ec_cxt)
> +static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
> {
> - struct acpi_ec *ec = ec_cxt;
> - u8 value = 0;
> struct acpi_ec_query_handler *handler, copy;
>
> - if (!ec || acpi_ec_query(ec, &value))
> - return;
> mutex_lock(&ec->lock);
> list_for_each_entry(handler, &ec->list, node) {
> if (value == handler->query_bit) {
> @@ -484,6 +480,18 @@ static void acpi_ec_gpe_query(void *ec_cxt)
> mutex_unlock(&ec->lock);
> }
>
> +static void acpi_ec_gpe_query(void *ec_cxt)
> +{
> + struct acpi_ec *ec = ec_cxt;
> + u8 value = 0;
> +
> + if (!ec)
> + return;
> +
> + while (!acpi_ec_query(ec, &value))
> + acpi_ec_gpe_run_handler(ec, value);
> +}
> +
> static u32 acpi_ec_gpe_handler(void *data)
> {
> acpi_status status = AE_OK;
>
>

Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
> Thanks for the patch, ACK.

This one fixes a potentially bad problem, since we could ignore more than
just hot key EC events by accident. Maybe it should go to -stable?

> Alan Jenkins wrote:
>> It looks like this EC clears the SMI_EVT bit after every query, even if there
>> are more events pending. The workaround is to repeatedly query the EC until
>> it reports that no events remain.
>>
>> This fixes a regression in 2.6.26 (from 2.6.25.3). Initially reported as
>> "Asus Eee PC hotkeys stop working if pressed quickly" in bugzilla
>> <http://bugzilla.kernel.org/show_bug.cgi?id=11089>.
>>
>> The regression was caused by a recently added check for interrupt storms.
>> The Eee PC triggers this check and switches to polling. When multiple events
>> arrive between polling intervals, only one is fetched from the EC. This causes
>> erroneous behaviour; ultimately events stop being delivered altogether when the
>> EC buffer overflows.
>>
>> Signed-off-by: Alan Jenkins <[email protected]>
>>
>> ---
>> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
>> index 5622aee..2b4c5a2 100644
>> --- a/drivers/acpi/ec.c
>> +++ b/drivers/acpi/ec.c
>> @@ -459,14 +459,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)
>> EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);
>> -static void acpi_ec_gpe_query(void *ec_cxt)
>> +static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
>> {
>> - struct acpi_ec *ec = ec_cxt;
>> - u8 value = 0;
>> struct acpi_ec_query_handler *handler, copy;
>> - if (!ec || acpi_ec_query(ec, &value))
>> - return;
>> mutex_lock(&ec->lock);
>> list_for_each_entry(handler, &ec->list, node) {
>> if (value == handler->query_bit) {
>> @@ -484,6 +480,18 @@ static void acpi_ec_gpe_query(void *ec_cxt)
>> mutex_unlock(&ec->lock);
>> }
>> +static void acpi_ec_gpe_query(void *ec_cxt)
>> +{
>> + struct acpi_ec *ec = ec_cxt;
>> + u8 value = 0;
>> +
>> + if (!ec)
>> + return;
>> +
>> + while (!acpi_ec_query(ec, &value))
>> + acpi_ec_gpe_run_handler(ec, value);
>> +}
>> +
>> static u32 acpi_ec_gpe_handler(void *data)
>> {
>> acpi_status status = AE_OK;

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-07-17 12:31:21

by Alexey Starikovskiy

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Henrique de Moraes Holschuh wrote:
> On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
>> Thanks for the patch, ACK.
>
> This one fixes a potentially bad problem, since we could ignore more than
> just hot key EC events by accident. Maybe it should go to -stable?

I vote for it

Regards,
Alex.

2008-07-17 14:35:48

by Alexey Starikovskiy

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Hi Alan,

Could you please test if your patch works with the last patch in #10919?

Thanks,
Alex.


Alan Jenkins wrote:
> It looks like this EC clears the SMI_EVT bit after every query, even if there
> are more events pending. The workaround is to repeatedly query the EC until
> it reports that no events remain.
>
> This fixes a regression in 2.6.26 (from 2.6.25.3). Initially reported as
> "Asus Eee PC hotkeys stop working if pressed quickly" in bugzilla
> <http://bugzilla.kernel.org/show_bug.cgi?id=11089>.
>
> The regression was caused by a recently added check for interrupt storms.
> The Eee PC triggers this check and switches to polling. When multiple events
> arrive between polling intervals, only one is fetched from the EC. This causes
> erroneous behaviour; ultimately events stop being delivered altogether when the
> EC buffer overflows.
>
> Signed-off-by: Alan Jenkins <[email protected]>
>
> ---
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 5622aee..2b4c5a2 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -459,14 +459,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)
>
> EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);
>
> -static void acpi_ec_gpe_query(void *ec_cxt)
> +static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
> {
> - struct acpi_ec *ec = ec_cxt;
> - u8 value = 0;
> struct acpi_ec_query_handler *handler, copy;
>
> - if (!ec || acpi_ec_query(ec, &value))
> - return;
> mutex_lock(&ec->lock);
> list_for_each_entry(handler, &ec->list, node) {
> if (value == handler->query_bit) {
> @@ -484,6 +480,18 @@ static void acpi_ec_gpe_query(void *ec_cxt)
> mutex_unlock(&ec->lock);
> }
>
> +static void acpi_ec_gpe_query(void *ec_cxt)
> +{
> + struct acpi_ec *ec = ec_cxt;
> + u8 value = 0;
> +
> + if (!ec)
> + return;
> +
> + while (!acpi_ec_query(ec, &value))
> + acpi_ec_gpe_run_handler(ec, value);
> +}
> +
> static u32 acpi_ec_gpe_handler(void *data)
> {
> acpi_status status = AE_OK;
>
>

2008-07-17 16:02:53

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Alexey Starikovskiy wrote:
> Hi Alan,
>
> Could you please test if your patch works with the last patch in #10919?
>
> Thanks,
> Alex.
Vacuously so.

My patch still applies, but #10919 makes it obsolete. My patch fixed a
bug that shows up in polling mode. #10919 kills polling mode.

I've tested v2.6.26 + #10919 and it works fine (except for spamming the
kernel log - please read my Bugzilla comment).


It appears that interrupt mode suffered from a race which is very
similar to my original problem. If two GPE interrupts arrive before the
workqueue runs, then the second interrupt will be ignored because
EC_FLAGS_QUERY_PENDING is still set. This will happen with any EC if
interrupts are very close together, right?

I think my patch also fixes this theoretical problem. But I'd rather
you took over on this. I was already confused by ec.c in v2.6.26, and
with #10919 I understand it even less. E.g. why is
ec_switch_to_poll_mode() still present; what does it do now do_ec_poll()
is removed?

I'm happy to work on this with you, but I'd need to be able understand
the code first :-(.

Alan

Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
> Henrique de Moraes Holschuh wrote:
>> On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
>>> Thanks for the patch, ACK.
>>
>> This one fixes a potentially bad problem, since we could ignore more than
>> just hot key EC events by accident. Maybe it should go to -stable?
>
> I vote for it

Well, in that case, it would be best to tack a Cc: [email protected] git
footer to it right away, I think.

IMHO, it would also be nice if the commit message made it more clear that
the issue it solves can affect much more serious ACPI events than just hot
key presses.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-07-17 16:45:25

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Henrique de Moraes Holschuh wrote:
> On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
>
>> Henrique de Moraes Holschuh wrote:
>>
>>> On Thu, 17 Jul 2008, Alexey Starikovskiy wrote:
>>>
>>>> Thanks for the patch, ACK.
>>>>
>>> This one fixes a potentially bad problem, since we could ignore more than
>>> just hot key EC events by accident. Maybe it should go to -stable?
>>>
>> I vote for it
>>
>
> Well, in that case, it would be best to tack a Cc: [email protected] git
> footer to it right away, I think.
>
> IMHO, it would also be nice if the commit message made it more clear that
> the issue it solves can affect much more serious ACPI events than just hot
> key presses.
>
>
Actually Alexey has another patch in bugzilla (#10919) which resolves
this issue in a better way. It avoids polling altogether, which is good
because it means you get events immediately. My laptop has backlight
adjustment hotkeys with hardware autorepeat, so it looks really jerky
with polling.

So I think I should withdraw my patch and leave this to Alexey. I've
tested his fix on my laptop and it works fine. It needs some more work
though - e.g. at the moment it spams the kernel log.

Thanks,
Alan

2008-07-17 16:45:46

by Alexey Starikovskiy

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Alan Jenkins wrote:
> Alexey Starikovskiy wrote:
>> Hi Alan,
>>
>> Could you please test if your patch works with the last patch in #10919?
>>
>> Thanks,
>> Alex.
> Vacuously so.
>
> My patch still applies, but #10919 makes it obsolete.
Not so, there are two polls in ec.c, first is poll for change in status register,
which gave the name to the mode, and still exists; the other is for event
in embedded controller, which was introduced to properly solve #9998, and part of
it is removed by patch in #10919.
> My patch fixed a
> bug that shows up in polling mode. #10919 kills polling mode.

>
> I've tested v2.6.26 + #10919 and it works fine (except for spamming the
> kernel log - please read my Bugzilla comment).
>
>
> It appears that interrupt mode suffered from a race which is very
> similar to my original problem. If two GPE interrupts arrive before the
> workqueue runs, then the second interrupt will be ignored because
> EC_FLAGS_QUERY_PENDING is still set. This will happen with any EC if
> interrupts are very close together, right?
The notion of queue in embedded controller is new, it was never mentioned in
ACPI spec, so the driver was written with assumption that new query interrupt should
not arrive before we service previous one. There is even a chart in how interrupts
should occur near the EC query command...
>
> I think my patch also fixes this theoretical problem.
I think, this is not a theoretical problem, but the problem we've tried to solve in
#9998, #10724, and so on.
> But I'd rather
> you took over on this. I was already confused by ec.c in v2.6.26, and
> with #10919 I understand it even less. E.g. why is
> ec_switch_to_poll_mode() still present; what does it do now do_ec_poll()
> is removed?
See above, I still disable EC GPE for the time than we have pending query,
so we better not wait for it to check the status register
>
> I'm happy to work on this with you, but I'd need to be able understand
> the code first :-(.
Well, with this patch of yours, I guess, we will not have too many problems in EC left :-)
>
> Alan

Thanks again,
Alex.

Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

On Thu, 17 Jul 2008, Alan Jenkins wrote:
> Actually Alexey has another patch in bugzilla (#10919) which resolves
> this issue in a better way. It avoids polling altogether, which is good
> because it means you get events immediately. My laptop has backlight
> adjustment hotkeys with hardware autorepeat, so it looks really jerky
> with polling.

We need to get ALL pending events from the EC, whether in polling mode or in
interrupt mode (and we must make sure that we ARE going to queue any new
interrupt before we stop checking for an empty pending event queue,
otherwise there is a small window where a new event might get queued, and we
are still masking the GPE and don't notice it). As long as the fix does
that, it is all I care :-)

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-07-17 18:55:13

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Alexey Starikovskiy wrote:
> Alan Jenkins wrote:
>> Alexey Starikovskiy wrote:
>>> Hi Alan,
>>>
>>> Could you please test if your patch works with the last patch in
>>> #10919?
>>>
>>> Thanks,
>>> Alex.
>> Vacuously so.
>>
>> My patch still applies, but #10919 makes it obsolete.
> Not so, there are two polls in ec.c, first is poll for change in
> status register,
> which gave the name to the mode, and still exists; the other is for event
> in embedded controller, which was introduced to properly solve #9998,
> and part of
> it is removed by patch in #10919.
>> My patch fixed a
>> bug that shows up in polling mode. #10919 kills polling mode.
>
>>
>> I've tested v2.6.26 + #10919 and it works fine (except for spamming the
>> kernel log - please read my Bugzilla comment).
>>
>>
>> It appears that interrupt mode suffered from a race which is very
>> similar to my original problem. If two GPE interrupts arrive before the
>> workqueue runs, then the second interrupt will be ignored because
>> EC_FLAGS_QUERY_PENDING is still set. This will happen with any EC if
>> interrupts are very close together, right?
> The notion of queue in embedded controller is new, it was never
> mentioned in
> ACPI spec, so the driver was written with assumption that new query
> interrupt should
> not arrive before we service previous one. There is even a chart in
> how interrupts
> should occur near the EC query command...
>>
>> I think my patch also fixes this theoretical problem.
> I think, this is not a theoretical problem, but the problem we've
> tried to solve in
> #9998, #10724, and so on.
>> But I'd rather
>> you took over on this. I was already confused by ec.c in v2.6.26, and
>> with #10919 I understand it even less. E.g. why is
>> ec_switch_to_poll_mode() still present; what does it do now do_ec_poll()
>> is removed?
> See above, I still disable EC GPE for the time than we have pending
> query,
> so we better not wait for it to check the status register
>>
>> I'm happy to work on this with you, but I'd need to be able understand
>> the code first :-(.
> Well, with this patch of yours, I guess, we will not have too many
> problems in EC left :-)

OK, I'm happy now.

However, I'm now worried that I broke the semantics for
EC_FLAGS_QUERY_PENDING. In my patch it gets cleared after the first
query, even though I'm running queries in a loop until nothing is left.
It doesn't cause a problem in my patch, but it's unclean and might cause
confusion later on. It'd be better to clear it after the loop has
completed.

Can I fix my patch? If you ACK the new code below, then I'll resend
with a proper changelog, S-o-B, CC's from the -mm mail (including
[email protected]) and grovel to akpm, etc.

You're latest (quieter) work still applies on top and works fine.

Thanks
Alan

---

From: Alan Jenkins <[email protected]>
Tested-by: Alan Jenkins <[email protected]>

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 5622aee..2a42392 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -230,8 +230,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec, u8 command,
"finish-write timeout, command = %d\n", command);
goto end;
}
- } else if (command == ACPI_EC_COMMAND_QUERY)
- clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+ }

for (; rdata_len > 0; --rdata_len) {
result = acpi_ec_wait(ec, ACPI_EC_EVENT_OBF_1, force_poll);
@@ -459,14 +458,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)

EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);

-static void acpi_ec_gpe_query(void *ec_cxt)
+static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
{
- struct acpi_ec *ec = ec_cxt;
- u8 value = 0;
struct acpi_ec_query_handler *handler, copy;

- if (!ec || acpi_ec_query(ec, &value))
- return;
mutex_lock(&ec->lock);
list_for_each_entry(handler, &ec->list, node) {
if (value == handler->query_bit) {
@@ -484,6 +479,20 @@ static void acpi_ec_gpe_query(void *ec_cxt)
mutex_unlock(&ec->lock);
}

+static void acpi_ec_gpe_query(void *ec_cxt)
+{
+ struct acpi_ec *ec = ec_cxt;
+ u8 value = 0;
+
+ if (!ec)
+ return;
+
+ while (acpi_ec_query(ec, &value) != 0)
+ acpi_ec_gpe_run_handler(ec, value);
+
+ clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+}
+
static u32 acpi_ec_gpe_handler(void *data)
{
acpi_status status = AE_OK;


2008-07-17 19:04:44

by Alexey Starikovskiy

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Alan Jenkins wrote:
> Alexey Starikovskiy wrote:
>> Alan Jenkins wrote:
>>> Alexey Starikovskiy wrote:
>>>> Hi Alan,
>>>>
>>>> Could you please test if your patch works with the last patch in
>>>> #10919?
>>>>
>>>> Thanks,
>>>> Alex.
>>> Vacuously so.
>>>
>>> My patch still applies, but #10919 makes it obsolete.
>> Not so, there are two polls in ec.c, first is poll for change in
>> status register,
>> which gave the name to the mode, and still exists; the other is for event
>> in embedded controller, which was introduced to properly solve #9998,
>> and part of
>> it is removed by patch in #10919.
>>> My patch fixed a
>>> bug that shows up in polling mode. #10919 kills polling mode.
>>> I've tested v2.6.26 + #10919 and it works fine (except for spamming the
>>> kernel log - please read my Bugzilla comment).
>>>
>>>
>>> It appears that interrupt mode suffered from a race which is very
>>> similar to my original problem. If two GPE interrupts arrive before the
>>> workqueue runs, then the second interrupt will be ignored because
>>> EC_FLAGS_QUERY_PENDING is still set. This will happen with any EC if
>>> interrupts are very close together, right?
>> The notion of queue in embedded controller is new, it was never
>> mentioned in
>> ACPI spec, so the driver was written with assumption that new query
>> interrupt should
>> not arrive before we service previous one. There is even a chart in
>> how interrupts
>> should occur near the EC query command...
>>> I think my patch also fixes this theoretical problem.
>> I think, this is not a theoretical problem, but the problem we've
>> tried to solve in
>> #9998, #10724, and so on.
>>> But I'd rather
>>> you took over on this. I was already confused by ec.c in v2.6.26, and
>>> with #10919 I understand it even less. E.g. why is
>>> ec_switch_to_poll_mode() still present; what does it do now do_ec_poll()
>>> is removed?
>> See above, I still disable EC GPE for the time than we have pending
>> query,
>> so we better not wait for it to check the status register
>>> I'm happy to work on this with you, but I'd need to be able understand
>>> the code first :-(.
>> Well, with this patch of yours, I guess, we will not have too many
>> problems in EC left :-)
>
> OK, I'm happy now.
>
> However, I'm now worried that I broke the semantics for
> EC_FLAGS_QUERY_PENDING. In my patch it gets cleared after the first
> query, even though I'm running queries in a loop until nothing is left.
> It doesn't cause a problem in my patch, but it's unclean and might cause
> confusion later on. It'd be better to clear it after the loop has
> completed.
Right.
>
> Can I fix my patch? If you ACK the new code below, then I'll resend
> with a proper changelog, S-o-B, CC's from the -mm mail (including
> [email protected]) and grovel to akpm, etc.
ACK
>
> You're latest (quieter) work still applies on top and works fine.
Good.
>
> Thanks
> Alan
Thanks,
Alex.
>
> ---
>
> From: Alan Jenkins <[email protected]>
> Tested-by: Alan Jenkins <[email protected]>
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 5622aee..2a42392 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -230,8 +230,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec, u8 command,
> "finish-write timeout, command = %d\n", command);
> goto end;
> }
> - } else if (command == ACPI_EC_COMMAND_QUERY)
> - clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> + }
>
> for (; rdata_len > 0; --rdata_len) {
> result = acpi_ec_wait(ec, ACPI_EC_EVENT_OBF_1, force_poll);
> @@ -459,14 +458,10 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)
>
> EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);
>
> -static void acpi_ec_gpe_query(void *ec_cxt)
> +static void acpi_ec_gpe_run_handler(struct acpi_ec *ec, u8 value)
> {
> - struct acpi_ec *ec = ec_cxt;
> - u8 value = 0;
> struct acpi_ec_query_handler *handler, copy;
>
> - if (!ec || acpi_ec_query(ec, &value))
> - return;
> mutex_lock(&ec->lock);
> list_for_each_entry(handler, &ec->list, node) {
> if (value == handler->query_bit) {
> @@ -484,6 +479,20 @@ static void acpi_ec_gpe_query(void *ec_cxt)
> mutex_unlock(&ec->lock);
> }
>
> +static void acpi_ec_gpe_query(void *ec_cxt)
> +{
> + struct acpi_ec *ec = ec_cxt;
> + u8 value = 0;
> +
> + if (!ec)
> + return;
> +
> + while (acpi_ec_query(ec, &value) != 0)
> + acpi_ec_gpe_run_handler(ec, value);
> +
> + clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> +}
> +
> static u32 acpi_ec_gpe_handler(void *data)
> {
> acpi_status status = AE_OK;
>
>
>

2008-07-17 19:07:23

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Henrique de Moraes Holschuh wrote:
> On Thu, 17 Jul 2008, Alan Jenkins wrote:
>
>> Actually Alexey has another patch in bugzilla (#10919) which resolves
>> this issue in a better way. It avoids polling altogether, which is good
>> because it means you get events immediately. My laptop has backlight
>> adjustment hotkeys with hardware autorepeat, so it looks really jerky
>> with polling.
>>
>
> We need to get ALL pending events from the EC, whether in polling mode or in
> interrupt mode

> (and we must make sure that we ARE going to queue any new
> interrupt before we stop checking for an empty pending event queue,
> otherwise there is a small window where a new event might get queued, and we
> are still masking the GPE and don't notice it). As long as the fix does
> that, it is all I care :-)
>
OK, I forgot about that. I need to do some reading on kernel
programming, workqueues in particular, and redo my patch again. Sorry,
Alexey! AFAICS there's still this (very) small window where events could
be dropped.

Alan

2008-07-19 11:37:32

by Alan Jenkins

[permalink] [raw]
Subject: [PATCH 0/3] acpi: GPE fixes

Here's what I came up with -

1. I was fighting against EC_FLAGS_QUERY_PENDING. This was used to ignore
multiple successive GPE interrupts and treat them as a single GPE instead.
That's the exact opposite of what we want to do. Let's get rid of it.

2. Then we can apply my original patch to fix GPE polling on the Asus EeePC,
by repeatedly querying for GPEs until there are none left.

3. Finally, if I'm right then we now know how to handle "GPE interrupt storms".
Some EC's are raising multiple interrupts before we acknowledge them. but
they're just telling us how many events are pending. There's no harm in
that, so we don't ever need to disable GPE interrupts. Let's get rid of
GPE polling mode. (Code mainly stolen from Alexey).

Patch 3 would benefit from wider testing. Fortunately there are several open
bugs about GPEs so it should be easy to find testers :-).

Alan

2008-07-19 14:08:12

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH 0/3] acpi: GPE fixes

On Sat, Jul 19, 2008 at 1:37 PM, Alan Jenkins
<[email protected]> wrote:
> Here's what I came up with -
>
> 1. I was fighting against EC_FLAGS_QUERY_PENDING. This was used to ignore
> multiple successive GPE interrupts and treat them as a single GPE instead.
> That's the exact opposite of what we want to do. Let's get rid of it.
>
> 2. Then we can apply my original patch to fix GPE polling on the Asus EeePC,
> by repeatedly querying for GPEs until there are none left.
>
> 3. Finally, if I'm right then we now know how to handle "GPE interrupt storms".
> Some EC's are raising multiple interrupts before we acknowledge them. but
> they're just telling us how many events are pending. There's no harm in
> that, so we don't ever need to disable GPE interrupts. Let's get rid of
> GPE polling mode. (Code mainly stolen from Alexey).

Hi,

I have seen the "GPE storm" message before but it had no apparent
side-effects. Your patches do not seem to change this, although the
message is now gone. Thanks! (I have an Acer Aspire 5720 laptop.)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-12 23:29:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC


Did this get fixed yet?

I have an patch in -mm which I just restored (I had to tempdrop it
because the acpi tree was busted for some time). But it seems to be
old.

http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
afaik, still unfixed, as is 2.6.27-rc.

Thanks.

2008-08-13 10:21:50

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Andrew Morton wrote:
> Did this get fixed yet?
>
> I have an patch in -mm which I just restored (I had to tempdrop it
> because the acpi tree was busted for some time). But it seems to be
> old.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
> but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
> afaik, still unfixed, as is 2.6.27-rc.
>
That's correct. I think this specific patch should go in 2.6.27 and
2.6.26-stable. No objections have been raised so far.

I still need this patch to make my brightness and volume control keys
usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
bug). This is true even after applying the latest patches from bug
10919 (#25 + #27).

I think the 10919 fix makes it harder to reproduce, but it definitely
still happens. I guess this is because the polling-driven EC
transactions add 1ms delays between each byte. The slower timings leave
a window where the buggy behaviour of my EC can make a difference. (It
has been seen to clear the "pending event" bit after a single event is
read, despite having more events pending).

There are more serious consequences of this bug. After a while it can
confuse the EC enough to cause lockups or reboots during boot, or after
pressing a single hotkey. This bad state is preserved over reboots,
even into known good kernels. Fortunately the badness clears when power
is removed for a long enough period. For a while I was worried that
something had physically burnt out.

Thanks
Alan

2008-08-13 10:47:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

On Wed, 13 Aug 2008 11:21:10 +0100 Alan Jenkins <[email protected]> wrote:

> Andrew Morton wrote:
> > Did this get fixed yet?
> >
> > I have an patch in -mm which I just restored (I had to tempdrop it
> > because the acpi tree was busted for some time). But it seems to be
> > old.
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
> > but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
> > afaik, still unfixed, as is 2.6.27-rc.
> >
> That's correct. I think this specific patch should go in 2.6.27 and
> 2.6.26-stable. No objections have been raised so far.
>
> I still need this patch to make my brightness and volume control keys
> usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
> bug). This is true even after applying the latest patches from bug
> 10919 (#25 + #27).
>

Confusing. Please send the patch which you think we should apply.

> I think the 10919 fix makes it harder to reproduce, but it definitely
> still happens. I guess this is because the polling-driven EC
> transactions add 1ms delays between each byte. The slower timings leave
> a window where the buggy behaviour of my EC can make a difference. (It
> has been seen to clear the "pending event" bit after a single event is
> read, despite having more events pending).
>
> There are more serious consequences of this bug. After a while it can
> confuse the EC enough to cause lockups or reboots during boot, or after
> pressing a single hotkey. This bad state is preserved over reboots,
> even into known good kernels. Fortunately the badness clears when power
> is removed for a long enough period. For a while I was worried that
> something had physically burnt out.

Oh gad. And there's no workaround?

2008-08-13 11:45:53

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Andrew Morton wrote:
> On Wed, 13 Aug 2008 11:21:10 +0100 Alan Jenkins <[email protected]> wrote:
>
>
>> Andrew Morton wrote:
>>
>>> Did this get fixed yet?
>>>
>>> I have an patch in -mm which I just restored (I had to tempdrop it
>>> because the acpi tree was busted for some time). But it seems to be
>>> old.
>>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
>>> but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
>>> afaik, still unfixed, as is 2.6.27-rc.
>>>
>>>
>> That's correct. I think this specific patch should go in 2.6.27 and
>> 2.6.26-stable. No objections have been raised so far.
>>
>> I still need this patch to make my brightness and volume control keys
>> usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
>> bug). This is true even after applying the latest patches from bug
>> 10919 (#25 + #27).
>>
>>
>
> Confusing. Please send the patch which you think we should apply.
>
>
>> I think the 10919 fix makes it harder to reproduce, but it definitely
>> still happens. I guess this is because the polling-driven EC
>> transactions add 1ms delays between each byte. The slower timings leave
>> a window where the buggy behaviour of my EC can make a difference. (It
>> has been seen to clear the "pending event" bit after a single event is
>> read, despite having more events pending).
>>
>> There are more serious consequences of this bug. After a while it can
>> confuse the EC enough to cause lockups or reboots during boot, or after
>> pressing a single hotkey. This bad state is preserved over reboots,
>> even into known good kernels. Fortunately the badness clears when power
>> is removed for a long enough period. For a while I was worried that
>> something had physically burnt out.
>>
>
> Oh gad. And there's no workaround?
>
Sorry, that was confusing.

The patch in currently in -mm _is_ the workaround for this damage. It
was not initially obvious just how important it was :-). I've
re-attached it as requested.

10919, "laggy hotkeys" is just what it says; ACPI EC events are slower
because of polling. It appears to be a more cosmetic issue which is
orthogonal to the _dropping_ of events.

Thanks
Alan


Attachments:
Attached Message (2.57 kB)

2008-08-13 11:51:25

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

[Dupe apology: CC'd to [email protected], with the right address this time]

Andrew Morton wrote:
> On Wed, 13 Aug 2008 11:21:10 +0100 Alan Jenkins <[email protected]> wrote:
>
>
>> Andrew Morton wrote:
>>
>>> Did this get fixed yet?
>>>
>>> I have an patch in -mm which I just restored (I had to tempdrop it
>>> because the acpi tree was busted for some time). But it seems to be
>>> old.
>>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
>>> but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
>>> afaik, still unfixed, as is 2.6.27-rc.
>>>
>>>
>> That's correct. I think this specific patch should go in 2.6.27 and
>> 2.6.26-stable. No objections have been raised so far.
>>
>> I still need this patch to make my brightness and volume control keys
>> usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
>> bug). This is true even after applying the latest patches from bug
>> 10919 (#25 + #27).
>>
>>
>
> Confusing. Please send the patch which you think we should apply.
>
>
>> I think the 10919 fix makes it harder to reproduce, but it definitely
>> still happens. I guess this is because the polling-driven EC
>> transactions add 1ms delays between each byte. The slower timings leave
>> a window where the buggy behaviour of my EC can make a difference. (It
>> has been seen to clear the "pending event" bit after a single event is
>> read, despite having more events pending).
>>
>> There are more serious consequences of this bug. After a while it can
>> confuse the EC enough to cause lockups or reboots during boot, or after
>> pressing a single hotkey. This bad state is preserved over reboots,
>> even into known good kernels. Fortunately the badness clears when power
>> is removed for a long enough period. For a while I was worried that
>> something had physically burnt out.
>>
>
> Oh gad. And there's no workaround?
>
Sorry, that was confusing.

The patch in currently in -mm _is_ the workaround for this damage. It
was not initially obvious just how important it was :-). I've
re-attached it as requested.

10919, "laggy hotkeys" is just what it says; ACPI EC events are slower
because of polling. It appears to be a more cosmetic issue which is
orthogonal to the _dropping_ of events.

Thanks
Alan


Attachments:
Attached Message (2.57 kB)

2008-08-13 14:14:42

by Maximilian Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

On Wednesday 13 August 2008, Alan Jenkins wrote:
> [Dupe apology: CC'd to [email protected], with the right address this time]
>
> Andrew Morton wrote:
> > On Wed, 13 Aug 2008 11:21:10 +0100 Alan Jenkins
<[email protected]> wrote:
> >> Andrew Morton wrote:
> >>> Did this get fixed yet?
> >>>
> >>> I have an patch in -mm which I just restored (I had to tempdrop it
> >>> because the acpi tree was busted for some time). But it seems to be
> >>> old.
> >>>
> >>> http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
> >>> but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
> >>> afaik, still unfixed, as is 2.6.27-rc.
> >>
> >> That's correct. I think this specific patch should go in 2.6.27 and
> >> 2.6.26-stable. No objections have been raised so far.
> >>
> >> I still need this patch to make my brightness and volume control keys
> >> usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
> >> bug). This is true even after applying the latest patches from bug
> >> 10919 (#25 + #27).
> >
> > Confusing. Please send the patch which you think we should apply.
> >
> >> I think the 10919 fix makes it harder to reproduce, but it definitely
> >> still happens. I guess this is because the polling-driven EC
> >> transactions add 1ms delays between each byte. The slower timings leave
> >> a window where the buggy behaviour of my EC can make a difference. (It
> >> has been seen to clear the "pending event" bit after a single event is
> >> read, despite having more events pending).
> >>
> >> There are more serious consequences of this bug. After a while it can
> >> confuse the EC enough to cause lockups or reboots during boot, or after
> >> pressing a single hotkey. This bad state is preserved over reboots,
> >> even into known good kernels. Fortunately the badness clears when power
> >> is removed for a long enough period. For a while I was worried that
> >> something had physically burnt out.
> >
> > Oh gad. And there's no workaround?
>
> Sorry, that was confusing.
>
> The patch in currently in -mm _is_ the workaround for this damage. It
> was not initially obvious just how important it was :-). I've
> re-attached it as requested.
>
> 10919, "laggy hotkeys" is just what it says; ACPI EC events are slower
> because of polling. It appears to be a more cosmetic issue which is
> orthogonal to the _dropping_ of events.
>
> Thanks
> Alan

This patch doesn't fix my problem (bug 10919), it only changes it a bit.
When I press the dimming key and hold it pressed the display should dim
up/down step by step as long as I hold the key pressed (that's how it has
always been).
I'm now running a 2.6.27-rc3 kernel with your patch applied and the display
does dim as described below.
When I press the dimming key first the display brightness doesn't change at
all, then it jumps multiple steps, stays there for a short while and again
jumps some steps till it reaches the end of the dimming interval.
It looks like the key presses (assuming holding down the key does generate
multiple key presses) get queued up, then all processed all at once, then
again queued up...

Maxi


Attachments:
(No filename) (3.09 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-08-13 14:39:21

by Alan Jenkins

[permalink] [raw]
Subject: Re: [PATCH] acpi: Avoid dropping rapid hotkey events (or other GPEs) on Asus EeePC

Maximilian Engelhardt wrote:
> On Wednesday 13 August 2008, Alan Jenkins wrote:
>
>> [Dupe apology: CC'd to [email protected], with the right address this time]
>>
>> Andrew Morton wrote:
>>
>>> On Wed, 13 Aug 2008 11:21:10 +0100 Alan Jenkins
>>>
> <[email protected]> wrote:
>
>>>> Andrew Morton wrote:
>>>>
>>>>> Did this get fixed yet?
>>>>>
>>>>> I have an patch in -mm which I just restored (I had to tempdrop it
>>>>> because the acpi tree was busted for some time). But it seems to be
>>>>> old.
>>>>>
>>>>> http://bugzilla.kernel.org/show_bug.cgi?id=10919 is marked "resolved"
>>>>> but the reporter (Maximilian) seems to think otherwise. 2.6.26.x is,
>>>>> afaik, still unfixed, as is 2.6.27-rc.
>>>>>
>>>> That's correct. I think this specific patch should go in 2.6.27 and
>>>> 2.6.26-stable. No objections have been raised so far.
>>>>
>>>> I still need this patch to make my brightness and volume control keys
>>>> usable in 2.6.27-rc3. (They auto-repeat fast enough to trigger the
>>>> bug). This is true even after applying the latest patches from bug
>>>> 10919 (#25 + #27).
>>>>
>>> Confusing. Please send the patch which you think we should apply.
>>>
>>>
>>>> I think the 10919 fix makes it harder to reproduce, but it definitely
>>>> still happens. I guess this is because the polling-driven EC
>>>> transactions add 1ms delays between each byte. The slower timings leave
>>>> a window where the buggy behaviour of my EC can make a difference. (It
>>>> has been seen to clear the "pending event" bit after a single event is
>>>> read, despite having more events pending).
>>>>
>>>> There are more serious consequences of this bug. After a while it can
>>>> confuse the EC enough to cause lockups or reboots during boot, or after
>>>> pressing a single hotkey. This bad state is preserved over reboots,
>>>> even into known good kernels. Fortunately the badness clears when power
>>>> is removed for a long enough period. For a while I was worried that
>>>> something had physically burnt out.
>>>>
>>> Oh gad. And there's no workaround?
>>>
>> Sorry, that was confusing.
>>
>> The patch in currently in -mm _is_ the workaround for this damage. It
>> was not initially obvious just how important it was :-). I've
>> re-attached it as requested.
>>
>> 10919, "laggy hotkeys" is just what it says; ACPI EC events are slower
>> because of polling. It appears to be a more cosmetic issue which is
>> orthogonal to the _dropping_ of events.
>>
>> Thanks
>> Alan
>>
>
> This patch doesn't fix my problem (bug 10919), it only changes it a bit.
> When I press the dimming key and hold it pressed the display should dim
> up/down step by step as long as I hold the key pressed (that's how it has
> always been).
> I'm now running a 2.6.27-rc3 kernel with your patch applied and the display
> does dim as described below.
> When I press the dimming key first the display brightness doesn't change at
> all, then it jumps multiple steps, stays there for a short while and again
> jumps some steps till it reaches the end of the dimming interval.
> It looks like the key presses (assuming holding down the key does generate
> multiple key presses) get queued up, then all processed all at once, then
> again queued up...
>
> Maxi
>
This is expected behaviour - I can explain it in detail if that helps.
I see the same on my laptop, though perhaps it is more annoying on yours.

The patch I've submitted here is not intended to fix #10919. You said
the patch at <http://bugzilla.kernel.org/show_bug.cgi?id=10919#c21>
worked for you, and I think the approach there is the way forward for
this "laggy" issue. However I don't know if it will be ready for this
kernel release. I pointed out a cosmetic issue with it to Alexey but
haven't heard anything since.

Alan