2014-02-26 15:43:46

by Kieran Clancy

[permalink] [raw]
Subject: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
continue to log events during sleep (lid open/close, AC plug/unplug,
battery level change), which accumulate in the EC until a buffer fills.
After the buffer is full (tests suggest it holds 8 events), GPEs stop
being triggered for new events. This state persists on wake or even on
power cycle, and prevents new events from being registered until the EC
is manually polled.

This is the root cause of a number of bugs, including AC not being
detected properly, lid close not triggering suspend, and low ambient
light not triggering the keyboard backlight. The bug also seemed to be
responsible for performance issues on at least one user's machine.

Juan Manuel Cabo found the cause of bug and the workaround of polling
the EC manually on wake.

This patch:
- Adds a function acpi_ec_clear() which polls the EC, at most
ACPI_EC_CLEAR_MAX (currently 20) times. A warning is logged if this
limit is reached.
- Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
system vendor is Samsung. This check could be replaced by several more
specific DMI vendor/product pairs, but it's likely that the bug
affects more Samsung products than just the five series mentioned
above. Further, it should not be harmful to run acpi_ec_clear() on
systems without the bug; it will return immediately after finding no
data waiting.
- Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
- Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()

References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
Signed-off-by: Kieran Clancy <[email protected]>
Signed-off-by: Lan Tianyu <[email protected]>
Signed-off-by: Juan Manuel Cabo <[email protected]>
Signed-off-by: Dennis Jansen <[email protected]>
Tested-by: Maurizio D'Addona <[email protected]>
Tested-by: San Zamoyski <[email protected]>
---
drivers/acpi/ec.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 959d41a..f437d9a 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -67,6 +67,8 @@ enum ec_command {
#define ACPI_EC_DELAY 500 /* Wait 500ms max. during EC ops */
#define ACPI_EC_UDELAY_GLK 1000 /* Wait 1ms max. to get global lock */
#define ACPI_EC_MSI_UDELAY 550 /* Wait 550us for MSI EC */
+#define ACPI_EC_CLEAR_MAX 20 /* Maximum number of events to query
+ * when trying to clear the EC */

enum {
EC_FLAGS_QUERY_PENDING, /* Query is pending */
@@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
+static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */

/* --------------------------------------------------------------------------
Transaction Management
@@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)

EXPORT_SYMBOL(ec_get_handle);

+static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
+
+/* run with locked ec mutex */
+static void acpi_ec_clear(struct acpi_ec *ec)
+{
+ int i, status;
+ u8 value = 0;
+
+ for (i = 0; i < ACPI_EC_CLEAR_MAX; i++) {
+ status = acpi_ec_query_unlocked(ec, &value);
+ if (status || !value)
+ break;
+ }
+
+ if (i == ACPI_EC_CLEAR_MAX)
+ pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
+ else
+ pr_info("%d stale EC events cleared\n", i);
+}
+
void acpi_ec_block_transactions(void)
{
struct acpi_ec *ec = first_ec;
@@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
mutex_lock(&ec->mutex);
/* Allow transactions to be carried out again */
clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
+
+ if (EC_FLAGS_CLEAR_ON_RESUME)
+ acpi_ec_clear(ec);
+
mutex_unlock(&ec->mutex);
}

@@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)

/* EC is fully operational, allow queries */
clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+
+ /* Some hardware may need the EC to be cleared before use */
+ if (EC_FLAGS_CLEAR_ON_RESUME) {
+ mutex_lock(&ec->mutex);
+ acpi_ec_clear(ec);
+ mutex_unlock(&ec->mutex);
+ }
return ret;
}

@@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
return 0;
}

+/*
+ * On some hardware it is necessary to clear events accumulated by the EC during
+ * sleep. These ECs stop reporting GPEs until they are manually polled, if too
+ * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
+ *
+ * https://bugzilla.kernel.org/show_bug.cgi?id=44161
+ *
+ * Ideally, the EC should also be instructed not to accumulate events during
+ * sleep (which Windows seems to do somehow), but the interface to control this
+ * behaviour is not known at this time.
+ *
+ * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
+ * however it is very likely that other Samsung models are affected.
+ *
+ * On systems which don't accumulate EC events during sleep, this extra check
+ * should be harmless.
+ */
+static int ec_clear_on_resume(const struct dmi_system_id *id)
+{
+ pr_debug("Detected system needing EC poll on resume.\n");
+ EC_FLAGS_CLEAR_ON_RESUME = 1;
+ return 0;
+}
+
static struct dmi_system_id ec_dmi_table[] __initdata = {
{
ec_skip_dsdt_scan, "Compal JFL92", {
@@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
ec_validate_ecdt, "ASUS hardware", {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
+ {
+ ec_clear_on_resume, "Samsung hardware", {
+ DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
{},
};

--
1.8.5.3


2014-02-26 23:30:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thursday, February 27, 2014 02:12:40 AM Kieran Clancy wrote:
> A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
> continue to log events during sleep (lid open/close, AC plug/unplug,
> battery level change), which accumulate in the EC until a buffer fills.
> After the buffer is full (tests suggest it holds 8 events), GPEs stop
> being triggered for new events. This state persists on wake or even on
> power cycle, and prevents new events from being registered until the EC
> is manually polled.
>
> This is the root cause of a number of bugs, including AC not being
> detected properly, lid close not triggering suspend, and low ambient
> light not triggering the keyboard backlight. The bug also seemed to be
> responsible for performance issues on at least one user's machine.
>
> Juan Manuel Cabo found the cause of bug and the workaround of polling
> the EC manually on wake.
>
> This patch:
> - Adds a function acpi_ec_clear() which polls the EC, at most
> ACPI_EC_CLEAR_MAX (currently 20) times. A warning is logged if this
> limit is reached.
> - Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
> system vendor is Samsung. This check could be replaced by several more
> specific DMI vendor/product pairs, but it's likely that the bug
> affects more Samsung products than just the five series mentioned
> above. Further, it should not be harmful to run acpi_ec_clear() on
> systems without the bug; it will return immediately after finding no
> data waiting.
> - Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
> - Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()
>
> References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
> References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
> References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
> Signed-off-by: Kieran Clancy <[email protected]>
> Signed-off-by: Lan Tianyu <[email protected]>
> Signed-off-by: Juan Manuel Cabo <[email protected]>
> Signed-off-by: Dennis Jansen <[email protected]>

There are too many sign-offs under this patch. I suppose some of them
should be Acked-by or Reviewed-by.

Are you the author?

> Tested-by: Maurizio D'Addona <[email protected]>
> Tested-by: San Zamoyski <[email protected]>
> ---
> drivers/acpi/ec.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 61 insertions(+)
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 959d41a..f437d9a 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -67,6 +67,8 @@ enum ec_command {
> #define ACPI_EC_DELAY 500 /* Wait 500ms max. during EC ops */
> #define ACPI_EC_UDELAY_GLK 1000 /* Wait 1ms max. to get global lock */
> #define ACPI_EC_MSI_UDELAY 550 /* Wait 550us for MSI EC */
> +#define ACPI_EC_CLEAR_MAX 20 /* Maximum number of events to query
> + * when trying to clear the EC */
>
> enum {
> EC_FLAGS_QUERY_PENDING, /* Query is pending */
> @@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
> static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
> static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
> static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */
>
> /* --------------------------------------------------------------------------
> Transaction Management
> @@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)
>
> EXPORT_SYMBOL(ec_get_handle);
>
> +static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
> +
> +/* run with locked ec mutex */
> +static void acpi_ec_clear(struct acpi_ec *ec)
> +{
> + int i, status;
> + u8 value = 0;
> +
> + for (i = 0; i < ACPI_EC_CLEAR_MAX; i++) {
> + status = acpi_ec_query_unlocked(ec, &value);
> + if (status || !value)
> + break;
> + }
> +
> + if (i == ACPI_EC_CLEAR_MAX)
> + pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
> + else
> + pr_info("%d stale EC events cleared\n", i);
> +}
> +
> void acpi_ec_block_transactions(void)
> {
> struct acpi_ec *ec = first_ec;
> @@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
> mutex_lock(&ec->mutex);
> /* Allow transactions to be carried out again */
> clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
> +
> + if (EC_FLAGS_CLEAR_ON_RESUME)
> + acpi_ec_clear(ec);
> +
> mutex_unlock(&ec->mutex);
> }
>
> @@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)
>
> /* EC is fully operational, allow queries */
> clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> +
> + /* Some hardware may need the EC to be cleared before use */
> + if (EC_FLAGS_CLEAR_ON_RESUME) {
> + mutex_lock(&ec->mutex);
> + acpi_ec_clear(ec);
> + mutex_unlock(&ec->mutex);
> + }
> return ret;
> }
>
> @@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
> return 0;
> }
>
> +/*
> + * On some hardware it is necessary to clear events accumulated by the EC during
> + * sleep. These ECs stop reporting GPEs until they are manually polled, if too
> + * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
> + *
> + * https://bugzilla.kernel.org/show_bug.cgi?id=44161
> + *
> + * Ideally, the EC should also be instructed not to accumulate events during
> + * sleep (which Windows seems to do somehow), but the interface to control this
> + * behaviour is not known at this time.
> + *
> + * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
> + * however it is very likely that other Samsung models are affected.
> + *
> + * On systems which don't accumulate EC events during sleep, this extra check
> + * should be harmless.
> + */
> +static int ec_clear_on_resume(const struct dmi_system_id *id)
> +{
> + pr_debug("Detected system needing EC poll on resume.\n");
> + EC_FLAGS_CLEAR_ON_RESUME = 1;
> + return 0;
> +}
> +
> static struct dmi_system_id ec_dmi_table[] __initdata = {
> {
> ec_skip_dsdt_scan, "Compal JFL92", {
> @@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
> ec_validate_ecdt, "ASUS hardware", {
> DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
> DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
> + {
> + ec_clear_on_resume, "Samsung hardware", {
> + DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
> {},
> };
>
>

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-02-27 00:45:38

by Kieran Clancy

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thu, Feb 27, 2014 at 10:15 AM, Rafael J. Wysocki <[email protected]> wrote:
> On Thursday, February 27, 2014 02:12:40 AM Kieran Clancy wrote:
>> Signed-off-by: Kieran Clancy <[email protected]>
>> Signed-off-by: Lan Tianyu <[email protected]>
>> Signed-off-by: Juan Manuel Cabo <[email protected]>
>> Signed-off-by: Dennis Jansen <[email protected]>
>
> There are too many sign-offs under this patch. I suppose some of them
> should be Acked-by or Reviewed-by.
>
> Are you the author?

Sorry about that, I wasn't sure of the best way to acknowledge the
people involved.

I am the primary author, but I based the loop which calls
acpi_ec_query_unlocked() on a patch by Lan Tianyu. Juan provided the
initial idea (userspace workaround), and Dennis and Lan Tianyu
reviewed and suggested some changes to the code.

What would you usually do this kind of situation?

I suppose I could put something such as:

Signed-off-by: Kieran Clancy <[email protected]>
Signed-off-by: Lan Tianyu <[email protected]>
Suggested-by: Juan Manuel Cabo <[email protected]>
Reviewed-by: Lan Tianyu <[email protected]>
Reviewed-by: Dennis Jansen <[email protected]>

Would this be preferable?

Thanks,
Kieran Clancy

2014-02-27 01:00:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thursday, February 27, 2014 11:15:05 AM Kieran Clancy wrote:
> On Thu, Feb 27, 2014 at 10:15 AM, Rafael J. Wysocki <[email protected]> wrote:
> > On Thursday, February 27, 2014 02:12:40 AM Kieran Clancy wrote:
> >> Signed-off-by: Kieran Clancy <[email protected]>
> >> Signed-off-by: Lan Tianyu <[email protected]>
> >> Signed-off-by: Juan Manuel Cabo <[email protected]>
> >> Signed-off-by: Dennis Jansen <[email protected]>
> >
> > There are too many sign-offs under this patch. I suppose some of them
> > should be Acked-by or Reviewed-by.
> >
> > Are you the author?
>
> Sorry about that, I wasn't sure of the best way to acknowledge the
> people involved.
>
> I am the primary author, but I based the loop which calls
> acpi_ec_query_unlocked() on a patch by Lan Tianyu. Juan provided the
> initial idea (userspace workaround), and Dennis and Lan Tianyu
> reviewed and suggested some changes to the code.
>
> What would you usually do this kind of situation?

I usually add acknowledgements to the changelog without using any tags,
like "This patch is based on previous work by <somebody>." And then you
can give a link to that work in References:.

> I suppose I could put something such as:
>
> Signed-off-by: Kieran Clancy <[email protected]>
> Signed-off-by: Lan Tianyu <[email protected]>
> Suggested-by: Juan Manuel Cabo <[email protected]>
> Reviewed-by: Lan Tianyu <[email protected]>
> Reviewed-by: Dennis Jansen <[email protected]>
>
> Would this be preferable?

Yes, that'd be fine by me.

Thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-02-27 01:00:28

by Kieran Clancy

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thu, Feb 27, 2014 at 11:15 AM, Kieran Clancy <[email protected]> wrote:
>
> I am the primary author, but I based the loop which calls
> acpi_ec_query_unlocked() on a patch by Lan Tianyu. Juan provided the
> initial idea (userspace workaround), and Dennis and Lan Tianyu
> reviewed and suggested some changes to the code.

Oh, and I should add that Juan and Dennis both helped to test the
patch extensively.

Kieran

2014-02-27 01:01:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thursday, February 27, 2014 11:29:53 AM Kieran Clancy wrote:
> On Thu, Feb 27, 2014 at 11:15 AM, Kieran Clancy <[email protected]> wrote:
> >
> > I am the primary author, but I based the loop which calls
> > acpi_ec_query_unlocked() on a patch by Lan Tianyu. Juan provided the
> > initial idea (userspace workaround), and Dennis and Lan Tianyu
> > reviewed and suggested some changes to the code.
>
> Oh, and I should add that Juan and Dennis both helped to test the
> patch extensively.

You can use Tested-by or Reported-and-tested-by for that.

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-02-27 01:10:15

by Kieran Clancy

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thu, Feb 27, 2014 at 11:46 AM, Rafael J. Wysocki <[email protected]> wrote:
> On Thursday, February 27, 2014 11:29:53 AM Kieran Clancy wrote:
>>
>> Oh, and I should add that Juan and Dennis both helped to test the
>> patch extensively.
>
> You can use Tested-by or Reported-and-tested-by for that.

Is it okay to have someone listed as both 'Reviewed-by' and
'Tested-by', or both 'Suggested-by' and 'Tested-by'?

Thank you for your help. Once I fix up these things, I will send a patch v2.

Kieran.

2014-02-27 01:59:32

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

Kieran Clancy wrote:
> A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
> continue to log events during sleep (lid open/close, AC plug/unplug,
> battery level change), which accumulate in the EC until a buffer fills.
> After the buffer is full (tests suggest it holds 8 events), GPEs stop
> being triggered for new events. This state persists on wake or even on
> power cycle, and prevents new events from being registered until the EC
> is manually polled.
>
>

that's really nasty EC firmware!

> This is the root cause of a number of bugs, including AC not being
> detected properly, lid close not triggering suspend, and low ambient
> light not triggering the keyboard backlight. The bug also seemed to be
> responsible for performance issues on at least one user's machine.
>
> Juan Manuel Cabo found the cause of bug and the workaround of polling
> the EC manually on wake.
>
> This patch:
> - Adds a function acpi_ec_clear() which polls the EC, at most
> ACPI_EC_CLEAR_MAX (currently 20) times. A warning is logged if this
> limit is reached.
> - Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
> system vendor is Samsung. This check could be replaced by several more
> specific DMI vendor/product pairs, but it's likely that the bug
> affects more Samsung products than just the five series mentioned
> above. Further, it should not be harmful to run acpi_ec_clear() on
> systems without the bug; it will return immediately after finding no
> data waiting.
> - Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
> - Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()
>
> References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
> References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
> References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
> Signed-off-by: Kieran Clancy<[email protected]>
> Signed-off-by: Lan Tianyu<[email protected]>
> Signed-off-by: Juan Manuel Cabo<[email protected]>
> Signed-off-by: Dennis Jansen<[email protected]>
> Tested-by: Maurizio D'Addona<[email protected]>
> Tested-by: San Zamoyski<[email protected]>
> ---
> drivers/acpi/ec.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 61 insertions(+)
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 959d41a..f437d9a 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -67,6 +67,8 @@ enum ec_command {
> #define ACPI_EC_DELAY 500 /* Wait 500ms max. during EC ops */
> #define ACPI_EC_UDELAY_GLK 1000 /* Wait 1ms max. to get global lock */
> #define ACPI_EC_MSI_UDELAY 550 /* Wait 550us for MSI EC */
> +#define ACPI_EC_CLEAR_MAX 20 /* Maximum number of events to query
> + * when trying to clear the EC */
>

20 is enough?
the query index is length of a byte.

>
> enum {
> EC_FLAGS_QUERY_PENDING, /* Query is pending */
> @@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
> static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
> static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
> static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */
>

seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
seems too long :-)

>
> /* --------------------------------------------------------------------------
> Transaction Management
> @@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)
>
> EXPORT_SYMBOL(ec_get_handle);
>
> +static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
> +
> +/* run with locked ec mutex */
> +static void acpi_ec_clear(struct acpi_ec *ec)
> +{
> + int i, status;
> + u8 value = 0;
> +
> + for (i = 0; i< ACPI_EC_CLEAR_MAX; i++) {
> + status = acpi_ec_query_unlocked(ec,&value);
> + if (status || !value)
> + break;
> + }
> +
> + if (i == ACPI_EC_CLEAR_MAX)
> + pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
> + else
> + pr_info("%d stale EC events cleared\n", i);
> +}
> +
> void acpi_ec_block_transactions(void)
> {
> struct acpi_ec *ec = first_ec;
> @@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
> mutex_lock(&ec->mutex);
> /* Allow transactions to be carried out again */
> clear_bit(EC_FLAGS_BLOCKED,&ec->flags);
> +
> + if (EC_FLAGS_CLEAR_ON_RESUME)
> + acpi_ec_clear(ec);
> +
> mutex_unlock(&ec->mutex);
> }
>
> @@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)
>
> /* EC is fully operational, allow queries */
> clear_bit(EC_FLAGS_QUERY_PENDING,&ec->flags);
> +
> + /* Some hardware may need the EC to be cleared before use */
>

description is implicit, should specify what we clear is Q event, not EC.

Thanks!
Li Guang

> + if (EC_FLAGS_CLEAR_ON_RESUME) {
> + mutex_lock(&ec->mutex);
> + acpi_ec_clear(ec);
> + mutex_unlock(&ec->mutex);
> + }
> return ret;
> }
>
> @@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
> return 0;
> }
>
> +/*
> + * On some hardware it is necessary to clear events accumulated by the EC during
> + * sleep. These ECs stop reporting GPEs until they are manually polled, if too
> + * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
> + *
> + * https://bugzilla.kernel.org/show_bug.cgi?id=44161
> + *
> + * Ideally, the EC should also be instructed not to accumulate events during
> + * sleep (which Windows seems to do somehow), but the interface to control this
> + * behaviour is not known at this time.
> + *
> + * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
> + * however it is very likely that other Samsung models are affected.
> + *
> + * On systems which don't accumulate EC events during sleep, this extra check
> + * should be harmless.
> + */
> +static int ec_clear_on_resume(const struct dmi_system_id *id)
> +{
> + pr_debug("Detected system needing EC poll on resume.\n");
> + EC_FLAGS_CLEAR_ON_RESUME = 1;
> + return 0;
> +}
> +
> static struct dmi_system_id ec_dmi_table[] __initdata = {
> {
> ec_skip_dsdt_scan, "Compal JFL92", {
> @@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
> ec_validate_ecdt, "ASUS hardware", {
> DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
> DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
> + {
> + ec_clear_on_resume, "Samsung hardware", {
> + DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
> {},
> };
>
>

2014-02-27 02:33:15

by Juan Manuel Cabo

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems


> that's really nasty EC firmware!

Yes!
And this bug has been unsolved for about two years.

> 20 is enough?
> the query index is length of a byte.
>
According to our humble tests, 8 is the maximum number of
accumulated events. For instance, if the one plugs or unplugs the PSU
16 times, (or battery % changes 16 times) during S3 sleep, then the
EC returns no more than 8 events when polled by this patch or by
my userspace util.
And having reached 8 events, it won't produce its GPE until queried.

> the query index is length of a byte.

There is no query index, unless you refer to something else (I'm sorry if its
something that escapes me).
For us, a query is just: send 0x84 through EC CMD port, and read status
from CMD port and event type from EC DATA port. This is done with
the usual ec.c functions that would handle a query after a GPE interrupt,
but using them instead to poll (not GPE initiated) at awake. The EC would
then return status without 0x20 mask and 'event type'==0 when no more left.

--
Juan Manuel Cabo <[email protected]>



>>
>> enum {
>> EC_FLAGS_QUERY_PENDING, /* Query is pending */
>> @@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
>> static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
>> static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
>> static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
>> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */
>>
>
> seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
> seems too long :-)
>
>>
>> /* --------------------------------------------------------------------------
>> Transaction Management
>> @@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)
>>
>> EXPORT_SYMBOL(ec_get_handle);
>>
>> +static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
>> +
>> +/* run with locked ec mutex */
>> +static void acpi_ec_clear(struct acpi_ec *ec)
>> +{
>> + int i, status;
>> + u8 value = 0;
>> +
>> + for (i = 0; i< ACPI_EC_CLEAR_MAX; i++) {
>> + status = acpi_ec_query_unlocked(ec,&value);
>> + if (status || !value)
>> + break;
>> + }
>> +
>> + if (i == ACPI_EC_CLEAR_MAX)
>> + pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
>> + else
>> + pr_info("%d stale EC events cleared\n", i);
>> +}
>> +
>> void acpi_ec_block_transactions(void)
>> {
>> struct acpi_ec *ec = first_ec;
>> @@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
>> mutex_lock(&ec->mutex);
>> /* Allow transactions to be carried out again */
>> clear_bit(EC_FLAGS_BLOCKED,&ec->flags);
>> +
>> + if (EC_FLAGS_CLEAR_ON_RESUME)
>> + acpi_ec_clear(ec);
>> +
>> mutex_unlock(&ec->mutex);
>> }
>>
>> @@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)
>>
>> /* EC is fully operational, allow queries */
>> clear_bit(EC_FLAGS_QUERY_PENDING,&ec->flags);
>> +
>> + /* Some hardware may need the EC to be cleared before use */
>>
>
> description is implicit, should specify what we clear is Q event, not EC.
>
> Thanks!
> Li Guang
>
>> + if (EC_FLAGS_CLEAR_ON_RESUME) {
>> + mutex_lock(&ec->mutex);
>> + acpi_ec_clear(ec);
>> + mutex_unlock(&ec->mutex);
>> + }
>> return ret;
>> }
>>
>> @@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
>> return 0;
>> }
>>
>> +/*
>> + * On some hardware it is necessary to clear events accumulated by the EC during
>> + * sleep. These ECs stop reporting GPEs until they are manually polled, if too
>> + * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
>> + *
>> + * https://bugzilla.kernel.org/show_bug.cgi?id=44161
>> + *
>> + * Ideally, the EC should also be instructed not to accumulate events during
>> + * sleep (which Windows seems to do somehow), but the interface to control this
>> + * behaviour is not known at this time.
>> + *
>> + * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
>> + * however it is very likely that other Samsung models are affected.
>> + *
>> + * On systems which don't accumulate EC events during sleep, this extra check
>> + * should be harmless.
>> + */
>> +static int ec_clear_on_resume(const struct dmi_system_id *id)
>> +{
>> + pr_debug("Detected system needing EC poll on resume.\n");
>> + EC_FLAGS_CLEAR_ON_RESUME = 1;
>> + return 0;
>> +}
>> +
>> static struct dmi_system_id ec_dmi_table[] __initdata = {
>> {
>> ec_skip_dsdt_scan, "Compal JFL92", {
>> @@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>> ec_validate_ecdt, "ASUS hardware", {
>> DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>> DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>> + {
>> + ec_clear_on_resume, "Samsung hardware", {
>> + DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
>> {},
>> };
>>
>>
>
>

2014-02-27 03:14:52

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

Juan Manuel Cabo wrote:
>
>
>> that's really nasty EC firmware!
>>
> Yes!
> And this bug has been unsolved for about two years.
>
>
>> 20 is enough?
>> the query index is length of a byte.
>>
>>
> According to our humble tests, 8 is the maximum number of
> accumulated events. For instance, if the one plugs or unplugs the PSU
> 16 times, (or battery % changes 16 times) during S3 sleep, then the
> EC returns no more than 8 events when polled by this patch or by
> my userspace util.
> And having reached 8 events, it won't produce its GPE until queried.
>

that surely indicts their EC firmware only queued 8 events,

>
>> the query index is length of a byte.
>>
> There is no query index, unless you refer to something else (I'm sorry if its
> something that escapes me).
>

oh, sorry, I'm referring internal EC firmware code
for Q event queuing, not ACPI SPEC, ;-)
for machine you tested, 8 is the queue size,
but for some unknown also nasty EC firmwares(let's suppose it exists),
it may queue more Q events.
and I saw several firmwares queued 32 events by default,
then, let's say, they be used for some samsung products,
and also they also forgot to deal with sleep/resume state,
then, we'll also leave stale Q event there.

Thanks!

> For us, a query is just: send 0x84 through EC CMD port, and read status
> from CMD port and event type from EC DATA port. This is done with
> the usual ec.c functions that would handle a query after a GPE interrupt,
> but using them instead to poll (not GPE initiated) at awake. The EC would
> then return status without 0x20 mask and 'event type'==0 when no more left.
>
> --
> Juan Manuel Cabo<[email protected]>
>
>
>
>
>>> enum {
>>> EC_FLAGS_QUERY_PENDING, /* Query is pending */
>>> @@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
>>> static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
>>> static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
>>> static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
>>> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */
>>>
>>>
>> seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
>> seems too long :-)
>>
>>
>>> /* --------------------------------------------------------------------------
>>> Transaction Management
>>> @@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)
>>>
>>> EXPORT_SYMBOL(ec_get_handle);
>>>
>>> +static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
>>> +
>>> +/* run with locked ec mutex */
>>> +static void acpi_ec_clear(struct acpi_ec *ec)
>>> +{
>>> + int i, status;
>>> + u8 value = 0;
>>> +
>>> + for (i = 0; i< ACPI_EC_CLEAR_MAX; i++) {
>>> + status = acpi_ec_query_unlocked(ec,&value);
>>> + if (status || !value)
>>> + break;
>>> + }
>>> +
>>> + if (i == ACPI_EC_CLEAR_MAX)
>>> + pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
>>> + else
>>> + pr_info("%d stale EC events cleared\n", i);
>>> +}
>>> +
>>> void acpi_ec_block_transactions(void)
>>> {
>>> struct acpi_ec *ec = first_ec;
>>> @@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
>>> mutex_lock(&ec->mutex);
>>> /* Allow transactions to be carried out again */
>>> clear_bit(EC_FLAGS_BLOCKED,&ec->flags);
>>> +
>>> + if (EC_FLAGS_CLEAR_ON_RESUME)
>>> + acpi_ec_clear(ec);
>>> +
>>> mutex_unlock(&ec->mutex);
>>> }
>>>
>>> @@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)
>>>
>>> /* EC is fully operational, allow queries */
>>> clear_bit(EC_FLAGS_QUERY_PENDING,&ec->flags);
>>> +
>>> + /* Some hardware may need the EC to be cleared before use */
>>>
>>>
>> description is implicit, should specify what we clear is Q event, not EC.
>>
>> Thanks!
>> Li Guang
>>
>>
>>> + if (EC_FLAGS_CLEAR_ON_RESUME) {
>>> + mutex_lock(&ec->mutex);
>>> + acpi_ec_clear(ec);
>>> + mutex_unlock(&ec->mutex);
>>> + }
>>> return ret;
>>> }
>>>
>>> @@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
>>> return 0;
>>> }
>>>
>>> +/*
>>> + * On some hardware it is necessary to clear events accumulated by the EC during
>>> + * sleep. These ECs stop reporting GPEs until they are manually polled, if too
>>> + * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
>>> + *
>>> + * https://bugzilla.kernel.org/show_bug.cgi?id=44161
>>> + *
>>> + * Ideally, the EC should also be instructed not to accumulate events during
>>> + * sleep (which Windows seems to do somehow), but the interface to control this
>>> + * behaviour is not known at this time.
>>> + *
>>> + * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
>>> + * however it is very likely that other Samsung models are affected.
>>> + *
>>> + * On systems which don't accumulate EC events during sleep, this extra check
>>> + * should be harmless.
>>> + */
>>> +static int ec_clear_on_resume(const struct dmi_system_id *id)
>>> +{
>>> + pr_debug("Detected system needing EC poll on resume.\n");
>>> + EC_FLAGS_CLEAR_ON_RESUME = 1;
>>> + return 0;
>>> +}
>>> +
>>> static struct dmi_system_id ec_dmi_table[] __initdata = {
>>> {
>>> ec_skip_dsdt_scan, "Compal JFL92", {
>>> @@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>>> ec_validate_ecdt, "ASUS hardware", {
>>> DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>>> DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>>> + {
>>> + ec_clear_on_resume, "Samsung hardware", {
>>> + DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
>>> {},
>>> };
>>>
>>>
>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>

2014-02-27 03:31:09

by Juan Manuel Cabo

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On 02/27/2014 12:14 AM, Li Guang wrote:
> oh, sorry, I'm referring internal EC firmware code
> for Q event queuing, not ACPI SPEC, ;-)
> for machine you tested, 8 is the queue size,
> but for some unknown also nasty EC firmwares(let's suppose it exists),
> it may queue more Q events.
> and I saw several firmwares queued 32 events by default,
> then, let's say, they be used for some samsung products,
> and also they also forgot to deal with sleep/resume state,
> then, we'll also leave stale Q event there.
>
> Thanks!
>

We tested each on our different samsung models (intel, amd), and it
was 8 across. But you're right, there might be more in the future.

I even saw a bug report in ubuntu's launchpad of an HP with a similar
sounding problem, ( https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/89860 )
which I have no idea if it was caused by the same issue, but if in the future,
the flag ec_clear_on_resume is used to match other DMI's, it might
be a good idea to make the max iteration count bigger.

The only reason that there is a max iteration count, was to prevent
an unexpected case in which an unknown EC never returns 0 after
queue emptied. So far it hasn't been the case. Can we count on it?.
The loop currently does finish early when there are no more events.

I guess changing it 255 or 1000 would be enough, right?

Cheers!
--
Juan Manuel Cabo<[email protected]>



>> For us, a query is just: send 0x84 through EC CMD port, and read status
>> from CMD port and event type from EC DATA port. This is done with
>> the usual ec.c functions that would handle a query after a GPE interrupt,
>> but using them instead to poll (not GPE initiated) at awake. The EC would
>> then return status without 0x20 mask and 'event type'==0 when no more left.
>>
>> --
>> Juan Manuel Cabo<[email protected]>
>>
>>
>>
>>
>>>> enum {
>>>> EC_FLAGS_QUERY_PENDING, /* Query is pending */
>>>> @@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
>>>> static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
>>>> static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
>>>> static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
>>>> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on boot/resume */
>>>>
>>>>
>>> seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
>>> seems too long :-)
>>>
>>>
>>>> /* --------------------------------------------------------------------------
>>>> Transaction Management
>>>> @@ -440,6 +443,26 @@ acpi_handle ec_get_handle(void)
>>>>
>>>> EXPORT_SYMBOL(ec_get_handle);
>>>>
>>>> +static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
>>>> +
>>>> +/* run with locked ec mutex */
>>>> +static void acpi_ec_clear(struct acpi_ec *ec)
>>>> +{
>>>> + int i, status;
>>>> + u8 value = 0;
>>>> +
>>>> + for (i = 0; i< ACPI_EC_CLEAR_MAX; i++) {
>>>> + status = acpi_ec_query_unlocked(ec,&value);
>>>> + if (status || !value)
>>>> + break;
>>>> + }
>>>> +
>>>> + if (i == ACPI_EC_CLEAR_MAX)
>>>> + pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
>>>> + else
>>>> + pr_info("%d stale EC events cleared\n", i);
>>>> +}
>>>> +
>>>> void acpi_ec_block_transactions(void)
>>>> {
>>>> struct acpi_ec *ec = first_ec;
>>>> @@ -463,6 +486,10 @@ void acpi_ec_unblock_transactions(void)
>>>> mutex_lock(&ec->mutex);
>>>> /* Allow transactions to be carried out again */
>>>> clear_bit(EC_FLAGS_BLOCKED,&ec->flags);
>>>> +
>>>> + if (EC_FLAGS_CLEAR_ON_RESUME)
>>>> + acpi_ec_clear(ec);
>>>> +
>>>> mutex_unlock(&ec->mutex);
>>>> }
>>>>
>>>> @@ -821,6 +848,13 @@ static int acpi_ec_add(struct acpi_device *device)
>>>>
>>>> /* EC is fully operational, allow queries */
>>>> clear_bit(EC_FLAGS_QUERY_PENDING,&ec->flags);
>>>> +
>>>> + /* Some hardware may need the EC to be cleared before use */
>>>>
>>>>
>>> description is implicit, should specify what we clear is Q event, not EC.
>>>
>>> Thanks!
>>> Li Guang
>>>
>>>
>>>> + if (EC_FLAGS_CLEAR_ON_RESUME) {
>>>> + mutex_lock(&ec->mutex);
>>>> + acpi_ec_clear(ec);
>>>> + mutex_unlock(&ec->mutex);
>>>> + }
>>>> return ret;
>>>> }
>>>>
>>>> @@ -922,6 +956,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
>>>> return 0;
>>>> }
>>>>
>>>> +/*
>>>> + * On some hardware it is necessary to clear events accumulated by the EC during
>>>> + * sleep. These ECs stop reporting GPEs until they are manually polled, if too
>>>> + * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
>>>> + *
>>>> + * https://bugzilla.kernel.org/show_bug.cgi?id=44161
>>>> + *
>>>> + * Ideally, the EC should also be instructed not to accumulate events during
>>>> + * sleep (which Windows seems to do somehow), but the interface to control this
>>>> + * behaviour is not known at this time.
>>>> + *
>>>> + * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
>>>> + * however it is very likely that other Samsung models are affected.
>>>> + *
>>>> + * On systems which don't accumulate EC events during sleep, this extra check
>>>> + * should be harmless.
>>>> + */
>>>> +static int ec_clear_on_resume(const struct dmi_system_id *id)
>>>> +{
>>>> + pr_debug("Detected system needing EC poll on resume.\n");
>>>> + EC_FLAGS_CLEAR_ON_RESUME = 1;
>>>> + return 0;
>>>> +}
>>>> +
>>>> static struct dmi_system_id ec_dmi_table[] __initdata = {
>>>> {
>>>> ec_skip_dsdt_scan, "Compal JFL92", {
>>>> @@ -965,6 +1023,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
>>>> ec_validate_ecdt, "ASUS hardware", {
>>>> DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
>>>> DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
>>>> + {
>>>> + ec_clear_on_resume, "Samsung hardware", {
>>>> + DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
>>>> {},
>>>> };
>>>>
>>>>
>>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>>
>
>

2014-02-27 03:46:13

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

Juan Manuel Cabo wrote:
> On 02/27/2014 12:14 AM, Li Guang wrote:
>
>> oh, sorry, I'm referring internal EC firmware code
>> for Q event queuing, not ACPI SPEC, ;-)
>> for machine you tested, 8 is the queue size,
>> but for some unknown also nasty EC firmwares(let's suppose it exists),
>> it may queue more Q events.
>> and I saw several firmwares queued 32 events by default,
>> then, let's say, they be used for some samsung products,
>> and also they also forgot to deal with sleep/resume state,
>> then, we'll also leave stale Q event there.
>>
>> Thanks!
>>
>>
> We tested each on our different samsung models (intel, amd), and it
> was 8 across. But you're right, there might be more in the future.
>
> I even saw a bug report in ubuntu's launchpad of an HP with a similar
> sounding problem, ( https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/89860 )
> which I have no idea if it was caused by the same issue, but if in the future,
> the flag ec_clear_on_resume is used to match other DMI's, it might
> be a good idea to make the max iteration count bigger.
>
> The only reason that there is a max iteration count, was to prevent
> an unexpected case in which an unknown EC never returns 0 after
> queue emptied. So far it hasn't been the case. Can we count on it?.
> The loop currently does finish early when there are no more events.
>
> I guess changing it 255 or 1000 would be enough, right?
>
>

can't imagine 1K bytes be dissipated on Q event,
EC's ram is usually expensive,
I think 255 is really enough. :-)

Thanks!


2014-02-27 03:49:38

by Kieran Clancy

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thu, Feb 27, 2014 at 12:29 PM, Li Guang <[email protected]> wrote:
>> +#define ACPI_EC_CLEAR_MAX 20 /* Maximum number of events to
>> query
>> + * when trying to clear the EC */
>>
>
>
> 20 is enough?
> the query index is length of a byte.

On my machine, 8 seems to be enough, so 20 seems to be a conservative
maximum. Just reading your other email, maybe we should set this to
32? or 40? 100?

If it's not enough, hopefully anyone seeing bugs will notice the
warning "maximum of X stale EC events cleared".

Here's what happens if I plug/replug the AC lots of times (more than
8) during suspend:

[ 8807.019800] ACPI : EC: ---> status = 0x29
[ 8807.019804] ACPI : EC: ---> data = 0x66
[ 8807.020790] ACPI : EC: ---> status = 0x29
[ 8807.020793] ACPI : EC: ---> data = 0x66
[ 8807.021793] ACPI : EC: ---> status = 0x29
[ 8807.021798] ACPI : EC: ---> data = 0x66
[ 8807.022831] ACPI : EC: ---> status = 0x29
[ 8807.022834] ACPI : EC: ---> data = 0x66
[ 8807.023788] ACPI : EC: ---> status = 0x29
[ 8807.023792] ACPI : EC: ---> data = 0x66
[ 8807.024787] ACPI : EC: ---> status = 0x29
[ 8807.024791] ACPI : EC: ---> data = 0x66
[ 8807.025787] ACPI : EC: ---> status = 0x29
[ 8807.025790] ACPI : EC: ---> data = 0x66
[ 8807.026787] ACPI : EC: ---> status = 0x29
[ 8807.026790] ACPI : EC: ---> data = 0x66
[ 8807.027786] ACPI : EC: ---> status = 0x09
[ 8807.027790] ACPI : EC: ---> data = 0x00
[ 8807.027792] ACPI : EC: 8 stale EC events cleared

Note that most of these have SCI_EVT set, but the OS is not notified
according to ACPI specs (seemingly because these events happened
during sleep).

The _Q66 method in my DSDT, is:

P8XH (Zero, 0x66)
If (LEqual (B1EX, One))
{
Notify (BAT1, 0x80)
}

So, basically, this is supposed to notify that the battery (BAT1 =
PNP0C0A) has changed state, but they are stale events so we don't run
the handlers.

>> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on
>> boot/resume */
>
> seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
> seems too long :-)

In my mind this is referring to the function name (acpi_ec_)clear.
Perhaps we could just make the connection more explicit in the
comment:

/* needs acpi_ec_clear() on boot/resume */

Not sure if this is better?

>> + /* Some hardware may need the EC to be cleared before use */
>
> description is implicit, should specify what we clear is Q event, not EC.

Are Q events the only thing we can get from the EC data port? I've
read the relevant parts of the ACPI spec and I can't say I am 100%
sure.

Thank you for your advice,
Kieran.

2014-02-27 04:48:09

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

Kieran Clancy wrote:
> On Thu, Feb 27, 2014 at 12:29 PM, Li Guang<[email protected]> wrote:
>
>>> +#define ACPI_EC_CLEAR_MAX 20 /* Maximum number of events to
>>> query
>>> + * when trying to clear the EC */
>>>
>>>
>>
>> 20 is enough?
>> the query index is length of a byte.
>>
> On my machine, 8 seems to be enough, so 20 seems to be a conservative
> maximum. Just reading your other email, maybe we should set this to
> 32? or 40? 100?
>
> If it's not enough, hopefully anyone seeing bugs will notice the
> warning "maximum of X stale EC events cleared".
>
> Here's what happens if I plug/replug the AC lots of times (more than
> 8) during suspend:
>
> [ 8807.019800] ACPI : EC: ---> status = 0x29
> [ 8807.019804] ACPI : EC: ---> data = 0x66
> [ 8807.020790] ACPI : EC: ---> status = 0x29
> [ 8807.020793] ACPI : EC: ---> data = 0x66
> [ 8807.021793] ACPI : EC: ---> status = 0x29
> [ 8807.021798] ACPI : EC: ---> data = 0x66
> [ 8807.022831] ACPI : EC: ---> status = 0x29
> [ 8807.022834] ACPI : EC: ---> data = 0x66
> [ 8807.023788] ACPI : EC: ---> status = 0x29
> [ 8807.023792] ACPI : EC: ---> data = 0x66
> [ 8807.024787] ACPI : EC: ---> status = 0x29
> [ 8807.024791] ACPI : EC: ---> data = 0x66
> [ 8807.025787] ACPI : EC: ---> status = 0x29
> [ 8807.025790] ACPI : EC: ---> data = 0x66
> [ 8807.026787] ACPI : EC: ---> status = 0x29
> [ 8807.026790] ACPI : EC: ---> data = 0x66
> [ 8807.027786] ACPI : EC: ---> status = 0x09
> [ 8807.027790] ACPI : EC: ---> data = 0x00
> [ 8807.027792] ACPI : EC: 8 stale EC events cleared
>
> Note that most of these have SCI_EVT set, but the OS is not notified
> according to ACPI specs (seemingly because these events happened
> during sleep).
>
> The _Q66 method in my DSDT, is:
>
> P8XH (Zero, 0x66)
> If (LEqual (B1EX, One))
> {
> Notify (BAT1, 0x80)
> }
>
> So, basically, this is supposed to notify that the battery (BAT1 =
> PNP0C0A) has changed state, but they are stale events so we don't run
> the handlers.
>
>
>>> +static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on
>>> boot/resume */
>>>
>> seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
>> seems too long :-)
>>
> In my mind this is referring to the function name (acpi_ec_)clear.
> Perhaps we could just make the connection more explicit in the
> comment:
>
> /* needs acpi_ec_clear() on boot/resume */
>
> Not sure if this is better?
>
>
>>> + /* Some hardware may need the EC to be cleared before use */
>>>
>> description is implicit, should specify what we clear is Q event, not EC.
>>
> Are Q events the only thing we can get from the EC data port? I've
> read the relevant parts of the ACPI spec and I can't say I am 100%
> sure.
>
>
I guess you want to clear Q events here,
EC usually has ACPI space to be read by cmd 80.

Thanks!


2014-02-27 13:42:05

by Juan Manuel Cabo

[permalink] [raw]
Subject: Hello Kieran

Hello Kieran!!

Dennis has an AMD laptop, NP535U3C-A04DE, and found it to be 9
(which I guess was an off by one in his modification
of the userspace workaround and was really 8):
https://bugzilla.kernel.org/show_bug.cgi?id=44161#c146

I found it to be 8 for me, both in the patch and in my new userspace workaround
(which doesn't ask for the same event twice now that it pauses for 1ms,
as the kernel does too):
https://bugzilla.kernel.org/show_bug.cgi?id=44161#c162

ANd I think I saw someone else with 8 too but don't remember where.

BUT, there might be more laptops existing or future, samsung or other,
that might queue more than 8. Li Guang ([email protected])
said that he saw EC firmware's with 32, and that true maximum is 255.


So in my opinion it's best to put max. iters. at 255.

Cheers!
--jm

2014-02-27 21:47:37

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

On Thursday, February 27, 2014 11:39:42 AM Kieran Clancy wrote:
> On Thu, Feb 27, 2014 at 11:46 AM, Rafael J. Wysocki <[email protected]> wrote:
> > On Thursday, February 27, 2014 11:29:53 AM Kieran Clancy wrote:
> >>
> >> Oh, and I should add that Juan and Dennis both helped to test the
> >> patch extensively.
> >
> > You can use Tested-by or Reported-and-tested-by for that.
>
> Is it okay to have someone listed as both 'Reviewed-by' and
> 'Tested-by', or both 'Suggested-by' and 'Tested-by'?

Yes, these are different things essentially and each of them requires time
and effort on its own.

> Thank you for your help. Once I fix up these things, I will send a patch v2.

Thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-02-28 14:13:30

by Kieran Clancy

[permalink] [raw]
Subject: [PATCH v2] ACPI / EC: Clear stale EC events on Samsung systems

A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
continue to log events during sleep (lid open/close, AC plug/unplug,
battery level change), which accumulate in the EC until a buffer fills.
After the buffer is full (tests suggest it holds 8 events), GPEs stop
being triggered for new events. This state persists on wake or even on
power cycle, and prevents new events from being registered until the EC
is manually polled.

This is the root cause of a number of bugs, including AC not being
detected properly, lid close not triggering suspend, and low ambient
light not triggering the keyboard backlight. The bug also seemed to be
responsible for performance issues on at least one user's machine.

Juan Manuel Cabo found the cause of bug and the workaround of polling
the EC manually on wake.

The loop which clears the stale events is based on an earlier patch by
Lan Tianyu (see referenced attachment).

This patch:
- Adds a function acpi_ec_clear() which polls the EC for stale _Q
events at most ACPI_EC_CLEAR_MAX (currently 100) times. A warning is
logged if this limit is reached.
- Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
system vendor is Samsung. This check could be replaced by several
more specific DMI vendor/product pairs, but it's likely that the bug
affects more Samsung products than just the five series mentioned
above. Further, it should not be harmful to run acpi_ec_clear() on
systems without the bug; it will return immediately after finding no
data waiting.
- Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
- Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()

References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
References: https://bugzilla.kernel.org/attachment.cgi?id=126801
Signed-off-by: Kieran Clancy <[email protected]>
Suggested-by: Juan Manuel Cabo <[email protected]>
Reviewed-by: Lan Tianyu <[email protected]>
Reviewed-by: Dennis Jansen <[email protected]>
Tested-by: Kieran Clancy <[email protected]>
Tested-by: Juan Manuel Cabo <[email protected]>
Tested-by: Dennis Jansen <[email protected]>
Tested-by: Maurizio D'Addona <[email protected]>
Tested-by: San Zamoyski <[email protected]>
---

Changes in PATCH v2:
- Changed some of the 'Signed-off-by' lines to better reflect contributions,
as suggested by Rafael J. Wysocki <[email protected]>.
- Directly reference prior work by Lan Tianyu.
- Increase ACPI_EC_CLEAR_MAX to 100, after discussion with
Li Guang <[email protected]> and Juan Manuel Cabo.
- Made source comments more explicit, thanks to suggestions by Li Guang.
- Marked warning for hitting ACPI_EC_CLEAR_MAX as unlikely(), as suggested by
Dennis Jansen.

drivers/acpi/ec.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 959d41a..d7d32c2 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -67,6 +67,8 @@ enum ec_command {
#define ACPI_EC_DELAY 500 /* Wait 500ms max. during EC ops */
#define ACPI_EC_UDELAY_GLK 1000 /* Wait 1ms max. to get global lock */
#define ACPI_EC_MSI_UDELAY 550 /* Wait 550us for MSI EC */
+#define ACPI_EC_CLEAR_MAX 100 /* Maximum number of events to query
+ * when trying to clear the EC */

enum {
EC_FLAGS_QUERY_PENDING, /* Query is pending */
@@ -116,6 +118,7 @@ EXPORT_SYMBOL(first_ec);
static int EC_FLAGS_MSI; /* Out-of-spec MSI controller */
static int EC_FLAGS_VALIDATE_ECDT; /* ASUStec ECDTs need to be validated */
static int EC_FLAGS_SKIP_DSDT_SCAN; /* Not all BIOS survive early DSDT scan */
+static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */

/* --------------------------------------------------------------------------
Transaction Management
@@ -440,6 +443,29 @@ acpi_handle ec_get_handle(void)

EXPORT_SYMBOL(ec_get_handle);

+static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data);
+
+/*
+ * Clears stale _Q events that might have accumulated in the EC.
+ * Run with locked ec mutex.
+ */
+static void acpi_ec_clear(struct acpi_ec *ec)
+{
+ int i, status;
+ u8 value = 0;
+
+ for (i = 0; i < ACPI_EC_CLEAR_MAX; i++) {
+ status = acpi_ec_query_unlocked(ec, &value);
+ if (status || !value)
+ break;
+ }
+
+ if (unlikely(i == ACPI_EC_CLEAR_MAX))
+ pr_warn("Warning: Maximum of %d stale EC events cleared\n", i);
+ else
+ pr_info("%d stale EC events cleared\n", i);
+}
+
void acpi_ec_block_transactions(void)
{
struct acpi_ec *ec = first_ec;
@@ -463,6 +489,10 @@ void acpi_ec_unblock_transactions(void)
mutex_lock(&ec->mutex);
/* Allow transactions to be carried out again */
clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
+
+ if (EC_FLAGS_CLEAR_ON_RESUME)
+ acpi_ec_clear(ec);
+
mutex_unlock(&ec->mutex);
}

@@ -821,6 +851,13 @@ static int acpi_ec_add(struct acpi_device *device)

/* EC is fully operational, allow queries */
clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+
+ /* Clear stale _Q events if hardware might require that */
+ if (EC_FLAGS_CLEAR_ON_RESUME) {
+ mutex_lock(&ec->mutex);
+ acpi_ec_clear(ec);
+ mutex_unlock(&ec->mutex);
+ }
return ret;
}

@@ -922,6 +959,30 @@ static int ec_enlarge_storm_threshold(const struct dmi_system_id *id)
return 0;
}

+/*
+ * On some hardware it is necessary to clear events accumulated by the EC during
+ * sleep. These ECs stop reporting GPEs until they are manually polled, if too
+ * many events are accumulated. (e.g. Samsung Series 5/9 notebooks)
+ *
+ * https://bugzilla.kernel.org/show_bug.cgi?id=44161
+ *
+ * Ideally, the EC should also be instructed NOT to accumulate events during
+ * sleep (which Windows seems to do somehow), but the interface to control this
+ * behaviour is not known at this time.
+ *
+ * Models known to be affected are Samsung 530Uxx/535Uxx/540Uxx/550Pxx/900Xxx,
+ * however it is very likely that other Samsung models are affected.
+ *
+ * On systems which don't accumulate _Q events during sleep, this extra check
+ * should be harmless.
+ */
+static int ec_clear_on_resume(const struct dmi_system_id *id)
+{
+ pr_debug("Detected system needing EC poll on resume.\n");
+ EC_FLAGS_CLEAR_ON_RESUME = 1;
+ return 0;
+}
+
static struct dmi_system_id ec_dmi_table[] __initdata = {
{
ec_skip_dsdt_scan, "Compal JFL92", {
@@ -965,6 +1026,9 @@ static struct dmi_system_id ec_dmi_table[] __initdata = {
ec_validate_ecdt, "ASUS hardware", {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "L4R"),}, NULL},
+ {
+ ec_clear_on_resume, "Samsung hardware", {
+ DMI_MATCH(DMI_SYS_VENDOR, "SAMSUNG ELECTRONICS CO., LTD.")}, NULL},
{},
};

--
1.8.5.3