2014-11-03 05:16:17

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 0/6] ACPI/EC: Cleanups of command flushing and event polling.

This patchset contains 2 cleanups related to the EC driver:
1. Command flushing
This patchset flushes EC commands before suspending/resuming, so that
there won't be timeout for the incomplete commands after resuming.
2. Event polling thread
It is not proper to send QR_EC command using the same work queue as
the _Qxx/_Lxx/_Exx evaluations. This patchset moves the QR_EC issuing to
a seperate thread. This also allows us to extend the EC driver in the
future to be aware of the events even when SCI_EVT is failed to be set
by the firmware.

This patchset is split from an EC GPE storming prevention enhancement
series. The original discussion can be found at:
http://www.spinics.net/lists/linux-acpi/msg51698.html

Lv Zheng (6):
ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
ACPI/EC: Enhance the checks to apply to QR_EC transactions.
ACPI/EC: Add reference counting for query handlers.
ACPI/EC: Add command flushing support.
ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread
to poll EC events.
ACPI/EC: Add GPE reference counting debugging messages.

drivers/acpi/ec.c | 379 ++++++++++++++++++++++++++++++++++++++---------
drivers/acpi/internal.h | 2 +
2 files changed, 313 insertions(+), 68 deletions(-)

--
1.7.10


2014-11-03 05:16:24

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

By using the 2 flags, we can indicate an inter-mediate state where the
current transactions should be completed while the new transactions should
be dropped.

The comparison of the old flag and the new flags:
Old New
about to set BLOCKED STOPPED set / STARTED set
BLOCKED set STOPPED clear / STARTED clear
BLOCKED clear STOPPED clear / STARTED set
The new period is between the point where we are about to set BLOCKED and
the point when the BLOCKED is set. The GPE is disabled during this period.
The new flags allow us to add acpi_ec_stopped() check to only check with
STOPPED flag to implement transaction flushing. This is not done in this
patch.

No functional changes except that after applying this patch, the GPE
enabling/disabling is protected by the EC specific lock. We can do this
because of recent ACPICA GPE API enhancement. This is reasonable as the GPE
disabling/enabling state should only be determined by the EC driver's state
machine which is protected by the EC spinlock.

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 5f9b74b..192cd11 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -79,7 +79,8 @@ enum {
EC_FLAGS_GPE_STORM, /* GPE storm detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
- EC_FLAGS_BLOCKED, /* Transactions are blocked */
+ EC_FLAGS_STARTED, /* Driver is started */
+ EC_FLAGS_STOPPED, /* Driver is stopped */
};

#define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
@@ -129,6 +130,16 @@ static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
static int EC_FLAGS_QUERY_HANDSHAKE; /* Needs QR_EC issued when SCI_EVT set */

/* --------------------------------------------------------------------------
+ * Device Flags
+ * -------------------------------------------------------------------------- */
+
+static bool acpi_ec_started(struct acpi_ec *ec)
+{
+ return test_bit(EC_FLAGS_STARTED, &ec->flags) &&
+ !test_bit(EC_FLAGS_STOPPED, &ec->flags);
+}
+
+/* --------------------------------------------------------------------------
* Transaction Management
* -------------------------------------------------------------------------- */

@@ -354,7 +365,7 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
if (t->rdata)
memset(t->rdata, 0, t->rlen);
mutex_lock(&ec->mutex);
- if (test_bit(EC_FLAGS_BLOCKED, &ec->flags)) {
+ if (!acpi_ec_started(ec)) {
status = -EINVAL;
goto unlock;
}
@@ -511,6 +522,35 @@ static void acpi_ec_clear(struct acpi_ec *ec)
pr_info("%d stale EC events cleared\n", i);
}

+static void acpi_ec_start(struct acpi_ec *ec)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
+ pr_debug("+++++ Starting EC +++++\n");
+ acpi_enable_gpe(NULL, ec->gpe);
+ pr_info("+++++ EC started +++++\n");
+ }
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
+static void acpi_ec_stop(struct acpi_ec *ec)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ if (acpi_ec_started(ec)) {
+ pr_debug("+++++ Stopping EC +++++\n");
+ set_bit(EC_FLAGS_STOPPED, &ec->flags);
+ acpi_disable_gpe(NULL, ec->gpe);
+ clear_bit(EC_FLAGS_STARTED, &ec->flags);
+ clear_bit(EC_FLAGS_STOPPED, &ec->flags);
+ pr_info("+++++ EC stopped +++++\n");
+ }
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
void acpi_ec_block_transactions(void)
{
struct acpi_ec *ec = first_ec;
@@ -520,7 +560,7 @@ void acpi_ec_block_transactions(void)

mutex_lock(&ec->mutex);
/* Prevent transactions from being carried out */
- set_bit(EC_FLAGS_BLOCKED, &ec->flags);
+ acpi_ec_stop(ec);
mutex_unlock(&ec->mutex);
}

@@ -533,7 +573,7 @@ void acpi_ec_unblock_transactions(void)

mutex_lock(&ec->mutex);
/* Allow transactions to be carried out again */
- clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
+ acpi_ec_start(ec);

if (EC_FLAGS_CLEAR_ON_RESUME)
acpi_ec_clear(ec);
@@ -548,7 +588,7 @@ void acpi_ec_unblock_transactions_early(void)
* atomic context during wakeup, so we don't need to acquire the mutex).
*/
if (first_ec)
- clear_bit(EC_FLAGS_BLOCKED, &first_ec->flags);
+ acpi_ec_start(first_ec);
}

static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data)
@@ -816,7 +856,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
if (ACPI_FAILURE(status))
return -ENODEV;

- acpi_enable_gpe(NULL, ec->gpe);
+ acpi_ec_start(ec);
status = acpi_install_address_space_handler(ec->handle,
ACPI_ADR_SPACE_EC,
&acpi_ec_space_handler,
@@ -831,7 +871,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
pr_err("Fail in evaluating the _REG object"
" of EC device. Broken bios is suspected.\n");
} else {
- acpi_disable_gpe(NULL, ec->gpe);
+ acpi_ec_stop(ec);
acpi_remove_gpe_handler(NULL, ec->gpe,
&acpi_ec_gpe_handler);
return -ENODEV;
@@ -844,7 +884,7 @@ static int ec_install_handlers(struct acpi_ec *ec)

static void ec_remove_handlers(struct acpi_ec *ec)
{
- acpi_disable_gpe(NULL, ec->gpe);
+ acpi_ec_stop(ec);
if (ACPI_FAILURE(acpi_remove_address_space_handler(ec->handle,
ACPI_ADR_SPACE_EC, &acpi_ec_space_handler)))
pr_err("failed to remove space handler\n");
--
1.7.10

2014-11-03 05:16:29

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 2/6] ACPI/EC: Enhance the checks to apply to QR_EC transactions.

Currently a check is applied to new transactions, but QR_EC transactions
are not included. This patch merges the code path to make the check also
applying to the QR_EC transactions.

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 192cd11..9a51f7a 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -336,6 +336,10 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
udelay(ACPI_EC_MSI_UDELAY);
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
+ if (!acpi_ec_started(ec)) {
+ ret = -EINVAL;
+ goto unlock;
+ }
/* following two actions should be kept atomic */
ec->curr = t;
pr_debug("***** Command(%s) started *****\n",
@@ -351,6 +355,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
pr_debug("***** Command(%s) stopped *****\n",
acpi_ec_cmd_string(t->command));
ec->curr = NULL;
+unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
}
@@ -365,10 +370,6 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
if (t->rdata)
memset(t->rdata, 0, t->rlen);
mutex_lock(&ec->mutex);
- if (!acpi_ec_started(ec)) {
- status = -EINVAL;
- goto unlock;
- }
if (ec->global_lock) {
status = acpi_acquire_global_lock(ACPI_EC_UDELAY_GLK, &glk);
if (ACPI_FAILURE(status)) {
--
1.7.10

2014-11-03 05:16:37

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 3/6] ACPI/EC: Add reference counting for query handlers.

This patch adds reference counting for query handlers in order to eliminate
kmalloc()/kfree() usage.

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Steffen Weber <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/ec.c | 47 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 9a51f7a..69e4f35 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -106,6 +106,7 @@ struct acpi_ec_query_handler {
acpi_handle handle;
void *data;
u8 query_bit;
+ struct kref kref;
};

struct transaction {
@@ -120,6 +121,10 @@ struct transaction {
u8 flags;
};

+static struct acpi_ec_query_handler *
+acpi_ec_get_query_handler(struct acpi_ec_query_handler *handler);
+static void acpi_ec_put_query_handler(struct acpi_ec_query_handler *handler);
+
struct acpi_ec *boot_ec, *first_ec;
EXPORT_SYMBOL(first_ec);

@@ -619,6 +624,27 @@ static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data)
/* --------------------------------------------------------------------------
Event Management
-------------------------------------------------------------------------- */
+static struct acpi_ec_query_handler *
+acpi_ec_get_query_handler(struct acpi_ec_query_handler *handler)
+{
+ if (handler)
+ kref_get(&handler->kref);
+ return handler;
+}
+
+static void acpi_ec_query_handler_release(struct kref *kref)
+{
+ struct acpi_ec_query_handler *handler =
+ container_of(kref, struct acpi_ec_query_handler, kref);
+
+ kfree(handler);
+}
+
+static void acpi_ec_put_query_handler(struct acpi_ec_query_handler *handler)
+{
+ kref_put(&handler->kref, acpi_ec_query_handler_release);
+}
+
int acpi_ec_add_query_handler(struct acpi_ec *ec, u8 query_bit,
acpi_handle handle, acpi_ec_query_func func,
void *data)
@@ -634,6 +660,7 @@ int acpi_ec_add_query_handler(struct acpi_ec *ec, u8 query_bit,
handler->func = func;
handler->data = data;
mutex_lock(&ec->mutex);
+ kref_init(&handler->kref);
list_add(&handler->node, &ec->list);
mutex_unlock(&ec->mutex);
return 0;
@@ -643,15 +670,18 @@ EXPORT_SYMBOL_GPL(acpi_ec_add_query_handler);
void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit)
{
struct acpi_ec_query_handler *handler, *tmp;
+ LIST_HEAD(free_list);

mutex_lock(&ec->mutex);
list_for_each_entry_safe(handler, tmp, &ec->list, node) {
if (query_bit == handler->query_bit) {
- list_del(&handler->node);
- kfree(handler);
+ list_del_init(&handler->node);
+ list_add(&handler->node, &free_list);
}
}
mutex_unlock(&ec->mutex);
+ list_for_each_entry(handler, &free_list, node)
+ acpi_ec_put_query_handler(handler);
}
EXPORT_SYMBOL_GPL(acpi_ec_remove_query_handler);

@@ -667,14 +697,14 @@ static void acpi_ec_run(void *cxt)
else if (handler->handle)
acpi_evaluate_object(handler->handle, NULL, NULL, NULL);
pr_debug("##### Query(0x%02x) stopped #####\n", handler->query_bit);
- kfree(handler);
+ acpi_ec_put_query_handler(handler);
}

static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data)
{
u8 value = 0;
int status;
- struct acpi_ec_query_handler *handler, *copy;
+ struct acpi_ec_query_handler *handler;

status = acpi_ec_query_unlocked(ec, &value);
if (data)
@@ -685,15 +715,12 @@ static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data)
list_for_each_entry(handler, &ec->list, node) {
if (value == handler->query_bit) {
/* have custom handler for this bit */
- copy = kmalloc(sizeof(*handler), GFP_KERNEL);
- if (!copy)
- return -ENOMEM;
- memcpy(copy, handler, sizeof(*copy));
+ handler = acpi_ec_get_query_handler(handler);
pr_debug("##### Query(0x%02x) scheduled #####\n",
handler->query_bit);
- return acpi_os_execute((copy->func) ?
+ return acpi_os_execute((handler->func) ?
OSL_NOTIFY_HANDLER : OSL_GPE_HANDLER,
- acpi_ec_run, copy);
+ acpi_ec_run, handler);
}
}
return 0;
--
1.7.10

2014-11-03 05:16:41

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 4/6] ACPI/EC: Add command flushing support.

This patch implements command flushing support. It's better to wait all
command transactions to be completed before disabling the EC GPE when the
system is going to be suspended. By doing so, the EC hardware can be
ensured to be in the idle state when the system is resumed.

There is a good indicator for flush support:
All acpi_ec_submit_request() is invoked after checking driver state with
acpi_ec_started() except the first one. This means all code paths can be
flushed as fast as possible by discarding the requests occurred after the
flush operation. The reference increased for such kind of code path is
wrapped by acpi_ec_submit_flushable_request().

The system suspending/resuming test result is as follows:
ACPI : EC: +++++ Stopping EC +++++
* ACPI : EC: +++++ EC stopped +++++
ACPI : EC: +++++ Starting EC +++++
# ACPI : EC: +++++ EC started +++++
This is performed by "switching wireless switch on/off" and "plugging power
cord in/out" frequently during the suspending/resuming. The above dmesg
shows that the EC driver is stopped during suspending (*) and is restarted
during resuming (#) correctly.

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 72 +++++++++++++++++++++++++++++++++++++++++++++--
drivers/acpi/internal.h | 1 +
2 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 69e4f35..a76794a 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -144,6 +144,53 @@ static bool acpi_ec_started(struct acpi_ec *ec)
!test_bit(EC_FLAGS_STOPPED, &ec->flags);
}

+static bool acpi_ec_flushed(struct acpi_ec *ec)
+{
+ return ec->reference_count == 1;
+}
+
+/* --------------------------------------------------------------------------
+ * GPE Enhancement
+ * -------------------------------------------------------------------------- */
+
+static void acpi_ec_submit_request(struct acpi_ec *ec)
+{
+ ec->reference_count++;
+ if (ec->reference_count == 1)
+ acpi_enable_gpe(NULL, ec->gpe);
+}
+
+static void acpi_ec_complete_request(struct acpi_ec *ec)
+{
+ bool flushed = false;
+
+ ec->reference_count--;
+ if (ec->reference_count == 0)
+ acpi_disable_gpe(NULL, ec->gpe);
+ flushed = acpi_ec_flushed(ec);
+ if (flushed)
+ wake_up(&ec->wait);
+}
+
+/*
+ * acpi_ec_submit_flushable_request() - Increase the reference count unless
+ * the flush operation is not in
+ * progress
+ * @ec: the EC device
+ *
+ * This function must be used before taking a new action that should hold
+ * the reference count. If this function returns false, then the action
+ * must be discarded or it will prevent the flush operation from being
+ * completed.
+ */
+static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
+{
+ if (!acpi_ec_started(ec))
+ return false;
+ acpi_ec_submit_request(ec);
+ return true;
+}
+
/* --------------------------------------------------------------------------
* Transaction Management
* -------------------------------------------------------------------------- */
@@ -341,7 +388,8 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
udelay(ACPI_EC_MSI_UDELAY);
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
- if (!acpi_ec_started(ec)) {
+ /* Enable GPE for command processing (IBF=0/OBF=1) */
+ if (!acpi_ec_submit_flushable_request(ec)) {
ret = -EINVAL;
goto unlock;
}
@@ -360,6 +408,8 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
pr_debug("***** Command(%s) stopped *****\n",
acpi_ec_cmd_string(t->command));
ec->curr = NULL;
+ /* Disable GPE for command processing (IBF=0/OBF=1) */
+ acpi_ec_complete_request(ec);
unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
@@ -535,12 +585,24 @@ static void acpi_ec_start(struct acpi_ec *ec)
spin_lock_irqsave(&ec->lock, flags);
if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
pr_debug("+++++ Starting EC +++++\n");
- acpi_enable_gpe(NULL, ec->gpe);
+ /* Enable GPE for event processing (SCI_EVT=1) */
+ acpi_ec_submit_request(ec);
pr_info("+++++ EC started +++++\n");
}
spin_unlock_irqrestore(&ec->lock, flags);
}

+static bool acpi_ec_stopped(struct acpi_ec *ec)
+{
+ unsigned long flags;
+ bool flushed;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ flushed = acpi_ec_flushed(ec);
+ spin_unlock_irqrestore(&ec->lock, flags);
+ return flushed;
+}
+
static void acpi_ec_stop(struct acpi_ec *ec)
{
unsigned long flags;
@@ -549,7 +611,11 @@ static void acpi_ec_stop(struct acpi_ec *ec)
if (acpi_ec_started(ec)) {
pr_debug("+++++ Stopping EC +++++\n");
set_bit(EC_FLAGS_STOPPED, &ec->flags);
- acpi_disable_gpe(NULL, ec->gpe);
+ spin_unlock_irqrestore(&ec->lock, flags);
+ wait_event(ec->wait, acpi_ec_stopped(ec));
+ spin_lock_irqsave(&ec->lock, flags);
+ /* Disable GPE for event processing (SCI_EVT=1) */
+ acpi_ec_complete_request(ec);
clear_bit(EC_FLAGS_STARTED, &ec->flags);
clear_bit(EC_FLAGS_STOPPED, &ec->flags);
pr_info("+++++ EC stopped +++++\n");
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 163e82f..bbcfe0b 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -122,6 +122,7 @@ struct acpi_ec {
unsigned long data_addr;
unsigned long global_lock;
unsigned long flags;
+ unsigned long reference_count;
struct mutex mutex;
wait_queue_head_t wait;
struct list_head list;
--
1.7.10

2014-11-03 05:16:52

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 6/6] ACPI/EC: Add GPE reference counting debugging messages.

This patch enhances debugging with the GPE reference count messages added.
No functional changes.

Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/ec.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 7089081..5ac189b 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -31,6 +31,7 @@

/* Uncomment next line to get verbose printout */
/* #define DEBUG */
+#define DEBUG_REF 0
#define pr_fmt(fmt) "ACPI : EC: " fmt

#include <linux/kernel.h>
@@ -88,6 +89,13 @@ enum {
#define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
#define ACPI_EC_COMMAND_COMPLETE 0x02 /* Completed last byte */

+#define ec_debug_ref(ec, fmt, ...) \
+ do { \
+ if (DEBUG_REF) \
+ pr_debug("%lu: " fmt, ec->reference_count, \
+ ## __VA_ARGS__); \
+ } while (0)
+
/* ec.c is compiled in acpi namespace so this shows up as acpi.ec_delay param */
static unsigned int ec_delay __read_mostly = ACPI_EC_DELAY;
module_param(ec_delay, uint, 0644);
@@ -218,6 +226,7 @@ static void acpi_ec_enable_event(struct acpi_ec *ec)
spin_unlock_irqrestore(&ec->lock, flags);
return;
}
+ ec_debug_ref(ec, "Increase poller(enable)\n");
set_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags);
if (test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
pr_debug("***** Event pending *****\n");
@@ -226,6 +235,7 @@ static void acpi_ec_enable_event(struct acpi_ec *ec)
return;
}
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease poller(enable)\n");
spin_unlock_irqrestore(&ec->lock, flags);
}

@@ -234,6 +244,7 @@ static void __acpi_ec_set_event(struct acpi_ec *ec)
/* Hold reference for pending event */
if (!acpi_ec_submit_flushable_request(ec, false))
return;
+ ec_debug_ref(ec, "Increase poller(set)\n");
if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
pr_debug("***** Event pending *****\n");
if (test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags)) {
@@ -242,6 +253,7 @@ static void __acpi_ec_set_event(struct acpi_ec *ec)
}
}
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease poller(set)\n");
}

static void __acpi_ec_complete_event(struct acpi_ec *ec)
@@ -249,6 +261,7 @@ static void __acpi_ec_complete_event(struct acpi_ec *ec)
if (test_and_clear_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
/* Unhold reference for pending event */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease poller\n");
pr_debug("***** Event running *****\n");
}
}
@@ -471,6 +484,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
ret = -EINVAL;
goto unlock;
}
+ ec_debug_ref(ec, "Increase command\n");
/* following two actions should be kept atomic */
ec->curr = t;
pr_debug("***** Command(%s) started *****\n",
@@ -484,6 +498,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
ec->curr = NULL;
/* Disable GPE for command processing (IBF=0/OBF=1) */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease command\n");
unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
@@ -659,6 +674,7 @@ static void acpi_ec_start(struct acpi_ec *ec)
pr_debug("+++++ Starting EC +++++\n");
/* Enable GPE for event processing (SCI_EVT=1) */
acpi_ec_submit_request(ec);
+ ec_debug_ref(ec, "Increase event\n");
pr_info("+++++ EC started +++++\n");
}
spin_unlock_irqrestore(&ec->lock, flags);
@@ -688,6 +704,7 @@ static void acpi_ec_stop(struct acpi_ec *ec)
spin_lock_irqsave(&ec->lock, flags);
/* Disable GPE for event processing (SCI_EVT=1) */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease event\n");
clear_bit(EC_FLAGS_STARTED, &ec->flags);
clear_bit(EC_FLAGS_STOPPED, &ec->flags);
pr_info("+++++ EC stopped +++++\n");
--
1.7.10

2014-11-03 05:17:15

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

There is wait code in the QR_SC command processing, which makes it not
suitable to be put into a work queue item (see bug 82611). And there is
case that the SCI_EVT cannot trigger GPE, though all commands have polling
mode implemented, the event cannot be polled (see bug 77431).

So if the QR_SC command can be put into a seperate IRQ thread, then the
work queue will not be blocked by the QR_SC command processing and we can
also trigger polling using the thread. Using IRQ thread also allows us to
change the EC GPE handler into the threaded IRQ model when possible.

This patch thus adds the IRQ polling thread for SCI_EVT polling and removes
QR_SC processing work item.

The reasons why we do not put a loop in the event poller to poll event
until the returned query value is 0:
Some platforms return non 0 query value even when SCI_EVT=0, if we put a
loop in the poller, our command flush mechanism could never execute to
an end thus the system suspending process could be blocked. One SCI_EVT
triggering one QR_EC is current logic and has been proven to be working
for so long time.

The reasons why it is not implemented directly using threaded IRQ are:
1. ACPICA doesn't support threaded IRQ as not all of the OSPMs support
threaded IRQ while GPE handler registerations are done through ACPICA.
2. The IRQ processing code need to be identical for both the IRQ handler
and the thread callback, while currently, though the command GPE
handling is ready for both IRQ and polling mode, only the event GPE is
is polled in the event polling thread and the command is polled in the
user threads.
So we use a standalone kernel thread, if the above situations are changed
in the future, we can easily convert the code into the threaded IRQ style.

The old EC_FLAGS_QUERY_PENDING is converted to EC_FLAGS_EVENT_ENABLED in
this patch, so that its naming is consistent with EC_FLAGS_EVENT_PENDING.
The original flag doesn't co-work with SCI_EVT well, this patch refines
its usage by enforcing a event polling wakeup indication as:
EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
So unless the both of the flags are set, the threaded event poller will
not be woken up.

This patch invokes acpi_ec_submit_request() after having detected SCI_EVT
and invokes acpi_ec_complete_request() before having the QR_EC command
processed. This is useful for implementing GPE storm prevention for
malicous "level triggered" SCI_EVT. But the storm prevention is not
implemented in this patch.

Since the acpi_ec_submit_request() invoked after detecting the SCI_EVT is
paired with acpi_ec_complete_request() invoked after completing QR_EC
command, acpi_ec_submit_flushable_request() then need to be modified to
allow QR_EC command to be submitted during this period to revert the
increased reference count. This period can be determined by the event
polling indication:
EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
Without enhancing this check in acpi_ec_submit_flushable_request(), QR_EC
command will not be executed to decrease the reference count added after
detecting the SCI_EVT, thus the system suspending will be blocked because
the reference count equals to 2. Such check is common for flushing code.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 194 ++++++++++++++++++++++++++++++++++-------------
drivers/acpi/internal.h | 1 +
2 files changed, 144 insertions(+), 51 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index a76794a..7089081 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -44,6 +44,7 @@
#include <linux/slab.h>
#include <linux/acpi.h>
#include <linux/dmi.h>
+#include <linux/kthread.h>
#include <asm/io.h>

#include "internal.h"
@@ -75,7 +76,8 @@ enum ec_command {
* when trying to clear the EC */

enum {
- EC_FLAGS_QUERY_PENDING, /* Query is pending */
+ EC_FLAGS_EVENT_ENABLED, /* Event is enabled */
+ EC_FLAGS_EVENT_PENDING, /* Event is pending */
EC_FLAGS_GPE_STORM, /* GPE storm detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
@@ -124,6 +126,7 @@ struct transaction {
static struct acpi_ec_query_handler *
acpi_ec_get_query_handler(struct acpi_ec_query_handler *handler);
static void acpi_ec_put_query_handler(struct acpi_ec_query_handler *handler);
+static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);

struct acpi_ec *boot_ec, *first_ec;
EXPORT_SYMBOL(first_ec);
@@ -149,6 +152,12 @@ static bool acpi_ec_flushed(struct acpi_ec *ec)
return ec->reference_count == 1;
}

+static bool acpi_ec_has_pending_event(struct acpi_ec *ec)
+{
+ return test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags) &&
+ test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags);
+}
+
/* --------------------------------------------------------------------------
* GPE Enhancement
* -------------------------------------------------------------------------- */
@@ -177,20 +186,93 @@ static void acpi_ec_complete_request(struct acpi_ec *ec)
* the flush operation is not in
* progress
* @ec: the EC device
+ * @check_event: check whether event is pending
*
* This function must be used before taking a new action that should hold
* the reference count. If this function returns false, then the action
* must be discarded or it will prevent the flush operation from being
* completed.
+ *
+ * During flushing, QR_EC command need to pass this check when there is a
+ * pending event, so that the reference count held for the pending event
+ * can be decreased by the completion of the QR_EC command.
*/
-static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
+static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec,
+ bool check_event)
{
- if (!acpi_ec_started(ec))
- return false;
+ if (!acpi_ec_started(ec)) {
+ if (!check_event || !acpi_ec_has_pending_event(ec))
+ return false;
+ }
acpi_ec_submit_request(ec);
return true;
}

+static void acpi_ec_enable_event(struct acpi_ec *ec)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ /* Hold reference for pending event */
+ if (!acpi_ec_submit_flushable_request(ec, false)) {
+ spin_unlock_irqrestore(&ec->lock, flags);
+ return;
+ }
+ set_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags);
+ if (test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
+ pr_debug("***** Event pending *****\n");
+ wake_up_process(ec->thread);
+ spin_unlock_irqrestore(&ec->lock, flags);
+ return;
+ }
+ acpi_ec_complete_request(ec);
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
+static void __acpi_ec_set_event(struct acpi_ec *ec)
+{
+ /* Hold reference for pending event */
+ if (!acpi_ec_submit_flushable_request(ec, false))
+ return;
+ if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
+ pr_debug("***** Event pending *****\n");
+ if (test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags)) {
+ wake_up_process(ec->thread);
+ return;
+ }
+ }
+ acpi_ec_complete_request(ec);
+}
+
+static void __acpi_ec_complete_event(struct acpi_ec *ec)
+{
+ if (test_and_clear_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
+ /* Unhold reference for pending event */
+ acpi_ec_complete_request(ec);
+ pr_debug("***** Event running *****\n");
+ }
+}
+
+int acpi_ec_wait_for_event(struct acpi_ec *ec)
+{
+ unsigned long flags;
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ while (!kthread_should_stop()) {
+ spin_lock_irqsave(&ec->lock, flags);
+ if (acpi_ec_has_pending_event(ec)) {
+ spin_unlock_irqrestore(&ec->lock, flags);
+ __set_current_state(TASK_RUNNING);
+ return 0;
+ }
+ spin_unlock_irqrestore(&ec->lock, flags);
+ schedule();
+ set_current_state(TASK_INTERRUPTIBLE);
+ }
+ __set_current_state(TASK_RUNNING);
+ return -1;
+}
+
/* --------------------------------------------------------------------------
* Transaction Management
* -------------------------------------------------------------------------- */
@@ -298,7 +380,7 @@ static bool advance_transaction(struct acpi_ec *ec)
t->flags |= ACPI_EC_COMMAND_COMPLETE;
wakeup = true;
}
- return wakeup;
+ goto out;
} else {
if (EC_FLAGS_QUERY_HANDSHAKE &&
!(status & ACPI_EC_FLAG_SCI) &&
@@ -312,9 +394,11 @@ static bool advance_transaction(struct acpi_ec *ec)
} else if ((status & ACPI_EC_FLAG_IBF) == 0) {
acpi_ec_write_cmd(ec, t->command);
t->flags |= ACPI_EC_COMMAND_POLL;
+ if (t->command == ACPI_EC_COMMAND_QUERY)
+ __acpi_ec_complete_event(ec);
} else
goto err;
- return wakeup;
+ goto out;
}
err:
/*
@@ -325,6 +409,10 @@ err:
if (in_interrupt() && t)
++t->irq_count;
}
+out:
+ if (status & ACPI_EC_FLAG_SCI &&
+ (!t || t->flags & ACPI_EC_COMMAND_COMPLETE))
+ __acpi_ec_set_event(ec);
return wakeup;
}

@@ -335,17 +423,6 @@ static void start_transaction(struct acpi_ec *ec)
(void)advance_transaction(ec);
}

-static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);
-
-static int ec_check_sci_sync(struct acpi_ec *ec, u8 state)
-{
- if (state & ACPI_EC_FLAG_SCI) {
- if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags))
- return acpi_ec_sync_query(ec, NULL);
- }
- return 0;
-}
-
static int ec_poll(struct acpi_ec *ec)
{
unsigned long flags;
@@ -384,12 +461,13 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
unsigned long tmp;
int ret = 0;

+ bool is_query = !!(t->command == ACPI_EC_COMMAND_QUERY);
if (EC_FLAGS_MSI)
udelay(ACPI_EC_MSI_UDELAY);
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
/* Enable GPE for command processing (IBF=0/OBF=1) */
- if (!acpi_ec_submit_flushable_request(ec)) {
+ if (!acpi_ec_submit_flushable_request(ec, is_query)) {
ret = -EINVAL;
goto unlock;
}
@@ -398,10 +476,6 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
pr_debug("***** Command(%s) started *****\n",
acpi_ec_cmd_string(t->command));
start_transaction(ec);
- if (ec->curr->command == ACPI_EC_COMMAND_QUERY) {
- clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
- pr_debug("***** Event stopped *****\n");
- }
spin_unlock_irqrestore(&ec->lock, tmp);
ret = ec_poll(ec);
spin_lock_irqsave(&ec->lock, tmp);
@@ -440,8 +514,6 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)

status = acpi_ec_transaction_unlocked(ec, t);

- /* check if we received SCI during transaction */
- ec_check_sci_sync(ec, acpi_ec_read_status(ec));
if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
msleep(1);
/* It is safe to enable the GPE outside of the transaction. */
@@ -792,29 +864,6 @@ static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data)
return 0;
}

-static void acpi_ec_gpe_query(void *ec_cxt)
-{
- struct acpi_ec *ec = ec_cxt;
-
- if (!ec)
- return;
- mutex_lock(&ec->mutex);
- acpi_ec_sync_query(ec, NULL);
- mutex_unlock(&ec->mutex);
-}
-
-static int ec_check_sci(struct acpi_ec *ec, u8 state)
-{
- if (state & ACPI_EC_FLAG_SCI) {
- if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) {
- pr_debug("***** Event started *****\n");
- return acpi_os_execute(OSL_NOTIFY_HANDLER,
- acpi_ec_gpe_query, ec);
- }
- }
- return 0;
-}
-
static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
u32 gpe_number, void *data)
{
@@ -825,10 +874,46 @@ static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
if (advance_transaction(ec))
wake_up(&ec->wait);
spin_unlock_irqrestore(&ec->lock, flags);
- ec_check_sci(ec, acpi_ec_read_status(ec));
return ACPI_INTERRUPT_HANDLED | ACPI_REENABLE_GPE;
}

+static int acpi_ec_event_poller(void *context)
+{
+ struct acpi_ec *ec = context;
+
+ while (!acpi_ec_wait_for_event(ec)) {
+ pr_debug("***** Event poller started *****\n");
+ mutex_lock(&ec->mutex);
+ (void)acpi_ec_sync_query(ec, NULL);
+ mutex_unlock(&ec->mutex);
+ pr_debug("***** Event poller stopped *****\n");
+ }
+ return 0;
+}
+
+static int ec_create_event_poller(struct acpi_ec *ec)
+{
+ struct task_struct *t;
+
+ t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
+ if (IS_ERR(t)) {
+ pr_err("failed to create event poller %lu\n", ec->gpe);
+ return PTR_ERR(t);
+ }
+ get_task_struct(t);
+ ec->thread = t;
+ return 0;
+}
+
+static void ec_delete_event_poller(struct acpi_ec *ec)
+{
+ struct task_struct *t = ec->thread;
+
+ ec->thread = NULL;
+ kthread_stop(t);
+ put_task_struct(t);
+}
+
/* --------------------------------------------------------------------------
* Address Space Management
* -------------------------------------------------------------------------- */
@@ -884,7 +969,6 @@ static struct acpi_ec *make_acpi_ec(void)

if (!ec)
return NULL;
- ec->flags = 1 << EC_FLAGS_QUERY_PENDING;
mutex_init(&ec->mutex);
init_waitqueue_head(&ec->wait);
INIT_LIST_HEAD(&ec->list);
@@ -940,15 +1024,21 @@ ec_parse_device(acpi_handle handle, u32 Level, void *context, void **retval)

static int ec_install_handlers(struct acpi_ec *ec)
{
+ int ret;
acpi_status status;

if (test_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags))
return 0;
+ ret = ec_create_event_poller(ec);
+ if (ret)
+ return ret;
status = acpi_install_gpe_handler(NULL, ec->gpe,
ACPI_GPE_EDGE_TRIGGERED,
&acpi_ec_gpe_handler, ec);
- if (ACPI_FAILURE(status))
+ if (ACPI_FAILURE(status)) {
+ ec_delete_event_poller(ec);
return -ENODEV;
+ }

acpi_ec_start(ec);
status = acpi_install_address_space_handler(ec->handle,
@@ -968,6 +1058,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
acpi_ec_stop(ec);
acpi_remove_gpe_handler(NULL, ec->gpe,
&acpi_ec_gpe_handler);
+ ec_delete_event_poller(ec);
return -ENODEV;
}
}
@@ -985,6 +1076,7 @@ static void ec_remove_handlers(struct acpi_ec *ec)
if (ACPI_FAILURE(acpi_remove_gpe_handler(NULL, ec->gpe,
&acpi_ec_gpe_handler)))
pr_err("failed to remove gpe handler\n");
+ ec_delete_event_poller(ec);
clear_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags);
}

@@ -1032,7 +1124,7 @@ static int acpi_ec_add(struct acpi_device *device)
ret = ec_install_handlers(ec);

/* EC is fully operational, allow queries */
- clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+ acpi_ec_enable_event(ec);

/* Clear stale _Q events if hardware might require that */
if (EC_FLAGS_CLEAR_ON_RESUME) {
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index bbcfe0b..20a569c 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -128,6 +128,7 @@ struct acpi_ec {
struct list_head list;
struct transaction *curr;
spinlock_t lock;
+ struct task_struct *thread;
};

extern struct acpi_ec *first_ec;
--
1.7.10

2014-11-05 02:54:40

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Hi, Rafael

There is one thing I should let you know.

Originally this patchset is dependent on the GPE "dead lock" fix.
Because this patch will invoke acpi_enable_gpe()/acpi_disable_gpe() with EC lock held.

I saw system hang during suspending using only this patchset, so we have to find a solution.

> From: Zheng, Lv
> Sent: Monday, November 03, 2014 1:16 PM
>
> By using the 2 flags, we can indicate an inter-mediate state where the
> current transactions should be completed while the new transactions should
> be dropped.
>
> The comparison of the old flag and the new flags:
> Old New
> about to set BLOCKED STOPPED set / STARTED set
> BLOCKED set STOPPED clear / STARTED clear
> BLOCKED clear STOPPED clear / STARTED set
> The new period is between the point where we are about to set BLOCKED and
> the point when the BLOCKED is set. The GPE is disabled during this period.
> The new flags allow us to add acpi_ec_stopped() check to only check with
> STOPPED flag to implement transaction flushing. This is not done in this
> patch.
>
> No functional changes except that after applying this patch, the GPE
> enabling/disabling is protected by the EC specific lock. We can do this
> because of recent ACPICA GPE API enhancement. This is reasonable as the GPE
> disabling/enabling state should only be determined by the EC driver's state
> machine which is protected by the EC spinlock.

This paragraph is talking about the dependency.

>
> Signed-off-by: Lv Zheng <[email protected]>
> Tested-by: Ortwin Glück <[email protected]>
> ---
> drivers/acpi/ec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index 5f9b74b..192cd11 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -79,7 +79,8 @@ enum {
> EC_FLAGS_GPE_STORM, /* GPE storm detected */
> EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
> * OpReg are installed */
> - EC_FLAGS_BLOCKED, /* Transactions are blocked */
> + EC_FLAGS_STARTED, /* Driver is started */
> + EC_FLAGS_STOPPED, /* Driver is stopped */
> };
>
> #define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
> @@ -129,6 +130,16 @@ static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
> static int EC_FLAGS_QUERY_HANDSHAKE; /* Needs QR_EC issued when SCI_EVT set */
>
> /* --------------------------------------------------------------------------
> + * Device Flags
> + * -------------------------------------------------------------------------- */
> +
> +static bool acpi_ec_started(struct acpi_ec *ec)
> +{
> + return test_bit(EC_FLAGS_STARTED, &ec->flags) &&
> + !test_bit(EC_FLAGS_STOPPED, &ec->flags);
> +}
> +
> +/* --------------------------------------------------------------------------
> * Transaction Management
> * -------------------------------------------------------------------------- */
>
> @@ -354,7 +365,7 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
> if (t->rdata)
> memset(t->rdata, 0, t->rlen);
> mutex_lock(&ec->mutex);
> - if (test_bit(EC_FLAGS_BLOCKED, &ec->flags)) {
> + if (!acpi_ec_started(ec)) {
> status = -EINVAL;
> goto unlock;
> }
> @@ -511,6 +522,35 @@ static void acpi_ec_clear(struct acpi_ec *ec)
> pr_info("%d stale EC events cleared\n", i);
> }
>
> +static void acpi_ec_start(struct acpi_ec *ec)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ec->lock, flags);
> + if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
> + pr_debug("+++++ Starting EC +++++\n");
> + acpi_enable_gpe(NULL, ec->gpe);

This can work without "GPE dead lock" fix applied because:
1. During boot, this API is called when the EC GPE is disabled.
2. During resume, this API is called when the EC GPE is disabled (because EC GPE is always not wake capable).

> + pr_info("+++++ EC started +++++\n");
> + }
> + spin_unlock_irqrestore(&ec->lock, flags);
> +}
> +
> +static void acpi_ec_stop(struct acpi_ec *ec)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ec->lock, flags);
> + if (acpi_ec_started(ec)) {
> + pr_debug("+++++ Stopping EC +++++\n");
> + set_bit(EC_FLAGS_STOPPED, &ec->flags);
> + acpi_disable_gpe(NULL, ec->gpe);

But this cannot work without "GPE dead lock" fix applied because:

In acpi_pm_freeze(), the call graph would be:
acpi_pm_freeze()
acpi_disable_all_gpes()
acpi_os_wait_events_complete()
acpi_ec_block_transactions()
acpi_ec_stop()
hold EC lock
acpi_disable_gpe()
hold GPE lock

And in the GPE handler acpi_irq(), the call graph would be:
acpi_irq()
acpi_ev_sci_xrupt_handler()
acpi_ev_gpe_detect()
hold GPE lock
acpi_ev_gpe_dispatch()
acpi_ec_gpe_handler()
hold EC lock

Since acpi_os_wait_events_complete() cannot flush GPE but can only flush _Lxx/_Exx evaluation work queue currently.
The reversed ordered dead lock can happen.
We need to fix the acpi_os_wait_events_complete() prior than this series.
I have a fix to invoke synchronize_irq() in acpi_os_wait_events_complete().
Let me send it to you.
This cleanup should be applied after that fix.

Thanks and best regards
-Lv

> + clear_bit(EC_FLAGS_STARTED, &ec->flags);
> + clear_bit(EC_FLAGS_STOPPED, &ec->flags);
> + pr_info("+++++ EC stopped +++++\n");
> + }
> + spin_unlock_irqrestore(&ec->lock, flags);
> +}
> +
> void acpi_ec_block_transactions(void)
> {
> struct acpi_ec *ec = first_ec;
> @@ -520,7 +560,7 @@ void acpi_ec_block_transactions(void)
>
> mutex_lock(&ec->mutex);
> /* Prevent transactions from being carried out */
> - set_bit(EC_FLAGS_BLOCKED, &ec->flags);
> + acpi_ec_stop(ec);
> mutex_unlock(&ec->mutex);
> }
>
> @@ -533,7 +573,7 @@ void acpi_ec_unblock_transactions(void)
>
> mutex_lock(&ec->mutex);
> /* Allow transactions to be carried out again */
> - clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
> + acpi_ec_start(ec);
>
> if (EC_FLAGS_CLEAR_ON_RESUME)
> acpi_ec_clear(ec);
> @@ -548,7 +588,7 @@ void acpi_ec_unblock_transactions_early(void)
> * atomic context during wakeup, so we don't need to acquire the mutex).
> */
> if (first_ec)
> - clear_bit(EC_FLAGS_BLOCKED, &first_ec->flags);
> + acpi_ec_start(first_ec);
> }
>
> static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 *data)
> @@ -816,7 +856,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
> if (ACPI_FAILURE(status))
> return -ENODEV;
>
> - acpi_enable_gpe(NULL, ec->gpe);
> + acpi_ec_start(ec);
> status = acpi_install_address_space_handler(ec->handle,
> ACPI_ADR_SPACE_EC,
> &acpi_ec_space_handler,
> @@ -831,7 +871,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
> pr_err("Fail in evaluating the _REG object"
> " of EC device. Broken bios is suspected.\n");
> } else {
> - acpi_disable_gpe(NULL, ec->gpe);
> + acpi_ec_stop(ec);
> acpi_remove_gpe_handler(NULL, ec->gpe,
> &acpi_ec_gpe_handler);
> return -ENODEV;
> @@ -844,7 +884,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
>
> static void ec_remove_handlers(struct acpi_ec *ec)
> {
> - acpi_disable_gpe(NULL, ec->gpe);
> + acpi_ec_stop(ec);
> if (ACPI_FAILURE(acpi_remove_address_space_handler(ec->handle,
> ACPI_ADR_SPACE_EC, &acpi_ec_space_handler)))
> pr_err("failed to remove space handler\n");
> --
> 1.7.10

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-12 00:55:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

On Monday, November 03, 2014 01:16:37 PM Lv Zheng wrote:
> There is wait code in the QR_SC command processing, which makes it not
> suitable to be put into a work queue item (see bug 82611). And there is
> case that the SCI_EVT cannot trigger GPE, though all commands have polling
> mode implemented, the event cannot be polled (see bug 77431).
>
> So if the QR_SC command can be put into a seperate IRQ thread, then the
> work queue will not be blocked by the QR_SC command processing and we can
> also trigger polling using the thread. Using IRQ thread also allows us to
> change the EC GPE handler into the threaded IRQ model when possible.
>
> This patch thus adds the IRQ polling thread for SCI_EVT polling and removes
> QR_SC processing work item.
>
> The reasons why we do not put a loop in the event poller to poll event
> until the returned query value is 0:
> Some platforms return non 0 query value even when SCI_EVT=0, if we put a
> loop in the poller, our command flush mechanism could never execute to
> an end thus the system suspending process could be blocked. One SCI_EVT
> triggering one QR_EC is current logic and has been proven to be working
> for so long time.
>
> The reasons why it is not implemented directly using threaded IRQ are:
> 1. ACPICA doesn't support threaded IRQ as not all of the OSPMs support
> threaded IRQ while GPE handler registerations are done through ACPICA.
> 2. The IRQ processing code need to be identical for both the IRQ handler
> and the thread callback, while currently, though the command GPE
> handling is ready for both IRQ and polling mode, only the event GPE is
> is polled in the event polling thread and the command is polled in the
> user threads.
> So we use a standalone kernel thread, if the above situations are changed
> in the future, we can easily convert the code into the threaded IRQ style.
>
> The old EC_FLAGS_QUERY_PENDING is converted to EC_FLAGS_EVENT_ENABLED in
> this patch, so that its naming is consistent with EC_FLAGS_EVENT_PENDING.
> The original flag doesn't co-work with SCI_EVT well, this patch refines
> its usage by enforcing a event polling wakeup indication as:
> EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
> So unless the both of the flags are set, the threaded event poller will
> not be woken up.
>
> This patch invokes acpi_ec_submit_request() after having detected SCI_EVT
> and invokes acpi_ec_complete_request() before having the QR_EC command
> processed. This is useful for implementing GPE storm prevention for
> malicous "level triggered" SCI_EVT. But the storm prevention is not
> implemented in this patch.
>
> Since the acpi_ec_submit_request() invoked after detecting the SCI_EVT is
> paired with acpi_ec_complete_request() invoked after completing QR_EC
> command, acpi_ec_submit_flushable_request() then need to be modified to
> allow QR_EC command to be submitted during this period to revert the
> increased reference count. This period can be determined by the event
> polling indication:
> EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
> Without enhancing this check in acpi_ec_submit_flushable_request(), QR_EC
> command will not be executed to decrease the reference count added after
> detecting the SCI_EVT, thus the system suspending will be blocked because
> the reference count equals to 2. Such check is common for flushing code.
>
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=82611
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=77431
> Signed-off-by: Lv Zheng <[email protected]>
> Tested-by: Ortwin Glück <[email protected]>
> ---
> drivers/acpi/ec.c | 194 ++++++++++++++++++++++++++++++++++-------------
> drivers/acpi/internal.h | 1 +
> 2 files changed, 144 insertions(+), 51 deletions(-)
>
> diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> index a76794a..7089081 100644
> --- a/drivers/acpi/ec.c
> +++ b/drivers/acpi/ec.c
> @@ -44,6 +44,7 @@
> #include <linux/slab.h>
> #include <linux/acpi.h>
> #include <linux/dmi.h>
> +#include <linux/kthread.h>
> #include <asm/io.h>
>
> #include "internal.h"
> @@ -75,7 +76,8 @@ enum ec_command {
> * when trying to clear the EC */
>
> enum {
> - EC_FLAGS_QUERY_PENDING, /* Query is pending */
> + EC_FLAGS_EVENT_ENABLED, /* Event is enabled */
> + EC_FLAGS_EVENT_PENDING, /* Event is pending */
> EC_FLAGS_GPE_STORM, /* GPE storm detected */
> EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
> * OpReg are installed */
> @@ -124,6 +126,7 @@ struct transaction {
> static struct acpi_ec_query_handler *
> acpi_ec_get_query_handler(struct acpi_ec_query_handler *handler);
> static void acpi_ec_put_query_handler(struct acpi_ec_query_handler *handler);
> +static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);
>
> struct acpi_ec *boot_ec, *first_ec;
> EXPORT_SYMBOL(first_ec);
> @@ -149,6 +152,12 @@ static bool acpi_ec_flushed(struct acpi_ec *ec)
> return ec->reference_count == 1;
> }
>
> +static bool acpi_ec_has_pending_event(struct acpi_ec *ec)
> +{
> + return test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags) &&
> + test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags);
> +}
> +
> /* --------------------------------------------------------------------------
> * GPE Enhancement
> * -------------------------------------------------------------------------- */
> @@ -177,20 +186,93 @@ static void acpi_ec_complete_request(struct acpi_ec *ec)
> * the flush operation is not in
> * progress
> * @ec: the EC device
> + * @check_event: check whether event is pending
> *
> * This function must be used before taking a new action that should hold
> * the reference count. If this function returns false, then the action
> * must be discarded or it will prevent the flush operation from being
> * completed.
> + *
> + * During flushing, QR_EC command need to pass this check when there is a
> + * pending event, so that the reference count held for the pending event
> + * can be decreased by the completion of the QR_EC command.
> */
> -static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
> +static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec,
> + bool check_event)
> {
> - if (!acpi_ec_started(ec))
> - return false;
> + if (!acpi_ec_started(ec)) {
> + if (!check_event || !acpi_ec_has_pending_event(ec))
> + return false;
> + }
> acpi_ec_submit_request(ec);
> return true;
> }
>
> +static void acpi_ec_enable_event(struct acpi_ec *ec)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ec->lock, flags);
> + /* Hold reference for pending event */
> + if (!acpi_ec_submit_flushable_request(ec, false)) {
> + spin_unlock_irqrestore(&ec->lock, flags);
> + return;
> + }
> + set_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags);
> + if (test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> + pr_debug("***** Event pending *****\n");
> + wake_up_process(ec->thread);
> + spin_unlock_irqrestore(&ec->lock, flags);
> + return;
> + }
> + acpi_ec_complete_request(ec);
> + spin_unlock_irqrestore(&ec->lock, flags);
> +}
> +
> +static void __acpi_ec_set_event(struct acpi_ec *ec)
> +{
> + /* Hold reference for pending event */
> + if (!acpi_ec_submit_flushable_request(ec, false))
> + return;
> + if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> + pr_debug("***** Event pending *****\n");
> + if (test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags)) {
> + wake_up_process(ec->thread);
> + return;
> + }
> + }
> + acpi_ec_complete_request(ec);
> +}
> +
> +static void __acpi_ec_complete_event(struct acpi_ec *ec)
> +{
> + if (test_and_clear_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> + /* Unhold reference for pending event */
> + acpi_ec_complete_request(ec);
> + pr_debug("***** Event running *****\n");
> + }
> +}
> +
> +int acpi_ec_wait_for_event(struct acpi_ec *ec)
> +{
> + unsigned long flags;
> +
> + set_current_state(TASK_INTERRUPTIBLE);
> + while (!kthread_should_stop()) {
> + spin_lock_irqsave(&ec->lock, flags);
> + if (acpi_ec_has_pending_event(ec)) {
> + spin_unlock_irqrestore(&ec->lock, flags);
> + __set_current_state(TASK_RUNNING);
> + return 0;
> + }
> + spin_unlock_irqrestore(&ec->lock, flags);
> + schedule();
> + set_current_state(TASK_INTERRUPTIBLE);
> + }
> + __set_current_state(TASK_RUNNING);
> + return -1;
> +}
> +
> /* --------------------------------------------------------------------------
> * Transaction Management
> * -------------------------------------------------------------------------- */
> @@ -298,7 +380,7 @@ static bool advance_transaction(struct acpi_ec *ec)
> t->flags |= ACPI_EC_COMMAND_COMPLETE;
> wakeup = true;
> }
> - return wakeup;
> + goto out;
> } else {
> if (EC_FLAGS_QUERY_HANDSHAKE &&
> !(status & ACPI_EC_FLAG_SCI) &&
> @@ -312,9 +394,11 @@ static bool advance_transaction(struct acpi_ec *ec)
> } else if ((status & ACPI_EC_FLAG_IBF) == 0) {
> acpi_ec_write_cmd(ec, t->command);
> t->flags |= ACPI_EC_COMMAND_POLL;
> + if (t->command == ACPI_EC_COMMAND_QUERY)
> + __acpi_ec_complete_event(ec);
> } else
> goto err;
> - return wakeup;
> + goto out;
> }
> err:
> /*
> @@ -325,6 +409,10 @@ err:
> if (in_interrupt() && t)
> ++t->irq_count;
> }
> +out:
> + if (status & ACPI_EC_FLAG_SCI &&
> + (!t || t->flags & ACPI_EC_COMMAND_COMPLETE))
> + __acpi_ec_set_event(ec);
> return wakeup;
> }
>
> @@ -335,17 +423,6 @@ static void start_transaction(struct acpi_ec *ec)
> (void)advance_transaction(ec);
> }
>
> -static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);
> -
> -static int ec_check_sci_sync(struct acpi_ec *ec, u8 state)
> -{
> - if (state & ACPI_EC_FLAG_SCI) {
> - if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags))
> - return acpi_ec_sync_query(ec, NULL);
> - }
> - return 0;
> -}
> -
> static int ec_poll(struct acpi_ec *ec)
> {
> unsigned long flags;
> @@ -384,12 +461,13 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
> unsigned long tmp;
> int ret = 0;
>
> + bool is_query = !!(t->command == ACPI_EC_COMMAND_QUERY);
> if (EC_FLAGS_MSI)
> udelay(ACPI_EC_MSI_UDELAY);
> /* start transaction */
> spin_lock_irqsave(&ec->lock, tmp);
> /* Enable GPE for command processing (IBF=0/OBF=1) */
> - if (!acpi_ec_submit_flushable_request(ec)) {
> + if (!acpi_ec_submit_flushable_request(ec, is_query)) {
> ret = -EINVAL;
> goto unlock;
> }
> @@ -398,10 +476,6 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
> pr_debug("***** Command(%s) started *****\n",
> acpi_ec_cmd_string(t->command));
> start_transaction(ec);
> - if (ec->curr->command == ACPI_EC_COMMAND_QUERY) {
> - clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> - pr_debug("***** Event stopped *****\n");
> - }
> spin_unlock_irqrestore(&ec->lock, tmp);
> ret = ec_poll(ec);
> spin_lock_irqsave(&ec->lock, tmp);
> @@ -440,8 +514,6 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
>
> status = acpi_ec_transaction_unlocked(ec, t);
>
> - /* check if we received SCI during transaction */
> - ec_check_sci_sync(ec, acpi_ec_read_status(ec));
> if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
> msleep(1);
> /* It is safe to enable the GPE outside of the transaction. */
> @@ -792,29 +864,6 @@ static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data)
> return 0;
> }
>
> -static void acpi_ec_gpe_query(void *ec_cxt)
> -{
> - struct acpi_ec *ec = ec_cxt;
> -
> - if (!ec)
> - return;
> - mutex_lock(&ec->mutex);
> - acpi_ec_sync_query(ec, NULL);
> - mutex_unlock(&ec->mutex);
> -}
> -
> -static int ec_check_sci(struct acpi_ec *ec, u8 state)
> -{
> - if (state & ACPI_EC_FLAG_SCI) {
> - if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) {
> - pr_debug("***** Event started *****\n");
> - return acpi_os_execute(OSL_NOTIFY_HANDLER,
> - acpi_ec_gpe_query, ec);
> - }
> - }
> - return 0;
> -}
> -
> static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
> u32 gpe_number, void *data)
> {
> @@ -825,10 +874,46 @@ static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
> if (advance_transaction(ec))
> wake_up(&ec->wait);
> spin_unlock_irqrestore(&ec->lock, flags);
> - ec_check_sci(ec, acpi_ec_read_status(ec));
> return ACPI_INTERRUPT_HANDLED | ACPI_REENABLE_GPE;
> }
>
> +static int acpi_ec_event_poller(void *context)
> +{
> + struct acpi_ec *ec = context;
> +
> + while (!acpi_ec_wait_for_event(ec)) {
> + pr_debug("***** Event poller started *****\n");
> + mutex_lock(&ec->mutex);
> + (void)acpi_ec_sync_query(ec, NULL);
> + mutex_unlock(&ec->mutex);
> + pr_debug("***** Event poller stopped *****\n");
> + }
> + return 0;
> +}
> +
> +static int ec_create_event_poller(struct acpi_ec *ec)
> +{
> + struct task_struct *t;
> +
> + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);

Does it have to be a kernel thread?

What about using a workqueue instead?

> + if (IS_ERR(t)) {
> + pr_err("failed to create event poller %lu\n", ec->gpe);
> + return PTR_ERR(t);
> + }
> + get_task_struct(t);
> + ec->thread = t;
> + return 0;
> +}
> +
> +static void ec_delete_event_poller(struct acpi_ec *ec)
> +{
> + struct task_struct *t = ec->thread;
> +
> + ec->thread = NULL;
> + kthread_stop(t);
> + put_task_struct(t);
> +}
> +
> /* --------------------------------------------------------------------------
> * Address Space Management
> * -------------------------------------------------------------------------- */
> @@ -884,7 +969,6 @@ static struct acpi_ec *make_acpi_ec(void)
>
> if (!ec)
> return NULL;
> - ec->flags = 1 << EC_FLAGS_QUERY_PENDING;
> mutex_init(&ec->mutex);
> init_waitqueue_head(&ec->wait);
> INIT_LIST_HEAD(&ec->list);
> @@ -940,15 +1024,21 @@ ec_parse_device(acpi_handle handle, u32 Level, void *context, void **retval)
>
> static int ec_install_handlers(struct acpi_ec *ec)
> {
> + int ret;
> acpi_status status;
>
> if (test_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags))
> return 0;
> + ret = ec_create_event_poller(ec);
> + if (ret)
> + return ret;
> status = acpi_install_gpe_handler(NULL, ec->gpe,
> ACPI_GPE_EDGE_TRIGGERED,
> &acpi_ec_gpe_handler, ec);
> - if (ACPI_FAILURE(status))
> + if (ACPI_FAILURE(status)) {
> + ec_delete_event_poller(ec);
> return -ENODEV;
> + }
>
> acpi_ec_start(ec);
> status = acpi_install_address_space_handler(ec->handle,
> @@ -968,6 +1058,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
> acpi_ec_stop(ec);
> acpi_remove_gpe_handler(NULL, ec->gpe,
> &acpi_ec_gpe_handler);
> + ec_delete_event_poller(ec);
> return -ENODEV;
> }
> }
> @@ -985,6 +1076,7 @@ static void ec_remove_handlers(struct acpi_ec *ec)
> if (ACPI_FAILURE(acpi_remove_gpe_handler(NULL, ec->gpe,
> &acpi_ec_gpe_handler)))
> pr_err("failed to remove gpe handler\n");
> + ec_delete_event_poller(ec);
> clear_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags);
> }
>
> @@ -1032,7 +1124,7 @@ static int acpi_ec_add(struct acpi_device *device)
> ret = ec_install_handlers(ec);
>
> /* EC is fully operational, allow queries */
> - clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> + acpi_ec_enable_event(ec);
>
> /* Clear stale _Q events if hardware might require that */
> if (EC_FLAGS_CLEAR_ON_RESUME) {
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index bbcfe0b..20a569c 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -128,6 +128,7 @@ struct acpi_ec {
> struct list_head list;
> struct transaction *curr;
> spinlock_t lock;
> + struct task_struct *thread;
> };
>
> extern struct acpi_ec *first_ec;
>

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-11-13 02:31:49

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

Hi, Rafael

> From: Rafael J. Wysocki [mailto:[email protected]]
> Sent: Wednesday, November 12, 2014 9:17 AM
>
> On Monday, November 03, 2014 01:16:37 PM Lv Zheng wrote:
> > There is wait code in the QR_SC command processing, which makes it not
> > suitable to be put into a work queue item (see bug 82611). And there is
> > case that the SCI_EVT cannot trigger GPE, though all commands have polling
> > mode implemented, the event cannot be polled (see bug 77431).
> >
> > So if the QR_SC command can be put into a seperate IRQ thread, then the
> > work queue will not be blocked by the QR_SC command processing and we can
> > also trigger polling using the thread. Using IRQ thread also allows us to
> > change the EC GPE handler into the threaded IRQ model when possible.
> >
> > This patch thus adds the IRQ polling thread for SCI_EVT polling and removes
> > QR_SC processing work item.
> >
> > The reasons why we do not put a loop in the event poller to poll event
> > until the returned query value is 0:
> > Some platforms return non 0 query value even when SCI_EVT=0, if we put a
> > loop in the poller, our command flush mechanism could never execute to
> > an end thus the system suspending process could be blocked. One SCI_EVT
> > triggering one QR_EC is current logic and has been proven to be working
> > for so long time.
> >
> > The reasons why it is not implemented directly using threaded IRQ are:
> > 1. ACPICA doesn't support threaded IRQ as not all of the OSPMs support
> > threaded IRQ while GPE handler registerations are done through ACPICA.
> > 2. The IRQ processing code need to be identical for both the IRQ handler
> > and the thread callback, while currently, though the command GPE
> > handling is ready for both IRQ and polling mode, only the event GPE is
> > is polled in the event polling thread and the command is polled in the
> > user threads.
> > So we use a standalone kernel thread, if the above situations are changed
> > in the future, we can easily convert the code into the threaded IRQ style.
> >
> > The old EC_FLAGS_QUERY_PENDING is converted to EC_FLAGS_EVENT_ENABLED in
> > this patch, so that its naming is consistent with EC_FLAGS_EVENT_PENDING.
> > The original flag doesn't co-work with SCI_EVT well, this patch refines
> > its usage by enforcing a event polling wakeup indication as:
> > EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
> > So unless the both of the flags are set, the threaded event poller will
> > not be woken up.
> >
> > This patch invokes acpi_ec_submit_request() after having detected SCI_EVT
> > and invokes acpi_ec_complete_request() before having the QR_EC command
> > processed. This is useful for implementing GPE storm prevention for
> > malicous "level triggered" SCI_EVT. But the storm prevention is not
> > implemented in this patch.
> >
> > Since the acpi_ec_submit_request() invoked after detecting the SCI_EVT is
> > paired with acpi_ec_complete_request() invoked after completing QR_EC
> > command, acpi_ec_submit_flushable_request() then need to be modified to
> > allow QR_EC command to be submitted during this period to revert the
> > increased reference count. This period can be determined by the event
> > polling indication:
> > EC_FLAGS_EVENT_ENABLED && EC_FLAGS_EVENT_PENDING
> > Without enhancing this check in acpi_ec_submit_flushable_request(), QR_EC
> > command will not be executed to decrease the reference count added after
> > detecting the SCI_EVT, thus the system suspending will be blocked because
> > the reference count equals to 2. Such check is common for flushing code.
> >
> > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=82611
> > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=77431
> > Signed-off-by: Lv Zheng <[email protected]>
> > Tested-by: Ortwin Glück <[email protected]>
> > ---
> > drivers/acpi/ec.c | 194 ++++++++++++++++++++++++++++++++++-------------
> > drivers/acpi/internal.h | 1 +
> > 2 files changed, 144 insertions(+), 51 deletions(-)
> >
> > diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> > index a76794a..7089081 100644
> > --- a/drivers/acpi/ec.c
> > +++ b/drivers/acpi/ec.c
> > @@ -44,6 +44,7 @@
> > #include <linux/slab.h>
> > #include <linux/acpi.h>
> > #include <linux/dmi.h>
> > +#include <linux/kthread.h>
> > #include <asm/io.h>
> >
> > #include "internal.h"
> > @@ -75,7 +76,8 @@ enum ec_command {
> > * when trying to clear the EC */
> >
> > enum {
> > - EC_FLAGS_QUERY_PENDING, /* Query is pending */
> > + EC_FLAGS_EVENT_ENABLED, /* Event is enabled */
> > + EC_FLAGS_EVENT_PENDING, /* Event is pending */
> > EC_FLAGS_GPE_STORM, /* GPE storm detected */
> > EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
> > * OpReg are installed */
> > @@ -124,6 +126,7 @@ struct transaction {
> > static struct acpi_ec_query_handler *
> > acpi_ec_get_query_handler(struct acpi_ec_query_handler *handler);
> > static void acpi_ec_put_query_handler(struct acpi_ec_query_handler *handler);
> > +static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);
> >
> > struct acpi_ec *boot_ec, *first_ec;
> > EXPORT_SYMBOL(first_ec);
> > @@ -149,6 +152,12 @@ static bool acpi_ec_flushed(struct acpi_ec *ec)
> > return ec->reference_count == 1;
> > }
> >
> > +static bool acpi_ec_has_pending_event(struct acpi_ec *ec)
> > +{
> > + return test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags) &&
> > + test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags);
> > +}
> > +
> > /* --------------------------------------------------------------------------
> > * GPE Enhancement
> > * -------------------------------------------------------------------------- */
> > @@ -177,20 +186,93 @@ static void acpi_ec_complete_request(struct acpi_ec *ec)
> > * the flush operation is not in
> > * progress
> > * @ec: the EC device
> > + * @check_event: check whether event is pending
> > *
> > * This function must be used before taking a new action that should hold
> > * the reference count. If this function returns false, then the action
> > * must be discarded or it will prevent the flush operation from being
> > * completed.
> > + *
> > + * During flushing, QR_EC command need to pass this check when there is a
> > + * pending event, so that the reference count held for the pending event
> > + * can be decreased by the completion of the QR_EC command.
> > */
> > -static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
> > +static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec,
> > + bool check_event)
> > {
> > - if (!acpi_ec_started(ec))
> > - return false;
> > + if (!acpi_ec_started(ec)) {
> > + if (!check_event || !acpi_ec_has_pending_event(ec))
> > + return false;
> > + }
> > acpi_ec_submit_request(ec);
> > return true;
> > }
> >
> > +static void acpi_ec_enable_event(struct acpi_ec *ec)
> > +{
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ec->lock, flags);
> > + /* Hold reference for pending event */
> > + if (!acpi_ec_submit_flushable_request(ec, false)) {
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > + return;
> > + }
> > + set_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags);
> > + if (test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> > + pr_debug("***** Event pending *****\n");
> > + wake_up_process(ec->thread);
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > + return;
> > + }
> > + acpi_ec_complete_request(ec);
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > +}
> > +
> > +static void __acpi_ec_set_event(struct acpi_ec *ec)
> > +{
> > + /* Hold reference for pending event */
> > + if (!acpi_ec_submit_flushable_request(ec, false))
> > + return;
> > + if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> > + pr_debug("***** Event pending *****\n");
> > + if (test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags)) {
> > + wake_up_process(ec->thread);
> > + return;
> > + }
> > + }
> > + acpi_ec_complete_request(ec);
> > +}
> > +
> > +static void __acpi_ec_complete_event(struct acpi_ec *ec)
> > +{
> > + if (test_and_clear_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
> > + /* Unhold reference for pending event */
> > + acpi_ec_complete_request(ec);
> > + pr_debug("***** Event running *****\n");
> > + }
> > +}
> > +
> > +int acpi_ec_wait_for_event(struct acpi_ec *ec)
> > +{
> > + unsigned long flags;
> > +
> > + set_current_state(TASK_INTERRUPTIBLE);
> > + while (!kthread_should_stop()) {
> > + spin_lock_irqsave(&ec->lock, flags);
> > + if (acpi_ec_has_pending_event(ec)) {
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > + __set_current_state(TASK_RUNNING);
> > + return 0;
> > + }
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > + schedule();
> > + set_current_state(TASK_INTERRUPTIBLE);
> > + }
> > + __set_current_state(TASK_RUNNING);
> > + return -1;
> > +}
> > +
> > /* --------------------------------------------------------------------------
> > * Transaction Management
> > * -------------------------------------------------------------------------- */
> > @@ -298,7 +380,7 @@ static bool advance_transaction(struct acpi_ec *ec)
> > t->flags |= ACPI_EC_COMMAND_COMPLETE;
> > wakeup = true;
> > }
> > - return wakeup;
> > + goto out;
> > } else {
> > if (EC_FLAGS_QUERY_HANDSHAKE &&
> > !(status & ACPI_EC_FLAG_SCI) &&
> > @@ -312,9 +394,11 @@ static bool advance_transaction(struct acpi_ec *ec)
> > } else if ((status & ACPI_EC_FLAG_IBF) == 0) {
> > acpi_ec_write_cmd(ec, t->command);
> > t->flags |= ACPI_EC_COMMAND_POLL;
> > + if (t->command == ACPI_EC_COMMAND_QUERY)
> > + __acpi_ec_complete_event(ec);
> > } else
> > goto err;
> > - return wakeup;
> > + goto out;
> > }
> > err:
> > /*
> > @@ -325,6 +409,10 @@ err:
> > if (in_interrupt() && t)
> > ++t->irq_count;
> > }
> > +out:
> > + if (status & ACPI_EC_FLAG_SCI &&
> > + (!t || t->flags & ACPI_EC_COMMAND_COMPLETE))
> > + __acpi_ec_set_event(ec);
> > return wakeup;
> > }
> >
> > @@ -335,17 +423,6 @@ static void start_transaction(struct acpi_ec *ec)
> > (void)advance_transaction(ec);
> > }
> >
> > -static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data);
> > -
> > -static int ec_check_sci_sync(struct acpi_ec *ec, u8 state)
> > -{
> > - if (state & ACPI_EC_FLAG_SCI) {
> > - if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags))
> > - return acpi_ec_sync_query(ec, NULL);
> > - }
> > - return 0;
> > -}
> > -
> > static int ec_poll(struct acpi_ec *ec)
> > {
> > unsigned long flags;
> > @@ -384,12 +461,13 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
> > unsigned long tmp;
> > int ret = 0;
> >
> > + bool is_query = !!(t->command == ACPI_EC_COMMAND_QUERY);
> > if (EC_FLAGS_MSI)
> > udelay(ACPI_EC_MSI_UDELAY);
> > /* start transaction */
> > spin_lock_irqsave(&ec->lock, tmp);
> > /* Enable GPE for command processing (IBF=0/OBF=1) */
> > - if (!acpi_ec_submit_flushable_request(ec)) {
> > + if (!acpi_ec_submit_flushable_request(ec, is_query)) {
> > ret = -EINVAL;
> > goto unlock;
> > }
> > @@ -398,10 +476,6 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
> > pr_debug("***** Command(%s) started *****\n",
> > acpi_ec_cmd_string(t->command));
> > start_transaction(ec);
> > - if (ec->curr->command == ACPI_EC_COMMAND_QUERY) {
> > - clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> > - pr_debug("***** Event stopped *****\n");
> > - }
> > spin_unlock_irqrestore(&ec->lock, tmp);
> > ret = ec_poll(ec);
> > spin_lock_irqsave(&ec->lock, tmp);
> > @@ -440,8 +514,6 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
> >
> > status = acpi_ec_transaction_unlocked(ec, t);
> >
> > - /* check if we received SCI during transaction */
> > - ec_check_sci_sync(ec, acpi_ec_read_status(ec));
> > if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
> > msleep(1);
> > /* It is safe to enable the GPE outside of the transaction. */
> > @@ -792,29 +864,6 @@ static int acpi_ec_sync_query(struct acpi_ec *ec, u8 *data)
> > return 0;
> > }
> >
> > -static void acpi_ec_gpe_query(void *ec_cxt)
> > -{
> > - struct acpi_ec *ec = ec_cxt;
> > -
> > - if (!ec)
> > - return;
> > - mutex_lock(&ec->mutex);
> > - acpi_ec_sync_query(ec, NULL);
> > - mutex_unlock(&ec->mutex);
> > -}
> > -
> > -static int ec_check_sci(struct acpi_ec *ec, u8 state)
> > -{
> > - if (state & ACPI_EC_FLAG_SCI) {
> > - if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) {
> > - pr_debug("***** Event started *****\n");
> > - return acpi_os_execute(OSL_NOTIFY_HANDLER,
> > - acpi_ec_gpe_query, ec);
> > - }
> > - }
> > - return 0;
> > -}
> > -
> > static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
> > u32 gpe_number, void *data)
> > {
> > @@ -825,10 +874,46 @@ static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
> > if (advance_transaction(ec))
> > wake_up(&ec->wait);
> > spin_unlock_irqrestore(&ec->lock, flags);
> > - ec_check_sci(ec, acpi_ec_read_status(ec));
> > return ACPI_INTERRUPT_HANDLED | ACPI_REENABLE_GPE;
> > }
> >
> > +static int acpi_ec_event_poller(void *context)
> > +{
> > + struct acpi_ec *ec = context;
> > +
> > + while (!acpi_ec_wait_for_event(ec)) {
> > + pr_debug("***** Event poller started *****\n");
> > + mutex_lock(&ec->mutex);
> > + (void)acpi_ec_sync_query(ec, NULL);
> > + mutex_unlock(&ec->mutex);
> > + pr_debug("***** Event poller stopped *****\n");
> > + }
> > + return 0;
> > +}
> > +
> > +static int ec_create_event_poller(struct acpi_ec *ec)
> > +{
> > + struct task_struct *t;
> > +
> > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
>
> Does it have to be a kernel thread?
>
> What about using a workqueue instead?

Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the callback from it.
Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.

Using a separate work queue, we didn't decrease the kernel thread count.
And the code written for the work item cannot be derived when things are switched to the threaded IRQ.
So I used kthread here.

Thanks and best regards
-Lv

>
> > + if (IS_ERR(t)) {
> > + pr_err("failed to create event poller %lu\n", ec->gpe);
> > + return PTR_ERR(t);
> > + }
> > + get_task_struct(t);
> > + ec->thread = t;
> > + return 0;
> > +}
> > +
> > +static void ec_delete_event_poller(struct acpi_ec *ec)
> > +{
> > + struct task_struct *t = ec->thread;
> > +
> > + ec->thread = NULL;
> > + kthread_stop(t);
> > + put_task_struct(t);
> > +}
> > +
> > /* --------------------------------------------------------------------------
> > * Address Space Management
> > * -------------------------------------------------------------------------- */
> > @@ -884,7 +969,6 @@ static struct acpi_ec *make_acpi_ec(void)
> >
> > if (!ec)
> > return NULL;
> > - ec->flags = 1 << EC_FLAGS_QUERY_PENDING;
> > mutex_init(&ec->mutex);
> > init_waitqueue_head(&ec->wait);
> > INIT_LIST_HEAD(&ec->list);
> > @@ -940,15 +1024,21 @@ ec_parse_device(acpi_handle handle, u32 Level, void *context, void **retval)
> >
> > static int ec_install_handlers(struct acpi_ec *ec)
> > {
> > + int ret;
> > acpi_status status;
> >
> > if (test_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags))
> > return 0;
> > + ret = ec_create_event_poller(ec);
> > + if (ret)
> > + return ret;
> > status = acpi_install_gpe_handler(NULL, ec->gpe,
> > ACPI_GPE_EDGE_TRIGGERED,
> > &acpi_ec_gpe_handler, ec);
> > - if (ACPI_FAILURE(status))
> > + if (ACPI_FAILURE(status)) {
> > + ec_delete_event_poller(ec);
> > return -ENODEV;
> > + }
> >
> > acpi_ec_start(ec);
> > status = acpi_install_address_space_handler(ec->handle,
> > @@ -968,6 +1058,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
> > acpi_ec_stop(ec);
> > acpi_remove_gpe_handler(NULL, ec->gpe,
> > &acpi_ec_gpe_handler);
> > + ec_delete_event_poller(ec);
> > return -ENODEV;
> > }
> > }
> > @@ -985,6 +1076,7 @@ static void ec_remove_handlers(struct acpi_ec *ec)
> > if (ACPI_FAILURE(acpi_remove_gpe_handler(NULL, ec->gpe,
> > &acpi_ec_gpe_handler)))
> > pr_err("failed to remove gpe handler\n");
> > + ec_delete_event_poller(ec);
> > clear_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags);
> > }
> >
> > @@ -1032,7 +1124,7 @@ static int acpi_ec_add(struct acpi_device *device)
> > ret = ec_install_handlers(ec);
> >
> > /* EC is fully operational, allow queries */
> > - clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
> > + acpi_ec_enable_event(ec);
> >
> > /* Clear stale _Q events if hardware might require that */
> > if (EC_FLAGS_CLEAR_ON_RESUME) {
> > diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> > index bbcfe0b..20a569c 100644
> > --- a/drivers/acpi/internal.h
> > +++ b/drivers/acpi/internal.h
> > @@ -128,6 +128,7 @@ struct acpi_ec {
> > struct list_head list;
> > struct transaction *curr;
> > spinlock_t lock;
> > + struct task_struct *thread;
> > };
> >
> > extern struct acpi_ec *first_ec;
> >
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-13 02:37:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

On Thursday, November 13, 2014 02:31:08 AM Zheng, Lv wrote:
> Hi, Rafael
>
> > From: Rafael J. Wysocki [mailto:[email protected]]
> > Sent: Wednesday, November 12, 2014 9:17 AM

[cut]

> > > +
> > > +static int ec_create_event_poller(struct acpi_ec *ec)
> > > +{
> > > + struct task_struct *t;
> > > +
> > > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
> >
> > Does it have to be a kernel thread?
> >
> > What about using a workqueue instead?
>
> Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
> If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the
> callback from it.

How so?

> Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
> So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.

Oh dear, no.

This isn't the way forward here.

> Using a separate work queue, we didn't decrease the kernel thread count.

Why does that matter at all?

> And the code written for the work item cannot be derived when things are
> switched to the threaded IRQ.
> So I used kthread here.

Please use a workqueue instead. If/when we need to switch over to threaded
IRQs, we'll do the work then. For now, let's not complicate things more
than necessary.

Rafael

2014-11-13 02:53:24

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

Hi, Rafael

> From: Rafael J. Wysocki [mailto:[email protected]]
> Sent: Thursday, November 13, 2014 10:59 AM
>
> On Thursday, November 13, 2014 02:31:08 AM Zheng, Lv wrote:
> > Hi, Rafael
> >
> > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > Sent: Wednesday, November 12, 2014 9:17 AM
>
> [cut]
>
> > > > +
> > > > +static int ec_create_event_poller(struct acpi_ec *ec)
> > > > +{
> > > > + struct task_struct *t;
> > > > +
> > > > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
> > >
> > > Does it have to be a kernel thread?
> > >
> > > What about using a workqueue instead?
> >
> > Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
> > If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the
> > callback from it.
>
> How so?
>
> > Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
> > So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.
>
> Oh dear, no.
>
> This isn't the way forward here.
>
> > Using a separate work queue, we didn't decrease the kernel thread count.
>
> Why does that matter at all?
>
> > And the code written for the work item cannot be derived when things are
> > switched to the threaded IRQ.
> > So I used kthread here.
>
> Please use a workqueue instead. If/when we need to switch over to threaded
> IRQs, we'll do the work then. For now, let's not complicate things more
> than necessary.

It seems we need the thread because we will move polling code from ec_poll() to acpi_ec_event_poller().
This will happen right after these cleanups.
That's the threaded IRQ logic - IRQ is polled in the thread.
We cannot achieve this using work queue.

Best regards
-Lv
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-13 22:16:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

On Thursday, November 13, 2014 02:52:03 AM Zheng, Lv wrote:
> Hi, Rafael
>
> > From: Rafael J. Wysocki [mailto:[email protected]]
> > Sent: Thursday, November 13, 2014 10:59 AM
> >
> > On Thursday, November 13, 2014 02:31:08 AM Zheng, Lv wrote:
> > > Hi, Rafael
> > >
> > > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > > Sent: Wednesday, November 12, 2014 9:17 AM
> >
> > [cut]
> >
> > > > > +
> > > > > +static int ec_create_event_poller(struct acpi_ec *ec)
> > > > > +{
> > > > > + struct task_struct *t;
> > > > > +
> > > > > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
> > > >
> > > > Does it have to be a kernel thread?
> > > >
> > > > What about using a workqueue instead?
> > >
> > > Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
> > > If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the
> > > callback from it.
> >
> > How so?
> >
> > > Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
> > > So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.
> >
> > Oh dear, no.
> >
> > This isn't the way forward here.
> >
> > > Using a separate work queue, we didn't decrease the kernel thread count.
> >
> > Why does that matter at all?
> >
> > > And the code written for the work item cannot be derived when things are
> > > switched to the threaded IRQ.
> > > So I used kthread here.
> >
> > Please use a workqueue instead. If/when we need to switch over to threaded
> > IRQs, we'll do the work then. For now, let's not complicate things more
> > than necessary.
>
> It seems we need the thread because we will move polling code from ec_poll() to acpi_ec_event_poller().
> This will happen right after these cleanups.
> That's the threaded IRQ logic - IRQ is polled in the thread.
> We cannot achieve this using work queue.

OK

In that case I'm not going to apply this patch, because it is not a cleanup.
It doesn't belong to this series, but to the series that will move the
polling code.

Does patch [6/6] depend on [5/6]?


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-11-14 01:22:07

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

Hi, Rafael

> From: Rafael J. Wysocki [mailto:[email protected]]
> Sent: Friday, November 14, 2014 6:38 AM
>
> On Thursday, November 13, 2014 02:52:03 AM Zheng, Lv wrote:
> > Hi, Rafael
> >
> > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > Sent: Thursday, November 13, 2014 10:59 AM
> > >
> > > On Thursday, November 13, 2014 02:31:08 AM Zheng, Lv wrote:
> > > > Hi, Rafael
> > > >
> > > > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > > > Sent: Wednesday, November 12, 2014 9:17 AM
> > >
> > > [cut]
> > >
> > > > > > +
> > > > > > +static int ec_create_event_poller(struct acpi_ec *ec)
> > > > > > +{
> > > > > > + struct task_struct *t;
> > > > > > +
> > > > > > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
> > > > >
> > > > > Does it have to be a kernel thread?
> > > > >
> > > > > What about using a workqueue instead?
> > > >
> > > > Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
> > > > If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the
> > > > callback from it.
> > >
> > > How so?
> > >
> > > > Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
> > > > So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.
> > >
> > > Oh dear, no.
> > >
> > > This isn't the way forward here.
> > >
> > > > Using a separate work queue, we didn't decrease the kernel thread count.
> > >
> > > Why does that matter at all?
> > >
> > > > And the code written for the work item cannot be derived when things are
> > > > switched to the threaded IRQ.
> > > > So I used kthread here.
> > >
> > > Please use a workqueue instead. If/when we need to switch over to threaded
> > > IRQs, we'll do the work then. For now, let's not complicate things more
> > > than necessary.
> >
> > It seems we need the thread because we will move polling code from ec_poll() to acpi_ec_event_poller().
> > This will happen right after these cleanups.
> > That's the threaded IRQ logic - IRQ is polled in the thread.
> > We cannot achieve this using work queue.
>
> OK
>
> In that case I'm not going to apply this patch, because it is not a cleanup.
> It doesn't belong to this series, but to the series that will move the
> polling code.

If we'll defer some execution and we know there will only be one execution corresponding to one indication, work item can fit.
If we'll poll something or there is no such 1 to 1 correspondence, using work queue may accumulate useless work items.

We have the work item to evaluate _Qxx in the EC driver, for the IRQ indications, IMO, it's better to use an IRQ poller thread.
And it's easier for me to control future improvements using kthread:
1. We need the SCI_EVT draining support for Samsung firmware. For Samsung, 1 SCI_EVT indication may mean several QR_EC transactions as we cannot rely on SCI_EVT value, it can be cleared by Samsung firmware before 0x00 is returned.
2. For Acer firmware, firmware will refuse to respond QR_EC if SCI_EVT=0 and further transactions will be blocked. Whether a transaction abort support is needed is unclear to me now because I'm not sure if this will appear on other platforms. When supporting this, I may face the difficulty to abort several queued up work items but for IRQ poller thread, I only need to abort the very 1 query transaction.

> Does patch [6/6] depend on [5/6]?

Patch [6/6] depends on [5/6].
So you can just take the patch 1-4 first..
I'll ask Samsung users to test an improved event draining support based on the poller thread and re-send the patch [5/5] and patch [6/6] after that.

Thanks and best regards
-Lv

>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-14 23:17:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events.

On Friday, November 14, 2014 01:21:51 AM Zheng, Lv wrote:
> Hi, Rafael
>
> > From: Rafael J. Wysocki [mailto:[email protected]]
> > Sent: Friday, November 14, 2014 6:38 AM
> >
> > On Thursday, November 13, 2014 02:52:03 AM Zheng, Lv wrote:
> > > Hi, Rafael
> > >
> > > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > > Sent: Thursday, November 13, 2014 10:59 AM
> > > >
> > > > On Thursday, November 13, 2014 02:31:08 AM Zheng, Lv wrote:
> > > > > Hi, Rafael
> > > > >
> > > > > > From: Rafael J. Wysocki [mailto:[email protected]]
> > > > > > Sent: Wednesday, November 12, 2014 9:17 AM
> > > >
> > > > [cut]
> > > >
> > > > > > > +
> > > > > > > +static int ec_create_event_poller(struct acpi_ec *ec)
> > > > > > > +{
> > > > > > > + struct task_struct *t;
> > > > > > > +
> > > > > > > + t = kthread_run(acpi_ec_event_poller, ec, "ec/gpe-%lu", ec->gpe);
> > > > > >
> > > > > > Does it have to be a kernel thread?
> > > > > >
> > > > > > What about using a workqueue instead?
> > > > >
> > > > > Actually I just want to use threaded IRQ here in response to Andi Kleen's comment.
> > > > > If acpi_irq is registered as threaded IRQ, then acpi_ec_event_poller() will be the
> > > > > callback from it.
> > > >
> > > > How so?
> > > >
> > > > > Since ACPICA is not ready for threaded IRQ currently, we cannot proceed at this point.
> > > > > So I copied the threaded IRQ code from kernel/irq/manage.c here to prepare threaded IRQ logics.
> > > >
> > > > Oh dear, no.
> > > >
> > > > This isn't the way forward here.
> > > >
> > > > > Using a separate work queue, we didn't decrease the kernel thread count.
> > > >
> > > > Why does that matter at all?
> > > >
> > > > > And the code written for the work item cannot be derived when things are
> > > > > switched to the threaded IRQ.
> > > > > So I used kthread here.
> > > >
> > > > Please use a workqueue instead. If/when we need to switch over to threaded
> > > > IRQs, we'll do the work then. For now, let's not complicate things more
> > > > than necessary.
> > >
> > > It seems we need the thread because we will move polling code from ec_poll() to acpi_ec_event_poller().
> > > This will happen right after these cleanups.
> > > That's the threaded IRQ logic - IRQ is polled in the thread.
> > > We cannot achieve this using work queue.
> >
> > OK
> >
> > In that case I'm not going to apply this patch, because it is not a cleanup.
> > It doesn't belong to this series, but to the series that will move the
> > polling code.
>
> If we'll defer some execution and we know there will only be one execution corresponding to one indication, work item can fit.
> If we'll poll something or there is no such 1 to 1 correspondence, using work queue may accumulate useless work items.
>
> We have the work item to evaluate _Qxx in the EC driver, for the IRQ indications, IMO, it's better to use an IRQ poller thread.
> And it's easier for me to control future improvements using kthread:
> 1. We need the SCI_EVT draining support for Samsung firmware. For Samsung, 1 SCI_EVT indication may mean several QR_EC transactions as we cannot rely on SCI_EVT value, it can be cleared by Samsung firmware before 0x00 is returned.
> 2. For Acer firmware, firmware will refuse to respond QR_EC if SCI_EVT=0 and further transactions will be blocked. Whether a transaction abort support is needed is unclear to me now because I'm not sure if this will appear on other platforms. When supporting this, I may face the difficulty to abort several queued up work items but for IRQ poller thread, I only need to abort the very 1 query transaction.
>
> > Does patch [6/6] depend on [5/6]?
>
> Patch [6/6] depends on [5/6].
> So you can just take the patch 1-4 first..
> I'll ask Samsung users to test an improved event draining support based on the poller thread and re-send the patch [5/5] and patch [6/6] after that.

OK

So patches [1-4/6] queued up for 3.19, thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-11-18 13:23:42

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> Hi, Rafael
>
> There is one thing I should let you know.
>
> Originally this patchset is dependent on the GPE "dead lock" fix.
> Because this patch will invoke acpi_enable_gpe()/acpi_disable_gpe() with EC lock held.
>
> I saw system hang during suspending using only this patchset, so we have to find a solution.
>
> > From: Zheng, Lv
> > Sent: Monday, November 03, 2014 1:16 PM
> >
> > By using the 2 flags, we can indicate an inter-mediate state where the
> > current transactions should be completed while the new transactions should
> > be dropped.
> >
> > The comparison of the old flag and the new flags:
> > Old New
> > about to set BLOCKED STOPPED set / STARTED set
> > BLOCKED set STOPPED clear / STARTED clear
> > BLOCKED clear STOPPED clear / STARTED set
> > The new period is between the point where we are about to set BLOCKED and
> > the point when the BLOCKED is set. The GPE is disabled during this period.
> > The new flags allow us to add acpi_ec_stopped() check to only check with
> > STOPPED flag to implement transaction flushing. This is not done in this
> > patch.
> >
> > No functional changes except that after applying this patch, the GPE
> > enabling/disabling is protected by the EC specific lock. We can do this
> > because of recent ACPICA GPE API enhancement. This is reasonable as the GPE
> > disabling/enabling state should only be determined by the EC driver's state
> > machine which is protected by the EC spinlock.
>
> This paragraph is talking about the dependency.
>
> >
> > Signed-off-by: Lv Zheng <[email protected]>
> > Tested-by: Ortwin Gl?ck <[email protected]>
> > ---
> > drivers/acpi/ec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--------
> > 1 file changed, 48 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
> > index 5f9b74b..192cd11 100644
> > --- a/drivers/acpi/ec.c
> > +++ b/drivers/acpi/ec.c
> > @@ -79,7 +79,8 @@ enum {
> > EC_FLAGS_GPE_STORM, /* GPE storm detected */
> > EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
> > * OpReg are installed */
> > - EC_FLAGS_BLOCKED, /* Transactions are blocked */
> > + EC_FLAGS_STARTED, /* Driver is started */
> > + EC_FLAGS_STOPPED, /* Driver is stopped */
> > };
> >
> > #define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
> > @@ -129,6 +130,16 @@ static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
> > static int EC_FLAGS_QUERY_HANDSHAKE; /* Needs QR_EC issued when SCI_EVT set */
> >
> > /* --------------------------------------------------------------------------
> > + * Device Flags
> > + * -------------------------------------------------------------------------- */
> > +
> > +static bool acpi_ec_started(struct acpi_ec *ec)
> > +{
> > + return test_bit(EC_FLAGS_STARTED, &ec->flags) &&
> > + !test_bit(EC_FLAGS_STOPPED, &ec->flags);
> > +}
> > +
> > +/* --------------------------------------------------------------------------
> > * Transaction Management
> > * -------------------------------------------------------------------------- */
> >
> > @@ -354,7 +365,7 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
> > if (t->rdata)
> > memset(t->rdata, 0, t->rlen);
> > mutex_lock(&ec->mutex);
> > - if (test_bit(EC_FLAGS_BLOCKED, &ec->flags)) {
> > + if (!acpi_ec_started(ec)) {
> > status = -EINVAL;
> > goto unlock;
> > }
> > @@ -511,6 +522,35 @@ static void acpi_ec_clear(struct acpi_ec *ec)
> > pr_info("%d stale EC events cleared\n", i);
> > }
> >
> > +static void acpi_ec_start(struct acpi_ec *ec)
> > +{
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ec->lock, flags);
> > + if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
> > + pr_debug("+++++ Starting EC +++++\n");
> > + acpi_enable_gpe(NULL, ec->gpe);
>
> This can work without "GPE dead lock" fix applied because:
> 1. During boot, this API is called when the EC GPE is disabled.
> 2. During resume, this API is called when the EC GPE is disabled (because EC GPE is always not wake capable).
>
> > + pr_info("+++++ EC started +++++\n");
> > + }
> > + spin_unlock_irqrestore(&ec->lock, flags);
> > +}
> > +
> > +static void acpi_ec_stop(struct acpi_ec *ec)
> > +{
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ec->lock, flags);
> > + if (acpi_ec_started(ec)) {
> > + pr_debug("+++++ Stopping EC +++++\n");
> > + set_bit(EC_FLAGS_STOPPED, &ec->flags);
> > + acpi_disable_gpe(NULL, ec->gpe);
>
> But this cannot work without "GPE dead lock" fix applied because:
>
> In acpi_pm_freeze(), the call graph would be:
> acpi_pm_freeze()
> acpi_disable_all_gpes()
> acpi_os_wait_events_complete()
> acpi_ec_block_transactions()
> acpi_ec_stop()
> hold EC lock
> acpi_disable_gpe()
> hold GPE lock
>
> And in the GPE handler acpi_irq(), the call graph would be:
> acpi_irq()
> acpi_ev_sci_xrupt_handler()
> acpi_ev_gpe_detect()
> hold GPE lock
> acpi_ev_gpe_dispatch()
> acpi_ec_gpe_handler()
> hold EC lock
>
> Since acpi_os_wait_events_complete() cannot flush GPE but can only flush _Lxx/_Exx evaluation work queue currently.
> The reversed ordered dead lock can happen.
> We need to fix the acpi_os_wait_events_complete() prior than this series.
> I have a fix to invoke synchronize_irq() in acpi_os_wait_events_complete().
> Let me send it to you.
> This cleanup should be applied after that fix.
>

Here's lockdep warning I see on -next:

[ 0.510159] ======================================================
[ 0.510171] [ INFO: possible circular locking dependency detected ]
[ 0.510185] 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 Not tainted
[ 0.510197] -------------------------------------------------------
[ 0.510209] swapper/3/0 is trying to acquire lock:
[ 0.510219] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.510254]
[ 0.510254] but task is already holding lock:
[ 0.510266] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
[ 0.510296]
[ 0.510296] which lock already depends on the new lock.
[ 0.510296]
[ 0.510312]
[ 0.510312] the existing dependency chain (in reverse order) is:
[ 0.510327]
[ 0.510327] -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
[ 0.510344] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
[ 0.510364] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.510381] [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
[ 0.510398] [<ffffffff814e31e8>] acpi_enable_gpe+0x22/0x68
[ 0.510416] [<ffffffff814d5b24>] acpi_ec_start+0x66/0x87
[ 0.510432] [<ffffffff81afc771>] ec_install_handlers+0x41/0xa4
[ 0.510449] [<ffffffff823e72b9>] acpi_ec_ecdt_probe+0x1a9/0x1ea
[ 0.510466] [<ffffffff823e6ae3>] acpi_init+0x8b/0x26e
[ 0.510480] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 0.510496] [<ffffffff8239f1dc>] kernel_init_freeable+0x1f5/0x282
[ 0.510513] [<ffffffff81af1a1e>] kernel_init+0xe/0xf0
[ 0.510527] [<ffffffff81b08cfc>] ret_from_fork+0x7c/0xb0
[ 0.510542]
[ 0.510542] -> #0 (&(&ec->lock)->rlock){-.....}:
[ 0.510558] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
[ 0.510574] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
[ 0.510589] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.510604] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.510620] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
[ 0.510636] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
[ 0.510652] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
[ 0.510669] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
[ 0.510684] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
[ 0.510702] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
[ 0.510718] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
[ 0.510733] [<ffffffff81075a22>] handle_irq+0x22/0x40
[ 0.510748] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
[ 0.510762] [<ffffffff81b09bb2>] ret_from_intr+0x0/0x1a
[ 0.510777] [<ffffffff8107e783>] default_idle+0x23/0x260
[ 0.510792] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
[ 0.510806] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
[ 0.510821] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
[ 0.510840]
[ 0.510840] other info that might help us debug this:
[ 0.510840]
[ 0.510856] Possible unsafe locking scenario:
[ 0.510856]
[ 0.510868] CPU0 CPU1
[ 0.510877] ---- ----
[ 0.510886] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
[ 0.510898] lock(&(&ec->lock)->rlock);
[ 0.510912] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
[ 0.510927] lock(&(&ec->lock)->rlock);
[ 0.510938]
[ 0.510938] *** DEADLOCK ***
[ 0.510938]
[ 0.510953] 1 lock held by swapper/3/0:
[ 0.510961] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
[ 0.510990]
[ 0.510990] stack backtrace:
[ 0.511004] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66
[ 0.511021] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
[ 0.511035] ffffffff82cb2f70 ffff88011e2c3bb8 ffffffff81afc316 0000000000000011
[ 0.511055] ffffffff82cb2f70 ffff88011e2c3c08 ffffffff81afae11 0000000000000001
[ 0.511074] ffff88011e2c3c68 ffff88011e2c3c08 ffff8801193f92d0 ffff8801193f9b20
[ 0.511094] Call Trace:
[ 0.511101] <IRQ> [<ffffffff81afc316>] dump_stack+0x4c/0x6e
[ 0.511125] [<ffffffff81afae11>] print_circular_bug+0x2b2/0x2c3
[ 0.511142] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
[ 0.511159] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
[ 0.511176] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
[ 0.511192] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.511209] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
[ 0.511225] [<ffffffff814ea192>] ? acpi_hw_write+0x4b/0x52
[ 0.511241] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.511258] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
[ 0.511274] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
[ 0.511292] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
[ 0.511309] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
[ 0.511325] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
[ 0.511342] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
[ 0.511357] [<ffffffff81171e98>] ? handle_fasteoi_irq+0x28/0x140
[ 0.511372] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
[ 0.511388] [<ffffffff81075a22>] handle_irq+0x22/0x40
[ 0.511402] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
[ 0.511417] [<ffffffff81b09bb2>] common_interrupt+0x72/0x72
[ 0.511428] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
[ 0.511454] [<ffffffff81154f3d>] ? trace_hardirqs_on+0xd/0x10
[ 0.511468] [<ffffffff8107e783>] default_idle+0x23/0x260
[ 0.511482] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
[ 0.511496] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
[ 0.511512] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0


--
Kirill A. Shutemov

2014-11-18 20:59:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:

[cut]

>
> Here's lockdep warning I see on -next:

Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?


> [ 0.510159] ======================================================
> [ 0.510171] [ INFO: possible circular locking dependency detected ]
> [ 0.510185] 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 Not tainted
> [ 0.510197] -------------------------------------------------------
> [ 0.510209] swapper/3/0 is trying to acquire lock:
> [ 0.510219] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.510254]
> [ 0.510254] but task is already holding lock:
> [ 0.510266] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.510296]
> [ 0.510296] which lock already depends on the new lock.
> [ 0.510296]
> [ 0.510312]
> [ 0.510312] the existing dependency chain (in reverse order) is:
> [ 0.510327]
> [ 0.510327] -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
> [ 0.510344] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> [ 0.510364] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.510381] [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.510398] [<ffffffff814e31e8>] acpi_enable_gpe+0x22/0x68
> [ 0.510416] [<ffffffff814d5b24>] acpi_ec_start+0x66/0x87
> [ 0.510432] [<ffffffff81afc771>] ec_install_handlers+0x41/0xa4
> [ 0.510449] [<ffffffff823e72b9>] acpi_ec_ecdt_probe+0x1a9/0x1ea
> [ 0.510466] [<ffffffff823e6ae3>] acpi_init+0x8b/0x26e
> [ 0.510480] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> [ 0.510496] [<ffffffff8239f1dc>] kernel_init_freeable+0x1f5/0x282
> [ 0.510513] [<ffffffff81af1a1e>] kernel_init+0xe/0xf0
> [ 0.510527] [<ffffffff81b08cfc>] ret_from_fork+0x7c/0xb0
> [ 0.510542]
> [ 0.510542] -> #0 (&(&ec->lock)->rlock){-.....}:
> [ 0.510558] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> [ 0.510574] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> [ 0.510589] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.510604] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.510620] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> [ 0.510636] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> [ 0.510652] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> [ 0.510669] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> [ 0.510684] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> [ 0.510702] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> [ 0.510718] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> [ 0.510733] [<ffffffff81075a22>] handle_irq+0x22/0x40
> [ 0.510748] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> [ 0.510762] [<ffffffff81b09bb2>] ret_from_intr+0x0/0x1a
> [ 0.510777] [<ffffffff8107e783>] default_idle+0x23/0x260
> [ 0.510792] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> [ 0.510806] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> [ 0.510821] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
> [ 0.510840]
> [ 0.510840] other info that might help us debug this:
> [ 0.510840]
> [ 0.510856] Possible unsafe locking scenario:
> [ 0.510856]
> [ 0.510868] CPU0 CPU1
> [ 0.510877] ---- ----
> [ 0.510886] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> [ 0.510898] lock(&(&ec->lock)->rlock);
> [ 0.510912] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> [ 0.510927] lock(&(&ec->lock)->rlock);
> [ 0.510938]
> [ 0.510938] *** DEADLOCK ***
> [ 0.510938]
> [ 0.510953] 1 lock held by swapper/3/0:
> [ 0.510961] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.510990]
> [ 0.510990] stack backtrace:
> [ 0.511004] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66
> [ 0.511021] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
> [ 0.511035] ffffffff82cb2f70 ffff88011e2c3bb8 ffffffff81afc316 0000000000000011
> [ 0.511055] ffffffff82cb2f70 ffff88011e2c3c08 ffffffff81afae11 0000000000000001
> [ 0.511074] ffff88011e2c3c68 ffff88011e2c3c08 ffff8801193f92d0 ffff8801193f9b20
> [ 0.511094] Call Trace:
> [ 0.511101] <IRQ> [<ffffffff81afc316>] dump_stack+0x4c/0x6e
> [ 0.511125] [<ffffffff81afae11>] print_circular_bug+0x2b2/0x2c3
> [ 0.511142] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> [ 0.511159] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> [ 0.511176] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> [ 0.511192] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.511209] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> [ 0.511225] [<ffffffff814ea192>] ? acpi_hw_write+0x4b/0x52
> [ 0.511241] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.511258] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> [ 0.511274] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> [ 0.511292] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> [ 0.511309] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> [ 0.511325] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> [ 0.511342] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> [ 0.511357] [<ffffffff81171e98>] ? handle_fasteoi_irq+0x28/0x140
> [ 0.511372] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> [ 0.511388] [<ffffffff81075a22>] handle_irq+0x22/0x40
> [ 0.511402] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> [ 0.511417] [<ffffffff81b09bb2>] common_interrupt+0x72/0x72
> [ 0.511428] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
> [ 0.511454] [<ffffffff81154f3d>] ? trace_hardirqs_on+0xd/0x10
> [ 0.511468] [<ffffffff8107e783>] default_idle+0x23/0x260
> [ 0.511482] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> [ 0.511496] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> [ 0.511512] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
>
>
>

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-11-19 09:49:45

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Hi, Rafael

I think you know this issue.

[PATCH 1] can trigger this dead lock because it is actually based on another GPE dead lock fixing series.
I have fixed the dead lock in acpi_ev_gpe_detect() or acpi_ev_gpe_dispatch().
The problem is they haven't been upstreamed to ACPICA, so I couldn't post them here.

I was thinking we can work this around by applying the acpi_os_wait_events_complete() enhancement support prior than applying this because it can only happen in suspend.
But it seems this can also be triggered during boot.

So we can have 3 choices here in order to merge this series:
1. Merging the GPE dead lock fix before it is merged in the ACPICA upstream.
2. Changing [PATCH 1] and do not hold EC lock currently (though it is racy, it is currently racy).
3. Reverting [PATCH 1-4] and wait until GPE dead lock fixed in ACPICA upstream.
Which one do you prefer?

IMO, we have several issues, their fixes form a dependency circle:
1. GPE dead lock: it may depends on DISPATCH_METHOD flushing (we shouldn't bump enabling status up in acpi_ev_asynch_enable_gpe())
2. EC transaction flushing: it depends on the GPE dead lock
3. EC event polling: it depends on the EC transaction flushing, this is required to support EC event draining as mentioned in bugzilla 44161
4. DISPATCH_METHOD flushing: it depends on EC event polling, if we don't move EC query from the _Lxx/_Exx work queue, then it may block DISPATCH_METHOD flushing.
So it seems we need to determine which one should be merged first.
IMO, the GPE dead lock fix is the most basic one.

Thanks and best regards
-Lv


> From: Rafael J. Wysocki [mailto:[email protected]]
> Sent: Wednesday, November 19, 2014 5:20 AM
> To: Kirill A. Shutemov
> Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; [email protected]; [email protected]
> Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
>
> On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
>
> [cut]
>
> >
> > Here's lockdep warning I see on -next:
>
> Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
>
>
> > [ 0.510159] ======================================================
> > [ 0.510171] [ INFO: possible circular locking dependency detected ]
> > [ 0.510185] 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 Not tainted
> > [ 0.510197] -------------------------------------------------------
> > [ 0.510209] swapper/3/0 is trying to acquire lock:
> > [ 0.510219] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.510254]
> > [ 0.510254] but task is already holding lock:
> > [ 0.510266] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.510296]
> > [ 0.510296] which lock already depends on the new lock.
> > [ 0.510296]
> > [ 0.510312]
> > [ 0.510312] the existing dependency chain (in reverse order) is:
> > [ 0.510327]
> > [ 0.510327] -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
> > [ 0.510344] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [ 0.510364] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.510381] [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.510398] [<ffffffff814e31e8>] acpi_enable_gpe+0x22/0x68
> > [ 0.510416] [<ffffffff814d5b24>] acpi_ec_start+0x66/0x87
> > [ 0.510432] [<ffffffff81afc771>] ec_install_handlers+0x41/0xa4
> > [ 0.510449] [<ffffffff823e72b9>] acpi_ec_ecdt_probe+0x1a9/0x1ea
> > [ 0.510466] [<ffffffff823e6ae3>] acpi_init+0x8b/0x26e
> > [ 0.510480] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> > [ 0.510496] [<ffffffff8239f1dc>] kernel_init_freeable+0x1f5/0x282
> > [ 0.510513] [<ffffffff81af1a1e>] kernel_init+0xe/0xf0
> > [ 0.510527] [<ffffffff81b08cfc>] ret_from_fork+0x7c/0xb0
> > [ 0.510542]
> > [ 0.510542] -> #0 (&(&ec->lock)->rlock){-.....}:
> > [ 0.510558] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> > [ 0.510574] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [ 0.510589] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.510604] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.510620] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> > [ 0.510636] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> > [ 0.510652] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [ 0.510669] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> > [ 0.510684] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> > [ 0.510702] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> > [ 0.510718] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> > [ 0.510733] [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [ 0.510748] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> > [ 0.510762] [<ffffffff81b09bb2>] ret_from_intr+0x0/0x1a
> > [ 0.510777] [<ffffffff8107e783>] default_idle+0x23/0x260
> > [ 0.510792] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> > [ 0.510806] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> > [ 0.510821] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
> > [ 0.510840]
> > [ 0.510840] other info that might help us debug this:
> > [ 0.510840]
> > [ 0.510856] Possible unsafe locking scenario:
> > [ 0.510856]
> > [ 0.510868] CPU0 CPU1
> > [ 0.510877] ---- ----
> > [ 0.510886] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [ 0.510898] lock(&(&ec->lock)->rlock);
> > [ 0.510912] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [ 0.510927] lock(&(&ec->lock)->rlock);
> > [ 0.510938]
> > [ 0.510938] *** DEADLOCK ***
> > [ 0.510938]
> > [ 0.510953] 1 lock held by swapper/3/0:
> > [ 0.510961] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.510990]
> > [ 0.510990] stack backtrace:
> > [ 0.511004] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66
> > [ 0.511021] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
> > [ 0.511035] ffffffff82cb2f70 ffff88011e2c3bb8 ffffffff81afc316 0000000000000011
> > [ 0.511055] ffffffff82cb2f70 ffff88011e2c3c08 ffffffff81afae11 0000000000000001
> > [ 0.511074] ffff88011e2c3c68 ffff88011e2c3c08 ffff8801193f92d0 ffff8801193f9b20
> > [ 0.511094] Call Trace:
> > [ 0.511101] <IRQ> [<ffffffff81afc316>] dump_stack+0x4c/0x6e
> > [ 0.511125] [<ffffffff81afae11>] print_circular_bug+0x2b2/0x2c3
> > [ 0.511142] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> > [ 0.511159] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [ 0.511176] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.511192] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.511209] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.511225] [<ffffffff814ea192>] ? acpi_hw_write+0x4b/0x52
> > [ 0.511241] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.511258] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> > [ 0.511274] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> > [ 0.511292] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [ 0.511309] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> > [ 0.511325] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> > [ 0.511342] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> > [ 0.511357] [<ffffffff81171e98>] ? handle_fasteoi_irq+0x28/0x140
> > [ 0.511372] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> > [ 0.511388] [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [ 0.511402] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> > [ 0.511417] [<ffffffff81b09bb2>] common_interrupt+0x72/0x72
> > [ 0.511428] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
> > [ 0.511454] [<ffffffff81154f3d>] ? trace_hardirqs_on+0xd/0x10
> > [ 0.511468] [<ffffffff8107e783>] default_idle+0x23/0x260
> > [ 0.511482] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> > [ 0.511496] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> > [ 0.511512] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
> >
> >
> >
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-19 12:16:26

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
>
> [cut]
>
> >
> > Here's lockdep warning I see on -next:
>
> Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?

I only saw it on -next. I've tried to apply patches directly on -rc5 and
don't see the warning. I don't have time for proper bisecting, sorry.

--
Kirill A. Shutemov

2014-11-20 02:19:59

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v3 1/2] ACPICA: Events: Remove duplicated sanity check in acpi_ev_enable_gpe().

This patch deletes a sanity check from acpi_ev_enable_gpe().

This kind of check is already done in
acpi_enable_gpe()/acpi_remove_gpe_handler()/acpi_update_all_gpes() before invoking
acpi_ev_enable_gpe():
1. acpi_enable_gpe(): same check (skip if DISPATCH_NONE) is now implemented.
2. acpi_remove_gpe_handler(): a more strict check (skip if !DISPATCH_HANDLER)
is implemented.
3. acpi_update_all_gpes(): a more strict check (skip if DISPATCH_NONE ||
DISPATCH_HANDLER || CAN_WAKE)
4. acpi_set_gpe(): since it is invoked by the OSPM driver where the GPE
handler is known to be available, such check isn't needed.
So we can simply remove this duplicated check from acpi_ev_enable_gpe().
Lv Zheng.

Signed-off-by: Lv Zheng <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/acpica/evgpe.c | 11 -----------
1 file changed, 11 deletions(-)

diff --git a/drivers/acpi/acpica/evgpe.c b/drivers/acpi/acpica/evgpe.c
index 2095dfb..9d1771d7 100644
--- a/drivers/acpi/acpica/evgpe.c
+++ b/drivers/acpi/acpica/evgpe.c
@@ -114,17 +114,6 @@ acpi_status acpi_ev_enable_gpe(struct acpi_gpe_event_info *gpe_event_info)

ACPI_FUNCTION_TRACE(ev_enable_gpe);

- /*
- * We will only allow a GPE to be enabled if it has either an associated
- * method (_Lxx/_Exx) or a handler, or is using the implicit notify
- * feature. Otherwise, the GPE will be immediately disabled by
- * acpi_ev_gpe_dispatch the first time it fires.
- */
- if ((gpe_event_info->flags & ACPI_GPE_DISPATCH_MASK) ==
- ACPI_GPE_DISPATCH_NONE) {
- return_ACPI_STATUS(AE_NO_HANDLER);
- }
-
/* Clear the GPE (of stale events) */

status = acpi_hw_clear_gpe(gpe_event_info);
--
1.7.10

2014-11-20 02:20:49

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v3 2/2] ACPICA: Events: Introduce ACPI_GPE_HANDLER_RAW to fix 2 issues for the current GPE APIs.

Since whether the GPE should be disabled/enabled/cleared should only be
determined by the GPE driver's state machine:
1. GPE should be disabled if the driver wants to switch to the GPE polling
mode when a GPE storm condition is indicated and should be enabled if
the driver wants to switch back to the GPE interrupt mode when all of
the storm conditions are cleared. The conditions should be protected by
the driver's specific lock.
2. GPE should be enabled if the driver has accepted more than one request
and should be disabled if the driver has completed all of the requests.
The request count should be protected by the driver's specific lock.
3. GPE should be cleared either when the driver is about to handle an edge
triggered GPE or when the driver has completed to handle a level
triggered GPE. The handling code should be protected by the driver's
specific lock.
Thus the GPE enabling/disabling/clearing operations are likely to be
performed with the driver's specific lock held while we currently cannot do
this. This is because:
1. We have the acpi_gbl_gpe_lock held before invoking the GPE driver's
handler. Driver's specific lock is likely to be held inside of the
handler, thus we can see some dead lock issues due to the reversed
locking order or recursive locking. In order to solve such dead lock
issues, we need to unlock the acpi_gbl_gpe_lock before invoking the
handler. BZ 1100.
2. Since GPE disabling/enabling/clearing should be determined by the GPE
driver's state machine, we shouldn't perform such operations inside of
ACPICA for a GPE handler to mess up the driver's state machine. BZ 1101.
Originally this patch includes a logic to flush GPE handlers, it is dropped
due to the following reasons:
1. This is a different issue;
2. Currently all Linux GPE drivers are either protected by acpi_gbl_gpe_lock
or not a removable module, thus this is not an existing issue for now.
We will pick up this topic when a removable GPE driver uses the raw GPE
handling model.

Note that currently the internal operations and the acpi_gbl_gpe_lock are
also used by ACPI_GPE_DISPATCH_METHOD and ACPI_GPE_DISPATCH_NOTIFY, in
order not to introduce regressions, we add one ACPI_GPE_HANDLER_RAW flag to
be paired with ACPI_GPE_DISPATCH_HANDLER. For which the acpi_gbl_gpe_lock is
unlocked before invoking the GPE handler and the internal
enabling/disabling/clearing operations are bypassed to allow drivers to
perform them at a proper position using the GPE APIs. Lv Zheng.

Signed-off-by: Lv Zheng <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
---
arch/x86/platform/olpc/olpc-xo15-sci.c | 2 +-
drivers/acpi/acpica/acevents.h | 9 +-
drivers/acpi/acpica/aclocal.h | 1 +
drivers/acpi/acpica/evgpe.c | 331 +++++++++++++++++++++++++++-----
drivers/acpi/acpica/evgpeblk.c | 3 +-
drivers/acpi/acpica/evxface.c | 21 +-
drivers/acpi/acpica/evxfgpe.c | 8 +-
drivers/acpi/device_pm.c | 3 +-
drivers/acpi/ec.c | 4 +-
drivers/acpi/sysfs.c | 2 +-
drivers/acpi/wakeup.c | 3 +-
drivers/platform/x86/apple-gmux.c | 2 +-
drivers/platform/x86/xo15-ebook.c | 3 +-
include/acpi/acpixf.h | 3 +-
include/acpi/actypes.h | 24 ++-
15 files changed, 347 insertions(+), 72 deletions(-)

diff --git a/arch/x86/platform/olpc/olpc-xo15-sci.c b/arch/x86/platform/olpc/olpc-xo15-sci.c
index 08e350e..97268a2 100644
--- a/arch/x86/platform/olpc/olpc-xo15-sci.c
+++ b/arch/x86/platform/olpc/olpc-xo15-sci.c
@@ -173,7 +173,7 @@ static int xo15_sci_add(struct acpi_device *device)
process_sci_queue();
olpc_ec_mask_write(EC_SCI_SRC_ALL);

- acpi_enable_gpe(NULL, xo15_sci_gpe);
+ acpi_enable_gpe(NULL, xo15_sci_gpe, TRUE);

/* Enable wake-on-EC */
if (device->wakeup.flags.valid)
diff --git a/drivers/acpi/acpica/acevents.h b/drivers/acpi/acpica/acevents.h
index 7a7811a..a0c631c 100644
--- a/drivers/acpi/acpica/acevents.h
+++ b/drivers/acpi/acpica/acevents.h
@@ -80,12 +80,19 @@ acpi_status acpi_ev_remove_global_lock_handler(void);
u32 acpi_ev_gpe_detect(struct acpi_gpe_xrupt_info *gpe_xrupt_list);

acpi_status
+acpi_ev_clear_gpe_status(struct acpi_gpe_event_info *gpe_event_info);
+
+acpi_status
+acpi_ev_update_gpe_status(struct acpi_gpe_event_info *gpe_event_info);
+
+acpi_status
acpi_ev_update_gpe_enable_mask(struct acpi_gpe_event_info *gpe_event_info);

acpi_status acpi_ev_enable_gpe(struct acpi_gpe_event_info *gpe_event_info);

acpi_status
-acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info);
+acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info,
+ u8 clear_stale_events);

acpi_status
acpi_ev_remove_gpe_reference(struct acpi_gpe_event_info *gpe_event_info);
diff --git a/drivers/acpi/acpica/aclocal.h b/drivers/acpi/acpica/aclocal.h
index c00e7e4..efbeaeb 100644
--- a/drivers/acpi/acpica/aclocal.h
+++ b/drivers/acpi/acpica/aclocal.h
@@ -454,6 +454,7 @@ struct acpi_gpe_register_info {
u16 base_gpe_number; /* Base GPE number for this register */
u8 enable_for_wake; /* GPEs to keep enabled when sleeping */
u8 enable_for_run; /* GPEs to keep enabled when running */
+ u8 raw_enabled_status_byte; /* Track of status reg for raw handlers */
};

/*
diff --git a/drivers/acpi/acpica/evgpe.c b/drivers/acpi/acpica/evgpe.c
index 9d1771d7..6dd2e39 100644
--- a/drivers/acpi/acpica/evgpe.c
+++ b/drivers/acpi/acpica/evgpe.c
@@ -54,6 +54,149 @@ static void ACPI_SYSTEM_XFACE acpi_ev_asynch_execute_gpe_method(void *context);

static void ACPI_SYSTEM_XFACE acpi_ev_asynch_enable_gpe(void *context);

+static acpi_status
+acpi_ev_get_gpe_enabled_status(struct acpi_gpe_register_info *gpe_register_info,
+ u8 *enabled_status_byte);
+
+/*******************************************************************************
+ *
+ * FUNCTION: acpi_ev_get_gpe_enabled_status
+ *
+ * PARAMETERS: gpe_event_info - GPE to update
+ *
+ * RETURN: Status
+ *
+ * DESCRIPTION: Update GPE register status mask to clear the saved raw GPE
+ * status indication.
+ *
+ ******************************************************************************/
+
+static acpi_status
+acpi_ev_get_gpe_enabled_status(struct acpi_gpe_register_info *gpe_register_info,
+ u8 *enabled_status_byte)
+{
+ u32 status_reg;
+ u32 enable_reg;
+ acpi_status status;
+
+ ACPI_FUNCTION_TRACE(ev_get_gpe_enabled_status);
+
+ /* Read the Status Register */
+
+ status = acpi_hw_read(&status_reg, &gpe_register_info->status_address);
+ if (ACPI_FAILURE(status)) {
+ return_ACPI_STATUS(status);
+ }
+
+ /* Read the Enable Register */
+
+ status = acpi_hw_read(&enable_reg, &gpe_register_info->enable_address);
+ if (ACPI_FAILURE(status)) {
+ return_ACPI_STATUS(status);
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INTERRUPTS,
+ "Read registers for GPE %02X-%02X: Status=%02X, Enable=%02X, "
+ "RunEnable=%02X, WakeEnable=%02X\n",
+ gpe_register_info->base_gpe_number,
+ gpe_register_info->base_gpe_number +
+ (ACPI_GPE_REGISTER_WIDTH - 1), status_reg, enable_reg,
+ gpe_register_info->enable_for_run,
+ gpe_register_info->enable_for_wake));
+
+ *enabled_status_byte = (u8)(status_reg & enable_reg);
+ return_ACPI_STATUS(AE_OK);
+}
+
+/*******************************************************************************
+ *
+ * FUNCTION: acpi_ev_clear_gpe_status
+ *
+ * PARAMETERS: gpe_event_info - GPE to update
+ *
+ * RETURN: Status
+ *
+ * DESCRIPTION: Update GPE register status mask to clear the saved raw GPE
+ * status indication.
+ *
+ ******************************************************************************/
+
+acpi_status acpi_ev_clear_gpe_status(struct acpi_gpe_event_info *gpe_event_info)
+{
+ struct acpi_gpe_register_info *gpe_register_info;
+ u32 register_bit;
+
+ ACPI_FUNCTION_TRACE(ev_clear_gpe_status);
+
+ if (!(gpe_event_info->flags & ACPI_GPE_HANDLER_RAW)) {
+ return_ACPI_STATUS(AE_BAD_PARAMETER);
+ }
+
+ gpe_register_info = gpe_event_info->register_info;
+ if (!gpe_register_info) {
+ return_ACPI_STATUS(AE_NOT_EXIST);
+ }
+
+ /* Clear the raw status bit */
+
+ register_bit = acpi_hw_get_gpe_register_bit(gpe_event_info);
+ ACPI_CLEAR_BIT(gpe_register_info->raw_enabled_status_byte,
+ register_bit);
+
+ return_ACPI_STATUS(AE_OK);
+}
+
+/*******************************************************************************
+ *
+ * FUNCTION: acpi_ev_update_gpe_status
+ *
+ * PARAMETERS: gpe_event_info - GPE to update
+ *
+ * RETURN: Status
+ *
+ * DESCRIPTION: Update GPE register status mask to set the saved raw GPE status
+ * indication.
+ *
+ ******************************************************************************/
+
+acpi_status
+acpi_ev_update_gpe_status(struct acpi_gpe_event_info *gpe_event_info)
+{
+ struct acpi_gpe_register_info *gpe_register_info;
+ u32 register_bit;
+ u8 enabled_status_byte;
+ acpi_status status;
+
+ ACPI_FUNCTION_TRACE(ev_update_gpe_status);
+
+ if (!(gpe_event_info->flags & ACPI_GPE_HANDLER_RAW)) {
+ return_ACPI_STATUS(AE_BAD_PARAMETER);
+ }
+
+ gpe_register_info = gpe_event_info->register_info;
+ if (!gpe_register_info) {
+ return_ACPI_STATUS(AE_NOT_EXIST);
+ }
+
+ status = acpi_ev_get_gpe_enabled_status(gpe_register_info,
+ &enabled_status_byte);
+ if (ACPI_FAILURE(status)) {
+ return_ACPI_STATUS(status);
+ }
+
+ /* Update the raw status bit */
+
+ register_bit = acpi_hw_get_gpe_register_bit(gpe_event_info);
+ if ((enabled_status_byte & (1 << register_bit)) &&
+ !(gpe_register_info->
+ raw_enabled_status_byte & (1 << register_bit))) {
+ ACPI_SET_BIT(gpe_register_info->raw_enabled_status_byte,
+ register_bit);
+ }
+
+ return_ACPI_STATUS(AE_OK);
+}
+
/*******************************************************************************
*
* FUNCTION: acpi_ev_update_gpe_enable_mask
@@ -132,6 +275,7 @@ acpi_status acpi_ev_enable_gpe(struct acpi_gpe_event_info *gpe_event_info)
* FUNCTION: acpi_ev_add_gpe_reference
*
* PARAMETERS: gpe_event_info - Add a reference to this GPE
+ * clear_stale_events - Whether stale events should be cleared
*
* RETURN: Status
*
@@ -141,7 +285,8 @@ acpi_status acpi_ev_enable_gpe(struct acpi_gpe_event_info *gpe_event_info)
******************************************************************************/

acpi_status
-acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info)
+acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info,
+ u8 clear_stale_events)
{
acpi_status status = AE_OK;

@@ -158,7 +303,16 @@ acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info)

status = acpi_ev_update_gpe_enable_mask(gpe_event_info);
if (ACPI_SUCCESS(status)) {
- status = acpi_ev_enable_gpe(gpe_event_info);
+ if (clear_stale_events) {
+ status = acpi_ev_enable_gpe(gpe_event_info);
+ } else {
+ status =
+ acpi_hw_low_set_gpe(gpe_event_info,
+ ACPI_GPE_ENABLE);
+ }
+ if (ACPI_SUCCESS(status)) {
+ (void)acpi_ev_update_gpe_status(gpe_event_info);
+ }
}

if (ACPI_FAILURE(status)) {
@@ -203,6 +357,9 @@ acpi_ev_remove_gpe_reference(struct acpi_gpe_event_info *gpe_event_info)
status =
acpi_hw_low_set_gpe(gpe_event_info,
ACPI_GPE_DISABLE);
+ if (ACPI_SUCCESS(status)) {
+ (void)acpi_ev_clear_gpe_status(gpe_event_info);
+ }
}

if (ACPI_FAILURE(status)) {
@@ -329,10 +486,12 @@ u32 acpi_ev_gpe_detect(struct acpi_gpe_xrupt_info *gpe_xrupt_list)
acpi_status status;
struct acpi_gpe_block_info *gpe_block;
struct acpi_gpe_register_info *gpe_register_info;
+ struct acpi_gpe_event_info *gpe_event_info;
+ struct acpi_gpe_handler_info *gpe_handler_info;
+ struct acpi_namespace_node *gpe_device;
+ u32 gpe_number;
u32 int_status = ACPI_INTERRUPT_NOT_HANDLED;
u8 enabled_status_byte;
- u32 status_reg;
- u32 enable_reg;
acpi_cpu_flags flags;
u32 i;
u32 j;
@@ -387,37 +546,17 @@ u32 acpi_ev_gpe_detect(struct acpi_gpe_xrupt_info *gpe_xrupt_list)
continue;
}

- /* Read the Status Register */
-
- status =
- acpi_hw_read(&status_reg,
- &gpe_register_info->status_address);
- if (ACPI_FAILURE(status)) {
- goto unlock_and_exit;
- }
-
- /* Read the Enable Register */
+ /* Get the (Enable & Status) register value */

status =
- acpi_hw_read(&enable_reg,
- &gpe_register_info->enable_address);
+ acpi_ev_get_gpe_enabled_status(gpe_register_info,
+ &enabled_status_byte);
if (ACPI_FAILURE(status)) {
goto unlock_and_exit;
}

- ACPI_DEBUG_PRINT((ACPI_DB_INTERRUPTS,
- "Read registers for GPE %02X-%02X: Status=%02X, Enable=%02X, "
- "RunEnable=%02X, WakeEnable=%02X\n",
- gpe_register_info->base_gpe_number,
- gpe_register_info->base_gpe_number +
- (ACPI_GPE_REGISTER_WIDTH - 1),
- status_reg, enable_reg,
- gpe_register_info->enable_for_run,
- gpe_register_info->enable_for_wake));
-
/* Check if there is anything active at all in this register */

- enabled_status_byte = (u8)(status_reg & enable_reg);
if (!enabled_status_byte) {

/* No active GPEs in this register, move on */
@@ -428,19 +567,130 @@ u32 acpi_ev_gpe_detect(struct acpi_gpe_xrupt_info *gpe_xrupt_list)
/* Now look at the individual GPEs in this byte register */

for (j = 0; j < ACPI_GPE_REGISTER_WIDTH; j++) {
+ gpe_event_info =
+ &gpe_block->
+ event_info[((acpi_size) i *
+ ACPI_GPE_REGISTER_WIDTH) + j];

/* Examine one GPE bit */

if (enabled_status_byte & (1 << j)) {
- /*
- * Found an active GPE. Dispatch the event to a handler
- * or method.
- */
- int_status |=
- acpi_ev_gpe_dispatch(gpe_block->
- node,
- &gpe_block->
- event_info[((acpi_size) i * ACPI_GPE_REGISTER_WIDTH) + j], j + gpe_register_info->base_gpe_number);
+
+ /* Found an active GPE */
+
+ acpi_gpe_count++;
+
+ /* Invoke global event handler if present */
+
+ if (acpi_gbl_global_event_handler) {
+ acpi_gbl_global_event_handler
+ (ACPI_EVENT_TYPE_GPE,
+ gpe_block->node,
+ j +
+ gpe_register_info->
+ base_gpe_number,
+ acpi_gbl_global_event_handler_context);
+ }
+
+ if (gpe_event_info->
+ flags & ACPI_GPE_HANDLER_RAW) {
+
+ /* Prepare handling for a raw handler */
+
+ gpe_register_info->
+ raw_enabled_status_byte |=
+ (1 << j);
+ } else {
+ /*
+ * Dispatch the event to a standard handler or
+ * method.
+ */
+ int_status |=
+ acpi_ev_gpe_dispatch
+ (gpe_block->node,
+ gpe_event_info,
+ j +
+ gpe_register_info->
+ base_gpe_number);
+ }
+ }
+ }
+ }
+
+ gpe_block = gpe_block->next;
+ }
+
+ /*
+ * Examine the GPE status bits for raw handlers. And if an active GPE
+ * is found, dispatch the event to a raw handler. The raw handler is
+ * invoked without the GPE lock held because:
+ * 1. Whether the GPE should be enabled/disabled/cleared is actually
+ * determined by the GPE driver, thus the GPE operations should be
+ * performed with the GPE driver's specific lock held.
+ * 2. Since the GPE APIs need to be invoked with the GPE driver's
+ * specific lock held, we need to unlock before invoking the handler
+ * to avoid recurisive locking or reversed ordered locking.
+ */
+ gpe_block = gpe_xrupt_list->gpe_block_list_head;
+ while (gpe_block) {
+
+ /* Find all currently active events for raw GPE handlers */
+
+ for (i = 0; i < gpe_block->register_count; i++) {
+
+ /* Get the next saved status/enable pair */
+
+ gpe_register_info = &gpe_block->register_info[i];
+
+ if (gpe_register_info->raw_enabled_status_byte) {
+ for (j = 0; j < ACPI_GPE_REGISTER_WIDTH; j++) {
+ gpe_event_info =
+ &gpe_block->
+ event_info[((acpi_size) i *
+ ACPI_GPE_REGISTER_WIDTH)
+ + j];
+
+ if (gpe_register_info->
+ raw_enabled_status_byte & (1 << j))
+ {
+
+ /* Clear the raw status bit */
+
+ gpe_register_info->
+ raw_enabled_status_byte &=
+ ~(1 << j);
+
+ gpe_device = gpe_block->node;
+ gpe_handler_info =
+ gpe_event_info->dispatch.
+ handler;
+ gpe_number =
+ j +
+ gpe_register_info->
+ base_gpe_number;
+
+ /*
+ * There is no protection around the namespace node
+ * and the GPE handler to ensure a safe destruction
+ * because:
+ * 1. The namespace node is expected to always
+ * exist after loading a table.
+ * 2. The GPE handler is expected to be flushed by
+ * acpi_os_wait_events_complete() before the
+ * destruction.
+ */
+ acpi_os_release_lock
+ (acpi_gbl_gpe_lock, flags);
+ int_status |=
+ gpe_handler_info->
+ address(gpe_device,
+ gpe_number,
+ gpe_handler_info->
+ context);
+ flags =
+ acpi_os_acquire_lock
+ (acpi_gbl_gpe_lock);
+ }
}
}
}
@@ -678,15 +928,6 @@ acpi_ev_gpe_dispatch(struct acpi_namespace_node *gpe_device,

ACPI_FUNCTION_TRACE(ev_gpe_dispatch);

- /* Invoke global event handler if present */
-
- acpi_gpe_count++;
- if (acpi_gbl_global_event_handler) {
- acpi_gbl_global_event_handler(ACPI_EVENT_TYPE_GPE, gpe_device,
- gpe_number,
- acpi_gbl_global_event_handler_context);
- }
-
/*
* Always disable the GPE so that it does not keep firing before
* any asynchronous activity completes (either from the execution
diff --git a/drivers/acpi/acpica/evgpeblk.c b/drivers/acpi/acpica/evgpeblk.c
index d86699e..e50853b 100644
--- a/drivers/acpi/acpica/evgpeblk.c
+++ b/drivers/acpi/acpica/evgpeblk.c
@@ -482,7 +482,8 @@ acpi_ev_initialize_gpe_block(struct acpi_gpe_xrupt_info *gpe_xrupt_info,
continue;
}

- status = acpi_ev_add_gpe_reference(gpe_event_info);
+ status =
+ acpi_ev_add_gpe_reference(gpe_event_info, TRUE);
if (ACPI_FAILURE(status)) {
ACPI_EXCEPTION((AE_INFO, status,
"Could not enable GPE 0x%02X",
diff --git a/drivers/acpi/acpica/evxface.c b/drivers/acpi/acpica/evxface.c
index 55a58f3..85ffefd 100644
--- a/drivers/acpi/acpica/evxface.c
+++ b/drivers/acpi/acpica/evxface.c
@@ -723,7 +723,9 @@ ACPI_EXPORT_SYMBOL(acpi_remove_fixed_event_handler)
* defined GPEs)
* gpe_number - The GPE number within the GPE block
* type - Whether this GPE should be treated as an
- * edge- or level-triggered interrupt.
+ * edge- or level-triggered interrupt, and
+ * whether this GPE should be handled using
+ * the special GPE handler mode.
* address - Address of the handler
* context - Value passed to the handler on each GPE
*
@@ -746,7 +748,8 @@ acpi_install_gpe_handler(acpi_handle gpe_device,

/* Parameter validation */

- if ((!address) || (type & ~ACPI_GPE_XRUPT_TYPE_MASK)) {
+ if ((!address) ||
+ (type & ~(ACPI_GPE_XRUPT_TYPE_MASK | ACPI_GPE_HANDLER_MASK))) {
return_ACPI_STATUS(AE_BAD_PARAMETER);
}

@@ -801,7 +804,7 @@ acpi_install_gpe_handler(acpi_handle gpe_device,

/* Sanity check of original type against new type */

- if (type !=
+ if ((type & ACPI_GPE_XRUPT_TYPE_MASK) !=
(u32)(gpe_event_info->flags & ACPI_GPE_XRUPT_TYPE_MASK)) {
ACPI_WARNING((AE_INFO,
"GPE type mismatch (level/edge)"));
@@ -893,6 +896,10 @@ acpi_remove_gpe_handler(acpi_handle gpe_device,
goto unlock_and_exit;
}

+ /* Clearing software "status" indication for re-enabling */
+
+ acpi_ev_clear_gpe_status(gpe_event_info);
+
/* Remove the handler */

handler = gpe_event_info->dispatch.handler;
@@ -901,7 +908,8 @@ acpi_remove_gpe_handler(acpi_handle gpe_device,

gpe_event_info->dispatch.method_node = handler->method_node;
gpe_event_info->flags &=
- ~(ACPI_GPE_XRUPT_TYPE_MASK | ACPI_GPE_DISPATCH_MASK);
+ ~(ACPI_GPE_XRUPT_TYPE_MASK | ACPI_GPE_DISPATCH_MASK |
+ ACPI_GPE_HANDLER_MASK);
gpe_event_info->flags |= handler->original_flags;

/*
@@ -912,7 +920,10 @@ acpi_remove_gpe_handler(acpi_handle gpe_device,
if (((handler->original_flags & ACPI_GPE_DISPATCH_METHOD) ||
(handler->original_flags & ACPI_GPE_DISPATCH_NOTIFY)) &&
handler->originally_enabled) {
- (void)acpi_ev_add_gpe_reference(gpe_event_info);
+
+ /* Clearing hardware "status" indication for re-enabling */
+
+ (void)acpi_ev_add_gpe_reference(gpe_event_info, TRUE);
}

acpi_os_release_lock(acpi_gbl_gpe_lock, flags);
diff --git a/drivers/acpi/acpica/evxfgpe.c b/drivers/acpi/acpica/evxfgpe.c
index e889a53..e768f49 100644
--- a/drivers/acpi/acpica/evxfgpe.c
+++ b/drivers/acpi/acpica/evxfgpe.c
@@ -108,6 +108,7 @@ ACPI_EXPORT_SYMBOL(acpi_update_all_gpes)
*
* PARAMETERS: gpe_device - Parent GPE Device. NULL for GPE0/GPE1
* gpe_number - GPE level within the GPE block
+ * clear_stale_events - Whether stale events should be cleared
*
* RETURN: Status
*
@@ -115,7 +116,8 @@ ACPI_EXPORT_SYMBOL(acpi_update_all_gpes)
* hardware-enabled.
*
******************************************************************************/
-acpi_status acpi_enable_gpe(acpi_handle gpe_device, u32 gpe_number)
+acpi_status
+acpi_enable_gpe(acpi_handle gpe_device, u32 gpe_number, u8 clear_stale_events)
{
acpi_status status = AE_BAD_PARAMETER;
struct acpi_gpe_event_info *gpe_event_info;
@@ -134,7 +136,9 @@ acpi_status acpi_enable_gpe(acpi_handle gpe_device, u32 gpe_number)
if (gpe_event_info) {
if ((gpe_event_info->flags & ACPI_GPE_DISPATCH_MASK) !=
ACPI_GPE_DISPATCH_NONE) {
- status = acpi_ev_add_gpe_reference(gpe_event_info);
+ status =
+ acpi_ev_add_gpe_reference(gpe_event_info,
+ clear_stale_events);
} else {
status = AE_NO_HANDLER;
}
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 7db1931..171c467 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -680,7 +680,8 @@ static int acpi_device_wakeup(struct acpi_device *adev, u32 target_state,
if (error)
return error;

- res = acpi_enable_gpe(wakeup->gpe_device, wakeup->gpe_number);
+ res = acpi_enable_gpe(wakeup->gpe_device, wakeup->gpe_number,
+ TRUE);
if (ACPI_FAILURE(res)) {
acpi_disable_wakeup_device_power(adev);
return -EIO;
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index a76794a..8d7b742 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -157,7 +157,7 @@ static void acpi_ec_submit_request(struct acpi_ec *ec)
{
ec->reference_count++;
if (ec->reference_count == 1)
- acpi_enable_gpe(NULL, ec->gpe);
+ acpi_enable_gpe(NULL, ec->gpe, TRUE);
}

static void acpi_ec_complete_request(struct acpi_ec *ec)
@@ -445,7 +445,7 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
msleep(1);
/* It is safe to enable the GPE outside of the transaction. */
- acpi_enable_gpe(NULL, ec->gpe);
+ acpi_enable_gpe(NULL, ec->gpe, TRUE);
} else if (t->irq_count > ec_storm_threshold) {
pr_info("GPE storm detected(%d GPEs), "
"transactions will use polling mode\n",
diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
index 13e577c..db4f91c 100644
--- a/drivers/acpi/sysfs.c
+++ b/drivers/acpi/sysfs.c
@@ -593,7 +593,7 @@ static ssize_t counter_set(struct kobject *kobj,
result = acpi_disable_gpe(handle, index);
else if (!strcmp(buf, "enable\n") &&
!(status & ACPI_EVENT_FLAG_ENABLED))
- result = acpi_enable_gpe(handle, index);
+ result = acpi_enable_gpe(handle, index, TRUE);
else if (!strcmp(buf, "clear\n") &&
(status & ACPI_EVENT_FLAG_SET))
result = acpi_clear_gpe(handle, index);
diff --git a/drivers/acpi/wakeup.c b/drivers/acpi/wakeup.c
index 1638401..57c55ad 100644
--- a/drivers/acpi/wakeup.c
+++ b/drivers/acpi/wakeup.c
@@ -88,7 +88,8 @@ int __init acpi_wakeup_device_init(void)
if (device_can_wakeup(&dev->dev)) {
/* Button GPEs are supposed to be always enabled. */
acpi_enable_gpe(dev->wakeup.gpe_device,
- dev->wakeup.gpe_number);
+ dev->wakeup.gpe_number,
+ TRUE);
device_set_wakeup_enable(&dev->dev, true);
}
}
diff --git a/drivers/platform/x86/apple-gmux.c b/drivers/platform/x86/apple-gmux.c
index b9429fb..a507a2c 100644
--- a/drivers/platform/x86/apple-gmux.c
+++ b/drivers/platform/x86/apple-gmux.c
@@ -541,7 +541,7 @@ static int gmux_probe(struct pnp_dev *pnp, const struct pnp_device_id *id)
goto err_notify;
}

- status = acpi_enable_gpe(NULL, gmux_data->gpe);
+ status = acpi_enable_gpe(NULL, gmux_data->gpe, TRUE);
if (ACPI_FAILURE(status)) {
pr_err("Cannot enable gpe: %s\n",
acpi_format_exception(status));
diff --git a/drivers/platform/x86/xo15-ebook.c b/drivers/platform/x86/xo15-ebook.c
index 49cbcce..24813fe 100644
--- a/drivers/platform/x86/xo15-ebook.c
+++ b/drivers/platform/x86/xo15-ebook.c
@@ -136,7 +136,8 @@ static int ebook_switch_add(struct acpi_device *device)
if (device->wakeup.flags.valid) {
/* Button's GPE is run-wake GPE */
acpi_enable_gpe(device->wakeup.gpe_device,
- device->wakeup.gpe_number);
+ device->wakeup.gpe_number,
+ TRUE);
device_set_wakeup_enable(&device->dev, true);
}

diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index ab2acf6..62eb0f7 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -654,7 +654,8 @@ ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_update_all_gpes(void))

ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status
acpi_enable_gpe(acpi_handle gpe_device,
- u32 gpe_number))
+ u32 gpe_number,
+ u8 clear_stale_events))

ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status
acpi_disable_gpe(acpi_handle gpe_device,
diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h
index 7000e66..368dcda 100644
--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -739,14 +739,15 @@ typedef u32 acpi_event_status;

/*
* GPE info flags - Per GPE
- * +-------+-+-+---+
- * | 7:4 |3|2|1:0|
- * +-------+-+-+---+
- * | | | |
- * | | | +-- Type of dispatch:to method, handler, notify, or none
- * | | +----- Interrupt type: edge or level triggered
- * | +------- Is a Wake GPE
- * +------------ <Reserved>
+ * +-------+-+-+-+---+
+ * | 7:5 |4|3|2|1:0|
+ * +-------+-+-+-+---+
+ * | | | | |
+ * | | | | +-- Type of dispatch:to method, handler, notify, or none
+ * | | | +----- Interrupt type: edge or level triggered
+ * | | +------- Allow driver to full control GPE handling
+ * | +--------- Is a Wake GPE
+ * +-------------- <Reserved>
*/
#define ACPI_GPE_DISPATCH_NONE (u8) 0x00
#define ACPI_GPE_DISPATCH_METHOD (u8) 0x01
@@ -754,11 +755,16 @@ typedef u32 acpi_event_status;
#define ACPI_GPE_DISPATCH_NOTIFY (u8) 0x03
#define ACPI_GPE_DISPATCH_MASK (u8) 0x03

+/*
+ * Flags used to install a GPE handler
+ */
#define ACPI_GPE_LEVEL_TRIGGERED (u8) 0x04
#define ACPI_GPE_EDGE_TRIGGERED (u8) 0x00
#define ACPI_GPE_XRUPT_TYPE_MASK (u8) 0x04
+#define ACPI_GPE_HANDLER_RAW (u8) 0x08
+#define ACPI_GPE_HANDLER_MASK (u8) 0x08

-#define ACPI_GPE_CAN_WAKE (u8) 0x08
+#define ACPI_GPE_CAN_WAKE (u8) 0x10

/*
* Flags for GPE and Lock interfaces
--
1.7.10

2014-11-20 02:22:21

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Since you have environment to trigger this.
Could you also help to check if the fix can work?
I've just sent them as RFC to this thread.

Thanks and best regards
-Lv

> From: Kirill A. Shutemov [mailto:[email protected]]
> Sent: Wednesday, November 19, 2014 8:16 PM
> To: Rafael J. Wysocki
> Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; [email protected]; [email protected]
> Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
>
> On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> >
> > [cut]
> >
> > >
> > > Here's lockdep warning I see on -next:
> >
> > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
>
> I only saw it on -next. I've tried to apply patches directly on -rc5 and
> don't see the warning. I don't have time for proper bisecting, sorry.
>
> --
> Kirill A. Shutemov

2014-11-20 02:35:29

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Hi,

> From: Kirill A. Shutemov [mailto:[email protected]]
> Sent: Wednesday, November 19, 2014 8:16 PM
>
> On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> >
> > [cut]
> >
> > >
> > > Here's lockdep warning I see on -next:
> >
> > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
>
> I only saw it on -next. I've tried to apply patches directly on -rc5 and
> don't see the warning. I don't have time for proper bisecting, sorry.

Also I want to know if there is a real dead lock issue triggered?
I think we have worked the dead lock issue around to make it working even when the fix is not merged so that we can have other EC work proceeded.
So maybe I only need to prevent the warning from being triggered at current situation.

Thanks and best regards
-Lv

2014-11-20 06:45:11

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers.

On some platforms, GPE is not disabled by default after ACPI hardware is
enabled. This confuses GPE drivers. This patch adds support to disable GPE
by default for GPE handler drivers.

Signed-off-by: Lv Zheng <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
---
drivers/acpi/acpica/evxface.c | 7 +++++++
1 file changed, 7 insertions(+)

Index: linux-acpica/drivers/acpi/acpica/evxface.c
===================================================================
--- linux-acpica.orig/drivers/acpi/acpica/evxface.c
+++ linux-acpica/drivers/acpi/acpica/evxface.c
@@ -821,6 +821,14 @@ acpi_install_gpe_handler(acpi_handle gpe
~(ACPI_GPE_XRUPT_TYPE_MASK | ACPI_GPE_DISPATCH_MASK);
gpe_event_info->flags |= (u8)(type | ACPI_GPE_DISPATCH_HANDLER);

+ /*
+ * Make sure that the GPE is disabled by default for the GPE
+ * handler driver. It is expected that the GPE handler driver
+ * should invoke acpi_enable_gpe() after installing the GPE
+ * handler.
+ */
+ (void)acpi_hw_low_set_gpe(gpe_event_info, ACPI_GPE_DISABLE);
+
acpi_os_release_lock(acpi_gbl_gpe_lock, flags);

unlock_and_exit:

2014-11-20 06:47:24

by Zheng, Lv

[permalink] [raw]
Subject: RE: [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers.

Hi, Kirill

Could also help to confirm if this patch can also fix the issue.
Please help to validate the 2 fix patchsets separately.

Thanks in advance.

Best regards
-Lv

> From: Zheng, Lv
> Sent: Thursday, November 20, 2014 2:45 PM
>
> On some platforms, GPE is not disabled by default after ACPI hardware is
> enabled. This confuses GPE drivers. This patch adds support to disable GPE
> by default for GPE handler drivers.
>
> Signed-off-by: Lv Zheng <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> ---
> drivers/acpi/acpica/evxface.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> Index: linux-acpica/drivers/acpi/acpica/evxface.c
> ===================================================================
> --- linux-acpica.orig/drivers/acpi/acpica/evxface.c
> +++ linux-acpica/drivers/acpi/acpica/evxface.c
> @@ -821,6 +821,14 @@ acpi_install_gpe_handler(acpi_handle gpe
> ~(ACPI_GPE_XRUPT_TYPE_MASK | ACPI_GPE_DISPATCH_MASK);
> gpe_event_info->flags |= (u8)(type | ACPI_GPE_DISPATCH_HANDLER);
>
> + /*
> + * Make sure that the GPE is disabled by default for the GPE
> + * handler driver. It is expected that the GPE handler driver
> + * should invoke acpi_enable_gpe() after installing the GPE
> + * handler.
> + */
> + (void)acpi_hw_low_set_gpe(gpe_event_info, ACPI_GPE_DISABLE);
> +
> acpi_os_release_lock(acpi_gbl_gpe_lock, flags);
>
> unlock_and_exit:

2014-11-20 21:33:59

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

On Thu, Nov 20, 2014 at 02:20:53AM +0000, Zheng, Lv wrote:
> Since you have environment to trigger this.
> Could you also help to check if the fix can work?
> I've just sent them as RFC to this thread.

With these two patchse on top of my -next snapshot I still see the issue:

[ 0.324119] ======================================================
[ 0.324125] [ INFO: possible circular locking dependency detected ]
[ 0.324132] 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80 Not tainted
[ 0.324138] -------------------------------------------------------
[ 0.324144] swapper/3/0 is trying to acquire lock:
[ 0.324149] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.324165]
but task is already holding lock:
[ 0.324171] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
[ 0.324185]
which lock already depends on the new lock.

[ 0.324193]
the existing dependency chain (in reverse order) is:
[ 0.324200]
-> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
[ 0.324209] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
[ 0.324218] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.324228] [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
[ 0.324235] [<ffffffff814d9945>] acpi_enable_gpe+0x27/0x75
[ 0.324244] [<ffffffff814cc960>] acpi_ec_start+0x67/0x88
[ 0.324251] [<ffffffff81af4ca9>] ec_install_handlers+0x41/0xa4
[ 0.324258] [<ffffffff823e4134>] acpi_ec_ecdt_probe+0x1a9/0x1ea
[ 0.324267] [<ffffffff823e395e>] acpi_init+0x8b/0x26e
[ 0.324275] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 0.324283] [<ffffffff8239c1dc>] kernel_init_freeable+0x1f5/0x282
[ 0.324293] [<ffffffff81aea0fe>] kernel_init+0xe/0xf0
[ 0.324300] [<ffffffff81b011bc>] ret_from_fork+0x7c/0xb0
[ 0.324307]
-> #0 (&(&ec->lock)->rlock){-.....}:
[ 0.324315] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
[ 0.324323] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
[ 0.324330] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.324338] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.324346] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
[ 0.324353] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
[ 0.324360] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
[ 0.324368] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
[ 0.324375] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
[ 0.324384] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
[ 0.324392] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
[ 0.324399] [<ffffffff81075a22>] handle_irq+0x22/0x40
[ 0.324408] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
[ 0.324416] [<ffffffff81b02072>] ret_from_intr+0x0/0x1a
[ 0.324423] [<ffffffff8107e7a3>] default_idle+0x23/0x260
[ 0.324430] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
[ 0.324438] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
[ 0.324445] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0
[ 0.324454]
other info that might help us debug this:

[ 0.324462] Possible unsafe locking scenario:

[ 0.324468] CPU0 CPU1
[ 0.324473] ---- ----
[ 0.324477] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
[ 0.324483] lock(&(&ec->lock)->rlock);
[ 0.324490] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
[ 0.324498] lock(&(&ec->lock)->rlock);
[ 0.324503]
*** DEADLOCK ***

[ 0.324510] 1 lock held by swapper/3/0:
[ 0.324514] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
[ 0.324528]
stack backtrace:
[ 0.324535] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80
[ 0.324543] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
[ 0.324550] ffffffff82cae120 ffff88011e2c3ba8 ffffffff81af484e 0000000000000011
[ 0.324560] ffffffff82cae120 ffff88011e2c3bf8 ffffffff81af3361 0000000000000001
[ 0.324569] ffff88011e2c3c58 ffff88011e2c3bf8 ffff8801193f92b0 ffff8801193f9b00
[ 0.324579] Call Trace:
[ 0.324582] <IRQ> [<ffffffff81af484e>] dump_stack+0x4c/0x6e
[ 0.324593] [<ffffffff81af3361>] print_circular_bug+0x2b2/0x2c3
[ 0.324601] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
[ 0.324609] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
[ 0.324616] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
[ 0.324624] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
[ 0.324631] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
[ 0.324640] [<ffffffff814e08f7>] ? acpi_hw_write+0x4b/0x52
[ 0.324646] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
[ 0.324653] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
[ 0.324660] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
[ 0.324668] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
[ 0.324675] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
[ 0.324683] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
[ 0.324691] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
[ 0.324698] [<ffffffff81171e88>] ? handle_fasteoi_irq+0x28/0x140
[ 0.324705] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
[ 0.324712] [<ffffffff81075a22>] handle_irq+0x22/0x40
[ 0.324719] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
[ 0.324725] [<ffffffff81b02072>] common_interrupt+0x72/0x72
[ 0.324731] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
[ 0.324743] [<ffffffff81154efd>] ? trace_hardirqs_on+0xd/0x10
[ 0.324750] [<ffffffff8107e7a3>] default_idle+0x23/0x260
[ 0.324757] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
[ 0.324763] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
[ 0.324771] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0


>
> Thanks and best regards
> -Lv
>
> > From: Kirill A. Shutemov [mailto:[email protected]]
> > Sent: Wednesday, November 19, 2014 8:16 PM
> > To: Rafael J. Wysocki
> > Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; [email protected]; [email protected]
> > Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> >
> > On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> > >
> > > [cut]
> > >
> > > >
> > > > Here's lockdep warning I see on -next:
> > >
> > > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
> >
> > I only saw it on -next. I've tried to apply patches directly on -rc5 and
> > don't see the warning. I don't have time for proper bisecting, sorry.
> >
> > --
> > Kirill A. Shutemov

--
Kirill A. Shutemov

2014-11-20 21:34:57

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

On Thu, Nov 20, 2014 at 02:34:12AM +0000, Zheng, Lv wrote:
> Hi,
>
> > From: Kirill A. Shutemov [mailto:[email protected]]
> > Sent: Wednesday, November 19, 2014 8:16 PM
> >
> > On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> > >
> > > [cut]
> > >
> > > >
> > > > Here's lockdep warning I see on -next:
> > >
> > > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
> >
> > I only saw it on -next. I've tried to apply patches directly on -rc5 and
> > don't see the warning. I don't have time for proper bisecting, sorry.
>
> Also I want to know if there is a real dead lock issue triggered?

I didn't see system hung or something. The laptop is usable at least.

> I think we have worked the dead lock issue around to make it working even when the fix is not merged so that we can have other EC work proceeded.
> So maybe I only need to prevent the warning from being triggered at current situation.
>
> Thanks and best regards
> -Lv
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Kirill A. Shutemov

2014-11-20 22:15:43

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers.

On Thu, Nov 20, 2014 at 06:47:16AM +0000, Zheng, Lv wrote:
> Hi, Kirill
>
> Could also help to confirm if this patch can also fix the issue.
> Please help to validate the 2 fix patchsets separately.

I've applied it on top of other two and still set the issue.
--
Kirill A. Shutemov

2014-11-21 00:43:25

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Hi, Shutemov

> From: Kirill A. Shutemov [mailto:[email protected]]
> Sent: Friday, November 21, 2014 5:34 AM
>
> On Thu, Nov 20, 2014 at 02:20:53AM +0000, Zheng, Lv wrote:
> > Since you have environment to trigger this.
> > Could you also help to check if the fix can work?
> > I've just sent them as RFC to this thread.
>
> With these two patchse on top of my -next snapshot I still see the issue:
>
> [ 0.324119] ======================================================
> [ 0.324125] [ INFO: possible circular locking dependency detected ]
> [ 0.324132] 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80 Not tainted
> [ 0.324138] -------------------------------------------------------
> [ 0.324144] swapper/3/0 is trying to acquire lock:
> [ 0.324149] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.324165]
> but task is already holding lock:
> [ 0.324171] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.324185]
> which lock already depends on the new lock.
>
> [ 0.324193]
> the existing dependency chain (in reverse order) is:
> [ 0.324200]
> -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
> [ 0.324209] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> [ 0.324218] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.324228] [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.324235] [<ffffffff814d9945>] acpi_enable_gpe+0x27/0x75
> [ 0.324244] [<ffffffff814cc960>] acpi_ec_start+0x67/0x88
> [ 0.324251] [<ffffffff81af4ca9>] ec_install_handlers+0x41/0xa4
> [ 0.324258] [<ffffffff823e4134>] acpi_ec_ecdt_probe+0x1a9/0x1ea
> [ 0.324267] [<ffffffff823e395e>] acpi_init+0x8b/0x26e
> [ 0.324275] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> [ 0.324283] [<ffffffff8239c1dc>] kernel_init_freeable+0x1f5/0x282
> [ 0.324293] [<ffffffff81aea0fe>] kernel_init+0xe/0xf0
> [ 0.324300] [<ffffffff81b011bc>] ret_from_fork+0x7c/0xb0
> [ 0.324307]
> -> #0 (&(&ec->lock)->rlock){-.....}:
> [ 0.324315] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
> [ 0.324323] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> [ 0.324330] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.324338] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.324346] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
> [ 0.324353] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
> [ 0.324360] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
> [ 0.324368] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
> [ 0.324375] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
> [ 0.324384] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
> [ 0.324392] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
> [ 0.324399] [<ffffffff81075a22>] handle_irq+0x22/0x40
> [ 0.324408] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
> [ 0.324416] [<ffffffff81b02072>] ret_from_intr+0x0/0x1a
> [ 0.324423] [<ffffffff8107e7a3>] default_idle+0x23/0x260
> [ 0.324430] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
> [ 0.324438] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
> [ 0.324445] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0
> [ 0.324454]
> other info that might help us debug this:
>
> [ 0.324462] Possible unsafe locking scenario:
>
> [ 0.324468] CPU0 CPU1
> [ 0.324473] ---- ----
> [ 0.324477] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> [ 0.324483] lock(&(&ec->lock)->rlock);
> [ 0.324490] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> [ 0.324498] lock(&(&ec->lock)->rlock);
> [ 0.324503]

Let me convert this into call stack:
CPU0 CPU1
acpi_irq
+GPE acpi_ev_gpe_dispatch
acpi_bus_init
acpi_ec_ecdt_probe
acpi_install_gpe_handler()
+EC acpi_ec_start
+GPE acpi_enable_gpe
-GPE
-EC
+EC acpi_ec_gpe_handler
-EC
-GPE

I used + to indicate spin_lock() and - to indicate spin_unlock().
GPE to indicate acpi_gbl_gpe_lock, EC to indicate ec->lock.

Are you sure you still can see this?
Please help to check the [RFC PATCH 2] to see if the following code is exactly applied:
+ /*
+ * There is no protection around the namespace node
+ * and the GPE handler to ensure a safe destruction
+ * because:
+ * 1. The namespace node is expected to always
+ * exist after loading a table.
+ * 2. The GPE handler is expected to be flushed by
+ * acpi_os_wait_events_complete() before the
+ * destruction.
+ */
+ acpi_os_release_lock
+ (acpi_gbl_gpe_lock, flags);
+ int_status |=
+ gpe_handler_info->
+ address(gpe_device,
+ gpe_number,
+ gpe_handler_info->
+ context);

This is where acpi_ec_gpe_handler() will be invoked.

+ flags =
+ acpi_os_acquire_lock
+ (acpi_gbl_gpe_lock);

So when acpi_ec_gpe_handler() is invoked, GPE lock is release.
There should be no reason you can see this warning, because the call stack will be:

CPU0 CPU1
CPU0 CPU1
acpi_irq
+GPE acpi_ev_gpe_dispatch
acpi_bus_init
acpi_ec_ecdt_probe
acpi_install_gpe_handler()
+EC acpi_ec_start
+GPE acpi_enable_gpe
-GPE
-EC
-GPE
+EC acpi_ec_gpe_handler
-EC
+GPE
-GPE

When acpi_ec_gpe_handler() is invoked, there is no acpi_gbl_gpe_lock locked.
So I really cannot understand your test result.
Could you confirm this again?

Maybe I just don't understand how this warning is generated, and this is just a kind of warning that we can ignore.
Let me ask Peter and Ingo to check if this is just a limitation of lockdep.

Thanks and best regards
-Lv

> *** DEADLOCK ***
>
> [ 0.324510] 1 lock held by swapper/3/0:
> [ 0.324514] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> [ 0.324528]
> stack backtrace:
> [ 0.324535] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80
> [ 0.324543] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
> [ 0.324550] ffffffff82cae120 ffff88011e2c3ba8 ffffffff81af484e 0000000000000011
> [ 0.324560] ffffffff82cae120 ffff88011e2c3bf8 ffffffff81af3361 0000000000000001
> [ 0.324569] ffff88011e2c3c58 ffff88011e2c3bf8 ffff8801193f92b0 ffff8801193f9b00
> [ 0.324579] Call Trace:
> [ 0.324582] <IRQ> [<ffffffff81af484e>] dump_stack+0x4c/0x6e
> [ 0.324593] [<ffffffff81af3361>] print_circular_bug+0x2b2/0x2c3
> [ 0.324601] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
> [ 0.324609] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> [ 0.324616] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
> [ 0.324624] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> [ 0.324631] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
> [ 0.324640] [<ffffffff814e08f7>] ? acpi_hw_write+0x4b/0x52
> [ 0.324646] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> [ 0.324653] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
> [ 0.324660] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
> [ 0.324668] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
> [ 0.324675] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
> [ 0.324683] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
> [ 0.324691] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
> [ 0.324698] [<ffffffff81171e88>] ? handle_fasteoi_irq+0x28/0x140
> [ 0.324705] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
> [ 0.324712] [<ffffffff81075a22>] handle_irq+0x22/0x40
> [ 0.324719] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
> [ 0.324725] [<ffffffff81b02072>] common_interrupt+0x72/0x72
> [ 0.324731] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
> [ 0.324743] [<ffffffff81154efd>] ? trace_hardirqs_on+0xd/0x10
> [ 0.324750] [<ffffffff8107e7a3>] default_idle+0x23/0x260
> [ 0.324757] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
> [ 0.324763] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
> [ 0.324771] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0
>
>
> >
> > Thanks and best regards
> > -Lv
> >
> > > From: Kirill A. Shutemov [mailto:[email protected]]
> > > Sent: Wednesday, November 19, 2014 8:16 PM
> > > To: Rafael J. Wysocki
> > > Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; [email protected]; [email protected]
> > > Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> > >
> > > On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > > > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> > > >
> > > > [cut]
> > > >
> > > > >
> > > > > Here's lockdep warning I see on -next:
> > > >
> > > > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
> > >
> > > I only saw it on -next. I've tried to apply patches directly on -rc5 and
> > > don't see the warning. I don't have time for proper bisecting, sorry.
> > >
> > > --
> > > Kirill A. Shutemov
>
> --
> Kirill A. Shutemov

2014-11-21 00:55:49

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

Hi, All

It's my fault.
I didn't add ACPI_GPE_HANDLER_RAW flag in ec.c to enable this fix.
Sorry for the noise.
Let me post the updated [RFC PATCH 2] for you to confirm.

Thanks
-Lv

> From: Zheng, Lv
> Sent: Friday, November 21, 2014 8:43 AM
>
> Hi, Shutemov
>
> > From: Kirill A. Shutemov [mailto:[email protected]]
> > Sent: Friday, November 21, 2014 5:34 AM
> >
> > On Thu, Nov 20, 2014 at 02:20:53AM +0000, Zheng, Lv wrote:
> > > Since you have environment to trigger this.
> > > Could you also help to check if the fix can work?
> > > I've just sent them as RFC to this thread.
> >
> > With these two patchse on top of my -next snapshot I still see the issue:
> >
> > [ 0.324119] ======================================================
> > [ 0.324125] [ INFO: possible circular locking dependency detected ]
> > [ 0.324132] 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80 Not tainted
> > [ 0.324138] -------------------------------------------------------
> > [ 0.324144] swapper/3/0 is trying to acquire lock:
> > [ 0.324149] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.324165]
> > but task is already holding lock:
> > [ 0.324171] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.324185]
> > which lock already depends on the new lock.
> >
> > [ 0.324193]
> > the existing dependency chain (in reverse order) is:
> > [ 0.324200]
> > -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
> > [ 0.324209] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> > [ 0.324218] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.324228] [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.324235] [<ffffffff814d9945>] acpi_enable_gpe+0x27/0x75
> > [ 0.324244] [<ffffffff814cc960>] acpi_ec_start+0x67/0x88
> > [ 0.324251] [<ffffffff81af4ca9>] ec_install_handlers+0x41/0xa4
> > [ 0.324258] [<ffffffff823e4134>] acpi_ec_ecdt_probe+0x1a9/0x1ea
> > [ 0.324267] [<ffffffff823e395e>] acpi_init+0x8b/0x26e
> > [ 0.324275] [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> > [ 0.324283] [<ffffffff8239c1dc>] kernel_init_freeable+0x1f5/0x282
> > [ 0.324293] [<ffffffff81aea0fe>] kernel_init+0xe/0xf0
> > [ 0.324300] [<ffffffff81b011bc>] ret_from_fork+0x7c/0xb0
> > [ 0.324307]
> > -> #0 (&(&ec->lock)->rlock){-.....}:
> > [ 0.324315] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
> > [ 0.324323] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> > [ 0.324330] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.324338] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.324346] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
> > [ 0.324353] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
> > [ 0.324360] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [ 0.324368] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
> > [ 0.324375] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
> > [ 0.324384] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
> > [ 0.324392] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
> > [ 0.324399] [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [ 0.324408] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
> > [ 0.324416] [<ffffffff81b02072>] ret_from_intr+0x0/0x1a
> > [ 0.324423] [<ffffffff8107e7a3>] default_idle+0x23/0x260
> > [ 0.324430] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
> > [ 0.324438] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
> > [ 0.324445] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0
> > [ 0.324454]
> > other info that might help us debug this:
> >
> > [ 0.324462] Possible unsafe locking scenario:
> >
> > [ 0.324468] CPU0 CPU1
> > [ 0.324473] ---- ----
> > [ 0.324477] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [ 0.324483] lock(&(&ec->lock)->rlock);
> > [ 0.324490] lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [ 0.324498] lock(&(&ec->lock)->rlock);
> > [ 0.324503]
>
> Let me convert this into call stack:
> CPU0 CPU1
> acpi_irq
> +GPE acpi_ev_gpe_dispatch
> acpi_bus_init
> acpi_ec_ecdt_probe
> acpi_install_gpe_handler()
> +EC acpi_ec_start
> +GPE acpi_enable_gpe
> -GPE
> -EC
> +EC acpi_ec_gpe_handler
> -EC
> -GPE
>
> I used + to indicate spin_lock() and - to indicate spin_unlock().
> GPE to indicate acpi_gbl_gpe_lock, EC to indicate ec->lock.
>
> Are you sure you still can see this?
> Please help to check the [RFC PATCH 2] to see if the following code is exactly applied:
> + /*
> + * There is no protection around the namespace node
> + * and the GPE handler to ensure a safe destruction
> + * because:
> + * 1. The namespace node is expected to always
> + * exist after loading a table.
> + * 2. The GPE handler is expected to be flushed by
> + * acpi_os_wait_events_complete() before the
> + * destruction.
> + */
> + acpi_os_release_lock
> + (acpi_gbl_gpe_lock, flags);
> + int_status |=
> + gpe_handler_info->
> + address(gpe_device,
> + gpe_number,
> + gpe_handler_info->
> + context);
>
> This is where acpi_ec_gpe_handler() will be invoked.
>
> + flags =
> + acpi_os_acquire_lock
> + (acpi_gbl_gpe_lock);
>
> So when acpi_ec_gpe_handler() is invoked, GPE lock is release.
> There should be no reason you can see this warning, because the call stack will be:
>
> CPU0 CPU1
> CPU0 CPU1
> acpi_irq
> +GPE acpi_ev_gpe_dispatch
> acpi_bus_init
> acpi_ec_ecdt_probe
> acpi_install_gpe_handler()
> +EC acpi_ec_start
> +GPE acpi_enable_gpe
> -GPE
> -EC
> -GPE
> +EC acpi_ec_gpe_handler
> -EC
> +GPE
> -GPE
>
> When acpi_ec_gpe_handler() is invoked, there is no acpi_gbl_gpe_lock locked.
> So I really cannot understand your test result.
> Could you confirm this again?
>
> Maybe I just don't understand how this warning is generated, and this is just a kind of warning that we can ignore.
> Let me ask Peter and Ingo to check if this is just a limitation of lockdep.
>
> Thanks and best regards
> -Lv
>
> > *** DEADLOCK ***
> >
> > [ 0.324510] 1 lock held by swapper/3/0:
> > [ 0.324514] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814c3b3e>] acpi_os_acquire_lock+0xe/0x10
> > [ 0.324528]
> > stack backtrace:
> > [ 0.324535] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc5-next-20141119-07477-g4c45e54745b2 #80
> > [ 0.324543] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
> > [ 0.324550] ffffffff82cae120 ffff88011e2c3ba8 ffffffff81af484e 0000000000000011
> > [ 0.324560] ffffffff82cae120 ffff88011e2c3bf8 ffffffff81af3361 0000000000000001
> > [ 0.324569] ffff88011e2c3c58 ffff88011e2c3bf8 ffff8801193f92b0 ffff8801193f9b00
> > [ 0.324579] Call Trace:
> > [ 0.324582] <IRQ> [<ffffffff81af484e>] dump_stack+0x4c/0x6e
> > [ 0.324593] [<ffffffff81af3361>] print_circular_bug+0x2b2/0x2c3
> > [ 0.324601] [<ffffffff811585af>] __lock_acquire+0x210f/0x2220
> > [ 0.324609] [<ffffffff81158f0f>] lock_acquire+0xdf/0x2d0
> > [ 0.324616] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.324624] [<ffffffff81b004c0>] _raw_spin_lock_irqsave+0x50/0x70
> > [ 0.324631] [<ffffffff814cb803>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.324640] [<ffffffff814e08f7>] ? acpi_hw_write+0x4b/0x52
> > [ 0.324646] [<ffffffff814cb803>] acpi_ec_gpe_handler+0x21/0xfc
> > [ 0.324653] [<ffffffff814d68e0>] acpi_ev_gpe_dispatch+0xb9/0x12e
> > [ 0.324660] [<ffffffff814d6a5a>] acpi_ev_gpe_detect+0x105/0x227
> > [ 0.324668] [<ffffffff814d8af5>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [ 0.324675] [<ffffffff814c2dae>] acpi_irq+0x16/0x31
> > [ 0.324683] [<ffffffff8116ecbf>] handle_irq_event_percpu+0x6f/0x540
> > [ 0.324691] [<ffffffff8116f1d1>] handle_irq_event+0x41/0x70
> > [ 0.324698] [<ffffffff81171e88>] ? handle_fasteoi_irq+0x28/0x140
> > [ 0.324705] [<ffffffff81171ee6>] handle_fasteoi_irq+0x86/0x140
> > [ 0.324712] [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [ 0.324719] [<ffffffff81b0436f>] do_IRQ+0x4f/0xf0
> > [ 0.324725] [<ffffffff81b02072>] common_interrupt+0x72/0x72
> > [ 0.324731] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
> > [ 0.324743] [<ffffffff81154efd>] ? trace_hardirqs_on+0xd/0x10
> > [ 0.324750] [<ffffffff8107e7a3>] default_idle+0x23/0x260
> > [ 0.324757] [<ffffffff8107f37f>] arch_cpu_idle+0xf/0x20
> > [ 0.324763] [<ffffffff8114a95b>] cpu_startup_entry+0x36b/0x5b0
> > [ 0.324771] [<ffffffff810a8d24>] start_secondary+0x1a4/0x1d0
> >
> >
> > >
> > > Thanks and best regards
> > > -Lv
> > >
> > > > From: Kirill A. Shutemov [mailto:[email protected]]
> > > > Sent: Wednesday, November 19, 2014 8:16 PM
> > > > To: Rafael J. Wysocki
> > > > Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; [email protected]; [email protected]
> > > > Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> > > >
> > > > On Tue, Nov 18, 2014 at 10:20:11PM +0100, Rafael J. Wysocki wrote:
> > > > > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > > > > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> > > > >
> > > > > [cut]
> > > > >
> > > > > >
> > > > > > Here's lockdep warning I see on -next:
> > > > >
> > > > > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
> > > >
> > > > I only saw it on -next. I've tried to apply patches directly on -rc5 and
> > > > don't see the warning. I don't have time for proper bisecting, sorry.
> > > >
> > > > --
> > > > Kirill A. Shutemov
> >
> > --
> > Kirill A. Shutemov

2014-11-21 01:38:37

by Zheng, Lv

[permalink] [raw]
Subject: RE: [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers.

This result convinces me that this might only be a theoretical dead lock for now.
So finally we won't need this in order to eliminate this warning.

Thanks and best regards
-Lv

> From: Kirill A. Shutemov [mailto:[email protected]]
> Sent: Friday, November 21, 2014 6:16 AM
> To: Zheng, Lv
> Cc: Lv Zheng; [email protected]; [email protected]; Wysocki, Rafael J; Brown, Len
> Subject: Re: [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers.
>
> On Thu, Nov 20, 2014 at 06:47:16AM +0000, Zheng, Lv wrote:
> > Hi, Kirill
> >
> > Could also help to confirm if this patch can also fix the issue.
> > Please help to validate the 2 fix patchsets separately.
>
> I've applied it on top of other two and still set the issue.
> --
> Kirill A. Shutemov

2015-02-06 00:57:57

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support.

This patchset contains 3 cleanups related to the EC driver:
1. Command flushing (command grace period)
This patchset flushes EC commands before suspending/resuming, so that
there won't be timeout for the incomplete commands after resuming.
2. Query flushing (query grace period)
This patchset flushes EC event queries before suspending/resuming, so
that there won't be broken events remained in the firmware queue.
3. Command storming prevention
This patchset corrects command storming prevention logic because of
the GPE raw handler mode.
The request reference count debugging messages can be used to detect broken
EC transactions. It should always drop to 1 when the driver is idle during
the runtime.

Note that after flushing before suspending, EC GPE is still enabled to keep
the old behavior.

Lv Zheng (5):
ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
ACPI/EC: Add command flushing support.
ACPI/EC: Refine command storm prevention support.
ACPI/EC: Add query flushing support.
ACPI/EC: Add GPE reference counting debugging messages.

drivers/acpi/ec.c | 295 ++++++++++++++++++++++++++++++++++++++++-------
drivers/acpi/internal.h | 1 +
2 files changed, 254 insertions(+), 42 deletions(-)

--
1.7.10

2015-02-06 00:58:03

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 1/5] ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.

By using the 2 flags, we can indicate an inter-mediate state where the
current transactions should be completed while the new transactions should
be dropped.

The comparison of the old flag and the new flags:
Old New
about to set BLOCKED STOPPED set / STARTED set
BLOCKED set STOPPED clear / STARTED clear
BLOCKED clear STOPPED clear / STARTED set
A new period can be indicated by the 2 flags. The new period is between the
point where we are about to set BLOCKED and the point when the BLOCKED is
set. The new flags facilitate us with acpi_ec_started() check to allow the
EC transaction to be submitted during the new period. This period thus can
be used as a grace period for the EC transaction flushing.

The only functional change after applying this patch is:
1. The GPE enabling/disabling is protected by the EC specific lock. We can
do this because of recent ACPICA GPE API enhancement. This is reasonable
as the GPE disabling/enabling state should only be determined by the EC
driver's state machine which is protected by the EC spinlock.

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 65 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 54 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 5563253..a6179b7 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -80,7 +80,8 @@ enum {
EC_FLAGS_GPE_STORM, /* GPE storm detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
- EC_FLAGS_BLOCKED, /* Transactions are blocked */
+ EC_FLAGS_STARTED, /* Driver is started */
+ EC_FLAGS_STOPPED, /* Driver is stopped */
};

#define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
@@ -135,6 +136,16 @@ static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
static int EC_FLAGS_QUERY_HANDSHAKE; /* Needs QR_EC issued when SCI_EVT set */

/* --------------------------------------------------------------------------
+ * Device Flags
+ * -------------------------------------------------------------------------- */
+
+static bool acpi_ec_started(struct acpi_ec *ec)
+{
+ return test_bit(EC_FLAGS_STARTED, &ec->flags) &&
+ !test_bit(EC_FLAGS_STOPPED, &ec->flags);
+}
+
+/* --------------------------------------------------------------------------
* EC Registers
* -------------------------------------------------------------------------- */

@@ -415,6 +426,10 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
udelay(ACPI_EC_MSI_UDELAY);
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
+ if (!acpi_ec_started(ec)) {
+ ret = -EINVAL;
+ goto unlock;
+ }
/* following two actions should be kept atomic */
ec->curr = t;
pr_debug("***** Command(%s) started *****\n",
@@ -426,6 +441,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
pr_debug("***** Command(%s) stopped *****\n",
acpi_ec_cmd_string(t->command));
ec->curr = NULL;
+unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
}
@@ -440,10 +456,6 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
if (t->rdata)
memset(t->rdata, 0, t->rlen);
mutex_lock(&ec->mutex);
- if (test_bit(EC_FLAGS_BLOCKED, &ec->flags)) {
- status = -EINVAL;
- goto unlock;
- }
if (ec->global_lock) {
status = acpi_acquire_global_lock(ACPI_EC_UDELAY_GLK, &glk);
if (ACPI_FAILURE(status)) {
@@ -595,6 +607,37 @@ static void acpi_ec_clear(struct acpi_ec *ec)
pr_info("%d stale EC events cleared\n", i);
}

+static void acpi_ec_start(struct acpi_ec *ec, bool resuming)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
+ pr_debug("+++++ Starting EC +++++\n");
+ if (!resuming)
+ acpi_ec_enable_gpe(ec, true);
+ pr_info("+++++ EC started +++++\n");
+ }
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
+static void acpi_ec_stop(struct acpi_ec *ec, bool suspending)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ if (acpi_ec_started(ec)) {
+ pr_debug("+++++ Stopping EC +++++\n");
+ set_bit(EC_FLAGS_STOPPED, &ec->flags);
+ if (!suspending)
+ acpi_ec_disable_gpe(ec, true);
+ clear_bit(EC_FLAGS_STARTED, &ec->flags);
+ clear_bit(EC_FLAGS_STOPPED, &ec->flags);
+ pr_info("+++++ EC stopped +++++\n");
+ }
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
void acpi_ec_block_transactions(void)
{
struct acpi_ec *ec = first_ec;
@@ -604,7 +647,7 @@ void acpi_ec_block_transactions(void)

mutex_lock(&ec->mutex);
/* Prevent transactions from being carried out */
- set_bit(EC_FLAGS_BLOCKED, &ec->flags);
+ acpi_ec_stop(ec, true);
mutex_unlock(&ec->mutex);
}

@@ -616,7 +659,7 @@ void acpi_ec_unblock_transactions(void)
return;

/* Allow transactions to be carried out again */
- clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
+ acpi_ec_start(ec, true);

if (EC_FLAGS_CLEAR_ON_RESUME)
acpi_ec_clear(ec);
@@ -629,7 +672,7 @@ void acpi_ec_unblock_transactions_early(void)
* atomic context during wakeup, so we don't need to acquire the mutex).
*/
if (first_ec)
- clear_bit(EC_FLAGS_BLOCKED, &first_ec->flags);
+ acpi_ec_start(first_ec, true);
}

/* --------------------------------------------------------------------------
@@ -894,7 +937,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
if (ACPI_FAILURE(status))
return -ENODEV;

- acpi_ec_enable_gpe(ec, true);
+ acpi_ec_start(ec, false);
status = acpi_install_address_space_handler(ec->handle,
ACPI_ADR_SPACE_EC,
&acpi_ec_space_handler,
@@ -909,7 +952,7 @@ static int ec_install_handlers(struct acpi_ec *ec)
pr_err("Fail in evaluating the _REG object"
" of EC device. Broken bios is suspected.\n");
} else {
- acpi_ec_disable_gpe(ec, true);
+ acpi_ec_stop(ec, false);
acpi_remove_gpe_handler(NULL, ec->gpe,
&acpi_ec_gpe_handler);
return -ENODEV;
@@ -924,7 +967,7 @@ static void ec_remove_handlers(struct acpi_ec *ec)
{
if (!test_bit(EC_FLAGS_HANDLERS_INSTALLED, &ec->flags))
return;
- acpi_ec_disable_gpe(ec, true);
+ acpi_ec_stop(ec, false);
if (ACPI_FAILURE(acpi_remove_address_space_handler(ec->handle,
ACPI_ADR_SPACE_EC, &acpi_ec_space_handler)))
pr_err("failed to remove space handler\n");
--
1.7.10

2015-02-06 00:58:10

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 2/5] ACPI / EC: Add command flushing support.

This patch implements the EC command flushing support.

During the grace period indicated by EC_FLAGS_STARTED and EC_FLAGS_STOPPED,
all submitted EC command transactions can be completed and new submissions
are prevented before suspending so that the EC hardware can be ensured to
be in the idle state when the system is resumed.

There is a good indicator for flush support:
All acpi_ec_submit_request() is invoked after checking driver state with
acpi_ec_started() except the first one. This means all code paths can be
flushed as fast as possible by discarding the requests occurred after the
flush operation. The reference increased for such kind of code path is
wrapped by acpi_ec_submit_flushable_request().

Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 68 ++++++++++++++++++++++++++++++++++++++++++++---
drivers/acpi/internal.h | 1 +
2 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index a6179b7..1fa1463 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -145,6 +145,11 @@ static bool acpi_ec_started(struct acpi_ec *ec)
!test_bit(EC_FLAGS_STOPPED, &ec->flags);
}

+static bool acpi_ec_flushed(struct acpi_ec *ec)
+{
+ return ec->reference_count == 1;
+}
+
/* --------------------------------------------------------------------------
* EC Registers
* -------------------------------------------------------------------------- */
@@ -266,6 +271,44 @@ static inline void acpi_ec_clear_gpe(struct acpi_ec *ec)
* Transaction Management
* -------------------------------------------------------------------------- */

+static void acpi_ec_submit_request(struct acpi_ec *ec)
+{
+ ec->reference_count++;
+ if (ec->reference_count == 1)
+ acpi_ec_enable_gpe(ec, true);
+}
+
+static void acpi_ec_complete_request(struct acpi_ec *ec)
+{
+ bool flushed = false;
+
+ ec->reference_count--;
+ if (ec->reference_count == 0)
+ acpi_ec_disable_gpe(ec, true);
+ flushed = acpi_ec_flushed(ec);
+ if (flushed)
+ wake_up(&ec->wait);
+}
+
+/*
+ * acpi_ec_submit_flushable_request() - Increase the reference count unless
+ * the flush operation is not in
+ * progress
+ * @ec: the EC device
+ *
+ * This function must be used before taking a new action that should hold
+ * the reference count. If this function returns false, then the action
+ * must be discarded or it will prevent the flush operation from being
+ * completed.
+ */
+static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
+{
+ if (!acpi_ec_started(ec))
+ return false;
+ acpi_ec_submit_request(ec);
+ return true;
+}
+
static void acpi_ec_submit_query(struct acpi_ec *ec)
{
if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) {
@@ -426,7 +469,8 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
udelay(ACPI_EC_MSI_UDELAY);
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
- if (!acpi_ec_started(ec)) {
+ /* Enable GPE for command processing (IBF=0/OBF=1) */
+ if (!acpi_ec_submit_flushable_request(ec)) {
ret = -EINVAL;
goto unlock;
}
@@ -441,6 +485,8 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
pr_debug("***** Command(%s) stopped *****\n",
acpi_ec_cmd_string(t->command));
ec->curr = NULL;
+ /* Disable GPE for command processing (IBF=0/OBF=1) */
+ acpi_ec_complete_request(ec);
unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
@@ -614,13 +660,25 @@ static void acpi_ec_start(struct acpi_ec *ec, bool resuming)
spin_lock_irqsave(&ec->lock, flags);
if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
pr_debug("+++++ Starting EC +++++\n");
+ /* Enable GPE for event processing (SCI_EVT=1) */
if (!resuming)
- acpi_ec_enable_gpe(ec, true);
+ acpi_ec_submit_request(ec);
pr_info("+++++ EC started +++++\n");
}
spin_unlock_irqrestore(&ec->lock, flags);
}

+static bool acpi_ec_stopped(struct acpi_ec *ec)
+{
+ unsigned long flags;
+ bool flushed;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ flushed = acpi_ec_flushed(ec);
+ spin_unlock_irqrestore(&ec->lock, flags);
+ return flushed;
+}
+
static void acpi_ec_stop(struct acpi_ec *ec, bool suspending)
{
unsigned long flags;
@@ -629,8 +687,12 @@ static void acpi_ec_stop(struct acpi_ec *ec, bool suspending)
if (acpi_ec_started(ec)) {
pr_debug("+++++ Stopping EC +++++\n");
set_bit(EC_FLAGS_STOPPED, &ec->flags);
+ spin_unlock_irqrestore(&ec->lock, flags);
+ wait_event(ec->wait, acpi_ec_stopped(ec));
+ spin_lock_irqsave(&ec->lock, flags);
+ /* Disable GPE for event processing (SCI_EVT=1) */
if (!suspending)
- acpi_ec_disable_gpe(ec, true);
+ acpi_ec_complete_request(ec);
clear_bit(EC_FLAGS_STARTED, &ec->flags);
clear_bit(EC_FLAGS_STOPPED, &ec->flags);
pr_info("+++++ EC stopped +++++\n");
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 63e3495..6a4420d 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -129,6 +129,7 @@ struct acpi_ec {
unsigned long data_addr;
unsigned long global_lock;
unsigned long flags;
+ unsigned long reference_count;
struct mutex mutex;
wait_queue_head_t wait;
struct list_head list;
--
1.7.10

2015-02-06 00:58:16

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 3/5] ACPI / EC: Refine command storm prevention support.

This patch refines EC command storm prevention support.

Current command storming code is wrong, when the storming condition is
detected, it only flags the condition without doing anything for the
current command but performing storming prevention for the follow-up
commands. So:
1. The first command which suffers from the storming still suffers from
storming.
2. The follow-up commands which may not suffer from the storming are
unconditionally forced into the storming prevention mode.
Ideally, we should only enable storm prevention immediately after detection
for the current command so that the next command can try the
power/performance efficient interrupt mode again.

This patch improves the command storm prevention by disabling GPE right
after the detection and re-enabling it right before completing the command
transaction using the GPE storming prevention APIs. This thus deploys the
following GPE handling model:
1. acpi_enable_gpe()/acpi_disable_gpe() for reference count changes:
This set of APIs are used for EC usage reference counting.
2. acpi_set_gpe(ACPI_GPE_ENABLE)/acpi_set_gpe(ACPI_GPE_DISABLE):
This set of APIs are used for preventing GPE storm. They must be invoked
when the reference count > 0.
Note that as the storming prevention should always happen when there is
an outstanding request, or GPE enabling value will be messed up by the
races. This patch also adds BUG_ON() to enforces this rule to prevent
future bugs.

The msleep(1) used after completing a transaction is useless now as this
sounds like a guard time only useful for platforms that need the
EC_FLAGS_MSI quirks while we have fixed GPE race issues using the previous
raw handler mode enabling. It is kept to avoid regressions. A seperate
patch which deletes EC_FLAGS_MSI quirks should take care of deleting it.

Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/ec.c | 55 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 1fa1463..982b67f 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -77,11 +77,12 @@ enum ec_command {

enum {
EC_FLAGS_QUERY_PENDING, /* Query is pending */
- EC_FLAGS_GPE_STORM, /* GPE storm detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
EC_FLAGS_STARTED, /* Driver is started */
EC_FLAGS_STOPPED, /* Driver is stopped */
+ EC_FLAGS_COMMAND_STORM, /* GPE storms occurred to the
+ * current command processing */
};

#define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
@@ -229,8 +230,10 @@ static inline void acpi_ec_enable_gpe(struct acpi_ec *ec, bool open)
{
if (open)
acpi_enable_gpe(NULL, ec->gpe);
- else
+ else {
+ BUG_ON(ec->reference_count < 1);
acpi_set_gpe(NULL, ec->gpe, ACPI_GPE_ENABLE);
+ }
if (acpi_ec_is_gpe_raised(ec)) {
/*
* On some platforms, EN=1 writes cannot trigger GPE. So
@@ -246,8 +249,10 @@ static inline void acpi_ec_disable_gpe(struct acpi_ec *ec, bool close)
{
if (close)
acpi_disable_gpe(NULL, ec->gpe);
- else
+ else {
+ BUG_ON(ec->reference_count < 1);
acpi_set_gpe(NULL, ec->gpe, ACPI_GPE_DISABLE);
+ }
}

static inline void acpi_ec_clear_gpe(struct acpi_ec *ec)
@@ -290,6 +295,24 @@ static void acpi_ec_complete_request(struct acpi_ec *ec)
wake_up(&ec->wait);
}

+static void acpi_ec_set_storm(struct acpi_ec *ec, u8 flag)
+{
+ if (!test_bit(flag, &ec->flags)) {
+ acpi_ec_disable_gpe(ec, false);
+ pr_debug("+++++ Polling enabled +++++\n");
+ set_bit(flag, &ec->flags);
+ }
+}
+
+static void acpi_ec_clear_storm(struct acpi_ec *ec, u8 flag)
+{
+ if (test_bit(flag, &ec->flags)) {
+ clear_bit(flag, &ec->flags);
+ acpi_ec_enable_gpe(ec, false);
+ pr_debug("+++++ Polling disabled +++++\n");
+ }
+}
+
/*
* acpi_ec_submit_flushable_request() - Increase the reference count unless
* the flush operation is not in
@@ -404,8 +427,13 @@ err:
* otherwise will take a not handled IRQ as a false one.
*/
if (!(status & ACPI_EC_FLAG_SCI)) {
- if (in_interrupt() && t)
- ++t->irq_count;
+ if (in_interrupt() && t) {
+ if (t->irq_count < ec_storm_threshold)
+ ++t->irq_count;
+ /* Allow triggering on 0 threshold */
+ if (t->irq_count == ec_storm_threshold)
+ acpi_ec_set_storm(ec, EC_FLAGS_COMMAND_STORM);
+ }
}
out:
if (status & ACPI_EC_FLAG_SCI)
@@ -482,6 +510,8 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
spin_unlock_irqrestore(&ec->lock, tmp);
ret = ec_poll(ec);
spin_lock_irqsave(&ec->lock, tmp);
+ if (t->irq_count == ec_storm_threshold)
+ acpi_ec_clear_storm(ec, EC_FLAGS_COMMAND_STORM);
pr_debug("***** Command(%s) stopped *****\n",
acpi_ec_cmd_string(t->command));
ec->curr = NULL;
@@ -509,24 +539,11 @@ static int acpi_ec_transaction(struct acpi_ec *ec, struct transaction *t)
goto unlock;
}
}
- /* disable GPE during transaction if storm is detected */
- if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
- /* It has to be disabled, so that it doesn't trigger. */
- acpi_ec_disable_gpe(ec, false);
- }

status = acpi_ec_transaction_unlocked(ec, t);

- if (test_bit(EC_FLAGS_GPE_STORM, &ec->flags)) {
+ if (test_bit(EC_FLAGS_COMMAND_STORM, &ec->flags))
msleep(1);
- /* It is safe to enable the GPE outside of the transaction. */
- acpi_ec_enable_gpe(ec, false);
- } else if (t->irq_count > ec_storm_threshold) {
- pr_info("GPE storm detected(%d GPEs), "
- "transactions will use polling mode\n",
- t->irq_count);
- set_bit(EC_FLAGS_GPE_STORM, &ec->flags);
- }
if (ec->global_lock)
acpi_release_global_lock(glk);
unlock:
--
1.7.10

2015-02-06 00:58:19

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 4/5] ACPI / EC: Add query flushing support.

This patch implementes the QR_EC flushing support.

Grace periods are implemented from the detection of an SCI_EVT to the
submission/completion of the QR_EC transaction. During this period, all
EC command transactions are allowed to be submitted.

Note that query periods and event periods are intentionally distiguished to
allow further improvements.
1. Query period: from the detection of an SCI_EVT to the sumission of the
QR_EC command. This period is used for storming prevention, as currently
QR_EC is deferred to a work queue rather than directly issued from the
IRQ context even there is no other transactions pending, so malicous
SCI_EVT GPE can act like "level triggered" to trigger a GPE storm. We
need to be prepared for this. And in the future, we may change it to be
a part of the advance_transaction() where we will try QR_EC submission
in appropriate positions to avoid such GPE storming.
2. Event period: from the detection of an SCI_EVT to the completion of the
QR_EC command. We may extend it to the completion of _Qxx evaluation.
This is actually a grace period for event flushing, but we only flush
queries due to the reason stated in known issue 1. That's also why we
use EC_FLAGS_EVENT_xxx. During this period, QR_EC transactions need to
pass the flushable submission check.

In this patch, the following flags are implemented:
1. EC_FLAGS_EVENT_ENABLED: this is derived from the old
EC_FLAGS_QUERY_PENDING flag which can block SCI_EVT handlings.
With this flag, the logics implemented by the original flag are
extended:
1. Old logic: unless both of the flags are set, the event poller will
not be scheduled, and
2. New logic: as soon as both of the flags are set, the evet poller will
be scheduled.
2. EC_FLAGS_EVENT_DETECTED: this is also derived from the old
EC_FLAGS_QUERY_PENDING flag which can block SCI_EVT detection. It thus
can be used to indicate the storming prevention period for query
submission.
acpi_ec_submit_request()/acpi_ec_complete_request() are invoked to
implement this period so that acpi_set_gpe() can be invoked under the
"reference count > 0" condition.
3. EC_FLAGS_EVENT_PENDING: this is newly added to indicate the grace period
for event flushing (query flushing for now).
acpi_ec_submit_request()/acpi_ec_complete_request() are invoked to
implement this period so that the flushing process can wait until the
event handling (query transaction for now) to be completed.

Known issues:
1. Event flushing support
Currently the EC transactions occurred for the _Qxx evaluations may be
dropped during the query flushing. A better support might implement the
grace period from the detection of an SCI_EVT to the completion of the
_Qxx evaluation so that the whole event handling process can run to a
a graceful end.
But doing this is dangerous as it requires not only the quality of the
EC driver state machine, but also the quality of the ACPICA interpreter.
So such aggressive approach shouldn't be done in one patch.
As the reason, the event flushing support should be a standalone one so
that we can handle regressions easier.
2. Event draining support
The reasons why we do not put a loop in the event poller to poll event
until the returned query value is 0:
Some platforms return non 0 query value even when SCI_EVT=0, if we put a
loop in the poller, our command flush mechanism could never execute to
an end thus the system suspending process could be blocked. One SCI_EVT
triggering one QR_EC is current logic and has been proven to be working
for so long time.
As the reason, the event draining support should be a standalone one so
that we can handle regressions easier.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=82611
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <[email protected]>
Tested-by: Ortwin Glück <[email protected]>
---
drivers/acpi/ec.c | 101 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 85 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 982b67f..40002ae 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -76,7 +76,9 @@ enum ec_command {
* when trying to clear the EC */

enum {
- EC_FLAGS_QUERY_PENDING, /* Query is pending */
+ EC_FLAGS_EVENT_ENABLED, /* Event is enabled */
+ EC_FLAGS_EVENT_PENDING, /* Event is pending */
+ EC_FLAGS_EVENT_DETECTED, /* Event is detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
EC_FLAGS_STARTED, /* Driver is started */
@@ -151,6 +153,12 @@ static bool acpi_ec_flushed(struct acpi_ec *ec)
return ec->reference_count == 1;
}

+static bool acpi_ec_has_pending_event(struct acpi_ec *ec)
+{
+ return test_bit(EC_FLAGS_EVENT_DETECTED, &ec->flags) ||
+ test_bit(EC_FLAGS_EVENT_PENDING, &ec->flags);
+}
+
/* --------------------------------------------------------------------------
* EC Registers
* -------------------------------------------------------------------------- */
@@ -318,36 +326,93 @@ static void acpi_ec_clear_storm(struct acpi_ec *ec, u8 flag)
* the flush operation is not in
* progress
* @ec: the EC device
+ * @allow_event: whether event should be handled
*
* This function must be used before taking a new action that should hold
* the reference count. If this function returns false, then the action
* must be discarded or it will prevent the flush operation from being
* completed.
+ *
+ * During flushing, QR_EC command need to pass this check when there is a
+ * pending event, so that the reference count held for the pending event
+ * can be decreased by the completion of the QR_EC command.
*/
-static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec)
+static bool acpi_ec_submit_flushable_request(struct acpi_ec *ec,
+ bool allow_event)
{
- if (!acpi_ec_started(ec))
- return false;
+ if (!acpi_ec_started(ec)) {
+ if (!allow_event || !acpi_ec_has_pending_event(ec))
+ return false;
+ }
acpi_ec_submit_request(ec);
return true;
}

-static void acpi_ec_submit_query(struct acpi_ec *ec)
+static void acpi_ec_submit_event(struct acpi_ec *ec)
{
- if (!test_and_set_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) {
- pr_debug("***** Event started *****\n");
+ if (!test_bit(EC_FLAGS_EVENT_DETECTED, &ec->flags) ||
+ !test_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags))
+ return;
+ /* Hold reference for pending event */
+ if (!acpi_ec_submit_flushable_request(ec, true))
+ return;
+ if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
+ pr_debug("***** Event query started *****\n");
schedule_work(&ec->work);
+ return;
+ }
+ acpi_ec_complete_request(ec);
+}
+
+static void acpi_ec_complete_event(struct acpi_ec *ec)
+{
+ if (ec->curr->command == ACPI_EC_COMMAND_QUERY) {
+ clear_bit(EC_FLAGS_EVENT_PENDING, &ec->flags);
+ pr_debug("***** Event query stopped *****\n");
+ /* Unhold reference for pending event */
+ acpi_ec_complete_request(ec);
+ /* Check if there is another SCI_EVT detected */
+ acpi_ec_submit_event(ec);
+ }
+}
+
+static void acpi_ec_submit_detection(struct acpi_ec *ec)
+{
+ /* Hold reference for query submission */
+ if (!acpi_ec_submit_flushable_request(ec, false))
+ return;
+ if (!test_and_set_bit(EC_FLAGS_EVENT_DETECTED, &ec->flags)) {
+ pr_debug("***** Event detection blocked *****\n");
+ acpi_ec_submit_event(ec);
+ return;
}
+ acpi_ec_complete_request(ec);
}

-static void acpi_ec_complete_query(struct acpi_ec *ec)
+static void acpi_ec_complete_detection(struct acpi_ec *ec)
{
if (ec->curr->command == ACPI_EC_COMMAND_QUERY) {
- clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
- pr_debug("***** Event stopped *****\n");
+ clear_bit(EC_FLAGS_EVENT_DETECTED, &ec->flags);
+ pr_debug("***** Event detetion unblocked *****\n");
+ /* Unhold reference for query submission */
+ acpi_ec_complete_request(ec);
}
}

+static void acpi_ec_enable_event(struct acpi_ec *ec)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ec->lock, flags);
+ set_bit(EC_FLAGS_EVENT_ENABLED, &ec->flags);
+ /*
+ * An event may be pending even with SCI_EVT=0, so QR_EC should
+ * always be issued right after started.
+ */
+ acpi_ec_submit_detection(ec);
+ spin_unlock_irqrestore(&ec->lock, flags);
+}
+
static int ec_transaction_completed(struct acpi_ec *ec)
{
unsigned long flags;
@@ -389,6 +454,7 @@ static void advance_transaction(struct acpi_ec *ec)
t->rdata[t->ri++] = acpi_ec_read_data(ec);
if (t->rlen == t->ri) {
t->flags |= ACPI_EC_COMMAND_COMPLETE;
+ acpi_ec_complete_event(ec);
if (t->command == ACPI_EC_COMMAND_QUERY)
pr_debug("***** Command(%s) hardware completion *****\n",
acpi_ec_cmd_string(t->command));
@@ -399,6 +465,7 @@ static void advance_transaction(struct acpi_ec *ec)
} else if (t->wlen == t->wi &&
(status & ACPI_EC_FLAG_IBF) == 0) {
t->flags |= ACPI_EC_COMMAND_COMPLETE;
+ acpi_ec_complete_event(ec);
wakeup = true;
}
goto out;
@@ -407,16 +474,17 @@ static void advance_transaction(struct acpi_ec *ec)
!(status & ACPI_EC_FLAG_SCI) &&
(t->command == ACPI_EC_COMMAND_QUERY)) {
t->flags |= ACPI_EC_COMMAND_POLL;
- acpi_ec_complete_query(ec);
+ acpi_ec_complete_detection(ec);
t->rdata[t->ri++] = 0x00;
t->flags |= ACPI_EC_COMMAND_COMPLETE;
+ acpi_ec_complete_event(ec);
pr_debug("***** Command(%s) software completion *****\n",
acpi_ec_cmd_string(t->command));
wakeup = true;
} else if ((status & ACPI_EC_FLAG_IBF) == 0) {
acpi_ec_write_cmd(ec, t->command);
t->flags |= ACPI_EC_COMMAND_POLL;
- acpi_ec_complete_query(ec);
+ acpi_ec_complete_detection(ec);
} else
goto err;
goto out;
@@ -437,7 +505,7 @@ err:
}
out:
if (status & ACPI_EC_FLAG_SCI)
- acpi_ec_submit_query(ec);
+ acpi_ec_submit_detection(ec);
if (wakeup && in_interrupt())
wake_up(&ec->wait);
}
@@ -498,7 +566,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
/* start transaction */
spin_lock_irqsave(&ec->lock, tmp);
/* Enable GPE for command processing (IBF=0/OBF=1) */
- if (!acpi_ec_submit_flushable_request(ec)) {
+ if (!acpi_ec_submit_flushable_request(ec, true)) {
ret = -EINVAL;
goto unlock;
}
@@ -879,7 +947,9 @@ static void acpi_ec_gpe_poller(struct work_struct *work)
{
struct acpi_ec *ec = container_of(work, struct acpi_ec, work);

+ pr_debug("***** Event poller started *****\n");
acpi_ec_query(ec, NULL);
+ pr_debug("***** Event poller stopped *****\n");
}

static u32 acpi_ec_gpe_handler(acpi_handle gpe_device,
@@ -949,7 +1019,6 @@ static struct acpi_ec *make_acpi_ec(void)

if (!ec)
return NULL;
- ec->flags = 1 << EC_FLAGS_QUERY_PENDING;
mutex_init(&ec->mutex);
init_waitqueue_head(&ec->wait);
INIT_LIST_HEAD(&ec->list);
@@ -1100,7 +1169,7 @@ static int acpi_ec_add(struct acpi_device *device)
ret = ec_install_handlers(ec);

/* EC is fully operational, allow queries */
- clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
+ acpi_ec_enable_event(ec);

/* Clear stale _Q events if hardware might require that */
if (EC_FLAGS_CLEAR_ON_RESUME)
--
1.7.10

2015-02-06 00:58:26

by Zheng, Lv

[permalink] [raw]
Subject: [PATCH v2 5/5] ACPI / EC: Add GPE reference counting debugging messages.

This patch enhances debugging with the GPE reference count messages added.
No functional changes.

Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/ec.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 40002ae..14d0c89 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -31,6 +31,7 @@

/* Uncomment next line to get verbose printout */
/* #define DEBUG */
+#define DEBUG_REF 0
#define pr_fmt(fmt) "ACPI : EC: " fmt

#include <linux/kernel.h>
@@ -90,6 +91,13 @@ enum {
#define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
#define ACPI_EC_COMMAND_COMPLETE 0x02 /* Completed last byte */

+#define ec_debug_ref(ec, fmt, ...) \
+ do { \
+ if (DEBUG_REF) \
+ pr_debug("%lu: " fmt, ec->reference_count, \
+ ## __VA_ARGS__); \
+ } while (0)
+
/* ec.c is compiled in acpi namespace so this shows up as acpi.ec_delay param */
static unsigned int ec_delay __read_mostly = ACPI_EC_DELAY;
module_param(ec_delay, uint, 0644);
@@ -356,12 +364,14 @@ static void acpi_ec_submit_event(struct acpi_ec *ec)
/* Hold reference for pending event */
if (!acpi_ec_submit_flushable_request(ec, true))
return;
+ ec_debug_ref(ec, "Increase event\n");
if (!test_and_set_bit(EC_FLAGS_EVENT_PENDING, &ec->flags)) {
pr_debug("***** Event query started *****\n");
schedule_work(&ec->work);
return;
}
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease event\n");
}

static void acpi_ec_complete_event(struct acpi_ec *ec)
@@ -371,6 +381,7 @@ static void acpi_ec_complete_event(struct acpi_ec *ec)
pr_debug("***** Event query stopped *****\n");
/* Unhold reference for pending event */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease event\n");
/* Check if there is another SCI_EVT detected */
acpi_ec_submit_event(ec);
}
@@ -381,12 +392,14 @@ static void acpi_ec_submit_detection(struct acpi_ec *ec)
/* Hold reference for query submission */
if (!acpi_ec_submit_flushable_request(ec, false))
return;
+ ec_debug_ref(ec, "Increase query\n");
if (!test_and_set_bit(EC_FLAGS_EVENT_DETECTED, &ec->flags)) {
pr_debug("***** Event detection blocked *****\n");
acpi_ec_submit_event(ec);
return;
}
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease query\n");
}

static void acpi_ec_complete_detection(struct acpi_ec *ec)
@@ -396,6 +409,7 @@ static void acpi_ec_complete_detection(struct acpi_ec *ec)
pr_debug("***** Event detetion unblocked *****\n");
/* Unhold reference for query submission */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease query\n");
}
}

@@ -570,6 +584,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
ret = -EINVAL;
goto unlock;
}
+ ec_debug_ref(ec, "Increase command\n");
/* following two actions should be kept atomic */
ec->curr = t;
pr_debug("***** Command(%s) started *****\n",
@@ -585,6 +600,7 @@ static int acpi_ec_transaction_unlocked(struct acpi_ec *ec,
ec->curr = NULL;
/* Disable GPE for command processing (IBF=0/OBF=1) */
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease command\n");
unlock:
spin_unlock_irqrestore(&ec->lock, tmp);
return ret;
@@ -746,8 +762,10 @@ static void acpi_ec_start(struct acpi_ec *ec, bool resuming)
if (!test_and_set_bit(EC_FLAGS_STARTED, &ec->flags)) {
pr_debug("+++++ Starting EC +++++\n");
/* Enable GPE for event processing (SCI_EVT=1) */
- if (!resuming)
+ if (!resuming) {
acpi_ec_submit_request(ec);
+ ec_debug_ref(ec, "Increase driver\n");
+ }
pr_info("+++++ EC started +++++\n");
}
spin_unlock_irqrestore(&ec->lock, flags);
@@ -776,8 +794,10 @@ static void acpi_ec_stop(struct acpi_ec *ec, bool suspending)
wait_event(ec->wait, acpi_ec_stopped(ec));
spin_lock_irqsave(&ec->lock, flags);
/* Disable GPE for event processing (SCI_EVT=1) */
- if (!suspending)
+ if (!suspending) {
acpi_ec_complete_request(ec);
+ ec_debug_ref(ec, "Decrease driver\n");
+ }
clear_bit(EC_FLAGS_STARTED, &ec->flags);
clear_bit(EC_FLAGS_STOPPED, &ec->flags);
pr_info("+++++ EC stopped +++++\n");
--
1.7.10

2015-02-06 01:03:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support.

On Friday, February 06, 2015 08:57:37 AM Lv Zheng wrote:
> This patchset contains 3 cleanups related to the EC driver:
> 1. Command flushing (command grace period)
> This patchset flushes EC commands before suspending/resuming, so that
> there won't be timeout for the incomplete commands after resuming.
> 2. Query flushing (query grace period)
> This patchset flushes EC event queries before suspending/resuming, so
> that there won't be broken events remained in the firmware queue.
> 3. Command storming prevention
> This patchset corrects command storming prevention logic because of
> the GPE raw handler mode.
> The request reference count debugging messages can be used to detect broken
> EC transactions. It should always drop to 1 when the driver is idle during
> the runtime.
>
> Note that after flushing before suspending, EC GPE is still enabled to keep
> the old behavior.
>
> Lv Zheng (5):
> ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> ACPI/EC: Add command flushing support.
> ACPI/EC: Refine command storm prevention support.
> ACPI/EC: Add query flushing support.
> ACPI/EC: Add GPE reference counting debugging messages.
>
> drivers/acpi/ec.c | 295 ++++++++++++++++++++++++++++++++++++++++-------
> drivers/acpi/internal.h | 1 +
> 2 files changed, 254 insertions(+), 42 deletions(-)

So this is on top of the EC patches you sent previously, right?


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2015-02-09 02:23:46

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support.

Hi, Rafael

> From: Rafael J. Wysocki [mailto:[email protected]]
> Sent: Friday, February 06, 2015 9:27 AM
>
> On Friday, February 06, 2015 08:57:37 AM Lv Zheng wrote:
> > This patchset contains 3 cleanups related to the EC driver:
> > 1. Command flushing (command grace period)
> > This patchset flushes EC commands before suspending/resuming, so that
> > there won't be timeout for the incomplete commands after resuming.
> > 2. Query flushing (query grace period)
> > This patchset flushes EC event queries before suspending/resuming, so
> > that there won't be broken events remained in the firmware queue.
> > 3. Command storming prevention
> > This patchset corrects command storming prevention logic because of
> > the GPE raw handler mode.
> > The request reference count debugging messages can be used to detect broken
> > EC transactions. It should always drop to 1 when the driver is idle during
> > the runtime.
> >
> > Note that after flushing before suspending, EC GPE is still enabled to keep
> > the old behavior.
> >
> > Lv Zheng (5):
> > ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> > ACPI/EC: Add command flushing support.
> > ACPI/EC: Refine command storm prevention support.
> > ACPI/EC: Add query flushing support.
> > ACPI/EC: Add GPE reference counting debugging messages.
> >
> > drivers/acpi/ec.c | 295 ++++++++++++++++++++++++++++++++++++++++-------
> > drivers/acpi/internal.h | 1 +
> > 2 files changed, 254 insertions(+), 42 deletions(-)
>
> So this is on top of the EC patches you sent previously, right?

Yes.

The sequence is:
ACPICA 20150204 release: http://www.spinics.net/lists/linux-acpi/msg55623.html
ACPI EC GPE race fixes: http://www.spinics.net/lists/linux-acpi/msg55633.html
And this series.

Thanks and best regards
-Lv


>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-02-12 01:17:48

by Wysocki, Rafael J

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support.

On 2/9/2015 3:23 AM, Zheng, Lv wrote:
> Hi, Rafael
>
>> From: Rafael J. Wysocki [mailto:[email protected]]
>> Sent: Friday, February 06, 2015 9:27 AM
>>
>> On Friday, February 06, 2015 08:57:37 AM Lv Zheng wrote:
>>> This patchset contains 3 cleanups related to the EC driver:
>>> 1. Command flushing (command grace period)
>>> This patchset flushes EC commands before suspending/resuming, so that
>>> there won't be timeout for the incomplete commands after resuming.
>>> 2. Query flushing (query grace period)
>>> This patchset flushes EC event queries before suspending/resuming, so
>>> that there won't be broken events remained in the firmware queue.
>>> 3. Command storming prevention
>>> This patchset corrects command storming prevention logic because of
>>> the GPE raw handler mode.
>>> The request reference count debugging messages can be used to detect broken
>>> EC transactions. It should always drop to 1 when the driver is idle during
>>> the runtime.
>>>
>>> Note that after flushing before suspending, EC GPE is still enabled to keep
>>> the old behavior.
>>>
>>> Lv Zheng (5):
>>> ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
>>> ACPI/EC: Add command flushing support.
>>> ACPI/EC: Refine command storm prevention support.
>>> ACPI/EC: Add query flushing support.
>>> ACPI/EC: Add GPE reference counting debugging messages.
>>>
>>> drivers/acpi/ec.c | 295 ++++++++++++++++++++++++++++++++++++++++-------
>>> drivers/acpi/internal.h | 1 +
>>> 2 files changed, 254 insertions(+), 42 deletions(-)
>> So this is on top of the EC patches you sent previously, right?
> Yes.
>
> The sequence is:
> ACPICA 20150204 release: http://www.spinics.net/lists/linux-acpi/msg55623.html
> ACPI EC GPE race fixes: http://www.spinics.net/lists/linux-acpi/msg55633.html
> And this series.

Unfortunately, patch [4/5] from this series broke system suspend on Acer
Aspire S5 for me.

Apparently, the box hangs hard during the last stage of suspend (after
offlining nonboot CPUs).

The issue is 100% reproducible and reverting the patch (along with [5/5]
that depends on it) fixes the problem, so I'm going to push the reverts
to Linus on Friday.

I can send you an acpidump output from the affected machine, but I won't
be able to do any testing on it before Feb 23, so reverting is the only
viable option for the time being.

Please let me know if you want the acpidump output.

Kind regards,
Rafael

2015-02-16 06:42:05

by Zheng, Lv

[permalink] [raw]
Subject: RE: [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support.

Hi,

> From: Wysocki, Rafael J
> Sent: Thursday, February 12, 2015 9:18 AM
>
> On 2/9/2015 3:23 AM, Zheng, Lv wrote:
> > Hi, Rafael
> >
> >> From: Rafael J. Wysocki [mailto:[email protected]]
> >> Sent: Friday, February 06, 2015 9:27 AM
> >>
> >> On Friday, February 06, 2015 08:57:37 AM Lv Zheng wrote:
> >>> This patchset contains 3 cleanups related to the EC driver:
> >>> 1. Command flushing (command grace period)
> >>> This patchset flushes EC commands before suspending/resuming, so that
> >>> there won't be timeout for the incomplete commands after resuming.
> >>> 2. Query flushing (query grace period)
> >>> This patchset flushes EC event queries before suspending/resuming, so
> >>> that there won't be broken events remained in the firmware queue.
> >>> 3. Command storming prevention
> >>> This patchset corrects command storming prevention logic because of
> >>> the GPE raw handler mode.
> >>> The request reference count debugging messages can be used to detect broken
> >>> EC transactions. It should always drop to 1 when the driver is idle during
> >>> the runtime.
> >>>
> >>> Note that after flushing before suspending, EC GPE is still enabled to keep
> >>> the old behavior.
> >>>
> >>> Lv Zheng (5):
> >>> ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> >>> ACPI/EC: Add command flushing support.
> >>> ACPI/EC: Refine command storm prevention support.
> >>> ACPI/EC: Add query flushing support.
> >>> ACPI/EC: Add GPE reference counting debugging messages.
> >>>
> >>> drivers/acpi/ec.c | 295 ++++++++++++++++++++++++++++++++++++++++-------
> >>> drivers/acpi/internal.h | 1 +
> >>> 2 files changed, 254 insertions(+), 42 deletions(-)
> >> So this is on top of the EC patches you sent previously, right?
> > Yes.
> >
> > The sequence is:
> > ACPICA 20150204 release: http://www.spinics.net/lists/linux-acpi/msg55623.html
> > ACPI EC GPE race fixes: http://www.spinics.net/lists/linux-acpi/msg55633.html
> > And this series.
>
> Unfortunately, patch [4/5] from this series broke system suspend on Acer
> Aspire S5 for me.
>
> Apparently, the box hangs hard during the last stage of suspend (after
> offlining nonboot CPUs).
>
> The issue is 100% reproducible and reverting the patch (along with [5/5]
> that depends on it) fixes the problem, so I'm going to push the reverts
> to Linus on Friday.

GPE _Lxx need EC transactions to complete the GPE handling.
While during runtime, EC transactions need the EC GPE to be enabled to run in the interrupt mode.
During suspending, EC transactions will be running in polling mode.

Note that in the polling mode, originally we didn't check EC event (SCI_EVT).
This becomes an issue for Samsung platforms, so we check it even in the polling mode in advance_transaction():
if (status & ACPI_EC_FLAG_SCI)
acpi_ec_submit_detection(ec);
We need to take care of the old hidden logics by doing this improvement.

We originally have the following sequence for suspending:
acpi_disable_all_gpes()
EC GPE disabled
acpi_os_wait_events_complete()
Flush GPE work items (CPU0, including _Lxx)
_Lxx invokes EC transactions in the EC polling mode
Flush Notify work items (CPU0, including old QR_EC because QR_EC is also queued up using acpi_os_execute(), _Qxx, Notify)
---
we may still see EC transactions (excluding new QR_EC because SCI_EVT is not checked in polling mode) handled in polling mode submitted here.
---
acpi_ec_block_transactions()
Flush EC transactions (excluding QR_EC)
prevent further EC transactions (causing AE_BAD_PARAMETER for EC region accesses).

After applying all necessary improvements (excluding PATCH 4) we got the following suspend sequence:
acpi_disable_all_gpes()
EC GPE disabled
acpi_os_wait_events_complete()
Flush GPE work items (CPU0, including _Lxx)
_Lxx invokes EC transactions in the EC polling mode
Flush Notify work items (CPU0, including _Qxx, Notify, excluding old QR_EC because QR_EC is queued up in a separate workqueue)
---
we may still see EC transactions (including new QR_EC because SCI_EVT is checked in polling mode) handled in polling mode submitted here.
---
acpi_ec_block_transactions()
Flush EC transactions (including QR_EC)
prevent further EC transactions (causing AE_BAD_PARAMETER for EC region accesses).
The above sequence is very similar to old behavior. The only differences are caused by the QR_EC flushing order:
1. After acpi_os_wait_events_complete(), originally no new QR_EC or _Qxx evaluations while now there may be new QR_EC or _Qxx evaluations queued up.
2. Originally AE_BAD_PARAMETER can only be seen to non _Qxx issued EC transactions. While now we may also see this to be returned for _Qxx issued EC transactions.
So IMO, before root causing the issue, it's OK to revert just [PATCH 4].
And we can even add acpi_os_wait_events_complete() after acpi_ec_block_transactions(). So that it's really no difference.
I doubt this is useful, because _Qxx will run very quickly to an exceptional end and this can even be freezed.

After applying all necessary improvements (including PATCH 4) we got the following suspend sequence:
acpi_disable_all_gpes()
EC GPE disabled
acpi_os_wait_events_complete()
Flush GPE work items (CPU0, including _Lxx)
_Lxx invokes EC transactions in the GPE polling mode
Flush Notify work items (CPU0, including _Qxx, Notify, excluding old QR_EC)
---
we may still see EC transactions (including new QR_EC) handled in polling mode submitted here.
---
acpi_ec_block_transactions()
Flush EC transactions (including QR_EC and ensure QR_EC to be issued for SCI_EVT)
prevent further EC transactions.
The only difference is QR_EC is ensured to be issued after detecting SCI_EVT during suspending.

I didn't see issues in doing so unless the patch itself has issues.
If we see hang in suspend/resume test, in most cases, it should happen in acpi_ec_block_transactions().
Debugging this would require EC refcnt messages in the suspend kernel log to catch the breakage.
So [PATCH 5] will be helpful in debugging this issue.

The patch 4 might be lacking of test cases. Let me check it again.

> I can send you an acpidump output from the affected machine, but I won't
> be able to do any testing on it before Feb 23, so reverting is the only
> viable option for the time being.

No problem.

> Please let me know if you want the acpidump output.

Let me first check the patch again, in most of the cases, the bug could be reviewed out.
I can also try to reproduce this locally using other Acer modules, so please send me the .config and acpidump output.

Thanks and best regards
-Lv
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?