2017-06-21 08:55:05

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 0/5] ACPI: button: Fix button.lid_init_state=method mode

There are platform variations implementing ACPI lid device in different
way:
1. Some platforms send "open" events to OS and the events arrive before
button driver is resumed;
2. Some platforms send "open" events to OS, but the events arrive after
button driver is resumed, ex., Samsung N210+;
3. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, and the update events arrive before
button driver is resumed;
4. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, but the update events arrive after
button driver is resumed, ex., Surface Pro 3;
5. Some platforms never send "open" events, _LID returns value sticks to
"close", ex., Surface Pro 1.

This series tries to provide solutions to fix all cases for old userspace
programs. These solutions include:
1. Fix order issue in case 2,4. Enabled by default.
2. Fix event missing issue in case 3,4. As newer systemd doesn't require
this fix, and this fix is not power friendly, it is not enabled by
default, but can be enabled by:
2.1. button.lid_periodic_update=1: periodically sends _LID value to
the input layer.
3. Fix persistate close issue in case 5, as newer systemd doesn't require
this fix, it is not enabled by default, but can be enabled in 2 means:
3.1. button.lid_init_state=ignore: only adds complement open events.
3.2. lid_unreliable=1: dymamically adds/removes input node.

Benjamin Tissoires (2):
ACPI: button: extract input creation/destruction helpers
ACPI: button: Add an optional workaround to fix a persistent close
issue for old userspace

Lv Zheng (3):
ACPI: button: Add a workaround to fix an order issue for old userspace
ACPI: button: Add an optional workaround to fix an event missing issue
for old userspace
ACPI: button: Rework lid_init_state=ignore mode

drivers/acpi/button.c | 302 +++++++++++++++++++++++++++++++-------------------
1 file changed, 186 insertions(+), 116 deletions(-)

--
2.7.4


2017-06-21 08:55:12

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 1/5] ACPI: button: Add a workaround to fix an order issue for old userspace

There are platform variations implementing ACPI lid device in different
ways:
1. Some platforms send "open" events to OS and the events arrive before
button driver is resumed;
2. Some platforms send "open" events to OS, but the events arrive after
button driver is resumed, ex., Samsung N210+;
3. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, and the update events arrive before
button driver is resumed;
4. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, but the update events arrive after
button driver is resumed, ex., Surface Pro 3;
5. Some platforms never send "open" events, _LID returns value sticks to
"close", ex., Surface Pro 1.
Currently, all cases work find with systemd 233, but only case 1 works fine
with systemd 229.

Case 2,4 can be treated as an order issue:
If the button driver always evaluates _LID after seeing a LID
notification, there shouldn't be such a problem.

Note that this order issue cannot be fixed by sorting OS drivers' resume
code:
1. When acpi.ec_freeze_events=Y, the order depends on whether PNP0C0D(LID)
or PNP0C09(EC) appears first in the namespace (probe order).
2. Even when acpi.ec_freeze_events=N, which forces OS to handle EC events
as early as possible, the order issue can still be reproduced due to
platform delays (reproduce ratio is lower than case 1).
3. Some platform simply has a very big delay for LID open events.

This patch tries to fix this issue for systemd 229 by defer sending initial
lid state 10 seconds later after resume, which ensures EC events can always
arrive before button driver evaluates _LID. It finally only fixes case 2
platforms as fixing case 4 platforms needs additional efforts (see known
issue below).

The users can configure wait timeout via button.lid_notify_timeout.

Known issu:
1. systemd/logind 229 still mis-bahaves with case 3,4 platforms
After seeing a LID "close" event, systemd 229 will wait several seconds
(HoldoffTimeoutSec) before suspending the platform. Thus on case 4
platforms, if users close lid, and re-open it during the
HoldoffTimeoutSec period, there is still no "open" events seen by the
userspace. Thus systemd still considers the last state as "close" and
suspends the platform after the HoldoffTimeoutSec times out.

Cc: <[email protected]>
Cc: Benjamin Tissoires <[email protected]>
Cc: Peter Hutterer <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/button.c | 36 ++++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index e19f530..67a0d78 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -28,6 +28,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/input.h>
+#include <linux/timer.h>
#include <linux/slab.h>
#include <linux/acpi.h>
#include <acpi/button.h>
@@ -79,6 +80,7 @@ MODULE_DEVICE_TABLE(acpi, button_device_ids);
static int acpi_button_add(struct acpi_device *device);
static int acpi_button_remove(struct acpi_device *device);
static void acpi_button_notify(struct acpi_device *device, u32 event);
+static void acpi_lid_timeout(ulong arg);

#ifdef CONFIG_PM_SLEEP
static int acpi_button_suspend(struct device *dev);
@@ -104,6 +106,7 @@ static struct acpi_driver acpi_button_driver = {
struct acpi_button {
unsigned int type;
struct input_dev *input;
+ struct timer_list lid_timer;
char phys[32]; /* for input device */
unsigned long pushed;
int last_state;
@@ -119,6 +122,10 @@ static unsigned long lid_report_interval __read_mostly = 500;
module_param(lid_report_interval, ulong, 0644);
MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key events");

+static unsigned long lid_notify_timeout __read_mostly = 10;
+module_param(lid_notify_timeout, ulong, 0644);
+MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");
+
/* --------------------------------------------------------------------------
FS Interface (/proc)
-------------------------------------------------------------------------- */
@@ -371,6 +378,15 @@ static int acpi_lid_update_state(struct acpi_device *device)
return acpi_lid_notify_state(device, state);
}

+static void acpi_lid_start_timer(struct acpi_device *device,
+ unsigned long msecs)
+{
+ struct acpi_button *button = acpi_driver_data(device);
+
+ mod_timer(&button->lid_timer,
+ jiffies + msecs_to_jiffies(msecs));
+}
+
static void acpi_lid_initialize_state(struct acpi_device *device)
{
switch (lid_init_state) {
@@ -386,6 +402,13 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
}
}

+static void acpi_lid_timeout(ulong arg)
+{
+ struct acpi_device *device = (struct acpi_device *)arg;
+
+ acpi_lid_initialize_state(device);
+}
+
static void acpi_button_notify(struct acpi_device *device, u32 event)
{
struct acpi_button *button = acpi_driver_data(device);
@@ -432,6 +455,8 @@ static int acpi_button_suspend(struct device *dev)
struct acpi_device *device = to_acpi_device(dev);
struct acpi_button *button = acpi_driver_data(device);

+ if (button->type == ACPI_BUTTON_TYPE_LID)
+ del_timer(&button->lid_timer);
button->suspended = true;
return 0;
}
@@ -443,7 +468,8 @@ static int acpi_button_resume(struct device *dev)

button->suspended = false;
if (button->type == ACPI_BUTTON_TYPE_LID)
- acpi_lid_initialize_state(device);
+ acpi_lid_start_timer(device,
+ lid_notify_timeout * MSEC_PER_SEC);
return 0;
}
#endif
@@ -490,6 +516,9 @@ static int acpi_button_add(struct acpi_device *device)
ACPI_BUTTON_CLASS, ACPI_BUTTON_SUBCLASS_LID);
button->last_state = !!acpi_lid_evaluate_state(device);
button->last_time = ktime_get();
+ init_timer(&button->lid_timer);
+ setup_timer(&button->lid_timer,
+ acpi_lid_timeout, (ulong)device);
} else {
printk(KERN_ERR PREFIX "Unsupported hid [%s]\n", hid);
error = -ENODEV;
@@ -526,12 +555,13 @@ static int acpi_button_add(struct acpi_device *device)
if (error)
goto err_remove_fs;
if (button->type == ACPI_BUTTON_TYPE_LID) {
- acpi_lid_initialize_state(device);
/*
* This assumes there's only one lid device, or if there are
* more we only care about the last one...
*/
lid_device = device;
+ acpi_lid_start_timer(device,
+ lid_notify_timeout * MSEC_PER_SEC);
}

printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
@@ -551,6 +581,8 @@ static int acpi_button_remove(struct acpi_device *device)
struct acpi_button *button = acpi_driver_data(device);

acpi_button_remove_fs(device);
+ if (button->type == ACPI_BUTTON_TYPE_LID)
+ del_timer_sync(&button->lid_timer);
input_unregister_device(button->input);
kfree(button);
return 0;
--
2.7.4

2017-06-21 08:55:29

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 2/5] ACPI: button: Add an optional workaround to fix an event missing issue for old userspace

There are platform variations implementing ACPI lid device in different
ways:
1. Some platforms send "open" events to OS and the events arrive before
button driver is resumed;
2. Some platforms send "open" events to OS, but the events arrive after
button driver is resumed, ex., Samsung N210+;
3. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, and the update events arrive before
button driver is resumed;
4. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, but the update events arrive after
button driver is resumed, ex., Surface Pro 3;
5. Some platforms never send "open" events, _LID returns value sticks to
"close", ex., Surface Pro 1.
Currently, all cases work fine with systemd 233, but only case 1,2 work
fine with systemd 229.

Case 3,4 can be treated as an event missing issue:
After seeing a LID "close" event, systemd 229 will wait several seconds
(HoldoffTimeoutSec) before suspending the platform. Thus on case 4
platforms, if users close lid, and re-open it during the
HoldoffTimeoutSec period, there is still no "open" events seen by the
userspace. Thus systemd still considers the last state as "close" and
suspends the platform after the HoldoffTimeoutSec times out.

Note that not only systemd 229, desktop managers (ex.,
gnome-settings-daemon) also suffer from this issue.

This patch tries to fix this issue by periodically sending _LID return
value to userspace, which ensures to trigger a SW_LID event when the
underlying hardware state has changed. As adding a periodic timer is not a
power friendly way, this patch prepares an option for users to enable on
failure platforms for old userspace programs.

Users can configure update interval via button.lid_update_interval.
This should be configured to a smaller value than HoldoffTimeoutSec in
/etc/systemd/logind.conf.

Cc: <[email protected]>
Cc: Benjamin Tissoires <[email protected]>
Cc: Peter Hutterer <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/button.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index 67a0d78..a8b119e 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -126,6 +126,14 @@ static unsigned long lid_notify_timeout __read_mostly = 10;
module_param(lid_notify_timeout, ulong, 0644);
MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");

+static bool lid_periodic_update __read_mostly = false;
+module_param(lid_periodic_update, bool, 0644);
+MODULE_PARM_DESC(lid_periodic_update, "Periodically sending lid state updates");
+
+static unsigned long lid_update_interval __read_mostly = 1 * MSEC_PER_SEC;
+module_param(lid_update_interval, ulong, 0644);
+MODULE_PARM_DESC(lid_update_interval, "Interval (ms) between lid periodic updates");
+
/* --------------------------------------------------------------------------
FS Interface (/proc)
-------------------------------------------------------------------------- */
@@ -395,6 +403,8 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
break;
case ACPI_BUTTON_LID_INIT_METHOD:
(void)acpi_lid_update_state(device);
+ if (lid_periodic_update)
+ acpi_lid_start_timer(device, lid_update_interval);
break;
case ACPI_BUTTON_LID_INIT_IGNORE:
default:
@@ -560,8 +570,11 @@ static int acpi_button_add(struct acpi_device *device)
* more we only care about the last one...
*/
lid_device = device;
- acpi_lid_start_timer(device,
- lid_notify_timeout * MSEC_PER_SEC);
+ if (lid_periodic_update)
+ acpi_lid_initialize_state(device);
+ else
+ acpi_lid_start_timer(device,
+ lid_notify_timeout * MSEC_PER_SEC);
}

printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
--
2.7.4

2017-06-21 08:55:38

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 5/5] ACPI: button: Add an optional workaround to fix a persistent close issue for old userspace

From: Benjamin Tissoires <[email protected]>

Because of the variation of firmware implementation, there is a chance
the LID state is unknown:
1. Some platforms send "open" ACPI notification to the OS and the event
arrive before the button driver is resumed;
2. Some platforms send "open" ACPI notification to the OS, but the
event
arrives after the button driver is resumed, ex., Samsung N210+;
3. Some platforms never send an "open" ACPI notification to the OS, but
update the cached _LID return value to "open", and this update
arrives
before the button driver is resumed;
4. Some platforms never send an "open" ACPI notification to the OS, but
update the cached _LID return value to "open", but this update
arrives
after the button driver is resumed, ex., Surface Pro 3;
5. Some platforms never send an "open" ACPI notification to the OS, and
_LID ACPI method returns a value which stays to "close", ex.,
Surface Pro 1.
Currently, all cases work find with systemd 233, but only case 1,2,3,4 work
fine with old userspace.

After fixing all the other issues for old userspace programs, case 5 is the
only case that the exported input node is still not fully compliant to
SW_LID ABI and thus needs quirks or ABI changes.

This patch provides a dynamic SW_LID input node solution to make sure we do
not export an input node with an unknown state to prevent suspend loops.

The database of unreliable devices is left to userspace to handle with a
hwdb file and a udev rule.

Reworked by Lv to make this solution optional, code based on top of old
"ignore" mode and make lid_reliable configurable to all lid devices to
eliminate the difficulties of synchronously handling global lid_device.

Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/button.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 88 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index 91c9989..f11045d 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -107,11 +107,13 @@ struct acpi_button {
unsigned int type;
struct input_dev *input;
struct timer_list lid_timer;
+ bool lid_reliable;
char phys[32]; /* for input device */
unsigned long pushed;
bool suspended;
};

+static DEFINE_MUTEX(lid_input_lock);
static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
static struct acpi_device *lid_device;
static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
@@ -128,6 +130,10 @@ static unsigned long lid_update_interval __read_mostly = 1 * MSEC_PER_SEC;
module_param(lid_update_interval, ulong, 0644);
MODULE_PARM_DESC(lid_update_interval, "Interval (ms) between lid periodic updates");

+static bool lid_unreliable __read_mostly = false;
+module_param(lid_unreliable, bool, 0644);
+MODULE_PARM_DESC(lid_unreliable, "Dynamically adding/removing input node for unreliable lids");
+
/* --------------------------------------------------------------------------
FS Interface (/proc)
-------------------------------------------------------------------------- */
@@ -152,6 +158,9 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
struct acpi_button *button = acpi_driver_data(device);
int ret;

+ if (!button->input)
+ return -EINVAL;
+
if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE)
input_report_switch(button->input, SW_LID, 0);

@@ -306,6 +315,8 @@ static void acpi_button_remove_input(struct acpi_device *device)
{
struct acpi_button *button = acpi_driver_data(device);

+ if (!button->input)
+ return;
input_unregister_device(button->input);
button->input = NULL;
}
@@ -316,6 +327,9 @@ static int acpi_button_add_input(struct acpi_device *device)
struct input_dev *input;
int error;

+ if (button->input)
+ return 0;
+
input = input_allocate_device();
if (!input) {
error = -ENOMEM;
@@ -378,8 +392,10 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
break;
case ACPI_BUTTON_LID_INIT_METHOD:
(void)acpi_lid_update_state(device);
+ mutex_unlock(&lid_input_lock);
if (lid_periodic_update)
acpi_lid_start_timer(device, lid_update_interval);
+ mutex_lock(&lid_input_lock);
break;
case ACPI_BUTTON_LID_INIT_IGNORE:
default:
@@ -391,7 +407,9 @@ static void acpi_lid_timeout(ulong arg)
{
struct acpi_device *device = (struct acpi_device *)arg;

+ mutex_lock(&lid_input_lock);
acpi_lid_initialize_state(device);
+ mutex_unlock(&lid_input_lock);
}

static void acpi_button_notify(struct acpi_device *device, u32 event)
@@ -406,7 +424,11 @@ static void acpi_button_notify(struct acpi_device *device, u32 event)
case ACPI_BUTTON_NOTIFY_STATUS:
input = button->input;
if (button->type == ACPI_BUTTON_TYPE_LID) {
+ mutex_lock(&lid_input_lock);
+ if (!button->input)
+ acpi_button_add_input(device);
acpi_lid_update_state(device);
+ mutex_unlock(&lid_input_lock);
} else {
int keycode;

@@ -440,8 +462,13 @@ static int acpi_button_suspend(struct device *dev)
struct acpi_device *device = to_acpi_device(dev);
struct acpi_button *button = acpi_driver_data(device);

- if (button->type == ACPI_BUTTON_TYPE_LID)
+ if (button->type == ACPI_BUTTON_TYPE_LID) {
+ mutex_lock(&lid_input_lock);
+ if (!button->lid_reliable)
+ acpi_button_remove_input(device);
+ mutex_unlock(&lid_input_lock);
del_timer(&button->lid_timer);
+ }
button->suspended = true;
return 0;
}
@@ -459,6 +486,38 @@ static int acpi_button_resume(struct device *dev)
}
#endif

+static ssize_t lid_reliable_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct acpi_device *device = to_acpi_device(dev);
+ struct acpi_button *button = acpi_driver_data(device);
+
+ return snprintf(buf, PAGE_SIZE, "%d\n", button->lid_reliable);
+}
+static ssize_t lid_reliable_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct acpi_device *device = to_acpi_device(dev);
+ struct acpi_button *button = acpi_driver_data(device);
+
+ mutex_lock(&lid_input_lock);
+ if (strtobool(buf, &button->lid_reliable) < 0) {
+ mutex_unlock(&lid_input_lock);
+ return -EINVAL;
+ }
+ if (button->lid_reliable)
+ acpi_button_add_input(device);
+ else {
+ lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
+ acpi_button_remove_input(device);
+ }
+ acpi_lid_initialize_state(device);
+ mutex_unlock(&lid_input_lock);
+ return count;
+}
+static DEVICE_ATTR_RW(lid_reliable);
+
static int acpi_button_add(struct acpi_device *device)
{
struct acpi_button *button;
@@ -507,21 +566,37 @@ static int acpi_button_add(struct acpi_device *device)

snprintf(button->phys, sizeof(button->phys), "%s/button/input0", hid);

- error = acpi_button_add_input(device);
- if (error)
- goto err_remove_fs;
-
if (button->type == ACPI_BUTTON_TYPE_LID) {
/*
* This assumes there's only one lid device, or if there are
* more we only care about the last one...
*/
lid_device = device;
+ device_create_file(&device->dev, &dev_attr_lid_reliable);
+ mutex_lock(&lid_input_lock);
+ error = acpi_button_add_input(device);
+ if (error) {
+ mutex_unlock(&lid_input_lock);
+ goto err_remove_fs;
+ }
+ if (lid_unreliable) {
+ lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
+ button->lid_reliable = false;
+ } else
+ button->lid_reliable = true;
if (lid_periodic_update)
acpi_lid_initialize_state(device);
- else
+ else {
+ mutex_unlock(&lid_input_lock);
acpi_lid_start_timer(device,
lid_notify_timeout * MSEC_PER_SEC);
+ mutex_lock(&lid_input_lock);
+ }
+ mutex_unlock(&lid_input_lock);
+ } else {
+ error = acpi_button_add_input(device);
+ if (error)
+ goto err_remove_fs;
}

printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
@@ -539,9 +614,14 @@ static int acpi_button_remove(struct acpi_device *device)
struct acpi_button *button = acpi_driver_data(device);

acpi_button_remove_fs(device);
- if (button->type == ACPI_BUTTON_TYPE_LID)
+ if (button->type == ACPI_BUTTON_TYPE_LID) {
+ mutex_lock(&lid_input_lock);
+ acpi_button_remove_input(device);
+ mutex_unlock(&lid_input_lock);
del_timer_sync(&button->lid_timer);
- acpi_button_remove_input(device);
+ device_remove_file(&device->dev, &dev_attr_lid_reliable);
+ } else
+ acpi_button_remove_input(device);
kfree(button);
return 0;
}
--
2.7.4

2017-06-21 08:55:28

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 3/5] ACPI: button: Rework lid_init_state=ignore mode

There are platform variations implementing ACPI lid device in different
ways:
1. Some platforms send "open" events to OS and the events arrive before
button driver is resumed;
2. Some platforms send "open" events to OS, but the events arrive after
button driver is resumed, ex., Samsung N210+;
3. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, and the update events arrive before
button driver is resumed;
4. Some platforms never send "open" events to OS, but send "open" events to
update the cached _LID return value, but the update events arrive after
button driver is resumed, ex., Surface Pro 3;
5. Some platforms never send "open" events, _LID returns value sticks to
"close", ex., Surface Pro 1.
Currently, all cases work find with systemd 233, but only case 1,2,3,4 work
fine with systemd 229.

After fixing all the other issues for old userspace programs, case 5 is the
only case that the exported input node is still not fully compliant to
SW_LID ABI and thus needs quirks or ABI changes.

The original button.lid_init_state=ignore ABI change solution is now too
complicated if its purpose is to only solve this final incompliant use
case. This patch re-works it by unconditionally prepending "open"
complement events.

Cc: <[email protected]>
Cc: Benjamin Tissoires <[email protected]>
Cc: Peter Hutterer <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/button.c | 85 +++------------------------------------------------
1 file changed, 5 insertions(+), 80 deletions(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index a8b119e..1256a8c 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -109,8 +109,6 @@ struct acpi_button {
struct timer_list lid_timer;
char phys[32]; /* for input device */
unsigned long pushed;
- int last_state;
- ktime_t last_time;
bool suspended;
};

@@ -118,10 +116,6 @@ static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
static struct acpi_device *lid_device;
static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;

-static unsigned long lid_report_interval __read_mostly = 500;
-module_param(lid_report_interval, ulong, 0644);
-MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key events");
-
static unsigned long lid_notify_timeout __read_mostly = 10;
module_param(lid_notify_timeout, ulong, 0644);
MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");
@@ -157,79 +151,12 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
{
struct acpi_button *button = acpi_driver_data(device);
int ret;
- ktime_t next_report;
- bool do_update;
-
- /*
- * In lid_init_state=ignore mode, if user opens/closes lid
- * frequently with "open" missing, and "last_time" is also updated
- * frequently, "close" cannot be delivered to the userspace.
- * So "last_time" is only updated after a timeout or an actual
- * switch.
- */
- if (lid_init_state != ACPI_BUTTON_LID_INIT_IGNORE ||
- button->last_state != !!state)
- do_update = true;
- else
- do_update = false;
-
- next_report = ktime_add(button->last_time,
- ms_to_ktime(lid_report_interval));
- if (button->last_state == !!state &&
- ktime_after(ktime_get(), next_report)) {
- /* Complain the buggy firmware */
- pr_warn_once("The lid device is not compliant to SW_LID.\n");

- /*
- * Send the unreliable complement switch event:
- *
- * On most platforms, the lid device is reliable. However
- * there are exceptions:
- * 1. Platforms returning initial lid state as "close" by
- * default after booting/resuming:
- * https://bugzilla.kernel.org/show_bug.cgi?id=89211
- * https://bugzilla.kernel.org/show_bug.cgi?id=106151
- * 2. Platforms never reporting "open" events:
- * https://bugzilla.kernel.org/show_bug.cgi?id=106941
- * On these buggy platforms, the usage model of the ACPI
- * lid device actually is:
- * 1. The initial returning value of _LID may not be
- * reliable.
- * 2. The open event may not be reliable.
- * 3. The close event is reliable.
- *
- * But SW_LID is typed as input switch event, the input
- * layer checks if the event is redundant. Hence if the
- * state is not switched, the userspace cannot see this
- * platform triggered reliable event. By inserting a
- * complement switch event, it then is guaranteed that the
- * platform triggered reliable one can always be seen by
- * the userspace.
- */
- if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE) {
- do_update = true;
- /*
- * Do generate complement switch event for "close"
- * as "close" is reliable and wrong "open" won't
- * trigger unexpected behaviors.
- * Do not generate complement switch event for
- * "open" as "open" is not reliable and wrong
- * "close" will trigger unexpected behaviors.
- */
- if (!state) {
- input_report_switch(button->input,
- SW_LID, state);
- input_sync(button->input);
- }
- }
- }
- /* Send the platform triggered reliable event */
- if (do_update) {
- input_report_switch(button->input, SW_LID, !state);
- input_sync(button->input);
- button->last_state = !!state;
- button->last_time = ktime_get();
- }
+ if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE)
+ input_report_switch(button->input, SW_LID, 0);
+
+ input_report_switch(button->input, SW_LID, !state);
+ input_sync(button->input);

if (state)
pm_wakeup_event(&device->dev, 0);
@@ -524,8 +451,6 @@ static int acpi_button_add(struct acpi_device *device)
strcpy(name, ACPI_BUTTON_DEVICE_NAME_LID);
sprintf(class, "%s/%s",
ACPI_BUTTON_CLASS, ACPI_BUTTON_SUBCLASS_LID);
- button->last_state = !!acpi_lid_evaluate_state(device);
- button->last_time = ktime_get();
init_timer(&button->lid_timer);
setup_timer(&button->lid_timer,
acpi_lid_timeout, (ulong)device);
--
2.7.4

2017-06-21 08:56:45

by Zheng, Lv

[permalink] [raw]
Subject: [RFC PATCH v6 4/5] ACPI: button: extract input creation/destruction helpers

From: Benjamin Tissoires <[email protected]>

When the LID switch ACPI implementation is unreliable, we might want
to remove the device when we are not sure about the state. This should
prevent any suspend/resume loops given that in that case, there will be
no more LID switch input node.

This patch prepares the dynamic creation/destruction of the input node.

Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
---
drivers/acpi/button.c | 86 +++++++++++++++++++++++++++++++--------------------
1 file changed, 53 insertions(+), 33 deletions(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index 1256a8c..91c9989 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -302,6 +302,54 @@ int acpi_lid_open(void)
}
EXPORT_SYMBOL(acpi_lid_open);

+static void acpi_button_remove_input(struct acpi_device *device)
+{
+ struct acpi_button *button = acpi_driver_data(device);
+
+ input_unregister_device(button->input);
+ button->input = NULL;
+}
+
+static int acpi_button_add_input(struct acpi_device *device)
+{
+ struct acpi_button *button = acpi_driver_data(device);
+ struct input_dev *input;
+ int error;
+
+ input = input_allocate_device();
+ if (!input) {
+ error = -ENOMEM;
+ goto err;
+ }
+
+ input->name = acpi_device_name(device);
+ input->phys = button->phys;
+ input->id.bustype = BUS_HOST;
+ input->id.product = button->type;
+ input->dev.parent = &device->dev;
+ switch (button->type) {
+ case ACPI_BUTTON_TYPE_POWER:
+ input_set_capability(input, EV_KEY, KEY_POWER);
+ break;
+ case ACPI_BUTTON_TYPE_SLEEP:
+ input_set_capability(input, EV_KEY, KEY_SLEEP);
+ break;
+ case ACPI_BUTTON_TYPE_LID:
+ input_set_capability(input, EV_SW, SW_LID);
+ break;
+ }
+
+ error = input_register_device(input);
+ if (error)
+ goto err;
+
+ button->input = input;
+ return 0;
+ err:
+ input_free_device(input);
+ return error;
+}
+
static int acpi_lid_update_state(struct acpi_device *device)
{
int state;
@@ -414,7 +462,6 @@ static int acpi_button_resume(struct device *dev)
static int acpi_button_add(struct acpi_device *device)
{
struct acpi_button *button;
- struct input_dev *input;
const char *hid = acpi_device_hid(device);
char *name, *class;
int error;
@@ -425,12 +472,6 @@ static int acpi_button_add(struct acpi_device *device)

device->driver_data = button;

- button->input = input = input_allocate_device();
- if (!input) {
- error = -ENOMEM;
- goto err_free_button;
- }
-
name = acpi_device_name(device);
class = acpi_device_class(device);

@@ -457,38 +498,19 @@ static int acpi_button_add(struct acpi_device *device)
} else {
printk(KERN_ERR PREFIX "Unsupported hid [%s]\n", hid);
error = -ENODEV;
- goto err_free_input;
+ goto err_free_button;
}

error = acpi_button_add_fs(device);
if (error)
- goto err_free_input;
+ goto err_free_button;

snprintf(button->phys, sizeof(button->phys), "%s/button/input0", hid);

- input->name = name;
- input->phys = button->phys;
- input->id.bustype = BUS_HOST;
- input->id.product = button->type;
- input->dev.parent = &device->dev;
-
- switch (button->type) {
- case ACPI_BUTTON_TYPE_POWER:
- input_set_capability(input, EV_KEY, KEY_POWER);
- break;
-
- case ACPI_BUTTON_TYPE_SLEEP:
- input_set_capability(input, EV_KEY, KEY_SLEEP);
- break;
-
- case ACPI_BUTTON_TYPE_LID:
- input_set_capability(input, EV_SW, SW_LID);
- break;
- }
-
- error = input_register_device(input);
+ error = acpi_button_add_input(device);
if (error)
goto err_remove_fs;
+
if (button->type == ACPI_BUTTON_TYPE_LID) {
/*
* This assumes there's only one lid device, or if there are
@@ -507,8 +529,6 @@ static int acpi_button_add(struct acpi_device *device)

err_remove_fs:
acpi_button_remove_fs(device);
- err_free_input:
- input_free_device(input);
err_free_button:
kfree(button);
return error;
@@ -521,7 +541,7 @@ static int acpi_button_remove(struct acpi_device *device)
acpi_button_remove_fs(device);
if (button->type == ACPI_BUTTON_TYPE_LID)
del_timer_sync(&button->lid_timer);
- input_unregister_device(button->input);
+ acpi_button_remove_input(device);
kfree(button);
return 0;
}
--
2.7.4

2017-06-23 14:02:57

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [RFC PATCH v6 1/5] ACPI: button: Add a workaround to fix an order issue for old userspace

On Jun 21 2017 or thereabouts, Lv Zheng wrote:
> There are platform variations implementing ACPI lid device in different
> ways:
> 1. Some platforms send "open" events to OS and the events arrive before
> button driver is resumed;
> 2. Some platforms send "open" events to OS, but the events arrive after
> button driver is resumed, ex., Samsung N210+;
> 3. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, and the update events arrive before
> button driver is resumed;
> 4. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, but the update events arrive after
> button driver is resumed, ex., Surface Pro 3;
> 5. Some platforms never send "open" events, _LID returns value sticks to
> "close", ex., Surface Pro 1.
> Currently, all cases work find with systemd 233, but only case 1 works fine
> with systemd 229.
>
> Case 2,4 can be treated as an order issue:
> If the button driver always evaluates _LID after seeing a LID
> notification, there shouldn't be such a problem.
>
> Note that this order issue cannot be fixed by sorting OS drivers' resume
> code:
> 1. When acpi.ec_freeze_events=Y, the order depends on whether PNP0C0D(LID)
> or PNP0C09(EC) appears first in the namespace (probe order).
> 2. Even when acpi.ec_freeze_events=N, which forces OS to handle EC events
> as early as possible, the order issue can still be reproduced due to
> platform delays (reproduce ratio is lower than case 1).
> 3. Some platform simply has a very big delay for LID open events.
>
> This patch tries to fix this issue for systemd 229 by defer sending initial
> lid state 10 seconds later after resume, which ensures EC events can always
> arrive before button driver evaluates _LID. It finally only fixes case 2
> platforms as fixing case 4 platforms needs additional efforts (see known
> issue below).
>
> The users can configure wait timeout via button.lid_notify_timeout.
>
> Known issu:
> 1. systemd/logind 229 still mis-bahaves with case 3,4 platforms
> After seeing a LID "close" event, systemd 229 will wait several seconds
> (HoldoffTimeoutSec) before suspending the platform. Thus on case 4
> platforms, if users close lid, and re-open it during the
> HoldoffTimeoutSec period, there is still no "open" events seen by the
> userspace. Thus systemd still considers the last state as "close" and
> suspends the platform after the HoldoffTimeoutSec times out.
>
> Cc: <[email protected]>
> Cc: Benjamin Tissoires <[email protected]>
> Cc: Peter Hutterer <[email protected]>
> Signed-off-by: Lv Zheng <[email protected]>
> ---

NACK on this one (at least in this current form): You are presenting a
device to user space with an unknown state.
If you want to delay the query of the state, you also need to delay the
call of input_register_device().

Cheers,
Benjamin

> drivers/acpi/button.c | 36 ++++++++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index e19f530..67a0d78 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -28,6 +28,7 @@
> #include <linux/proc_fs.h>
> #include <linux/seq_file.h>
> #include <linux/input.h>
> +#include <linux/timer.h>
> #include <linux/slab.h>
> #include <linux/acpi.h>
> #include <acpi/button.h>
> @@ -79,6 +80,7 @@ MODULE_DEVICE_TABLE(acpi, button_device_ids);
> static int acpi_button_add(struct acpi_device *device);
> static int acpi_button_remove(struct acpi_device *device);
> static void acpi_button_notify(struct acpi_device *device, u32 event);
> +static void acpi_lid_timeout(ulong arg);
>
> #ifdef CONFIG_PM_SLEEP
> static int acpi_button_suspend(struct device *dev);
> @@ -104,6 +106,7 @@ static struct acpi_driver acpi_button_driver = {
> struct acpi_button {
> unsigned int type;
> struct input_dev *input;
> + struct timer_list lid_timer;
> char phys[32]; /* for input device */
> unsigned long pushed;
> int last_state;
> @@ -119,6 +122,10 @@ static unsigned long lid_report_interval __read_mostly = 500;
> module_param(lid_report_interval, ulong, 0644);
> MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key events");
>
> +static unsigned long lid_notify_timeout __read_mostly = 10;
> +module_param(lid_notify_timeout, ulong, 0644);
> +MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");
> +
> /* --------------------------------------------------------------------------
> FS Interface (/proc)
> -------------------------------------------------------------------------- */
> @@ -371,6 +378,15 @@ static int acpi_lid_update_state(struct acpi_device *device)
> return acpi_lid_notify_state(device, state);
> }
>
> +static void acpi_lid_start_timer(struct acpi_device *device,
> + unsigned long msecs)
> +{
> + struct acpi_button *button = acpi_driver_data(device);
> +
> + mod_timer(&button->lid_timer,
> + jiffies + msecs_to_jiffies(msecs));
> +}
> +
> static void acpi_lid_initialize_state(struct acpi_device *device)
> {
> switch (lid_init_state) {
> @@ -386,6 +402,13 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> }
> }
>
> +static void acpi_lid_timeout(ulong arg)
> +{
> + struct acpi_device *device = (struct acpi_device *)arg;
> +
> + acpi_lid_initialize_state(device);
> +}
> +
> static void acpi_button_notify(struct acpi_device *device, u32 event)
> {
> struct acpi_button *button = acpi_driver_data(device);
> @@ -432,6 +455,8 @@ static int acpi_button_suspend(struct device *dev)
> struct acpi_device *device = to_acpi_device(dev);
> struct acpi_button *button = acpi_driver_data(device);
>
> + if (button->type == ACPI_BUTTON_TYPE_LID)
> + del_timer(&button->lid_timer);
> button->suspended = true;
> return 0;
> }
> @@ -443,7 +468,8 @@ static int acpi_button_resume(struct device *dev)
>
> button->suspended = false;
> if (button->type == ACPI_BUTTON_TYPE_LID)
> - acpi_lid_initialize_state(device);
> + acpi_lid_start_timer(device,
> + lid_notify_timeout * MSEC_PER_SEC);
> return 0;
> }
> #endif
> @@ -490,6 +516,9 @@ static int acpi_button_add(struct acpi_device *device)
> ACPI_BUTTON_CLASS, ACPI_BUTTON_SUBCLASS_LID);
> button->last_state = !!acpi_lid_evaluate_state(device);
> button->last_time = ktime_get();
> + init_timer(&button->lid_timer);
> + setup_timer(&button->lid_timer,
> + acpi_lid_timeout, (ulong)device);
> } else {
> printk(KERN_ERR PREFIX "Unsupported hid [%s]\n", hid);
> error = -ENODEV;
> @@ -526,12 +555,13 @@ static int acpi_button_add(struct acpi_device *device)
> if (error)
> goto err_remove_fs;
> if (button->type == ACPI_BUTTON_TYPE_LID) {
> - acpi_lid_initialize_state(device);
> /*
> * This assumes there's only one lid device, or if there are
> * more we only care about the last one...
> */
> lid_device = device;
> + acpi_lid_start_timer(device,
> + lid_notify_timeout * MSEC_PER_SEC);
> }
>
> printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
> @@ -551,6 +581,8 @@ static int acpi_button_remove(struct acpi_device *device)
> struct acpi_button *button = acpi_driver_data(device);
>
> acpi_button_remove_fs(device);
> + if (button->type == ACPI_BUTTON_TYPE_LID)
> + del_timer_sync(&button->lid_timer);
> input_unregister_device(button->input);
> kfree(button);
> return 0;
> --
> 2.7.4
>

2017-06-23 14:03:23

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [RFC PATCH v6 2/5] ACPI: button: Add an optional workaround to fix an event missing issue for old userspace

On Jun 21 2017 or thereabouts, Lv Zheng wrote:
> There are platform variations implementing ACPI lid device in different
> ways:
> 1. Some platforms send "open" events to OS and the events arrive before
> button driver is resumed;
> 2. Some platforms send "open" events to OS, but the events arrive after
> button driver is resumed, ex., Samsung N210+;
> 3. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, and the update events arrive before
> button driver is resumed;
> 4. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, but the update events arrive after
> button driver is resumed, ex., Surface Pro 3;
> 5. Some platforms never send "open" events, _LID returns value sticks to
> "close", ex., Surface Pro 1.
> Currently, all cases work fine with systemd 233, but only case 1,2 work
> fine with systemd 229.
>
> Case 3,4 can be treated as an event missing issue:
> After seeing a LID "close" event, systemd 229 will wait several seconds
> (HoldoffTimeoutSec) before suspending the platform. Thus on case 4
> platforms, if users close lid, and re-open it during the
> HoldoffTimeoutSec period, there is still no "open" events seen by the
> userspace. Thus systemd still considers the last state as "close" and
> suspends the platform after the HoldoffTimeoutSec times out.
>
> Note that not only systemd 229, desktop managers (ex.,
> gnome-settings-daemon) also suffer from this issue.
>
> This patch tries to fix this issue by periodically sending _LID return
> value to userspace, which ensures to trigger a SW_LID event when the
> underlying hardware state has changed. As adding a periodic timer is not a
> power friendly way, this patch prepares an option for users to enable on
> failure platforms for old userspace programs.
>
> Users can configure update interval via button.lid_update_interval.
> This should be configured to a smaller value than HoldoffTimeoutSec in
> /etc/systemd/logind.conf.
>
> Cc: <[email protected]>
> Cc: Benjamin Tissoires <[email protected]>
> Cc: Peter Hutterer <[email protected]>
> Signed-off-by: Lv Zheng <[email protected]>
> ---

NACK: what is the point if you just want to provide an event to user
space? We already explained to you that the events do not matter, only
the state is. If user space has an issue with a state not being in sync,
it's a user space bug and it has to be fixed in user space.

This patch could be useful if the purpose was to monitor the changes
while the state is unreliable (in case 4):
- if _LID returns a different value than right after suspend, it is
expected to be something reliable, and so we could then re-create the
input node and present it to user space
- if _LID still returns the same value after a somewhat long delay
(10-20 seconds), then we can consider the value to be correct and
re-create the input node.

Polling for polling and hoping user space will have more chances of
seeing an event for an EV_SW is moot.

Cheers,
Benjamin

> drivers/acpi/button.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index 67a0d78..a8b119e 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -126,6 +126,14 @@ static unsigned long lid_notify_timeout __read_mostly = 10;
> module_param(lid_notify_timeout, ulong, 0644);
> MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");
>
> +static bool lid_periodic_update __read_mostly = false;
> +module_param(lid_periodic_update, bool, 0644);
> +MODULE_PARM_DESC(lid_periodic_update, "Periodically sending lid state updates");
> +
> +static unsigned long lid_update_interval __read_mostly = 1 * MSEC_PER_SEC;
> +module_param(lid_update_interval, ulong, 0644);
> +MODULE_PARM_DESC(lid_update_interval, "Interval (ms) between lid periodic updates");
> +
> /* --------------------------------------------------------------------------
> FS Interface (/proc)
> -------------------------------------------------------------------------- */
> @@ -395,6 +403,8 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> break;
> case ACPI_BUTTON_LID_INIT_METHOD:
> (void)acpi_lid_update_state(device);
> + if (lid_periodic_update)
> + acpi_lid_start_timer(device, lid_update_interval);
> break;
> case ACPI_BUTTON_LID_INIT_IGNORE:
> default:
> @@ -560,8 +570,11 @@ static int acpi_button_add(struct acpi_device *device)
> * more we only care about the last one...
> */
> lid_device = device;
> - acpi_lid_start_timer(device,
> - lid_notify_timeout * MSEC_PER_SEC);
> + if (lid_periodic_update)
> + acpi_lid_initialize_state(device);
> + else
> + acpi_lid_start_timer(device,
> + lid_notify_timeout * MSEC_PER_SEC);
> }
>
> printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
> --
> 2.7.4
>

2017-06-23 14:03:38

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [RFC PATCH v6 3/5] ACPI: button: Rework lid_init_state=ignore mode

On Jun 21 2017 or thereabouts, Lv Zheng wrote:
> There are platform variations implementing ACPI lid device in different
> ways:
> 1. Some platforms send "open" events to OS and the events arrive before
> button driver is resumed;
> 2. Some platforms send "open" events to OS, but the events arrive after
> button driver is resumed, ex., Samsung N210+;
> 3. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, and the update events arrive before
> button driver is resumed;
> 4. Some platforms never send "open" events to OS, but send "open" events to
> update the cached _LID return value, but the update events arrive after
> button driver is resumed, ex., Surface Pro 3;
> 5. Some platforms never send "open" events, _LID returns value sticks to
> "close", ex., Surface Pro 1.
> Currently, all cases work find with systemd 233, but only case 1,2,3,4 work
> fine with systemd 229.
>
> After fixing all the other issues for old userspace programs, case 5 is the
> only case that the exported input node is still not fully compliant to
> SW_LID ABI and thus needs quirks or ABI changes.
>
> The original button.lid_init_state=ignore ABI change solution is now too
> complicated if its purpose is to only solve this final incompliant use
> case. This patch re-works it by unconditionally prepending "open"
> complement events.

Ouch. I thought the purpose of "ignore" was to not query the state on
boot/resume and only rely on acpi notification to switch the states.

I like the patch, the commit message is IMO too far from what it
actually does:
- it removes the caching update of _LID switch
- it removes the filtering of too close notifications.

One more note inlined:

>
> Cc: <[email protected]>
> Cc: Benjamin Tissoires <[email protected]>
> Cc: Peter Hutterer <[email protected]>
> Signed-off-by: Lv Zheng <[email protected]>
> ---
> drivers/acpi/button.c | 85 +++------------------------------------------------
> 1 file changed, 5 insertions(+), 80 deletions(-)
>
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index a8b119e..1256a8c 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -109,8 +109,6 @@ struct acpi_button {
> struct timer_list lid_timer;
> char phys[32]; /* for input device */
> unsigned long pushed;
> - int last_state;
> - ktime_t last_time;
> bool suspended;
> };
>
> @@ -118,10 +116,6 @@ static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> static struct acpi_device *lid_device;
> static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
>
> -static unsigned long lid_report_interval __read_mostly = 500;
> -module_param(lid_report_interval, ulong, 0644);
> -MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key events");
> -

You shouldn't do that. It is kernel ABI, and if you remove the module,
people that tuned their kernel with this parameter will have the module
failing to load. You should keep this around for a few kernel releases,
and mark its usage deprecated by issuing a warning in the dmesg.

Cheers,
Benjamin

> static unsigned long lid_notify_timeout __read_mostly = 10;
> module_param(lid_notify_timeout, ulong, 0644);
> MODULE_PARM_DESC(lid_notify_timeout, "Timeout (s) before receiving lid notification");
> @@ -157,79 +151,12 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
> {
> struct acpi_button *button = acpi_driver_data(device);
> int ret;
> - ktime_t next_report;
> - bool do_update;
> -
> - /*
> - * In lid_init_state=ignore mode, if user opens/closes lid
> - * frequently with "open" missing, and "last_time" is also updated
> - * frequently, "close" cannot be delivered to the userspace.
> - * So "last_time" is only updated after a timeout or an actual
> - * switch.
> - */
> - if (lid_init_state != ACPI_BUTTON_LID_INIT_IGNORE ||
> - button->last_state != !!state)
> - do_update = true;
> - else
> - do_update = false;
> -
> - next_report = ktime_add(button->last_time,
> - ms_to_ktime(lid_report_interval));
> - if (button->last_state == !!state &&
> - ktime_after(ktime_get(), next_report)) {
> - /* Complain the buggy firmware */
> - pr_warn_once("The lid device is not compliant to SW_LID.\n");
>
> - /*
> - * Send the unreliable complement switch event:
> - *
> - * On most platforms, the lid device is reliable. However
> - * there are exceptions:
> - * 1. Platforms returning initial lid state as "close" by
> - * default after booting/resuming:
> - * https://bugzilla.kernel.org/show_bug.cgi?id=89211
> - * https://bugzilla.kernel.org/show_bug.cgi?id=106151
> - * 2. Platforms never reporting "open" events:
> - * https://bugzilla.kernel.org/show_bug.cgi?id=106941
> - * On these buggy platforms, the usage model of the ACPI
> - * lid device actually is:
> - * 1. The initial returning value of _LID may not be
> - * reliable.
> - * 2. The open event may not be reliable.
> - * 3. The close event is reliable.
> - *
> - * But SW_LID is typed as input switch event, the input
> - * layer checks if the event is redundant. Hence if the
> - * state is not switched, the userspace cannot see this
> - * platform triggered reliable event. By inserting a
> - * complement switch event, it then is guaranteed that the
> - * platform triggered reliable one can always be seen by
> - * the userspace.
> - */
> - if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE) {
> - do_update = true;
> - /*
> - * Do generate complement switch event for "close"
> - * as "close" is reliable and wrong "open" won't
> - * trigger unexpected behaviors.
> - * Do not generate complement switch event for
> - * "open" as "open" is not reliable and wrong
> - * "close" will trigger unexpected behaviors.
> - */
> - if (!state) {
> - input_report_switch(button->input,
> - SW_LID, state);
> - input_sync(button->input);
> - }
> - }
> - }
> - /* Send the platform triggered reliable event */
> - if (do_update) {
> - input_report_switch(button->input, SW_LID, !state);
> - input_sync(button->input);
> - button->last_state = !!state;
> - button->last_time = ktime_get();
> - }
> + if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE)
> + input_report_switch(button->input, SW_LID, 0);
> +
> + input_report_switch(button->input, SW_LID, !state);
> + input_sync(button->input);
>
> if (state)
> pm_wakeup_event(&device->dev, 0);
> @@ -524,8 +451,6 @@ static int acpi_button_add(struct acpi_device *device)
> strcpy(name, ACPI_BUTTON_DEVICE_NAME_LID);
> sprintf(class, "%s/%s",
> ACPI_BUTTON_CLASS, ACPI_BUTTON_SUBCLASS_LID);
> - button->last_state = !!acpi_lid_evaluate_state(device);
> - button->last_time = ktime_get();
> init_timer(&button->lid_timer);
> setup_timer(&button->lid_timer,
> acpi_lid_timeout, (ulong)device);
> --
> 2.7.4
>

2017-06-23 14:03:59

by Benjamin Tissoires

[permalink] [raw]
Subject: Re: [RFC PATCH v6 5/5] ACPI: button: Add an optional workaround to fix a persistent close issue for old userspace

On Jun 21 2017 or thereabouts, Lv Zheng wrote:
> From: Benjamin Tissoires <[email protected]>
>
> Because of the variation of firmware implementation, there is a chance
> the LID state is unknown:
> 1. Some platforms send "open" ACPI notification to the OS and the event
> arrive before the button driver is resumed;
> 2. Some platforms send "open" ACPI notification to the OS, but the
> event
> arrives after the button driver is resumed, ex., Samsung N210+;
> 3. Some platforms never send an "open" ACPI notification to the OS, but
> update the cached _LID return value to "open", and this update
> arrives
> before the button driver is resumed;
> 4. Some platforms never send an "open" ACPI notification to the OS, but
> update the cached _LID return value to "open", but this update
> arrives
> after the button driver is resumed, ex., Surface Pro 3;
> 5. Some platforms never send an "open" ACPI notification to the OS, and
> _LID ACPI method returns a value which stays to "close", ex.,
> Surface Pro 1.
> Currently, all cases work find with systemd 233, but only case 1,2,3,4 work
> fine with old userspace.
>
> After fixing all the other issues for old userspace programs, case 5 is the
> only case that the exported input node is still not fully compliant to
> SW_LID ABI and thus needs quirks or ABI changes.
>
> This patch provides a dynamic SW_LID input node solution to make sure we do
> not export an input node with an unknown state to prevent suspend loops.
>
> The database of unreliable devices is left to userspace to handle with a
> hwdb file and a udev rule.
>
> Reworked by Lv to make this solution optional, code based on top of old
> "ignore" mode and make lid_reliable configurable to all lid devices to
> eliminate the difficulties of synchronously handling global lid_device.
>

Well, you changed it enough to make it wrong now. See inlined:

> Signed-off-by: Benjamin Tissoires <[email protected]>
> Signed-off-by: Lv Zheng <[email protected]>
> ---
> drivers/acpi/button.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 88 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index 91c9989..f11045d 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -107,11 +107,13 @@ struct acpi_button {
> unsigned int type;
> struct input_dev *input;
> struct timer_list lid_timer;
> + bool lid_reliable;
> char phys[32]; /* for input device */
> unsigned long pushed;
> bool suspended;
> };
>
> +static DEFINE_MUTEX(lid_input_lock);
> static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> static struct acpi_device *lid_device;
> static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> @@ -128,6 +130,10 @@ static unsigned long lid_update_interval __read_mostly = 1 * MSEC_PER_SEC;
> module_param(lid_update_interval, ulong, 0644);
> MODULE_PARM_DESC(lid_update_interval, "Interval (ms) between lid periodic updates");
>
> +static bool lid_unreliable __read_mostly = false;
> +module_param(lid_unreliable, bool, 0644);
> +MODULE_PARM_DESC(lid_unreliable, "Dynamically adding/removing input node for unreliable lids");
> +
> /* --------------------------------------------------------------------------
> FS Interface (/proc)
> -------------------------------------------------------------------------- */
> @@ -152,6 +158,9 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
> struct acpi_button *button = acpi_driver_data(device);
> int ret;
>
> + if (!button->input)
> + return -EINVAL;
> +
> if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE)
> input_report_switch(button->input, SW_LID, 0);
>
> @@ -306,6 +315,8 @@ static void acpi_button_remove_input(struct acpi_device *device)
> {
> struct acpi_button *button = acpi_driver_data(device);
>
> + if (!button->input)
> + return;
> input_unregister_device(button->input);
> button->input = NULL;
> }
> @@ -316,6 +327,9 @@ static int acpi_button_add_input(struct acpi_device *device)
> struct input_dev *input;
> int error;
>
> + if (button->input)
> + return 0;
> +
> input = input_allocate_device();
> if (!input) {
> error = -ENOMEM;
> @@ -378,8 +392,10 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> break;
> case ACPI_BUTTON_LID_INIT_METHOD:
> (void)acpi_lid_update_state(device);
> + mutex_unlock(&lid_input_lock);
> if (lid_periodic_update)
> acpi_lid_start_timer(device, lid_update_interval);
> + mutex_lock(&lid_input_lock);
> break;
> case ACPI_BUTTON_LID_INIT_IGNORE:
> default:
> @@ -391,7 +407,9 @@ static void acpi_lid_timeout(ulong arg)
> {
> struct acpi_device *device = (struct acpi_device *)arg;
>
> + mutex_lock(&lid_input_lock);
> acpi_lid_initialize_state(device);
> + mutex_unlock(&lid_input_lock);
> }
>
> static void acpi_button_notify(struct acpi_device *device, u32 event)
> @@ -406,7 +424,11 @@ static void acpi_button_notify(struct acpi_device *device, u32 event)
> case ACPI_BUTTON_NOTIFY_STATUS:
> input = button->input;
> if (button->type == ACPI_BUTTON_TYPE_LID) {
> + mutex_lock(&lid_input_lock);
> + if (!button->input)
> + acpi_button_add_input(device);
> acpi_lid_update_state(device);
> + mutex_unlock(&lid_input_lock);
> } else {
> int keycode;
>
> @@ -440,8 +462,13 @@ static int acpi_button_suspend(struct device *dev)
> struct acpi_device *device = to_acpi_device(dev);
> struct acpi_button *button = acpi_driver_data(device);
>
> - if (button->type == ACPI_BUTTON_TYPE_LID)
> + if (button->type == ACPI_BUTTON_TYPE_LID) {
> + mutex_lock(&lid_input_lock);
> + if (!button->lid_reliable)
> + acpi_button_remove_input(device);
> + mutex_unlock(&lid_input_lock);
> del_timer(&button->lid_timer);
> + }
> button->suspended = true;
> return 0;
> }
> @@ -459,6 +486,38 @@ static int acpi_button_resume(struct device *dev)
> }
> #endif
>
> +static ssize_t lid_reliable_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct acpi_device *device = to_acpi_device(dev);
> + struct acpi_button *button = acpi_driver_data(device);
> +
> + return snprintf(buf, PAGE_SIZE, "%d\n", button->lid_reliable);
> +}
> +static ssize_t lid_reliable_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct acpi_device *device = to_acpi_device(dev);
> + struct acpi_button *button = acpi_driver_data(device);
> +
> + mutex_lock(&lid_input_lock);
> + if (strtobool(buf, &button->lid_reliable) < 0) {
> + mutex_unlock(&lid_input_lock);
> + return -EINVAL;
> + }
> + if (button->lid_reliable)
> + acpi_button_add_input(device);
> + else {
> + lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;

Why would you link the "ignore" option to those unreliable switches?
When the input node is registered, we know that the _LID method gets
reliable, so why would you not query its state?

> + acpi_button_remove_input(device);
> + }
> + acpi_lid_initialize_state(device);
> + mutex_unlock(&lid_input_lock);
> + return count;
> +}
> +static DEVICE_ATTR_RW(lid_reliable);
> +
> static int acpi_button_add(struct acpi_device *device)
> {
> struct acpi_button *button;
> @@ -507,21 +566,37 @@ static int acpi_button_add(struct acpi_device *device)
>
> snprintf(button->phys, sizeof(button->phys), "%s/button/input0", hid);
>
> - error = acpi_button_add_input(device);
> - if (error)
> - goto err_remove_fs;
> -
> if (button->type == ACPI_BUTTON_TYPE_LID) {
> /*
> * This assumes there's only one lid device, or if there are
> * more we only care about the last one...
> */
> lid_device = device;
> + device_create_file(&device->dev, &dev_attr_lid_reliable);
> + mutex_lock(&lid_input_lock);
> + error = acpi_button_add_input(device);
> + if (error) {
> + mutex_unlock(&lid_input_lock);
> + goto err_remove_fs;
> + }
> + if (lid_unreliable) {
> + lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> + button->lid_reliable = false;
> + } else
> + button->lid_reliable = true;

You can not add the input node if you mark it later on as unreliable.
You are presenting an unknown state to user space, which is bad.

Cheers,
Benjamin

> if (lid_periodic_update)
> acpi_lid_initialize_state(device);
> - else
> + else {
> + mutex_unlock(&lid_input_lock);
> acpi_lid_start_timer(device,
> lid_notify_timeout * MSEC_PER_SEC);
> + mutex_lock(&lid_input_lock);
> + }
> + mutex_unlock(&lid_input_lock);
> + } else {
> + error = acpi_button_add_input(device);
> + if (error)
> + goto err_remove_fs;
> }
>
> printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
> @@ -539,9 +614,14 @@ static int acpi_button_remove(struct acpi_device *device)
> struct acpi_button *button = acpi_driver_data(device);
>
> acpi_button_remove_fs(device);
> - if (button->type == ACPI_BUTTON_TYPE_LID)
> + if (button->type == ACPI_BUTTON_TYPE_LID) {
> + mutex_lock(&lid_input_lock);
> + acpi_button_remove_input(device);
> + mutex_unlock(&lid_input_lock);
> del_timer_sync(&button->lid_timer);
> - acpi_button_remove_input(device);
> + device_remove_file(&device->dev, &dev_attr_lid_reliable);
> + } else
> + acpi_button_remove_input(device);
> kfree(button);
> return 0;
> }
> --
> 2.7.4
>

2017-06-30 01:34:32

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [ACPI] c4a9ff748c: BUG:scheduling_while_atomic


FYI, we noticed the following commit:

commit: c4a9ff748c1ca2fe81e012ac32e1daa70702778b ("ACPI: button: Add a workaround to fix an order issue for old userspace")
url: https://github.com/0day-ci/linux/commits/Lv-Zheng/ACPI-button-Fix-button-lid_init_state-method-mode/20170622-143854


in testcase: netperf
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 1
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/


on test machine: 4 threads Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz with 4G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------------------------------------------------------+-----------+------------+
| | v4.12-rc6 | c4a9ff748c |
+-----------------------------------------------------------------------------------------------------+-----------+------------+
| boot_successes | 9404 | 3 |
| boot_failures | 2560 | 9 |
| ACPI_Error:Field[CPB3]at#exceeds_Buffer[NULL]size#(bits)(#/dsopcode-#) | 365 | |
| ACPI_Error:Method_parse/execution_failed[~_SB._OSC](Node#),AE_AML_BUFFER_LIMIT(#/psparse-#) | 365 | |
| cpu_clock_throttled | 362 | |
| Kernel_panic-not_syncing:Fatal_hardware_error | 2 | |
| Mem-Info | 560 | |
| invoked_oom-killer:gfp_mask=0x | 528 | |
| Out_of_memory:Kill_process | 80 | |
| BUG:Bad_page_state_in_process | 25 | |
| BUG:Bad_rss-counter_state_mm:#idx:#val | 18 | |
| BUG:Bad_page_map_in_process | 7 | |
| page_allocation_failure:order:#,mode:#(GFP_ATOMIC|__GFP_COMP|__GFP_NOTRACK),nodemask=(null) | 3 | |
| WARNING:at_lib/list_debug.c:#__list_add_valid | 2 | |
| INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 2 | |
| BUG:Bad_page_state:#messages_suppressed | 1 | |
| WARNING:at_drivers/gpu/drm/drm_irq.c:#drm_wait_one_vblank[drm] | 19 | |
| WARNING:at_lib/list_debug.c:#__list_del_entry_valid | 2 | |
| general_protection_fault:#[##] | 2 | |
| Kernel_panic-not_syncing:Fatal_exception | 3 | |
| page_allocation_failure:order:#,mode:#(GFP_ATOMIC|__GFP_COMP|__GFP_ZERO),nodemask=(null) | 18 | |
| WARNING:at_drivers/gpu/drm/drm_vblank.c:#drm_wait_one_vblank[drm] | 5 | |
| page_allocation_failure:order:#,mode:#(GFP_KERNEL|__GFP_NORETRY),nodemask=(null) | 11 | |
| BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c | 4 | |
| BUG:non-zero_nr_ptes_on_freeing_mm | 13 | |
| ACPI_Error:Field[CPB3]at_bit_offset/length#exceeds_size_of_target_Buffer(#bits)(#/dsopcode-#) | 116 | |
| ACPI_Error:Method_parse/execution_failed~_SB._OSC,AE_AML_BUFFER_LIMIT(#/psparse-#) | 116 | |
| ACPI_Error:Method_parse/execution_failed~_SB.PC00.IOTR._CRS,AE_AML_NO_RESOURCE_END_TAG(#/psparse-#) | 129 | |
| ACPI_Error:Method_execution_failed~_SB.PC00.IOTR._CRS,AE_AML_NO_RESOURCE_END_TAG(#/uteval-#) | 129 | |
| Kernel_panic-not_syncing:Hard_LOCKUP | 103 | |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 20 | |
| ACPI_Error:[~_SB_.PCI0.XHC_.RHUB.HS11]Namespace_lookup_failure,AE_NOT_FOUND(#/dswload-#) | 15 | |
| ACPI_Error:#table_load_failures,#successful(#/tbxfload-#) | 15 | |
| WARNING:at_drivers/gpu/drm/i915/intel_display.c:#intel_modeset_init[i915] | 259 | 9 |
| BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c | 3 | 1 |
| INFO:creating/lkp/benchmarks/ltp/output_directory | 351 | |
| INFO:creating/lkp/benchmarks/ltp/results_directory | 351 | |
| INFO:ltp-pan_reported_some_tests_FAIL | 88 | |
| INFO:ltp-pan_reported_all_tests_PASS | 208 | |
| page_allocation_failure:order:#,mode:#(GFP_HIGHUSER_MOVABLE|__GFP_ZERO),nodemask=(null) | 2 | |
| WARNING:at_kernel/events/core.c:#add_event_to_ctx | 5 | |
| WARNING:at_kernel/events/core.c:#event_sched_out | 5 | |
| BUG:sleeping_function_called_from_invalid_context_at_kernel/irq/manage.c | 1 | |
| calltrace:init_netconsole | 1 | |
| calltrace:SyS_finit_module | 2 | |
| calltrace:SyS_write | 1 | |
| BUG:unable_to_handle_kernel | 1 | |
| Oops:#[##] | 1 | |
| WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 1 | |
| calltrace:parport_pc_init | 1 | |
| WARNING:at_lib/kobject.c:#kobject_add_internal | 1 | |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 447 | |
| BUG:kernel_hang_in_test_stage | 263 | |
| BUG:soft_lockup-CPU##stuck_for#s | 1 | |
| Kernel_panic-not_syncing:softlockup:hung_tasks | 1 | |
| BUG:kernel_reboot-without-warning_in_boot_stage | 1 | |
| BUG:kernel_reboot-without-warning_in_test_stage | 12 | |
| WARNING:at_kernel/workqueue.c:#process_one_work | 1 | |
| BUG:scheduling_while_atomic | 0 | 8 |
| WARNING:at_kernel/time/timer.c:#call_timer_fn | 0 | 8 |
+-----------------------------------------------------------------------------------------------------+-----------+------------+



[ 14.828015] BUG: scheduling while atomic: swapper/0/0/0x00000102
[ 14.828552] Modules linked in:
[ 14.828846] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc6-00001-gc4a9ff74 #1
[ 14.829454] Hardware name: LENOVO IdeaPad U410 /Lenovo , BIOS 65CN15WW 06/05/2012
[ 14.830113] Call Trace:
[ 14.830363] <IRQ>
[ 14.831578] dump_stack+0x63/0x86
[ 14.831887] __schedule_bug+0x54/0x70
[ 14.832212] __schedule+0x633/0x850
[ 14.832530] schedule+0x3d/0x90
[ 14.832826] schedule_timeout+0x174/0x310
[ 14.833176] ? call_timer_fn+0x150/0x150
[ 14.833518] ec_guard+0x1c6/0x1d0
[ 14.833821] ? ec_guard+0x1c6/0x1d0
[ 14.834135] ? remove_wait_queue+0x70/0x70
[ 14.834487] acpi_ec_transaction+0x181/0x3b0
[ 14.834850] acpi_ec_read+0x42/0x50
[ 14.835166] acpi_ec_space_handler+0xa5/0x240
[ 14.835534] ? __update_load_avg_se+0x15b/0x180
[ 14.835955] ? acpi_os_signal_semaphore+0x4d/0x90
[ 14.836345] ? acpi_ut_release_mutex+0x11c/0x127
[ 14.836727] acpi_ev_address_space_dispatch+0x2ce/0x33a
[ 14.837147] ? ec_transaction+0x50/0x50
[ 14.837483] acpi_ex_access_region+0x414/0x4ca
[ 14.837858] ? __might_sleep+0x4a/0x80
[ 14.838187] acpi_ex_field_datum_io+0x178/0x3fa
[ 14.838564] acpi_ex_extract_from_field+0x151/0x2bb
[ 14.838964] acpi_ex_read_data_from_field+0x380/0x3d0
[ 14.839374] acpi_ex_resolve_node_to_value+0x326/0x443
[ 14.839790] acpi_ex_resolve_to_value+0x374/0x40e
[ 14.840179] acpi_ds_evaluate_name_path+0xa3/0x143
[ 14.840573] acpi_ds_exec_end_op+0xd1/0x6c6
[ 14.840930] acpi_ps_parse_loop+0x819/0x8b5
[ 14.841287] ? acpi_ut_trace+0x26/0x66
[ 14.841617] acpi_ps_parse_aml+0x1ac/0x4a1
[ 14.841968] acpi_ps_execute_method+0x1f4/0x2b6
[ 14.842349] acpi_ns_evaluate+0x2ee/0x435
[ 14.842696] acpi_evaluate_object+0x178/0x38e
[ 14.843065] ? acpi_lid_update_state+0x50/0x50
[ 14.843439] acpi_evaluate_integer+0x3e/0xd0
[ 14.843802] acpi_lid_update_state+0x27/0x50
[ 14.844164] acpi_lid_timeout+0x1d/0x30
[ 14.844501] call_timer_fn+0x35/0x150
[ 14.844828] run_timer_softirq+0x1e8/0x460
[ 14.845180] ? ktime_get+0x41/0xa0
[ 14.845490] ? lapic_next_deadline+0x26/0x30
[ 14.845851] ? clockevents_program_event+0x7a/0xf0
[ 14.846246] __do_softirq+0x104/0x2ab
[ 14.846572] irq_exit+0xf1/0x100
[ 14.846872] smp_apic_timer_interrupt+0x3d/0x50
[ 14.847249] apic_timer_interrupt+0x93/0xa0
[ 14.847608] RIP: 0010:cpuidle_enter_state+0x121/0x2d0
[ 14.848013] RSP: 0018:ffffffff81e03df0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[ 14.848618] RAX: 0000000000000000 RBX: 0000000000000005 RCX: 000000000000001f
[ 14.849151] RDX: 0000000373cf2cc5 RSI: ffff88011f2195d8 RDI: 0000000000000000
[ 14.849682] RBP: ffffffff81e03e28 R08: 00000000000f41c9 R09: 0000000000000018
[ 14.850214] R10: ffffffff81e03dc0 R11: 000000000007cf4d R12: ffff88011f225800
[ 14.850747] R13: ffffffff81fa9c58 R14: 0000000373cf2cc5 R15: ffffffff81fa9c40
[ 14.851281] </IRQ>
[ 14.851516] cpuidle_enter+0x17/0x20
[ 14.851837] call_cpuidle+0x23/0x40
[ 14.852152] do_idle+0x189/0x1e0
[ 14.852451] cpu_startup_entry+0x1d/0x20
[ 14.852791] rest_init+0x85/0x90
[ 14.853093] start_kernel+0x3f8/0x405
[ 14.853419] ? early_idt_handler_array+0x120/0x120
[ 14.853812] x86_64_start_reservations+0x2f/0x31
[ 14.854195] x86_64_start_kernel+0x143/0x152
[ 14.854557] secondary_startup_64+0x9f/0x9f
[ 14.854990] ACPI : button: The lid device is not compliant to SW_LID.
[ 14.855491] timer: acpi_lid_timeout+0x0/0x30 preempt leak: 00000101 -> 00000000
[ 14.856086] ------------[ cut here ]------------
[ 14.856473] WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1275 call_timer_fn+0x13c/0x150
[ 14.857188] Modules linked in:
[ 14.857479] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.12.0-rc6-00001-gc4a9ff74 #1
[ 14.858158] Hardware name: LENOVO IdeaPad U410 /Lenovo , BIOS 65CN15WW 06/05/2012
[ 14.858818] task: ffffffff81e104c0 task.stack: ffffffff81e00000
[ 14.859279] RIP: 0010:call_timer_fn+0x13c/0x150
[ 14.859655] RSP: 0018:ffff88011f203e98 EFLAGS: 00010292
[ 14.860073] RAX: 0000000000000043 RBX: 0000000000000001 RCX: ffffffff81e61488
[ 14.860607] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000246
[ 14.861138] RBP: ffff88011f203ec0 R08: 0000000000000000 R09: 000000000000035e
[ 14.861672] R10: ffffea0004714b40 R11: 000000008237d701 R12: 0000000000000101
[ 14.862203] R13: ffff8800c7535990 R14: ffffffff815734e0 R15: ffff880101cb6800
[ 14.862735] FS: 0000000000000000(0000) GS:ffff88011f200000(0000) knlGS:0000000000000000
[ 14.863371] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 14.863819] CR2: 00007fc0dcc49267 CR3: 000000011de09000 CR4: 00000000001406f0
[ 14.864354] Call Trace:
[ 14.864605] <IRQ>
[ 14.864831] run_timer_softirq+0x1e8/0x460
[ 14.865185] ? ktime_get+0x41/0xa0
[ 14.865493] ? lapic_next_deadline+0x26/0x30
[ 14.865854] ? clockevents_program_event+0x7a/0xf0
[ 14.866248] __do_softirq+0x104/0x2ab
[ 14.866573] irq_exit+0xf1/0x100
[ 14.866874] smp_apic_timer_interrupt+0x3d/0x50
[ 14.867252] apic_timer_interrupt+0x93/0xa0
[ 14.867609] RIP: 0010:cpuidle_enter_state+0x121/0x2d0
[ 14.868017] RSP: 0018:ffffffff81e03df0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[ 14.868619] RAX: 0000000000000000 RBX: 0000000000000005 RCX: 000000000000001f
[ 14.869151] RDX: 0000000373cf2cc5 RSI: ffff88011f2195d8 RDI: 0000000000000000
[ 14.869682] RBP: ffffffff81e03e28 R08: 00000000000f41c9 R09: 0000000000000018
[ 14.870216] R10: ffffffff81e03dc0 R11: 000000000007cf4d R12: ffff88011f225800
[ 14.870747] R13: ffffffff81fa9c58 R14: 0000000373cf2cc5 R15: ffffffff81fa9c40
[ 14.871278] </IRQ>
[ 14.871513] cpuidle_enter+0x17/0x20
[ 14.871831] call_cpuidle+0x23/0x40
[ 14.872143] do_idle+0x189/0x1e0
[ 14.872443] cpu_startup_entry+0x1d/0x20
[ 14.872783] rest_init+0x85/0x90
[ 14.873083] start_kernel+0x3f8/0x405
[ 14.873411] ? early_idt_handler_array+0x120/0x120
[ 14.873807] x86_64_start_reservations+0x2f/0x31
[ 14.874192] x86_64_start_kernel+0x143/0x152
[ 14.874554] secondary_startup_64+0x9f/0x9f
[ 14.874910] Code: 03 48 85 c0 75 eb 65 ff 0d e2 9f f1 7e e9 0c ff ff ff 44 89 e2 4c 89 f6 48 c7 c7 08 5d ca 81 c6 05 d1 24 e2 00 01 e8 81 66 0a 00 <0f> ff e9 16 ff ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 14.876305] ---[ end trace dfc99d55faf65cc8 ]---


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong


Attachments:
(No filename) (15.36 kB)
config-4.12.0-rc6-00001-gc4a9ff74 (155.59 kB)
job-script (6.97 kB)
dmesg.xz (21.05 kB)
netperf (544.00 B)
job.yaml (4.56 kB)
reproduce (328.00 B)
Download all attachments