2019-03-25 12:58:32

by Vladislav Dalechyn

[permalink] [raw]
Subject: [PATCH] ELAN touchpad i2c_hid bugs fix

From: Vladislav Dalechyn <[email protected]>

Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
caused by an error setting the correct IRQ_TRIGGER flag:
- i2c_hid incoplete error flood in journalctl;
- Five finger tap kill's module so you have to restart it;
- Two finger scoll is working incorrect and sometimes even when you
raised one of two finger still thinks that you are scrolling.

Fix all of these with a new quirk that corrects the trigger flag
announced by the ACPI tables. (edge-falling).

Reason behind moving i2c_hid_init_irq section described below:
i2c_hid_init_irq now checks for a quirk, so we must setup the quirks
before we init the irq, and we cannot setup the quirk earlier, so we must
init the irq later.

Co-developed-by: Hans De Goede <[email protected]>
Signed-off-by: Vladislav Dalechyn <[email protected]>
---
drivers/hid/hid-ids.h | 1 +
drivers/hid/i2c-hid/i2c-hid-core.c | 25 +++++++++++++++----------
2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index b6d93f4ad037..660b4e0e912e 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -389,6 +389,7 @@
#define USB_DEVICE_ID_TOSHIBA_CLICK_L9W 0x0401
#define USB_DEVICE_ID_HP_X2 0x074d
#define USB_DEVICE_ID_HP_X2_10_COVER 0x0755
+#define I2C_DEVICE_ID_ELAN_TOUCHPAD 0x303e

#define USB_VENDOR_ID_ELECOM 0x056e
#define USB_DEVICE_ID_ELECOM_BM084 0x0061
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index 90164fed08d3..9b417914411f 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -51,6 +51,7 @@
#define I2C_HID_QUIRK_NO_RUNTIME_PM BIT(2)
#define I2C_HID_QUIRK_DELAY_AFTER_SLEEP BIT(3)
#define I2C_HID_QUIRK_BOGUS_IRQ BIT(4)
+#define I2C_HID_QUIRK_FORCE_TRIGGER_FALLING BIT(5)

/* flags */
#define I2C_HID_STARTED 0
@@ -182,8 +183,10 @@ static const struct i2c_hid_quirks {
I2C_HID_QUIRK_NO_RUNTIME_PM },
{ I2C_VENDOR_ID_GOODIX, I2C_DEVICE_ID_GOODIX_01F0,
I2C_HID_QUIRK_NO_RUNTIME_PM },
+ { USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_ELAN_TOUCHPAD,
+ I2C_HID_QUIRK_BOGUS_IRQ | I2C_HID_QUIRK_FORCE_TRIGGER_FALLING },
{ USB_VENDOR_ID_ELAN, HID_ANY_ID,
- I2C_HID_QUIRK_BOGUS_IRQ },
+ I2C_HID_QUIRK_BOGUS_IRQ },
{ 0, 0 }
};

@@ -854,6 +857,8 @@ static int i2c_hid_init_irq(struct i2c_client *client)

if (!irq_get_trigger_type(client->irq))
irqflags = IRQF_TRIGGER_LOW;
+ if (ihid->quirks & I2C_HID_QUIRK_FORCE_TRIGGER_FALLING)
+ irqflags = IRQF_TRIGGER_FALLING;

ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
irqflags | IRQF_ONESHOT, client->name, ihid);
@@ -1123,14 +1128,10 @@ static int i2c_hid_probe(struct i2c_client *client,
if (ret < 0)
goto err_pm;

- ret = i2c_hid_init_irq(client);
- if (ret < 0)
- goto err_pm;
-
hid = hid_allocate_device();
if (IS_ERR(hid)) {
ret = PTR_ERR(hid);
- goto err_irq;
+ goto err_pm;
}

ihid->hid = hid;
@@ -1149,11 +1150,15 @@ static int i2c_hid_probe(struct i2c_client *client,

ihid->quirks = i2c_hid_lookup_quirk(hid->vendor, hid->product);

+ ret = i2c_hid_init_irq(client);
+ if (ret < 0)
+ goto err_mem_free;
+
ret = hid_add_device(hid);
if (ret) {
if (ret != -ENODEV)
hid_err(client, "can't add hid device: %d\n", ret);
- goto err_mem_free;
+ goto err_irq;
}

if (!(ihid->quirks & I2C_HID_QUIRK_NO_RUNTIME_PM))
@@ -1161,12 +1166,12 @@ static int i2c_hid_probe(struct i2c_client *client,

return 0;

+err_irq:
+ free_irq(client->irq, ihid);
+
err_mem_free:
hid_destroy_device(hid);

-err_irq:
- free_irq(client->irq, ihid);
-
err_pm:
pm_runtime_put_noidle(&client->dev);
pm_runtime_disable(&client->dev);
--
2.20.1



2019-03-25 16:04:53

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi Vladislav,

On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
<[email protected]> wrote:
>
> From: Vladislav Dalechyn <[email protected]>
>
> Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
> caused by an error setting the correct IRQ_TRIGGER flag:
> - i2c_hid incoplete error flood in journalctl;
> - Five finger tap kill's module so you have to restart it;
> - Two finger scoll is working incorrect and sometimes even when you
> raised one of two finger still thinks that you are scrolling.
>
> Fix all of these with a new quirk that corrects the trigger flag
> announced by the ACPI tables. (edge-falling).

I do not believe this is right solution. The driver makes liberal use
of disable_irq() and enable_irq() which may lead to lost edges and
touchpad stopping working altogether.

Usually the "extra" report is caused by GPIO controller clearing
interrupt condition at the wrong time (too early), or in unsafe or
racy fashion. You need to look there instead of adding quirk to
i2c-hid.

Thanks.

--
Dmitry

2019-03-25 16:41:00

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi Dmitry,

On 25-03-19 17:02, Dmitry Torokhov wrote:
> Hi Vladislav,
>
> On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
> <[email protected]> wrote:
>>
>> From: Vladislav Dalechyn <[email protected]>
>>
>> Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
>> caused by an error setting the correct IRQ_TRIGGER flag:
>> - i2c_hid incoplete error flood in journalctl;
>> - Five finger tap kill's module so you have to restart it;
>> - Two finger scoll is working incorrect and sometimes even when you
>> raised one of two finger still thinks that you are scrolling.
>>
>> Fix all of these with a new quirk that corrects the trigger flag
>> announced by the ACPI tables. (edge-falling).
>
> I do not believe this is right solution. The driver makes liberal use
> of disable_irq() and enable_irq() which may lead to lost edges and
> touchpad stopping working altogether.
>
> Usually the "extra" report is caused by GPIO controller clearing
> interrupt condition at the wrong time (too early), or in unsafe or
> racy fashion. You need to look there instead of adding quirk to
> i2c-hid.

The falling-edge solution was proposed by Elan themselves.

Also if you look at: https://bugzilla.redhat.com/show_bug.cgi?id=1543769

And esp. the "cat /proc/interrupts" output there, then you will see
that the interrupt seems to be stuck at low level, which according
to the ACPI tables is its active level.

As for this being a GPIO chip driver problem, this is using standard
Intel pinctrl stuff, which is not showing this same issue with many
other i2c-hid touchpads.

Regards,

Hans


2019-03-25 16:58:59

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi Hans,

On Mon, Mar 25, 2019 at 9:38 AM Hans de Goede <[email protected]> wrote:
>
> Hi Dmitry,
>
> On 25-03-19 17:02, Dmitry Torokhov wrote:
> > Hi Vladislav,
> >
> > On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
> > <[email protected]> wrote:
> >>
> >> From: Vladislav Dalechyn <[email protected]>
> >>
> >> Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
> >> caused by an error setting the correct IRQ_TRIGGER flag:
> >> - i2c_hid incoplete error flood in journalctl;
> >> - Five finger tap kill's module so you have to restart it;
> >> - Two finger scoll is working incorrect and sometimes even when you
> >> raised one of two finger still thinks that you are scrolling.
> >>
> >> Fix all of these with a new quirk that corrects the trigger flag
> >> announced by the ACPI tables. (edge-falling).
> >
> > I do not believe this is right solution. The driver makes liberal use
> > of disable_irq() and enable_irq() which may lead to lost edges and
> > touchpad stopping working altogether.
> >
> > Usually the "extra" report is caused by GPIO controller clearing
> > interrupt condition at the wrong time (too early), or in unsafe or
> > racy fashion. You need to look there instead of adding quirk to
> > i2c-hid.
>
> The falling-edge solution was proposed by Elan themselves.
>
> Also if you look at: https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>
> And esp. the "cat /proc/interrupts" output there, then you will see
> that the interrupt seems to be stuck at low level, which according
> to the ACPI tables is its active level.

So how does it generate a new edge if it is stuck at low?

Is it bad touchpad firmware that does not deassert interrupt quickly
enough? I scrolled through the bug but I do not see if it had been
confirmed that original windows installation actually uses edge (it
may very well be using it; Elan engineers pushed us to use edge in a
few cases, but they all boiled down to an issue with pin control/GPIO
implementation).

>
> As for this being a GPIO chip driver problem, this is using standard
> Intel pinctrl stuff, which is not showing this same issue with many
> other i2c-hid touchpads.

Well, there have been plenty of issues in intel drivers, coupled with
"interesting" things done by firmware and boards.

If you want to keep on using edge you need to make sure that i2c-hid
never loses edge, as replaying of previously disabled interrupts in
not at all reliable. So you need to "kick" the device after
enable_irq() by initiating read from it and be ready to not get any
data or get valid data and process accordingly.

Thanks.

--
Dmitry

2019-03-25 18:31:47

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

On Mon, Mar 25, 2019 at 11:23 AM <[email protected]> wrote:
>
> Hi.
>
> Mar 25, 2019, 6:56 PM by [email protected]:
>
> If you want to keep on using edge you need to make sure that i2c-hid
> never loses edge, as replaying of previously disabled interrupts in
> not at all reliable. So you need to "kick" the device after
> enable_irq() by initiating read from it and be ready to not get any
> data or get valid data and process accordingly.
>
> I'm sorry, but how edge can be loosed? There is device ID and quirk's which are associated with that ID, and quirk will always trigger edge irq.

If edge arrives while interrupt is disabled and replay of the
interrupt does not work when it is reenabled (and I do not think it
ever works for GPIO, you can hunt for Thomas Gleixner emails to that
effect on LKML), then ISR in the driver will never be called, so
driver will never read from the device to reset the interrupt
condition and interrupt will never be reasserted again -> edge lost.

Thanks.

--
Dmitry

2019-03-29 12:19:17

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi,

On 3/25/19 5:56 PM, Dmitry Torokhov wrote:
> Hi Hans,
>
> On Mon, Mar 25, 2019 at 9:38 AM Hans de Goede <[email protected]> wrote:
>>
>> Hi Dmitry,
>>
>> On 25-03-19 17:02, Dmitry Torokhov wrote:
>>> Hi Vladislav,
>>>
>>> On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
>>> <[email protected]> wrote:
>>>>
>>>> From: Vladislav Dalechyn <[email protected]>
>>>>
>>>> Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
>>>> caused by an error setting the correct IRQ_TRIGGER flag:
>>>> - i2c_hid incoplete error flood in journalctl;
>>>> - Five finger tap kill's module so you have to restart it;
>>>> - Two finger scoll is working incorrect and sometimes even when you
>>>> raised one of two finger still thinks that you are scrolling.
>>>>
>>>> Fix all of these with a new quirk that corrects the trigger flag
>>>> announced by the ACPI tables. (edge-falling).
>>>
>>> I do not believe this is right solution. The driver makes liberal use
>>> of disable_irq() and enable_irq() which may lead to lost edges and
>>> touchpad stopping working altogether.
>>>
>>> Usually the "extra" report is caused by GPIO controller clearing
>>> interrupt condition at the wrong time (too early), or in unsafe or
>>> racy fashion. You need to look there instead of adding quirk to
>>> i2c-hid.
>>
>> The falling-edge solution was proposed by Elan themselves.
>>
>> Also if you look at: https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>>
>> And esp. the "cat /proc/interrupts" output there, then you will see
>> that the interrupt seems to be stuck at low level, which according
>> to the ACPI tables is its active level.
>
> So how does it generate a new edge if it is stuck at low?
>
> Is it bad touchpad firmware that does not deassert interrupt quickly
> enough?

I do not believe it is not de-asserting it quick enough (I believe
the amount of interrups is too high for that.

It seems to simply be low most of the time, or it is really really
slow with de-asserting.

Vladislav can you check the output of /cat/interrupts on a kernel
without the patch and while *not* using the touchpad; and check
if the amount of touchpads-interrupts still keeps increasing in this
case?

Also I believe that you had contact with Elan about this and they
told you to change the interrupt type to falling-edge as work-around,
right? Can you ask them why?

This is quite unususal, I've collected quite a few DSDTs over time
and I've just checked about 40 of them all with a PNP0C50 in
some form (and in many cases multiple such devices) and NONE of
them is using edged-interrupts in the ACPI config.

<speculation>

I think that the Elan touchpad firmware supports a mode to
work on devices which only support edge interrupts and that
this mode is accidentally enabled in this firmware.

I think that the interrupt line is simply low all the time
and gets pulsed high then low again when the touchpad detects
a finger. Hopefully it does this pulsing on every event and not
only when its event "fifo" is empty.

</speculation>

> I scrolled through the bug but I do not see if it had been
> confirmed that original windows installation actually uses edge (it
> may very well be using it; Elan engineers pushed us to use edge in a
> few cases, but they all boiled down to an issue with pin control/GPIO
> implementation).

This has not been checked on Windows AFAIK.

>> As for this being a GPIO chip driver problem, this is using standard
>> Intel pinctrl stuff, which is not showing this same issue with many
>> other i2c-hid touchpads.
>
> Well, there have been plenty of issues in intel drivers, coupled with
> "interesting" things done by firmware and boards.
>
> If you want to keep on using edge you need to make sure that i2c-hid
> never loses edge, as replaying of previously disabled interrupts in
> not at all reliable. So you need to "kick" the device after
> enable_irq() by initiating read from it and be ready to not get any
> data or get valid data and process accordingly.

That is a good point and I agree.

Vladislav, let me explain this a bit. Normally the touchpad
driver the IRQ line low when it has touch-data to respond, which means
that if touch-data is reported before the driver loads (or while
the driver has the irq disabled during e.g. suspend), it will immediately
see an interrupt. If we use edge mode then the IRQ will only trigger
when the IRQ line goes from high to low, if this happens when the driver
is not listening then we do not see the edge. And since we never read the
pending touch-data, the IRQ line never goes high again (which it does
when we have read all available data), so we will never see a negative-edge
and then things are stuck.

It would be good, if running a kernel with your patch, you can try
to trigger this by:

1) Suspending the machine by selecting suspend from a menu in your
desktop environment, or by briefly pressing the power-button, do
not close the lid
2) As soon as the system starts suspending and while it is suspended, move
your finger around the touchpad
3) Wake the system up with the powerbutton while moving your finger around
4) Check if the touchpad still works after this

Or by:

1) Using ctrl + alt + f3 to switch to a text console
2) Move finger around on touchpad, keep moving it around
3) Switch back to X11 with alt + f2 or alt + f7, while still moving the finger
4) Check if the touchpad still works after this

If neither causes the touchpad to stop working, then at least the problem Dmitry
fears is not easy to trigger, but we should probably still prepare to deal
with it; and we really should try to better understand the problem here, so
if you can answer my questions above, then that would be great.

Regards,

Hans


2019-03-29 18:24:55

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

On Fri, Mar 29, 2019 at 5:18 AM Hans de Goede <[email protected]> wrote:
>
> Hi,
>
> On 3/25/19 5:56 PM, Dmitry Torokhov wrote:
> > Hi Hans,
> >
> > On Mon, Mar 25, 2019 at 9:38 AM Hans de Goede <[email protected]> wrote:
> >>
> >> Hi Dmitry,
> >>
> >> On 25-03-19 17:02, Dmitry Torokhov wrote:
> >>> Hi Vladislav,
> >>>
> >>> On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
> >>> <[email protected]> wrote:
> >>>>
> >>>> From: Vladislav Dalechyn <[email protected]>
> >>>>
> >>>> Description: The ELAN1200:04F3:303E touchpad exposes several issues, all
> >>>> caused by an error setting the correct IRQ_TRIGGER flag:
> >>>> - i2c_hid incoplete error flood in journalctl;
> >>>> - Five finger tap kill's module so you have to restart it;
> >>>> - Two finger scoll is working incorrect and sometimes even when you
> >>>> raised one of two finger still thinks that you are scrolling.
> >>>>
> >>>> Fix all of these with a new quirk that corrects the trigger flag
> >>>> announced by the ACPI tables. (edge-falling).
> >>>
> >>> I do not believe this is right solution. The driver makes liberal use
> >>> of disable_irq() and enable_irq() which may lead to lost edges and
> >>> touchpad stopping working altogether.
> >>>
> >>> Usually the "extra" report is caused by GPIO controller clearing
> >>> interrupt condition at the wrong time (too early), or in unsafe or
> >>> racy fashion. You need to look there instead of adding quirk to
> >>> i2c-hid.
> >>
> >> The falling-edge solution was proposed by Elan themselves.
> >>
> >> Also if you look at: https://bugzilla.redhat.com/show_bug.cgi?id=1543769
> >>
> >> And esp. the "cat /proc/interrupts" output there, then you will see
> >> that the interrupt seems to be stuck at low level, which according
> >> to the ACPI tables is its active level.
> >
> > So how does it generate a new edge if it is stuck at low?
> >
> > Is it bad touchpad firmware that does not deassert interrupt quickly
> > enough?
>
> I do not believe it is not de-asserting it quick enough (I believe
> the amount of interrups is too high for that.
>
> It seems to simply be low most of the time, or it is really really
> slow with de-asserting.
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check
> if the amount of touchpads-interrupts still keeps increasing in this
> case?
>
> Also I believe that you had contact with Elan about this and they
> told you to change the interrupt type to falling-edge as work-around,
> right? Can you ask them why?

KT, do you know anything about using edge interrupts with your
hid-over-i2c products?

>
> This is quite unususal, I've collected quite a few DSDTs over time
> and I've just checked about 40 of them all with a PNP0C50 in
> some form (and in many cases multiple such devices) and NONE of
> them is using edged-interrupts in the ACPI config.

That is because MS spec for HID over I2C requires level interrupts:

"7.4 Interrupt Line Management

DEVICE is responsible to assert the interrupt line (level trigger
interrupt required) when it has data.

DEVICE must keep the interrupt line asserted until the data has been
fully read by the HOST. After this point, the DEVICE must de-assert
the interrupt line if it has no more data to report to the HOST.
When the DEVICE has new data to report, it must repeat the process above."

See https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn642101(v=vs.85)

>
> <speculation>
>
> I think that the Elan touchpad firmware supports a mode to
> work on devices which only support edge interrupts and that
> this mode is accidentally enabled in this firmware.
>
> I think that the interrupt line is simply low all the time
> and gets pulsed high then low again when the touchpad detects
> a finger. Hopefully it does this pulsing on every event and not
> only when its event "fifo" is empty.

This behavior would violate HID-over-I2C spec though.

>
> </speculation>
>
> > I scrolled through the bug but I do not see if it had been
> > confirmed that original windows installation actually uses edge (it
> > may very well be using it; Elan engineers pushed us to use edge in a
> > few cases, but they all boiled down to an issue with pin control/GPIO
> > implementation).
>
> This has not been checked on Windows AFAIK.
>
> >> As for this being a GPIO chip driver problem, this is using standard
> >> Intel pinctrl stuff, which is not showing this same issue with many
> >> other i2c-hid touchpads.
> >
> > Well, there have been plenty of issues in intel drivers, coupled with
> > "interesting" things done by firmware and boards.
> >
> > If you want to keep on using edge you need to make sure that i2c-hid
> > never loses edge, as replaying of previously disabled interrupts in
> > not at all reliable. So you need to "kick" the device after
> > enable_irq() by initiating read from it and be ready to not get any
> > data or get valid data and process accordingly.
>
> That is a good point and I agree.
>
> Vladislav, let me explain this a bit. Normally the touchpad
> driver the IRQ line low when it has touch-data to respond, which means
> that if touch-data is reported before the driver loads (or while
> the driver has the irq disabled during e.g. suspend), it will immediately
> see an interrupt. If we use edge mode then the IRQ will only trigger
> when the IRQ line goes from high to low, if this happens when the driver
> is not listening then we do not see the edge. And since we never read the
> pending touch-data, the IRQ line never goes high again (which it does
> when we have read all available data), so we will never see a negative-edge
> and then things are stuck.
>
> It would be good, if running a kernel with your patch, you can try
> to trigger this by:
>
> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do
> not close the lid
> 2) As soon as the system starts suspending and while it is suspended, move
> your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger around
> 4) Check if the touchpad still works after this
>
> Or by:
>
> 1) Using ctrl + alt + f3 to switch to a text console
> 2) Move finger around on touchpad, keep moving it around
> 3) Switch back to X11 with alt + f2 or alt + f7, while still moving the finger
> 4) Check if the touchpad still works after this
>
> If neither causes the touchpad to stop working, then at least the problem Dmitry
> fears is not easy to trigger, but we should probably still prepare to deal
> with it; and we really should try to better understand the problem here, so
> if you can answer my questions above, then that would be great.

Thanks.

--
Dmitry

2019-04-01 12:27:39

by 廖崇榮

[permalink] [raw]
Subject: RE: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi Dmitry,

-----Original Message-----
From: Dmitry Torokhov [mailto:[email protected]]
Sent: Saturday, March 30, 2019 2:24 AM
To: Hans de Goede; 廖崇榮
Cc: Vladislav Dalechyn; Benjamin Tissoires; Jiri Kosina; [email protected]; [email protected]; [email protected]; open list:HID CORE LAYER; lkml; [email protected]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

On Fri, Mar 29, 2019 at 5:18 AM Hans de Goede <[email protected]> wrote:
>
> Hi,
>
> On 3/25/19 5:56 PM, Dmitry Torokhov wrote:
> > Hi Hans,
> >
> > On Mon, Mar 25, 2019 at 9:38 AM Hans de Goede <[email protected]> wrote:
> >>
> >> Hi Dmitry,
> >>
> >> On 25-03-19 17:02, Dmitry Torokhov wrote:
> >>> Hi Vladislav,
> >>>
> >>> On Mon, Mar 25, 2019 at 5:57 AM Vladislav Dalechyn
> >>> <[email protected]> wrote:
> >>>>
> >>>> From: Vladislav Dalechyn <[email protected]>
> >>>>
> >>>> Description: The ELAN1200:04F3:303E touchpad exposes several
> >>>> issues, all caused by an error setting the correct IRQ_TRIGGER flag:
> >>>> - i2c_hid incoplete error flood in journalctl;
> >>>> - Five finger tap kill's module so you have to restart it;
> >>>> - Two finger scoll is working incorrect and sometimes even when
> >>>> you raised one of two finger still thinks that you are scrolling.
> >>>>
> >>>> Fix all of these with a new quirk that corrects the trigger flag
> >>>> announced by the ACPI tables. (edge-falling).
> >>>
> >>> I do not believe this is right solution. The driver makes liberal
> >>> use of disable_irq() and enable_irq() which may lead to lost edges
> >>> and touchpad stopping working altogether.
> >>>
> >>> Usually the "extra" report is caused by GPIO controller clearing
> >>> interrupt condition at the wrong time (too early), or in unsafe or
> >>> racy fashion. You need to look there instead of adding quirk to
> >>> i2c-hid.
> >>
> >> The falling-edge solution was proposed by Elan themselves.
> >>
> >> Also if you look at:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1543769
> >>
> >> And esp. the "cat /proc/interrupts" output there, then you will see
> >> that the interrupt seems to be stuck at low level, which according
> >> to the ACPI tables is its active level.
> >
> > So how does it generate a new edge if it is stuck at low?
> >
> > Is it bad touchpad firmware that does not deassert interrupt quickly
> > enough?
>
> I do not believe it is not de-asserting it quick enough (I believe the
> amount of interrups is too high for that.
>
> It seems to simply be low most of the time, or it is really really
> slow with de-asserting.
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check if the
> amount of touchpads-interrupts still keeps increasing in this case?
>
> Also I believe that you had contact with Elan about this and they told
> you to change the interrupt type to falling-edge as work-around,
> right? Can you ask them why?

KT, do you know anything about using edge interrupts with your hid-over-i2c products?

Ya, I sent the patch to Vladislav for testing his 5 finger-tap issue.

The patch is created by Dell/Intel for debugging Elan PTP's incomplete report message.
Host tried to get report again while touchpad finish report transmission.

I excerpt from Dell's mail
===========Begin=========================
We have worked with Intel for a while to revise Linux kernel driver code to prevent Touch I2C error message show up,
And Intel dig out there’re 2 issues to cause this i2c error on Linux kernel 4.18.

Therefore Intel suggested to revised SW code into “trigger falling edge + pm_disabled + IRQF_SHARED | IRQF_COND_SUSPEND(i2c-designware-master.c)”
and then we can get positive result that i2c error message disappeared when Touch Panel is working.

Since this issue only happened on ELAN touch pad + Intel GLK platform (Linux) and other Intel reference platform (Whisky lake) won’t, would you help check the solution (as patch attached) and let us know if any question/concern for this modification?
===========End=========================

I don't know the root cause of 5-finger issue with level trigger so far.
I will check it once I get the touchpad with the same issue.

From previous issue list, it seems that some touchpad's crush issue will be fixed by edge trigger.

I have discussed with FW engineer, both level and edge trigger should be OK for our PTP(HID touchpad).
I summary the behavior of our HID touchpad's interrupt.
1. Assert for Every finger's report, which means 5 ISR for 5 finger operation.
2. We will de-assert INT pin after the last byte's NACK signal.
I checked LA scope, 2nd finger's assert will happen after 10us if two finger touch.
SYNAPTICS's touchpad will de-assert it after address byte sent.

I am not sure if SYNAPTICE's earlier de-asserting is a better way for level-trigger?
At least we never see issue happen on our PTP in Windows 10.
>
> This is quite unususal, I've collected quite a few DSDTs over time and
> I've just checked about 40 of them all with a PNP0C50 in some form
> (and in many cases multiple such devices) and NONE of them is using
> edged-interrupts in the ACPI config.

That is because MS spec for HID over I2C requires level interrupts:

"7.4 Interrupt Line Management

DEVICE is responsible to assert the interrupt line (level trigger interrupt required) when it has data.

DEVICE must keep the interrupt line asserted until the data has been fully read by the HOST. After this point, the DEVICE must de-assert the interrupt line if it has no more data to report to the HOST.
When the DEVICE has new data to report, it must repeat the process above."

See https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn642101(v=vs.85)

>
> <speculation>
>
> I think that the Elan touchpad firmware supports a mode to work on
> devices which only support edge interrupts and that this mode is
> accidentally enabled in this firmware.
>
> I think that the interrupt line is simply low all the time and gets
> pulsed high then low again when the touchpad detects a finger.
> Hopefully it does this pulsing on every event and not only when its
> event "fifo" is empty.

This behavior would violate HID-over-I2C spec though.

>
> </speculation>
>
> > I scrolled through the bug but I do not see if it had been confirmed
> > that original windows installation actually uses edge (it may very
> > well be using it; Elan engineers pushed us to use edge in a few
> > cases, but they all boiled down to an issue with pin control/GPIO
> > implementation).
>
> This has not been checked on Windows AFAIK.
>
> >> As for this being a GPIO chip driver problem, this is using
> >> standard Intel pinctrl stuff, which is not showing this same issue
> >> with many other i2c-hid touchpads.
> >
> > Well, there have been plenty of issues in intel drivers, coupled
> > with "interesting" things done by firmware and boards.
> >
> > If you want to keep on using edge you need to make sure that i2c-hid
> > never loses edge, as replaying of previously disabled interrupts in
> > not at all reliable. So you need to "kick" the device after
> > enable_irq() by initiating read from it and be ready to not get any
> > data or get valid data and process accordingly.
>
> That is a good point and I agree.
>
> Vladislav, let me explain this a bit. Normally the touchpad driver the
> IRQ line low when it has touch-data to respond, which means that if
> touch-data is reported before the driver loads (or while the driver
> has the irq disabled during e.g. suspend), it will immediately see an
> interrupt. If we use edge mode then the IRQ will only trigger when the
> IRQ line goes from high to low, if this happens when the driver is not
> listening then we do not see the edge. And since we never read the
> pending touch-data, the IRQ line never goes high again (which it does
> when we have read all available data), so we will never see a
> negative-edge and then things are stuck.
>
> It would be good, if running a kernel with your patch, you can try to
> trigger this by:
>
> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do not
> close the lid
> 2) As soon as the system starts suspending and while it is suspended,
> move your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger
> around
> 4) Check if the touchpad still works after this
>
> Or by:
>
> 1) Using ctrl + alt + f3 to switch to a text console
> 2) Move finger around on touchpad, keep moving it around
> 3) Switch back to X11 with alt + f2 or alt + f7, while still moving
> the finger
> 4) Check if the touchpad still works after this
>
> If neither causes the touchpad to stop working, then at least the
> problem Dmitry fears is not easy to trigger, but we should probably
> still prepare to deal with it; and we really should try to better
> understand the problem here, so if you can answer my questions above, then that would be great.

Thanks.

--
Dmitry

2019-04-03 11:20:45

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi,

On 31-03-19 11:50, [email protected] wrote:
> Hi. I've done everything you said, here are results:
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check
> if the amount of touchpads-interrupts still keeps increasing in this
> case?
>
> IWI or IRQ work interrupts keep increasing with speed at least 3 interrupts/s.

I'm really only interested in the touchpad related IRQs, so e.g. the line
about "intel-gpio 129 ELAN1200:00", if you're seeing 3 interrupts/s on
some others that is fine, so I take it the ELAN1200:00 interrupt count
does not increase on an *unpatched* kernel, unless you use the touchpad?

> Also when I am moving touchpad IR-IO-APIC 14-fasteoi INT345D:00 get's triggered and increased.

That is the GPIO controller interrupt, so that one increasing is normal.

If I understand things correctly then this all means that the IRQ indeed
is a normal level IRQ and Dmitry is likely correct that there is an
pinctrl / gpiochip driver problem here.

Can you try the following with an *unpatched* kernel? :

1) Run "cat /proc/interrupts | grep ELAN" , note down the value
2) Very quickly/briefly touch the touchpad once
3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
4) Subtract result from 1. from result from 3, this difference is
the value we are interested in. E.g. my testing got 254 and 257, so
a difference of 3.

The goal here is to get an as low as possible difference. Feel free
to repeat this a couple of times.

On an Apollo Lake laptop with an I2C hid mt touchpad I can get the
amount of interrupts triggered for a single touch down to 3,
given the huge interrupt counts of 130000+ reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=1543769

I expect you to get a much bigger smallest possible difference
between 2 "cat /proc/interrupts | grep ELAN" commands, note a
difference of 0 means your touch did not register.

Assuming you indeed see much more interrupts for a very quick
touch + release, then we indeed have an interrupt handling problem
we need to investigate further.

> I don't know if it's important or not, but for some reason these interrupts keep popping only on CPU2 (i have 4cpu processor).

That does not matter.

> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do
> not close the lid
> 2) As soon as the system starts suspending and while it is suspended, move
> your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger around
> 4) Check if the touchpad still works after this
>
> It works, but as it seems, looses edge. JournalCTL is being flooded with i2c_hid_get_input: incomplete report (16/65535)

That is probably a different issue. If you loose the edge IRQ, then the touchpad
would stop working without any messages. I believe that the suspend / resume
issue may be fixed by:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=52cf93e63ee672a92f349edc6ddad86ec8808fd8

Does your kernel have this commit? (please always use the latest kernel while
testing).

> Also a thing to notice, that after manually removing and modprobing i2c_hid module, it says next in journalctl:
>
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using dummy regulator
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using dummy regulator

Those messages can safely be ignored.

Regards,

Hans

2019-04-11 16:19:07

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi,

at 05:18, <[email protected]> <[email protected]> wrote:

> Hi.
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
> I've tested that, main diffs are 30, 24, 16 (the most frequent), 2 (the
> least frequent).
>
> I was using 4.19.13 kernel, because I use ParrotOS (which happens to be
> Debian distribution).
> But I've installed experimental 5.0.0 kernel and I can't say right now if
> suspend problem is resolved (i have to rebuild latest kernel with patch).

Can you try below fix?

This can solve what commit 1475af255e18 ("HID: i2c-hid: Ignore input report
if there's no data present on Elan touchpanels”) tries to workaround.

diff --git a/drivers/pinctrl/intel/pinctrl-intel.c
b/drivers/pinctrl/intel/pinctrl-intel.c
index c19a4c45f7bb..30e3664f1ae5 100644
--- a/drivers/pinctrl/intel/pinctrl-intel.c
+++ b/drivers/pinctrl/intel/pinctrl-intel.c
@@ -957,6 +957,10 @@ static void intel_gpio_irq_mask_unmask(struct irq_data
*d, bool mask)
reg = community->regs + community->ie_offset + gpp * 4;

raw_spin_lock_irqsave(&pctrl->lock, flags);
+
+ if (!mask)
+ writel(BIT(gpp_offset), community->regs +
community->is_offset + gpp * 4);
+
value = readl(reg);
if (mask)
value &= ~BIT(gpp_offset);


>
> Regards,
> Vladislav.
>
> Apr 3, 2019, 2:18 PM by [email protected]:
> Hi,
>
> On 31-03-19 11:50, [email protected] wrote:
> Hi. I've done everything you said, here are results:
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check
> if the amount of touchpads-interrupts still keeps increasing in this
> case?
>
> IWI or IRQ work interrupts keep increasing with speed at least 3
> interrupts/s.
>
> I'm really only interested in the touchpad related IRQs, so e.g. the line
> about "intel-gpio 129 ELAN1200:00", if you're seeing 3 interrupts/s on
> some others that is fine, so I take it the ELAN1200:00 interrupt count
> does not increase on an *unpatched* kernel, unless you use the touchpad?
> Also when I am moving touchpad IR-IO-APIC 14-fasteoi INT345D:00 get's
> triggered and increased.
>
> That is the GPIO controller interrupt, so that one increasing is normal.
>
> If I understand things correctly then this all means that the IRQ indeed
> is a normal level IRQ and Dmitry is likely correct that there is an
> pinctrl / gpiochip driver problem here.
>
> Can you try the following with an *unpatched* kernel? :
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
>
> The goal here is to get an as low as possible difference. Feel free
> to repeat this a couple of times.
>
> On an Apollo Lake laptop with an I2C hid mt touchpad I can get the
> amount of interrupts triggered for a single touch down to 3,
> given the huge interrupt counts of 130000+ reported in:
> https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>
> I expect you to get a much bigger smallest possible difference
> between 2 "cat /proc/interrupts | grep ELAN" commands, note a
> difference of 0 means your touch did not register.
>
> Assuming you indeed see much more interrupts for a very quick
> touch + release, then we indeed have an interrupt handling problem
> we need to investigate further.
> I don't know if it's important or not, but for some reason these
> interrupts keep popping only on CPU2 (i have 4cpu processor).
>
> That does not matter.
> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do
> not close the lid
> 2) As soon as the system starts suspending and while it is suspended, move
> your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger around
> 4) Check if the touchpad still works after this
>
> It works, but as it seems, looses edge. JournalCTL is being flooded with
> i2c_hid_get_input: incomplete report (16/65535)
>
> That is probably a different issue. If you loose the edge IRQ, then the
> touchpad
> would stop working without any messages. I believe that the suspend /
> resume
> issue may be fixed by:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=52cf93e63ee672a92f349edc6ddad86ec8808fd8
>
> Does your kernel have this commit? (please always use the latest kernel
> while
> testing).
> Also a thing to notice, that after manually removing and modprobing
> i2c_hid module, it says next in journalctl:
>
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using
> dummy regulator
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using
> dummy regulator
>
> Those messages can safely be ignored.
>
> Regards,
>
> Hans

2019-04-13 08:44:26

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

at 16:40, <[email protected]> <[email protected]> wrote:

> Hi.
>
> I've applied this patch, but still getting incomplete report messages.

Does the patch fix the other two issues:
- Five finger tap kill's module so you have to restart it;
- Two finger scoll is working incorrect and sometimes even when you
raised one of two finger still thinks that you are scrolling.

Kai-Heng

>
> Regards,
> Vladislav
>
> Apr 11, 2019, 7:17 PM by [email protected]:
> Hi,
>
> at 05:18, <[email protected]> <[email protected]> wrote:
> Hi.
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
> I've tested that, main diffs are 30, 24, 16 (the most frequent), 2 (the
> least frequent).
>
> I was using 4.19.13 kernel, because I use ParrotOS (which happens to be
> Debian distribution).
> But I've installed experimental 5.0.0 kernel and I can't say right now if
> suspend problem is resolved (i have to rebuild latest kernel with patch).
>
> Can you try below fix?
>
> This can solve what commit 1475af255e18 ("HID: i2c-hid: Ignore input
> report if there's no data present on Elan touchpanels”) tries to
> workaround.
>
> diff --git a/drivers/pinctrl/intel/pinctrl-intel.c
> b/drivers/pinctrl/intel/pinctrl-intel.c
> index c19a4c45f7bb..30e3664f1ae5 100644
> --- a/drivers/pinctrl/intel/pinctrl-intel.c
> +++ b/drivers/pinctrl/intel/pinctrl-intel.c
> @@ -957,6 +957,10 @@ static void intel_gpio_irq_mask_unmask(struct
> irq_data *d, bool mask)
> reg = community->regs + community->ie_offset + gpp * 4;
>
> raw_spin_lock_irqsave(&pctrl->lock, flags);
> +
> + if (!mask)
> + writel(BIT(gpp_offset), community->regs + community->is_offset + gpp *
> 4);
> +
> value = readl(reg);
> if (mask)
> value &= ~BIT(gpp_offset);
>
>
> Regards,
> Vladislav.
>
> Apr 3, 2019, 2:18 PM by [email protected]:
> Hi,
>
> On 31-03-19 11:50, [email protected] wrote:
> Hi. I've done everything you said, here are results:
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check
> if the amount of touchpads-interrupts still keeps increasing in this
> case?
>
> IWI or IRQ work interrupts keep increasing with speed at least 3
> interrupts/s.
>
> I'm really only interested in the touchpad related IRQs, so e.g. the line
> about "intel-gpio 129 ELAN1200:00", if you're seeing 3 interrupts/s on
> some others that is fine, so I take it the ELAN1200:00 interrupt count
> does not increase on an *unpatched* kernel, unless you use the touchpad?
> Also when I am moving touchpad IR-IO-APIC 14-fasteoi INT345D:00 get's
> triggered and increased.
>
> That is the GPIO controller interrupt, so that one increasing is normal.
>
> If I understand things correctly then this all means that the IRQ indeed
> is a normal level IRQ and Dmitry is likely correct that there is an
> pinctrl / gpiochip driver problem here.
>
> Can you try the following with an *unpatched* kernel? :
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
>
> The goal here is to get an as low as possible difference. Feel free
> to repeat this a couple of times.
>
> On an Apollo Lake laptop with an I2C hid mt touchpad I can get the
> amount of interrupts triggered for a single touch down to 3,
> given the huge interrupt counts of 130000+ reported in:
> https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>
> I expect you to get a much bigger smallest possible difference
> between 2 "cat /proc/interrupts | grep ELAN" commands, note a
> difference of 0 means your touch did not register.
>
> Assuming you indeed see much more interrupts for a very quick
> touch + release, then we indeed have an interrupt handling problem
> we need to investigate further.
> I don't know if it's important or not, but for some reason these
> interrupts keep popping only on CPU2 (i have 4cpu processor).
>
> That does not matter.
> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do
> not close the lid
> 2) As soon as the system starts suspending and while it is suspended, move
> your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger around
> 4) Check if the touchpad still works after this
>
> It works, but as it seems, looses edge. JournalCTL is being flooded with
> i2c_hid_get_input: incomplete report (16/65535)
>
> That is probably a different issue. If you loose the edge IRQ, then the
> touchpad
> would stop working without any messages. I believe that the suspend /
> resume
> issue may be fixed by:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=52cf93e63ee672a92f349edc6ddad86ec8808fd8
>
> Does your kernel have this commit? (please always use the latest kernel
> while
> testing).
> Also a thing to notice, that after manually removing and modprobing
> i2c_hid module, it says next in journalctl:
>
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using
> dummy regulator
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using
> dummy regulator
>
> Those messages can safely be ignored.
>
> Regards,
>
> Hans


2019-04-15 11:44:48

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

Hi,

On 15-04-19 13:36, [email protected] wrote:
> Sorry for the delay.
> By applying this patch I get next results:
> Five finger tap and two finger scroll issues disappear, but after suspend touchpad dies. Restarting module doesn't help.

So bascally the same results as with the edge-triggered interrupt patch/hack,
right?

Are you still using the edge-triggered interrupt patch, or just the new
patch Kai-Heng Feng provided.

To me it sounds like the patch Kai-Heng Feng provided at least removes
the need for the edge-triggered interrupt patch/hack and what remains to
be solved is the suspend/resume issues.

Regards,

Hans



> Here's the log:
>
> Apr 15 14:35:54 parrot sudo[3473]: h0tw4t3r : TTY=pts/1 ; PWD=/home/h0tw4t3r ; USER=root ; COMMAND=/sbin/rmmod i2c_hid
> Apr 15 14:35:54 parrot sudo[3478]: h0tw4t3r : TTY=pts/1 ; PWD=/home/h0tw4t3r ; USER=root ; COMMAND=/sbin/modprobe i2c_hid
> Apr 15 14:35:54 parrot kernel: i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using dummy regulator
> Apr 15 14:35:54 parrot kernel: i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using dummy regulator
>
> Could you please explain what this patch does?
>
> Regards,
> Vladislav.
>
> Apr 13, 2019, 11:42 AM by [email protected]:
>
> at 16:40, <[email protected] <mailto:[email protected]>> <[email protected] <mailto:[email protected]>> wrote:
>
> Hi.
>
> I've applied this patch, but still getting incomplete report messages.
>
>
> Does the patch fix the other two issues:
> - Five finger tap kill's module so you have to restart it;
> - Two finger scoll is working incorrect and sometimes even when you
> raised one of two finger still thinks that you are scrolling.
>
> Kai-Heng
>
>
> Regards,
> Vladislav
>
> Apr 11, 2019, 7:17 PM by [email protected] <mailto:[email protected]>:
> Hi,
>
> at 05:18, <[email protected] <mailto:[email protected]>> <[email protected] <mailto:[email protected]>> wrote:
> Hi.
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
> I've tested that, main diffs are 30, 24, 16 (the most frequent), 2 (the least frequent).
>
> I was using 4.19.13 kernel, because I use ParrotOS (which happens to be Debian distribution).
> But I've installed experimental 5.0.0 kernel and I can't say right now if suspend problem is resolved (i have to rebuild latest kernel with patch).
>
> Can you try below fix?
>
> This can solve what commit 1475af255e18 ("HID: i2c-hid: Ignore input report if there's no data present on Elan touchpanels”) tries to workaround.
>
> diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c
> index c19a4c45f7bb..30e3664f1ae5 100644
> --- a/drivers/pinctrl/intel/pinctrl-intel.c
> +++ b/drivers/pinctrl/intel/pinctrl-intel.c
> @@ -957,6 +957,10 @@ static void intel_gpio_irq_mask_unmask(struct irq_data *d, bool mask)
> reg = community->regs + community->ie_offset + gpp * 4;
>
> raw_spin_lock_irqsave(&pctrl->lock, flags);
> +
> + if (!mask)
> + writel(BIT(gpp_offset), community->regs + community->is_offset + gpp * 4);
> +
> value = readl(reg);
> if (mask)
> value &= ~BIT(gpp_offset);
>
>
> Regards,
> Vladislav.
>
> Apr 3, 2019, 2:18 PM by [email protected] <mailto:[email protected]>:
> Hi,
>
> On 31-03-19 11:50, [email protected] <mailto:[email protected]> wrote:
> Hi. I've done everything you said, here are results:
>
> Vladislav can you check the output of /cat/interrupts on a kernel
> without the patch and while *not* using the touchpad; and check
> if the amount of touchpads-interrupts still keeps increasing in this
> case?
>
> IWI or IRQ work interrupts keep increasing with speed at least 3 interrupts/s.
>
> I'm really only interested in the touchpad related IRQs, so e.g. the line
> about "intel-gpio 129 ELAN1200:00", if you're seeing 3 interrupts/s on
> some others that is fine, so I take it the ELAN1200:00 interrupt count
> does not increase on an *unpatched* kernel, unless you use the touchpad?
> Also when I am moving touchpad IR-IO-APIC 14-fasteoi INT345D:00 get's triggered and increased.
>
> That is the GPIO controller interrupt, so that one increasing is normal.
>
> If I understand things correctly then this all means that the IRQ indeed
> is a normal level IRQ and Dmitry is likely correct that there is an
> pinctrl / gpiochip driver problem here.
>
> Can you try the following with an *unpatched* kernel? :
>
> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
> 2) Very quickly/briefly touch the touchpad once
> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
> 4) Subtract result from 1. from result from 3, this difference is
> the value we are interested in. E.g. my testing got 254 and 257, so
> a difference of 3.
>
> The goal here is to get an as low as possible difference. Feel free
> to repeat this a couple of times.
>
> On an Apollo Lake laptop with an I2C hid mt touchpad I can get the
> amount of interrupts triggered for a single touch down to 3,
> given the huge interrupt counts of 130000+ reported in:
> https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>
> I expect you to get a much bigger smallest possible difference
> between 2 "cat /proc/interrupts | grep ELAN" commands, note a
> difference of 0 means your touch did not register.
>
> Assuming you indeed see much more interrupts for a very quick
> touch + release, then we indeed have an interrupt handling problem
> we need to investigate further.
> I don't know if it's important or not, but for some reason these interrupts keep popping only on CPU2 (i have 4cpu processor).
>
> That does not matter.
> 1) Suspending the machine by selecting suspend from a menu in your
> desktop environment, or by briefly pressing the power-button, do
> not close the lid
> 2) As soon as the system starts suspending and while it is suspended, move
> your finger around the touchpad
> 3) Wake the system up with the powerbutton while moving your finger around
> 4) Check if the touchpad still works after this
>
> It works, but as it seems, looses edge. JournalCTL is being flooded with i2c_hid_get_input: incomplete report (16/65535)
>
> That is probably a different issue. If you loose the edge IRQ, then the touchpad
> would stop working without any messages. I believe that the suspend / resume
> issue may be fixed by:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=52cf93e63ee672a92f349edc6ddad86ec8808fd8
>
> Does your kernel have this commit? (please always use the latest kernel while
> testing).
> Also a thing to notice, that after manually removing and modprobing i2c_hid module, it says next in journalctl:
>
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using dummy regulator
> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using dummy regulator
>
> Those messages can safely be ignored.
>
> Regards,
>
> Hans
>
>

2019-04-16 04:00:43

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] ELAN touchpad i2c_hid bugs fix

at 19:42, Hans de Goede <[email protected]> wrote:

> Hi,
>
> On 15-04-19 13:36, [email protected] wrote:
>> Sorry for the delay.
>> By applying this patch I get next results:
>> Five finger tap and two finger scroll issues disappear, but after
>> suspend touchpad dies. Restarting module doesn't help.
>
> So bascally the same results as with the edge-triggered interrupt
> patch/hack,
> right?
>
> Are you still using the edge-triggered interrupt patch, or just the new
> patch Kai-Heng Feng provided.
>
> To me it sounds like the patch Kai-Heng Feng provided at least removes
> the need for the edge-triggered interrupt patch/hack and what remains to
> be solved is the suspend/resume issues.

Great! I’ll send a patch to address this issue.

Kai-Heng

>
> Regards,
>
> Hans
>
>
>
>> Here's the log:
>> Apr 15 14:35:54 parrot sudo[3473]: h0tw4t3r : TTY=pts/1 ;
>> PWD=/home/h0tw4t3r ; USER=root ; COMMAND=/sbin/rmmod i2c_hid
>> Apr 15 14:35:54 parrot sudo[3478]: h0tw4t3r : TTY=pts/1 ;
>> PWD=/home/h0tw4t3r ; USER=root ; COMMAND=/sbin/modprobe i2c_hid
>> Apr 15 14:35:54 parrot kernel: i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00
>> supply vdd not found, using dummy regulator
>> Apr 15 14:35:54 parrot kernel: i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00
>> supply vddl not found, using dummy regulator
>> Could you please explain what this patch does?
>> Regards,
>> Vladislav.
>> Apr 13, 2019, 11:42 AM by [email protected]:
>> at 16:40, <[email protected] <mailto:[email protected]>> <[email protected] <mailto:[email protected]>> wrote:
>> Hi.
>> I've applied this patch, but still getting incomplete report messages.
>> Does the patch fix the other two issues:
>> - Five finger tap kill's module so you have to restart it;
>> - Two finger scoll is working incorrect and sometimes even when you
>> raised one of two finger still thinks that you are scrolling.
>> Kai-Heng
>> Regards,
>> Vladislav
>> Apr 11, 2019, 7:17 PM by [email protected] <mailto:[email protected]>:
>> Hi,
>> at 05:18, <[email protected] <mailto:[email protected]>> <[email protected] <mailto:[email protected]>> wrote:
>> Hi.
>> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
>> 2) Very quickly/briefly touch the touchpad once
>> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
>> 4) Subtract result from 1. from result from 3, this difference is
>> the value we are interested in. E.g. my testing got 254 and 257, so
>> a difference of 3.
>> I've tested that, main diffs are 30, 24, 16 (the most frequent), 2 (the least frequent).
>> I was using 4.19.13 kernel, because I use ParrotOS (which happens to be Debian distribution).
>> But I've installed experimental 5.0.0 kernel and I can't say right now if suspend problem is resolved (i have to rebuild latest kernel with patch).
>> Can you try below fix?
>> This can solve what commit 1475af255e18 ("HID: i2c-hid: Ignore input report if there's no data present on Elan touchpanels”) tries to workaround.
>> diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c
>> index c19a4c45f7bb..30e3664f1ae5 100644
>> --- a/drivers/pinctrl/intel/pinctrl-intel.c
>> +++ b/drivers/pinctrl/intel/pinctrl-intel.c
>> @@ -957,6 +957,10 @@ static void intel_gpio_irq_mask_unmask(struct irq_data *d, bool mask)
>> reg = community->regs + community->ie_offset + gpp * 4;
>> raw_spin_lock_irqsave(&pctrl->lock, flags);
>> +
>> + if (!mask)
>> + writel(BIT(gpp_offset), community->regs + community->is_offset + gpp * 4);
>> +
>> value = readl(reg);
>> if (mask)
>> value &= ~BIT(gpp_offset);
>> Regards,
>> Vladislav.
>> Apr 3, 2019, 2:18 PM by [email protected] <mailto:[email protected]>:
>> Hi,
>> On 31-03-19 11:50, [email protected] <mailto:[email protected]> wrote:
>> Hi. I've done everything you said, here are results:
>> Vladislav can you check the output of /cat/interrupts on a kernel
>> without the patch and while *not* using the touchpad; and check
>> if the amount of touchpads-interrupts still keeps increasing in this
>> case?
>> IWI or IRQ work interrupts keep increasing with speed at least 3 interrupts/s.
>> I'm really only interested in the touchpad related IRQs, so e.g. the line
>> about "intel-gpio 129 ELAN1200:00", if you're seeing 3 interrupts/s on
>> some others that is fine, so I take it the ELAN1200:00 interrupt count
>> does not increase on an *unpatched* kernel, unless you use the touchpad?
>> Also when I am moving touchpad IR-IO-APIC 14-fasteoi INT345D:00 get's triggered and increased.
>> That is the GPIO controller interrupt, so that one increasing is normal.
>> If I understand things correctly then this all means that the IRQ indeed
>> is a normal level IRQ and Dmitry is likely correct that there is an
>> pinctrl / gpiochip driver problem here.
>> Can you try the following with an *unpatched* kernel? :
>> 1) Run "cat /proc/interrupts | grep ELAN" , note down the value
>> 2) Very quickly/briefly touch the touchpad once
>> 3) Run "cat /proc/interrupts | grep ELAN" , note down the value again
>> 4) Subtract result from 1. from result from 3, this difference is
>> the value we are interested in. E.g. my testing got 254 and 257, so
>> a difference of 3.
>> The goal here is to get an as low as possible difference. Feel free
>> to repeat this a couple of times.
>> On an Apollo Lake laptop with an I2C hid mt touchpad I can get the
>> amount of interrupts triggered for a single touch down to 3,
>> given the huge interrupt counts of 130000+ reported in:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1543769
>> I expect you to get a much bigger smallest possible difference
>> between 2 "cat /proc/interrupts | grep ELAN" commands, note a
>> difference of 0 means your touch did not register.
>> Assuming you indeed see much more interrupts for a very quick
>> touch + release, then we indeed have an interrupt handling problem
>> we need to investigate further.
>> I don't know if it's important or not, but for some reason these interrupts keep popping only on CPU2 (i have 4cpu processor).
>> That does not matter.
>> 1) Suspending the machine by selecting suspend from a menu in your
>> desktop environment, or by briefly pressing the power-button, do
>> not close the lid
>> 2) As soon as the system starts suspending and while it is suspended, move
>> your finger around the touchpad
>> 3) Wake the system up with the powerbutton while moving your finger around
>> 4) Check if the touchpad still works after this
>> It works, but as it seems, looses edge. JournalCTL is being flooded with i2c_hid_get_input: incomplete report (16/65535)
>> That is probably a different issue. If you loose the edge IRQ, then the touchpad
>> would stop working without any messages. I believe that the suspend / resume
>> issue may be fixed by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=52cf93e63ee672a92f349edc6ddad86ec8808fd8
>> Does your kernel have this commit? (please always use the latest kernel while
>> testing).
>> Also a thing to notice, that after manually removing and modprobing i2c_hid module, it says next in journalctl:
>> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vdd not found, using dummy regulator
>> i2c_hid i2c-ELAN1200:00: i2c-ELAN1200:00 supply vddl not found, using dummy regulator
>> Those messages can safely be ignored.
>> Regards,
>> Hans