The previous iteration of this was a bit naïve about bus error code
consistency from i2c drivers. Instead of trying to differentiate between
them, we now handle all bus errors during HID descriptor fetch the same
with a single debug log, as suggested by Doug[0].
As the change is relatively minor, I have carried over Łukasz' Tested-by
and Reviewed-by tags.
Third time's the charm?
[1] https://lore.kernel.org/all/CAD=FV=Xr6NsW085Sc+NhVmGDOn-zCCQ65CMNce_DsHxtXUgm9w@mail.gmail.com/
To avoid error messages when a device is not present, b3a81b6c4fc6 added
an initial bus probe using a dummy i2c_smbus_read_byte() call.
Without this probe, i2c_hid_fetch_hid_descriptor() will fail internally
on a bus error and log. Treat the bus error as a missing device and
remove the error log so we can do away with the probe.
Tested-by: Lukasz Majczak <[email protected]>
Reviewed-by: Lukasz Majczak <[email protected]>
Signed-off-by: Kenny Levinsen <[email protected]>
---
drivers/hid/i2c-hid/i2c-hid-core.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index d965382196c6..6ffa43d245b4 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -872,12 +872,11 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid)
ihid->wHIDDescRegister,
&ihid->hdesc,
sizeof(ihid->hdesc));
- if (error) {
- dev_err(&ihid->client->dev,
- "failed to fetch HID descriptor: %d\n",
- error);
- return -ENODEV;
- }
+
+ /* The i2c drivers are a bit inconsistent with their error
+ * codes, so treat everything as -ENXIO for now. */
+ if (error)
+ return -ENXIO;
}
/* Validate the length of HID descriptor, the 4 first bytes:
@@ -992,17 +991,9 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid)
struct hid_device *hid = ihid->hid;
int ret;
- /* Make sure there is something at this address */
- ret = i2c_smbus_read_byte(client);
- if (ret < 0) {
- i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret);
- return -ENXIO;
- }
-
ret = i2c_hid_fetch_hid_descriptor(ihid);
if (ret < 0) {
- dev_err(&client->dev,
- "Failed to fetch the HID Descriptor\n");
+ i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret);
return ret;
}
--
2.44.0
The retry for HID descriptor and for power commands deals with the same
device quirk, so align the two.
Tested-by: Lukasz Majczak <[email protected]>
Reviewed-by: Lukasz Majczak <[email protected]>
Signed-off-by: Kenny Levinsen <[email protected]>
---
drivers/hid/i2c-hid/i2c-hid-core.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index 6ac1b11fb675..4ec12c083714 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -385,25 +385,21 @@ static int i2c_hid_set_power(struct i2c_hid *ihid, int power_state)
i2c_hid_dbg(ihid, "%s\n", __func__);
/*
- * Some devices require to send a command to wakeup before power on.
- * The call will get a return value (EREMOTEIO) but device will be
- * triggered and activated. After that, it goes like a normal device.
+ * Some STM-based devices need 400µs after a rising clock edge to wake
+ * from deep sleep, which in turn means that our first command will
+ * fail on a bus error. Certain Weida Tech devices also need this
+ * wake-up. Retry the command in this case.
*/
- if (power_state == I2C_HID_PWR_ON) {
+ ret = i2c_hid_set_power_command(ihid, power_state);
+ if (ret && power_state == I2C_HID_PWR_ON) {
+ usleep_range(400, 500);
ret = i2c_hid_set_power_command(ihid, I2C_HID_PWR_ON);
-
- /* Device was already activated */
- if (!ret)
- goto set_pwr_exit;
}
- ret = i2c_hid_set_power_command(ihid, power_state);
if (ret)
dev_err(&ihid->client->dev,
"failed to change power setting.\n");
-set_pwr_exit:
-
/*
* The HID over I2C specification states that if a DEVICE needs time
* after the PWR_ON request, it should utilise CLOCK stretching.
--
2.44.0
Some STM microcontrollers need 400µs after rising clock edge in order to
come out of their deep sleep state. This in turn means that the first
command sent to them will fail on a bus error.
Retry once on bus error to see if the device came alive, otherwise treat
the error as if no device was present like before.
Link: https://lore.kernel.org/all/[email protected]/#t
Co-developed-by: Radoslaw Biernacki <[email protected]>
Co-developed-by: Lukasz Majczak <[email protected]>
Tested-by: Lukasz Majczak <[email protected]>
Reviewed-by: Lukasz Majczak <[email protected]>
Signed-off-by: Kenny Levinsen <[email protected]>
---
drivers/hid/i2c-hid/i2c-hid-core.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
index 6ffa43d245b4..6ac1b11fb675 100644
--- a/drivers/hid/i2c-hid/i2c-hid-core.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -991,8 +991,17 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid)
struct hid_device *hid = ihid->hid;
int ret;
+ /*
+ * Some STM-based devices need 400µs after a rising clock edge to wake
+ * from deep sleep, which in turn means that our first command will
+ * fail on a bus error. Retry the command in this case.
+ */
ret = i2c_hid_fetch_hid_descriptor(ihid);
- if (ret < 0) {
+ if (ret == -ENXIO) {
+ usleep_range(400, 500);
+ ret = i2c_hid_fetch_hid_descriptor(ihid);
+ }
+ if (ret) {
i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret);
return ret;
}
--
2.44.0
Hi Kenny, Lukasz,
On Sat, Apr 27, 2024 at 12:47:07AM +0200, Kenny Levinsen wrote:
> To avoid error messages when a device is not present, b3a81b6c4fc6 added
> an initial bus probe using a dummy i2c_smbus_read_byte() call.
>
> Without this probe, i2c_hid_fetch_hid_descriptor() will fail internally
> on a bus error and log. Treat the bus error as a missing device and
> remove the error log so we can do away with the probe.
>
> Tested-by: Lukasz Majczak <[email protected]>
> Reviewed-by: Lukasz Majczak <[email protected]>
> Signed-off-by: Kenny Levinsen <[email protected]>
> ---
> drivers/hid/i2c-hid/i2c-hid-core.c | 21 ++++++---------------
> 1 file changed, 6 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
> index d965382196c6..6ffa43d245b4 100644
> --- a/drivers/hid/i2c-hid/i2c-hid-core.c
> +++ b/drivers/hid/i2c-hid/i2c-hid-core.c
> @@ -872,12 +872,11 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid)
> ihid->wHIDDescRegister,
> &ihid->hdesc,
> sizeof(ihid->hdesc));
> - if (error) {
> - dev_err(&ihid->client->dev,
> - "failed to fetch HID descriptor: %d\n",
> - error);
> - return -ENODEV;
> - }
> +
> + /* The i2c drivers are a bit inconsistent with their error
> + * codes, so treat everything as -ENXIO for now. */
> + if (error)
> + return -ENXIO;
I really think we should differentiate the cases "we do not know if
there is a device" vs "we do known that there is a device and we have
strong expectation of what that device is, and we do not expect
communication to fail".
Because of that I think we should have separate retry for the original
i2c_smbus_read_byte() (if you want you can make a wrapper around it,
something like i2c_hid_check_device_present()", and if there is a
concern that we will run into similar issue on resume (either from
suspend or from hibernate) then we can have similar retry in
i2c_hid_fetch_hid_descriptor().
But I do think that i2c_hid_fetch_hid_descriptor() should not try to
clobber the error but rather log the true one.
> }
>
> /* Validate the length of HID descriptor, the 4 first bytes:
> @@ -992,17 +991,9 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid)
> struct hid_device *hid = ihid->hid;
> int ret;
>
> - /* Make sure there is something at this address */
> - ret = i2c_smbus_read_byte(client);
> - if (ret < 0) {
> - i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret);
> - return -ENXIO;
> - }
> -
> ret = i2c_hid_fetch_hid_descriptor(ihid);
> if (ret < 0) {
> - dev_err(&client->dev,
> - "Failed to fetch the HID Descriptor\n");
> + i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret);
> return ret;
> }
>
> --
> 2.44.0
>
>
Thanks.
--
Dmitry
On 4/27/24 5:21 AM, Dmitry Torokhov wrote:
> I really think we should differentiate the cases "we do not know if
> there is a device" vs "we do known that there is a device and we have
> strong expectation of what that device is, and we do not expect
> communication to fail".
My reasoning was that there is no difference between looking for address
acknowledge on a probe read vs. a real command. Unfortunately, I ran
into some issues with error code consistency that Doug highlighted...
Considering that the smbus probe bails on *any* error, it's really only
ACK'd address + NACK'd register that remains, and I thought it maybe
wouldn't be too harmful to just always have a debug log as suggested.
However, I would still like *more* good errors by being specific about
the error condition, and I plan to send some patches to get the number
of drivers sending ENXIO up so we can comfortably rely on it in a future
i2c-hid patch.
If you don't think it's acceptable to leave this as a pure debug print
for now, I'll send a patch with just a minor clean-up and Łukasz' delays
- then I'll try this again later when the driver situation has improved.
I've been rapid-firing revisions, so I'll await an ACK this time... :)
---
For some context, I started looking at the i2c-hid driver after a recent
regression around assumed Windows behavior, and found that the actual
behavior differed a lot from our assumptions. Windows gets the job done
notably quicker, with fewer messages and with shorter albeit differently
placed delays.
My plan is to send patches that clean up and aligns our traffic more,
speeding things up and hopefully deprecating some quirks. To that end, I
would also like to get rid of the dummy read once we're comfortable with it.
On Sat, Apr 27, 2024 at 03:20:07PM +0200, Kenny Levinsen wrote:
> On 4/27/24 5:21 AM, Dmitry Torokhov wrote:
> > I really think we should differentiate the cases "we do not know if
> > there is a device" vs "we do known that there is a device and we have
> > strong expectation of what that device is, and we do not expect
> > communication to fail".
>
> My reasoning was that there is no difference between looking for address
> acknowledge on a probe read vs. a real command. Unfortunately, I ran into
> some issues with error code consistency that Doug highlighted...
I actually believe there is. On Chromebooks we may source components
from several vendors and use them in our devices. The components
are electrically compatible with each other, have exactly the same
connector, and therefore interchangeable. Because of that at probe time
we do not quite know if the device is there at given address, or not
(i.e. the touchpad could be from a different vendor and listening on
another address) and we need to make a quick determination whether we
should continue with probe or not.
Once we decided that the probe should continue we commit to it and all
subsequent errors from the device should be treated as unexpected errors
worthy of logging. On resume we do not expect hardware configuration to
change, so again when resuming we also treat errors as errors and log
them and fail resume.
>
> Considering that the smbus probe bails on *any* error, it's really only
> ACK'd address + NACK'd register that remains, and I thought it maybe
> wouldn't be too harmful to just always have a debug log as suggested.
> However, I would still like *more* good errors by being specific about the
> error condition, and I plan to send some patches to get the number of
> drivers sending ENXIO up so we can comfortably rely on it in a future
> i2c-hid patch.
>
> If you don't think it's acceptable to leave this as a pure debug print for
> now, I'll send a patch with just a minor clean-up and Łukasz' delays - then
> I'll try this again later when the driver situation has improved. I've been
> rapid-firing revisions, so I'll await an ACK this time... :)
>
> ---
>
> For some context, I started looking at the i2c-hid driver after a recent
> regression around assumed Windows behavior, and found that the actual
> behavior differed a lot from our assumptions. Windows gets the job done
> notably quicker, with fewer messages and with shorter albeit differently
> placed delays.
I am not sure we can fully unify what Windows does and what Linux does,
mainly because our firmwares are different (I think Windows devices do a
lot more device discovery in firmware, Chrome OS historically tried to
limit amount of code in its firmware). We also need to make sure it
works on non-ACPI systems/ARM.
Thanks.
--
Dmitry
On 4/30/24 11:41 PM, Dmitry Torokhov wrote:
> I actually believe there is. On Chromebooks we may source components
> from several vendors and use them in our devices. The components
> are electrically compatible with each other, have exactly the same
> connector, and therefore interchangeable. Because of that at probe time
> we do not quite know if the device is there at given address, or not
> (i.e. the touchpad could be from a different vendor and listening on
> another address) and we need to make a quick determination whether we
> should continue with probe or not.
Maybe I should clarify what I meant: All I2C operations start with the
master writing the slave address to the bus. When a slave reads its own
address off the bus, it pulls the data line low to ACK. If no device is
present on the bus with the specified address, the line stays high which
is a NACK. This means that on the bus level, we have a clear error
condition specifically for no device with the specified address being
present on the bus.
Whether the operation used is a dummy read or our first actual write
should not matter - if the address is not acknowledged, the device is
not present (or able to talk I2C). The problem lies in whether this "no
device present on bus" error is reported clearly to us: Some drivers use
-ENXIO specifically for this, some use it also for NACKs on written
data, some report it but use other return codes for it, etc.
Even if we stick to the smbus probe in the long run, if we get to the
point where we can rely on the error codes from I2C drivers we would be
able to correctly log and propagate other error classes like bus errors
or I2C driver issues which are all currently silenced as "nothing at
address" by the smbus probe.
> I am not sure we can fully unify what Windows does and what Linux does,
> mainly because our firmwares are different (I think Windows devices do a
> lot more device discovery in firmware, Chrome OS historically tried to
> limit amount of code in its firmware). We also need to make sure it
> works on non-ACPI systems/ARM.
Good point. My main focus is also quirky behaviors we have added to
replicate Windows behavior, the smbus probe just stood out in my bus traces.
I already sent
https://lore.kernel.org/all/[email protected]/ which goes
back to improving the bus probe.
On Wed, May 01, 2024 at 07:24:08AM +0200, Kenny Levinsen wrote:
> On 4/30/24 11:41 PM, Dmitry Torokhov wrote:
> > I actually believe there is. On Chromebooks we may source components
> > from several vendors and use them in our devices. The components
> > are electrically compatible with each other, have exactly the same
> > connector, and therefore interchangeable. Because of that at probe time
> > we do not quite know if the device is there at given address, or not
> > (i.e. the touchpad could be from a different vendor and listening on
> > another address) and we need to make a quick determination whether we
> > should continue with probe or not.
>
> Maybe I should clarify what I meant: All I2C operations start with the
> master writing the slave address to the bus. When a slave reads its own
> address off the bus, it pulls the data line low to ACK. If no device is
> present on the bus with the specified address, the line stays high which is
> a NACK. This means that on the bus level, we have a clear error condition
> specifically for no device with the specified address being present on the
> bus.
>
> Whether the operation used is a dummy read or our first actual write should
> not matter - if the address is not acknowledged, the device is not present
> (or able to talk I2C).
Is it possible for a device to be wedged so hard that it refuses to
acknowledge the address?
> The problem lies in whether this "no device present
> on bus" error is reported clearly to us: Some drivers use -ENXIO
> specifically for this, some use it also for NACKs on written data, some
> report it but use other return codes for it, etc.
>
> Even if we stick to the smbus probe in the long run, if we get to the point
> where we can rely on the error codes from I2C drivers we would be able to
> correctly log and propagate other error classes like bus errors or I2C
> driver issues which are all currently silenced as "nothing at address" by
> the smbus probe.
I think this depends on the answer to the question above. If there is
potential that the chip may stop responding, I still see benefit in
differentiating initial "soft touch" poke vs. hard errors once we
established that there is/was a device and it started misbehaving.
Thanks.
--
Dmitry
On 5/1/24 9:09 PM, Dmitry Torokhov wrote:
> Is it possible for a device to be wedged so hard that it refuses to
> acknowledge the address?
A slave is allowed to not acknowledge if not able (e.g., "because it's
performing some real time function"), but a slave that does not
acknowledge its address is electrically indistinguishable from a
disconnected device. In such case the device is impossible to detect
through I2C operations, and neither smbus probe nor a "real" command
will see it.
Any logic we have to silence missing devices will also silence entirely
unresponsive or extremely non-cooperate devices. That is the price to
pay for avoiding the log message unfortunately.
No other errors from the smbus probe or a real command would be related
to device presence, and some of them even suggest a device is present
but broken (arbitration loss, assuming no shorts).