2024-03-20 07:48:54

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH 0/5] Fix various races in UCSI

Fix various races in UCSI code:
- The EVENT_PENDING bit should be cleared under the PPM lock to
avoid spurious re-checking of the connector status.
- The initial connector change notification during init may be
lost which can cause a stuck UCSI controller. Observed by me
and others during resume or after module reload.
- Unsupported commands must be ACKed. This was uncovered by the
recent change from Jameson Thies that did sent unsupported commands.
- The DELL quirk still isn't quite complete and I've found a more
elegant way to handle this. A connector change ack _is_ accepted
on affected systems if it is bundled with a command ack.
- If we do two consecutive resets or the controller is already
reset at boog the second reset might complete early because the
reset complete bit is already set. ucsi_ccg.c has a work around
for this but it looks like an more general issue to me.

NOTE:
As a result of these individual fixes we could think about the
question if there are additional cases where we send some type
of command to the PPM while the bit that indicates its completion
is already set in CCI. And in fact there is one more case where
this can happen: The ack command that clears the connector change
is sent directly after the ack command for the previous command.
It might be possible to simply ack the connector change along with
the first command ucsi_handle_connector_change() and not at the
end. AFAICS the connector lock should protect us from races that
might arise out of this.

Christian A. Ehrhardt (5):
usb: typec: ucsi: Clear EVENT_PENDING under PPM lock
usb: typec: ucsi: Check for notifications after init
usb: typec: ucsi: Ack unsupported commands
usb: typec: ucsi_acpi: Refactor and fix DELL quirk
usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset

drivers/usb/typec/ucsi/ucsi.c | 56 ++++++++++++++++++++--
drivers/usb/typec/ucsi/ucsi_acpi.c | 75 +++++++++++++-----------------
2 files changed, 84 insertions(+), 47 deletions(-)

--
2.40.1



2024-03-20 07:48:55

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH 1/5] usb: typec: ucsi: Clear EVENT_PENDING under PPM lock

Suppose we sleep on the PPM lock after clearing the EVENT_PENDING
bit because the thread for another connector is executing a command.
In this case the command completion of the other command will still
report the connector change for our connector.

Clear the EVENT_PENDING bit under the PPM lock to avoid another
useless call to ucsi_handle_connector_change() in this case.

Fixes: c9aed03a0a68 ("usb: ucsi: Add missing ppm_lock")
Signed-off-by: Christian A. Ehrhardt <[email protected]>
---
drivers/usb/typec/ucsi/ucsi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index cf52cb34d285..8a6645ffd938 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1215,11 +1215,11 @@ static void ucsi_handle_connector_change(struct work_struct *work)
if (con->status.change & UCSI_CONSTAT_CAM_CHANGE)
ucsi_partner_task(con, ucsi_check_altmodes, 1, 0);

- clear_bit(EVENT_PENDING, &con->ucsi->flags);
-
mutex_lock(&ucsi->ppm_lock);
+ clear_bit(EVENT_PENDING, &con->ucsi->flags);
ret = ucsi_acknowledge_connector_change(ucsi);
mutex_unlock(&ucsi->ppm_lock);
+
if (ret)
dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);

--
2.40.1


2024-03-20 07:49:38

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH 2/5] usb: typec: ucsi: Check for notifications after init

The completion notification for the final SET_NOTIFICATION_ENABLE
command during initialization can include a connector change
notification. However, at the time this completion notification is
processed, the ucsi struct is not ready to handle this notification.
As a result the notification is ignored and the controller
never sends an interrupt again.

Re-check CCI for a pending connector state change after
initialization is complete. Adjust the corresponding debug
message accordingly.

Fixes: 71a1fa0df2a3 ("usb: typec: ucsi: Store the notification mask")
Cc: [email protected]
Signed-off-by: Christian A. Ehrhardt <[email protected]>
---
drivers/usb/typec/ucsi/ucsi.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 8a6645ffd938..dceeed207569 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1237,7 +1237,7 @@ void ucsi_connector_change(struct ucsi *ucsi, u8 num)
struct ucsi_connector *con = &ucsi->connector[num - 1];

if (!(ucsi->ntfy & UCSI_ENABLE_NTFY_CONNECTOR_CHANGE)) {
- dev_dbg(ucsi->dev, "Bogus connector change event\n");
+ dev_dbg(ucsi->dev, "Early connector change event\n");
return;
}

@@ -1636,6 +1636,7 @@ static int ucsi_init(struct ucsi *ucsi)
{
struct ucsi_connector *con, *connector;
u64 command, ntfy;
+ u32 cci;
int ret;
int i;

@@ -1688,6 +1689,13 @@ static int ucsi_init(struct ucsi *ucsi)

ucsi->connector = connector;
ucsi->ntfy = ntfy;
+
+ ret = ucsi->ops->read(ucsi, UCSI_CCI, &cci, sizeof(cci));
+ if (ret)
+ return ret;
+ if (UCSI_CCI_CONNECTOR(READ_ONCE(cci)))
+ ucsi_connector_change(ucsi, cci);
+
return 0;

err_unregister:
--
2.40.1


2024-03-20 11:16:49

by Kenneth Crudup

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix various races in UCSI


Applied (cleanly) onto 6.8.1; I'll be testing over the next few days,
but a few connects/disconnects mixed in with suspend/resume cycles shows
no obvious issues (and everything seems to work).

Dell XPS 9320, BIOS 2.10.0

-K

On 3/20/24 00:39, Christian A. Ehrhardt wrote:
> Fix various races in UCSI code:
> - The EVENT_PENDING bit should be cleared under the PPM lock to
> avoid spurious re-checking of the connector status.
> - The initial connector change notification during init may be
> lost which can cause a stuck UCSI controller. Observed by me
> and others during resume or after module reload.
> - Unsupported commands must be ACKed. This was uncovered by the
> recent change from Jameson Thies that did sent unsupported commands.
> - The DELL quirk still isn't quite complete and I've found a more
> elegant way to handle this. A connector change ack _is_ accepted
> on affected systems if it is bundled with a command ack.
> - If we do two consecutive resets or the controller is already
> reset at boog the second reset might complete early because the
> reset complete bit is already set. ucsi_ccg.c has a work around
> for this but it looks like an more general issue to me.
>
> NOTE:
> As a result of these individual fixes we could think about the
> question if there are additional cases where we send some type
> of command to the PPM while the bit that indicates its completion
> is already set in CCI. And in fact there is one more case where
> this can happen: The ack command that clears the connector change
> is sent directly after the ack command for the previous command.
> It might be possible to simply ack the connector change along with
> the first command ucsi_handle_connector_change() and not at the
> end. AFAICS the connector lock should protect us from races that
> might arise out of this.
>
> Christian A. Ehrhardt (5):
> usb: typec: ucsi: Clear EVENT_PENDING under PPM lock
> usb: typec: ucsi: Check for notifications after init
> usb: typec: ucsi: Ack unsupported commands
> usb: typec: ucsi_acpi: Refactor and fix DELL quirk
> usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset
>
> drivers/usb/typec/ucsi/ucsi.c | 56 ++++++++++++++++++++--
> drivers/usb/typec/ucsi/ucsi_acpi.c | 75 +++++++++++++-----------------
> 2 files changed, 84 insertions(+), 47 deletions(-)
>

--
Kenneth R. Crudup / Sr. SW Engineer, Scott County Consulting, Orange
County CA

2024-03-20 07:40:50

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH 4/5] usb: typec: ucsi_acpi: Refactor and fix DELL quirk

Some DELL systems don't like UCSI_ACK_CC_CI commands with the
UCSI_ACK_CONNECTOR_CHANGE but not the UCSI_ACK_COMMAND_COMPLETE
bit set. The current quirk still leaves room for races because
it requires two consecutive ACK commands to be sent.

Refactor and significantly simplify the quirk to fix this:
Send a dummy command and bundle the connector change ack with the
command completion ack in a single UCSI_ACK_CC_CI command.
This removes the need to probe for the quirk.

While there define flag bits for struct ucsi_acpi->flags in ucsi_acpi.c
and don't re-use definitions from ucsi.h for struct ucsi->flags.

Fixes: f3be347ea42d ("usb: ucsi_acpi: Quirk to ack a connector change ack cmd")
Cc: [email protected]
Signed-off-by: Christian A. Ehrhardt <[email protected]>
---
drivers/usb/typec/ucsi/ucsi_acpi.c | 75 +++++++++++++-----------------
1 file changed, 33 insertions(+), 42 deletions(-)

diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c
index 928eacbeb21a..7b3ac133ef86 100644
--- a/drivers/usb/typec/ucsi/ucsi_acpi.c
+++ b/drivers/usb/typec/ucsi/ucsi_acpi.c
@@ -23,10 +23,11 @@ struct ucsi_acpi {
void *base;
struct completion complete;
unsigned long flags;
+#define UCSI_ACPI_SUPPRESS_EVENT 0
+#define UCSI_ACPI_COMMAND_PENDING 1
+#define UCSI_ACPI_ACK_PENDING 2
guid_t guid;
u64 cmd;
- bool dell_quirk_probed;
- bool dell_quirk_active;
};

static int ucsi_acpi_dsm(struct ucsi_acpi *ua, int func)
@@ -79,9 +80,9 @@ static int ucsi_acpi_sync_write(struct ucsi *ucsi, unsigned int offset,
int ret;

if (ack)
- set_bit(ACK_PENDING, &ua->flags);
+ set_bit(UCSI_ACPI_ACK_PENDING, &ua->flags);
else
- set_bit(COMMAND_PENDING, &ua->flags);
+ set_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags);

ret = ucsi_acpi_async_write(ucsi, offset, val, val_len);
if (ret)
@@ -92,9 +93,9 @@ static int ucsi_acpi_sync_write(struct ucsi *ucsi, unsigned int offset,

out_clear_bit:
if (ack)
- clear_bit(ACK_PENDING, &ua->flags);
+ clear_bit(UCSI_ACPI_ACK_PENDING, &ua->flags);
else
- clear_bit(COMMAND_PENDING, &ua->flags);
+ clear_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags);

return ret;
}
@@ -129,51 +130,40 @@ static const struct ucsi_operations ucsi_zenbook_ops = {
};

/*
- * Some Dell laptops expect that an ACK command with the
- * UCSI_ACK_CONNECTOR_CHANGE bit set is followed by a (separate)
- * ACK command that only has the UCSI_ACK_COMMAND_COMPLETE bit set.
- * If this is not done events are not delivered to OSPM and
- * subsequent commands will timeout.
+ * Some Dell laptops don't like ACK commands with the
+ * UCSI_ACK_CONNECTOR_CHANGE but not the UCSI_ACK_COMMAND_COMPLETE
+ * bit set. To work around this send a dummy command and bundle the
+ * UCSI_ACK_CONNECTOR_CHANGE with the UCSI_ACK_COMMAND_COMPLETE
+ * for the dummy command.
*/
static int
ucsi_dell_sync_write(struct ucsi *ucsi, unsigned int offset,
const void *val, size_t val_len)
{
struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi);
- u64 cmd = *(u64 *)val, ack = 0;
+ u64 cmd = *(u64 *)val;
+ u64 dummycmd = UCSI_GET_CAPABILITY;
int ret;

- if (UCSI_COMMAND(cmd) == UCSI_ACK_CC_CI &&
- cmd & UCSI_ACK_CONNECTOR_CHANGE)
- ack = UCSI_ACK_CC_CI | UCSI_ACK_COMMAND_COMPLETE;
-
- ret = ucsi_acpi_sync_write(ucsi, offset, val, val_len);
- if (ret != 0)
- return ret;
- if (ack == 0)
- return ret;
-
- if (!ua->dell_quirk_probed) {
- ua->dell_quirk_probed = true;
-
- cmd = UCSI_GET_CAPABILITY;
- ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &cmd,
- sizeof(cmd));
- if (ret == 0)
- return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL,
- &ack, sizeof(ack));
- if (ret != -ETIMEDOUT)
+ if (cmd == (UCSI_ACK_CC_CI | UCSI_ACK_CONNECTOR_CHANGE)) {
+ cmd |= UCSI_ACK_COMMAND_COMPLETE;
+
+ /*
+ * The UCSI core thinks it is sending a connector change ack
+ * and will accept new connector change events. We don't want
+ * this to happen for the dummy command as its response will
+ * still report the very event that the core is trying to clear.
+ */
+ set_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags);
+ ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &dummycmd,
+ sizeof(dummycmd));
+ clear_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags);
+
+ if (ret < 0)
return ret;
-
- ua->dell_quirk_active = true;
- dev_err(ua->dev, "Firmware bug: Additional ACK required after ACKing a connector change.\n");
- dev_err(ua->dev, "Firmware bug: Enabling workaround\n");
}

- if (!ua->dell_quirk_active)
- return ret;
-
- return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &ack, sizeof(ack));
+ return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &cmd, sizeof(cmd));
}

static const struct ucsi_operations ucsi_dell_ops = {
@@ -209,13 +199,14 @@ static void ucsi_acpi_notify(acpi_handle handle, u32 event, void *data)
if (ret)
return;

- if (UCSI_CCI_CONNECTOR(cci))
+ if (UCSI_CCI_CONNECTOR(cci) &&
+ !test_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags))
ucsi_connector_change(ua->ucsi, UCSI_CCI_CONNECTOR(cci));

if (cci & UCSI_CCI_ACK_COMPLETE && test_bit(ACK_PENDING, &ua->flags))
complete(&ua->complete);
if (cci & UCSI_CCI_COMMAND_COMPLETE &&
- test_bit(COMMAND_PENDING, &ua->flags))
+ test_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags))
complete(&ua->complete);
}

--
2.40.1


2024-03-22 10:03:15

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix various races in UCSI

On Wed, Mar 20, 2024 at 08:39:21AM +0100, Christian A. Ehrhardt wrote:
> Fix various races in UCSI code:
> - The EVENT_PENDING bit should be cleared under the PPM lock to
> avoid spurious re-checking of the connector status.
> - The initial connector change notification during init may be
> lost which can cause a stuck UCSI controller. Observed by me
> and others during resume or after module reload.
> - Unsupported commands must be ACKed. This was uncovered by the
> recent change from Jameson Thies that did sent unsupported commands.
> - The DELL quirk still isn't quite complete and I've found a more
> elegant way to handle this. A connector change ack _is_ accepted
> on affected systems if it is bundled with a command ack.
> - If we do two consecutive resets or the controller is already
> reset at boog the second reset might complete early because the
> reset complete bit is already set. ucsi_ccg.c has a work around
> for this but it looks like an more general issue to me.
>
> NOTE:
> As a result of these individual fixes we could think about the
> question if there are additional cases where we send some type
> of command to the PPM while the bit that indicates its completion
> is already set in CCI. And in fact there is one more case where
> this can happen: The ack command that clears the connector change
> is sent directly after the ack command for the previous command.
> It might be possible to simply ack the connector change along with
> the first command ucsi_handle_connector_change() and not at the
> end. AFAICS the connector lock should protect us from races that
> might arise out of this.

That sounds good to me.

thanks,

--
heikki

2024-03-22 11:03:25

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix various races in UCSI

On 20/03/2024 11:53, Kenneth Crudup wrote:
>
> Applied (cleanly) onto 6.8.1; I'll be testing over the next few days, but a few connects/disconnects mixed in with suspend/resume cycles shows no obvious issues (and everything seems to work).
>
> Dell XPS 9320, BIOS 2.10.0
>
> -K
>
> On 3/20/24 00:39, Christian A. Ehrhardt wrote:
>> Fix various races in UCSI code:
>> - The EVENT_PENDING bit should be cleared under the PPM lock to
>>    avoid spurious re-checking of the connector status.
>> - The initial connector change notification during init may be
>>    lost which can cause a stuck UCSI controller. Observed by me
>>    and others during resume or after module reload.
>> - Unsupported commands must be ACKed. This was uncovered by the
>>    recent change from Jameson Thies that did sent unsupported commands.
>> - The DELL quirk still isn't quite complete and I've found a more
>>    elegant way to handle this. A connector change ack _is_ accepted
>>    on affected systems if it is bundled with a command ack.
>> - If we do two consecutive resets or the controller is already
>>    reset at boog the second reset might complete early because the
>>    reset complete bit is already set. ucsi_ccg.c has a work around
>>    for this but it looks like an more general issue to me.
>>
>> NOTE:
>> As a result of these individual fixes we could think about the
>> question if there are additional cases where we send some type
>> of command to the PPM while the bit that indicates its completion
>> is already set in CCI. And in fact there is one more case where
>> this can happen: The ack command that clears the connector change
>> is sent directly after the ack command for the previous command.
>> It might be possible to simply ack the connector change along with
>> the first command ucsi_handle_connector_change() and not at the
>> end. AFAICS the connector lock should protect us from races that
>> might arise out of this.
>>
>> Christian A. Ehrhardt (5):
>>    usb: typec: ucsi: Clear EVENT_PENDING under PPM lock
>>    usb: typec: ucsi: Check for notifications after init
>>    usb: typec: ucsi: Ack unsupported commands
>>    usb: typec: ucsi_acpi: Refactor and fix DELL quirk
>>    usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset
>>
>>   drivers/usb/typec/ucsi/ucsi.c      | 56 ++++++++++++++++++++--
>>   drivers/usb/typec/ucsi/ucsi_acpi.c | 75 +++++++++++++-----------------
>>   2 files changed, 84 insertions(+), 47 deletions(-)
>>
>

with [1] applied:

Tested-by: Neil Armstrong <[email protected]> # on SM8550-QRD
Tested-by: Neil Armstrong <[email protected]> # on SM8650-QRD
Tested-by: Neil Armstrong <[email protected]> # on SM8550-HDK

[1] https://lore.kernel.org/all/[email protected]/

Thanks,
Neil

2024-03-22 12:21:56

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH 2/5] usb: typec: ucsi: Check for notifications after init

On Wed, Mar 20, 2024 at 08:39:23AM +0100, Christian A. Ehrhardt wrote:
> The completion notification for the final SET_NOTIFICATION_ENABLE
> command during initialization can include a connector change
> notification. However, at the time this completion notification is
> processed, the ucsi struct is not ready to handle this notification.
> As a result the notification is ignored and the controller
> never sends an interrupt again.
>
> Re-check CCI for a pending connector state change after
> initialization is complete. Adjust the corresponding debug
> message accordingly.
>
> Fixes: 71a1fa0df2a3 ("usb: typec: ucsi: Store the notification mask")
> Cc: [email protected]
> Signed-off-by: Christian A. Ehrhardt <[email protected]>

Reviewed-by: Heikki Krogerus <[email protected]>

> ---
> drivers/usb/typec/ucsi/ucsi.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index 8a6645ffd938..dceeed207569 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1237,7 +1237,7 @@ void ucsi_connector_change(struct ucsi *ucsi, u8 num)
> struct ucsi_connector *con = &ucsi->connector[num - 1];
>
> if (!(ucsi->ntfy & UCSI_ENABLE_NTFY_CONNECTOR_CHANGE)) {
> - dev_dbg(ucsi->dev, "Bogus connector change event\n");
> + dev_dbg(ucsi->dev, "Early connector change event\n");
> return;
> }
>
> @@ -1636,6 +1636,7 @@ static int ucsi_init(struct ucsi *ucsi)
> {
> struct ucsi_connector *con, *connector;
> u64 command, ntfy;
> + u32 cci;
> int ret;
> int i;
>
> @@ -1688,6 +1689,13 @@ static int ucsi_init(struct ucsi *ucsi)
>
> ucsi->connector = connector;
> ucsi->ntfy = ntfy;
> +
> + ret = ucsi->ops->read(ucsi, UCSI_CCI, &cci, sizeof(cci));
> + if (ret)
> + return ret;
> + if (UCSI_CCI_CONNECTOR(READ_ONCE(cci)))
> + ucsi_connector_change(ucsi, cci);
> +
> return 0;
>
> err_unregister:
> --
> 2.40.1

--
heikki

2024-03-22 13:59:02

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH 1/5] usb: typec: ucsi: Clear EVENT_PENDING under PPM lock

On Wed, Mar 20, 2024 at 08:39:22AM +0100, Christian A. Ehrhardt wrote:
> Suppose we sleep on the PPM lock after clearing the EVENT_PENDING
> bit because the thread for another connector is executing a command.
> In this case the command completion of the other command will still
> report the connector change for our connector.
>
> Clear the EVENT_PENDING bit under the PPM lock to avoid another
> useless call to ucsi_handle_connector_change() in this case.
>
> Fixes: c9aed03a0a68 ("usb: ucsi: Add missing ppm_lock")
> Signed-off-by: Christian A. Ehrhardt <[email protected]>

Reviewed-by: Heikki Krogerus <[email protected]>

> ---
> drivers/usb/typec/ucsi/ucsi.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index cf52cb34d285..8a6645ffd938 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1215,11 +1215,11 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> if (con->status.change & UCSI_CONSTAT_CAM_CHANGE)
> ucsi_partner_task(con, ucsi_check_altmodes, 1, 0);
>
> - clear_bit(EVENT_PENDING, &con->ucsi->flags);
> -
> mutex_lock(&ucsi->ppm_lock);
> + clear_bit(EVENT_PENDING, &con->ucsi->flags);
> ret = ucsi_acknowledge_connector_change(ucsi);
> mutex_unlock(&ucsi->ppm_lock);
> +
> if (ret)
> dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
>
> --
> 2.40.1

--
heikki

2024-03-22 15:57:03

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH 4/5] usb: typec: ucsi_acpi: Refactor and fix DELL quirk

On Wed, Mar 20, 2024 at 08:39:25AM +0100, Christian A. Ehrhardt wrote:
> Some DELL systems don't like UCSI_ACK_CC_CI commands with the
> UCSI_ACK_CONNECTOR_CHANGE but not the UCSI_ACK_COMMAND_COMPLETE
> bit set. The current quirk still leaves room for races because
> it requires two consecutive ACK commands to be sent.
>
> Refactor and significantly simplify the quirk to fix this:
> Send a dummy command and bundle the connector change ack with the
> command completion ack in a single UCSI_ACK_CC_CI command.
> This removes the need to probe for the quirk.
>
> While there define flag bits for struct ucsi_acpi->flags in ucsi_acpi.c
> and don't re-use definitions from ucsi.h for struct ucsi->flags.
>
> Fixes: f3be347ea42d ("usb: ucsi_acpi: Quirk to ack a connector change ack cmd")
> Cc: [email protected]
> Signed-off-by: Christian A. Ehrhardt <[email protected]>

Reviewed-by: Heikki Krogerus <[email protected]>

> ---
> drivers/usb/typec/ucsi/ucsi_acpi.c | 75 +++++++++++++-----------------
> 1 file changed, 33 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c
> index 928eacbeb21a..7b3ac133ef86 100644
> --- a/drivers/usb/typec/ucsi/ucsi_acpi.c
> +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c
> @@ -23,10 +23,11 @@ struct ucsi_acpi {
> void *base;
> struct completion complete;
> unsigned long flags;
> +#define UCSI_ACPI_SUPPRESS_EVENT 0
> +#define UCSI_ACPI_COMMAND_PENDING 1
> +#define UCSI_ACPI_ACK_PENDING 2
> guid_t guid;
> u64 cmd;
> - bool dell_quirk_probed;
> - bool dell_quirk_active;
> };
>
> static int ucsi_acpi_dsm(struct ucsi_acpi *ua, int func)
> @@ -79,9 +80,9 @@ static int ucsi_acpi_sync_write(struct ucsi *ucsi, unsigned int offset,
> int ret;
>
> if (ack)
> - set_bit(ACK_PENDING, &ua->flags);
> + set_bit(UCSI_ACPI_ACK_PENDING, &ua->flags);
> else
> - set_bit(COMMAND_PENDING, &ua->flags);
> + set_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags);
>
> ret = ucsi_acpi_async_write(ucsi, offset, val, val_len);
> if (ret)
> @@ -92,9 +93,9 @@ static int ucsi_acpi_sync_write(struct ucsi *ucsi, unsigned int offset,
>
> out_clear_bit:
> if (ack)
> - clear_bit(ACK_PENDING, &ua->flags);
> + clear_bit(UCSI_ACPI_ACK_PENDING, &ua->flags);
> else
> - clear_bit(COMMAND_PENDING, &ua->flags);
> + clear_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags);
>
> return ret;
> }
> @@ -129,51 +130,40 @@ static const struct ucsi_operations ucsi_zenbook_ops = {
> };
>
> /*
> - * Some Dell laptops expect that an ACK command with the
> - * UCSI_ACK_CONNECTOR_CHANGE bit set is followed by a (separate)
> - * ACK command that only has the UCSI_ACK_COMMAND_COMPLETE bit set.
> - * If this is not done events are not delivered to OSPM and
> - * subsequent commands will timeout.
> + * Some Dell laptops don't like ACK commands with the
> + * UCSI_ACK_CONNECTOR_CHANGE but not the UCSI_ACK_COMMAND_COMPLETE
> + * bit set. To work around this send a dummy command and bundle the
> + * UCSI_ACK_CONNECTOR_CHANGE with the UCSI_ACK_COMMAND_COMPLETE
> + * for the dummy command.
> */
> static int
> ucsi_dell_sync_write(struct ucsi *ucsi, unsigned int offset,
> const void *val, size_t val_len)
> {
> struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi);
> - u64 cmd = *(u64 *)val, ack = 0;
> + u64 cmd = *(u64 *)val;
> + u64 dummycmd = UCSI_GET_CAPABILITY;
> int ret;
>
> - if (UCSI_COMMAND(cmd) == UCSI_ACK_CC_CI &&
> - cmd & UCSI_ACK_CONNECTOR_CHANGE)
> - ack = UCSI_ACK_CC_CI | UCSI_ACK_COMMAND_COMPLETE;
> -
> - ret = ucsi_acpi_sync_write(ucsi, offset, val, val_len);
> - if (ret != 0)
> - return ret;
> - if (ack == 0)
> - return ret;
> -
> - if (!ua->dell_quirk_probed) {
> - ua->dell_quirk_probed = true;
> -
> - cmd = UCSI_GET_CAPABILITY;
> - ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &cmd,
> - sizeof(cmd));
> - if (ret == 0)
> - return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL,
> - &ack, sizeof(ack));
> - if (ret != -ETIMEDOUT)
> + if (cmd == (UCSI_ACK_CC_CI | UCSI_ACK_CONNECTOR_CHANGE)) {
> + cmd |= UCSI_ACK_COMMAND_COMPLETE;
> +
> + /*
> + * The UCSI core thinks it is sending a connector change ack
> + * and will accept new connector change events. We don't want
> + * this to happen for the dummy command as its response will
> + * still report the very event that the core is trying to clear.
> + */
> + set_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags);
> + ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &dummycmd,
> + sizeof(dummycmd));
> + clear_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags);
> +
> + if (ret < 0)
> return ret;
> -
> - ua->dell_quirk_active = true;
> - dev_err(ua->dev, "Firmware bug: Additional ACK required after ACKing a connector change.\n");
> - dev_err(ua->dev, "Firmware bug: Enabling workaround\n");
> }
>
> - if (!ua->dell_quirk_active)
> - return ret;
> -
> - return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &ack, sizeof(ack));
> + return ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &cmd, sizeof(cmd));
> }
>
> static const struct ucsi_operations ucsi_dell_ops = {
> @@ -209,13 +199,14 @@ static void ucsi_acpi_notify(acpi_handle handle, u32 event, void *data)
> if (ret)
> return;
>
> - if (UCSI_CCI_CONNECTOR(cci))
> + if (UCSI_CCI_CONNECTOR(cci) &&
> + !test_bit(UCSI_ACPI_SUPPRESS_EVENT, &ua->flags))
> ucsi_connector_change(ua->ucsi, UCSI_CCI_CONNECTOR(cci));
>
> if (cci & UCSI_CCI_ACK_COMPLETE && test_bit(ACK_PENDING, &ua->flags))
> complete(&ua->complete);
> if (cci & UCSI_CCI_COMMAND_COMPLETE &&
> - test_bit(COMMAND_PENDING, &ua->flags))
> + test_bit(UCSI_ACPI_COMMAND_PENDING, &ua->flags))
> complete(&ua->complete);
> }
>
> --
> 2.40.1

--
heikki

2024-03-29 16:21:24

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 2/5] usb: typec: ucsi: Check for notifications after init

On Wed, Mar 20, 2024 at 08:39:23AM +0100, Christian A. Ehrhardt wrote:
> The completion notification for the final SET_NOTIFICATION_ENABLE
> command during initialization can include a connector change
> notification. However, at the time this completion notification is
> processed, the ucsi struct is not ready to handle this notification.
> As a result the notification is ignored and the controller
> never sends an interrupt again.
>
> Re-check CCI for a pending connector state change after
> initialization is complete. Adjust the corresponding debug
> message accordingly.
>
> Fixes: 71a1fa0df2a3 ("usb: typec: ucsi: Store the notification mask")
> Cc: [email protected]
> Signed-off-by: Christian A. Ehrhardt <[email protected]>
> ---
> drivers/usb/typec/ucsi/ucsi.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index 8a6645ffd938..dceeed207569 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1237,7 +1237,7 @@ void ucsi_connector_change(struct ucsi *ucsi, u8 num)
> struct ucsi_connector *con = &ucsi->connector[num - 1];
>
> if (!(ucsi->ntfy & UCSI_ENABLE_NTFY_CONNECTOR_CHANGE)) {
> - dev_dbg(ucsi->dev, "Bogus connector change event\n");
> + dev_dbg(ucsi->dev, "Early connector change event\n");
> return;
> }
>
> @@ -1636,6 +1636,7 @@ static int ucsi_init(struct ucsi *ucsi)
> {
> struct ucsi_connector *con, *connector;
> u64 command, ntfy;
> + u32 cci;
> int ret;
> int i;
>
> @@ -1688,6 +1689,13 @@ static int ucsi_init(struct ucsi *ucsi)
>
> ucsi->connector = connector;
> ucsi->ntfy = ntfy;
> +
> + ret = ucsi->ops->read(ucsi, UCSI_CCI, &cci, sizeof(cci));
> + if (ret)
> + return ret;
> + if (UCSI_CCI_CONNECTOR(READ_ONCE(cci)))
> + ucsi_connector_change(ucsi, cci);
> +

I think this leaves place for the race. With this patchset + "Ack
connector change early" in place Neil triggered the following backtrace
on sm8550 HDK while testing my UCSI-qcom-fixes patchset:
What happens:

[ 10.421640] write: 00000000: 05 00 e7 db 00 00 00 00

SET_NOTIFICATION_ENABLE

[ 10.432359] read: 00000000: 10 01 00 00 00 00 00 80 00 00 00 00 00 00 00 00
[ 10.469553] read: 00000010: 04 58 29 20 00 00 00 00 00 00 00 00 00 03 30 01
[ 10.476783] read: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.489552] notify: 00000000: 00 00 00 80

COMMAND_COMPLETE

[ 10.494194] read: 00000000: 10 01 00 00 00 00 00 80 00 00 00 00 00 00 00 00
[ 10.501370] read: 00000010: 04 58 29 20 00 00 00 00 00 00 00 00 00 03 30 01
[ 10.508578] read: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.515757] write: 00000000: 04 00 02 00 00 00 00 00

ACK_CC_CI(command completed)

[ 10.521100] read: 00000000: 10 01 00 00 00 00 00 20 00 00 00 00 00 00 00 00
[ 10.528363] read: 00000010: 04 58 29 20 00 00 00 00 00 00 00 00 00 03 30 01
[ 10.535603] read: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 10.549549] notify: 00000000: 00 00 00 20

ACK_COMPLETE

[Here ucsi->connector and ucsi->ntfy are being set by ucsi_init()

[ 10.566654] read: 00000010: 04 58 29 20 00 00 00 00 00 00 00 00 00 03 30 01
[ 10.593553] notify: 00000000: 02 00 00 20

Event with CONNECTION_CHANGE. It also schedules connector_change work,
because ucsi->ntfy is already set

[ 10.595796] read: 00000000: 10 01 00 00 02 00 00 20 00 00 00 00 00 00 00 00
[ 10.595798] read: 00000010: 04 58 29 20 00 00 00 00 00 00 00 00 00 03 30 01
[ 10.595799] read: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The CCI read coming from ucsi_init()

[ 10.595807] ------------[ cut here ]------------
[ 10.595808] WARNING: CPU: 6 PID: 101 at kernel/workqueue.c:2384 __queue_work+0x374/0x474

[skipped the register dump]

[ 10.595953] __queue_work+0x374/0x474
[ 10.595956] queue_work_on+0x68/0x84
[ 10.595959] ucsi_connector_change+0x54/0x88 [typec_ucsi]
[ 10.595963] ucsi_init_work+0x834/0x85c [typec_ucsi]
[ 10.595968] process_one_work+0x148/0x29c
[ 10.595971] worker_thread+0x2fc/0x40c
[ 10.595974] kthread+0x110/0x114
[ 10.595978] ret_from_fork+0x10/0x20
[ 10.595985] ---[ end trace 0000000000000000 ]---

Warning, because the work is already scheduled.


> return 0;
>
> err_unregister:
> --
> 2.40.1
>

2024-04-01 20:11:30

by Christian A. Ehrhardt

[permalink] [raw]
Subject: Re: [PATCH 2/5] usb: typec: ucsi: Check for notifications after init


Hi,

On Fri, Mar 29, 2024 at 06:21:08PM +0200, Dmitry Baryshkov wrote:
> On Wed, Mar 20, 2024 at 08:39:23AM +0100, Christian A. Ehrhardt wrote:
> > The completion notification for the final SET_NOTIFICATION_ENABLE
> > command during initialization can include a connector change
> > notification. However, at the time this completion notification is
> > processed, the ucsi struct is not ready to handle this notification.
> > As a result the notification is ignored and the controller
> > never sends an interrupt again.
> >
> > Re-check CCI for a pending connector state change after
> > initialization is complete. Adjust the corresponding debug
> > message accordingly.
> >
> > Fixes: 71a1fa0df2a3 ("usb: typec: ucsi: Store the notification mask")
> > Cc: [email protected]
> > Signed-off-by: Christian A. Ehrhardt <[email protected]>
> > ---
> > drivers/usb/typec/ucsi/ucsi.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > index 8a6645ffd938..dceeed207569 100644
> > --- a/drivers/usb/typec/ucsi/ucsi.c
> > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > @@ -1237,7 +1237,7 @@ void ucsi_connector_change(struct ucsi *ucsi, u8 num)
> > struct ucsi_connector *con = &ucsi->connector[num - 1];
> >
> > if (!(ucsi->ntfy & UCSI_ENABLE_NTFY_CONNECTOR_CHANGE)) {
> > - dev_dbg(ucsi->dev, "Bogus connector change event\n");
> > + dev_dbg(ucsi->dev, "Early connector change event\n");
> > return;
> > }
> >
> > @@ -1636,6 +1636,7 @@ static int ucsi_init(struct ucsi *ucsi)
> > {
> > struct ucsi_connector *con, *connector;
> > u64 command, ntfy;
> > + u32 cci;
> > int ret;
> > int i;
> >
> > @@ -1688,6 +1689,13 @@ static int ucsi_init(struct ucsi *ucsi)
> >
> > ucsi->connector = connector;
> > ucsi->ntfy = ntfy;
> > +
> > + ret = ucsi->ops->read(ucsi, UCSI_CCI, &cci, sizeof(cci));
> > + if (ret)
> > + return ret;
> > + if (UCSI_CCI_CONNECTOR(READ_ONCE(cci)))
> > + ucsi_connector_change(ucsi, cci);
> > +
>
> I think this leaves place for the race. With this patchset + "Ack
> connector change early" in place Neil triggered the following backtrace
> on sm8550 HDK while testing my UCSI-qcom-fixes patchset:

Sorry, but this seems to be a brown paper bag change.
- The READ_ONCE is bogus and a remnant of a prevoius verion of the
change.
- Calling ->read should probably be done with the PPM lock held.
- The argument to ucsi_connector_change() must be
UCSI_CCI_CONNECTOR(cci) instead of the plain cci.
I'll send a fix.

> What happens:
[ ... ]
>
> [ 10.595807] ------------[ cut here ]------------
> [ 10.595808] WARNING: CPU: 6 PID: 101 at kernel/workqueue.c:2384 __queue_work+0x374/0x474
>
> [skipped the register dump]
>
> [ 10.595953] __queue_work+0x374/0x474
> [ 10.595956] queue_work_on+0x68/0x84
> [ 10.595959] ucsi_connector_change+0x54/0x88 [typec_ucsi]
> [ 10.595963] ucsi_init_work+0x834/0x85c [typec_ucsi]
> [ 10.595968] process_one_work+0x148/0x29c
> [ 10.595971] worker_thread+0x2fc/0x40c
> [ 10.595974] kthread+0x110/0x114
> [ 10.595978] ret_from_fork+0x10/0x20
> [ 10.595985] ---[ end trace 0000000000000000 ]---
>
> Warning, because the work is already scheduled.

No, the reason is the wrong connector number. Scheduling a work that
is already scheduled is fine.

Best regards
Christian