2023-11-28 07:53:17

by Quan Nguyen

[permalink] [raw]
Subject: [PATCH v2 RESEND 0/2] i2c: aspeed: Late ack Tx done irqs and fix unhandled Tx done with NAK

Rebase on the v6.7-rc3 and resend the series as per discussion at
https://lore.kernel.org/all/d0773df55a6fe8a5c9b1a3d7c8dd2e1343643272.camel@codeconstruct.com.au/

The prior discussion could be found at
https://lore.kernel.org/all/[email protected]/

This series consists of two patches to fix the below issues observed
when testing with slave mode:
+ Unhandled Tx done with NAK
+ Early ack'ed of Tx done (ACK and NAK) causing "Unexpected Ack on
read request".

v2:
+ Split these patches to separate series [Joel]
+ Added the Fixes lines [Joel]
+ Fixed multiline comment [Joel]
+ Refactor irq clearing code [Joel, Guenter]
+ Revised commit message [Joel, Quan]

v1:
+ These patches are first introduced from this disscusstion
https://lore.kernel.org/all/[email protected]/

Quan Nguyen (2):
i2c: aspeed: Fix unhandled Tx done with NAK
i2c: aspeed: Acknowledge Tx done with and without ACK irq late

drivers/i2c/busses/i2c-aspeed.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

--
2.35.1


2023-11-28 07:53:38

by Quan Nguyen

[permalink] [raw]
Subject: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late

Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
interrupt handler") acknowledges most interrupts early before the slave
irq handler is executed, except for the "Receive Done Interrupt status"
which is acknowledged late in the interrupt.
However, it is observed that the early acknowledgment of "Transmit Done
Interrupt Status" (with ACK or NACK) often causes the interrupt to be
raised in READ REQUEST state, resulting in "Unexpected ACK on read
request." complaint messages.

Assuming that the "Transmit Done" interrupt should only be acknowledged
once it is truly processed, this commit fixes this issue by acknowledging
this interrupt for both ACK and NACK cases late in the interrupt handler
also.

Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
Signed-off-by: Quan Nguyen <[email protected]>
---
v2:
+ Split to separate series [Joel]
+ Added the Fixes line [Joel]
+ Fixed multiline comment [Joel]
+ Refactor irq clearing code [Joel, Guenter]
+ Revised commit message [Joel]
+ Revised commit message [Quan]
+ About a note to remind why the readl() should immediately follow the
writel() to fix the race condition when clearing irq status from commit
c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
condition"), I think it looks straight forward in this patch and decided
not to add that note. [Joel]

v1:
+ First introduced in
https://lore.kernel.org/all/[email protected]/
---
drivers/i2c/busses/i2c-aspeed.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index 79476b46285b..3231f430e335 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)

spin_lock(&bus->lock);
irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
- /* Ack all interrupts except for Rx done */
- writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
+ /* Ack all interrupts except for Rx done and Tx done with/without ACK */
+ writel(irq_received &
+ ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
bus->base + ASPEED_I2C_INTR_STS_REG);
readl(bus->base + ASPEED_I2C_INTR_STS_REG);
irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
@@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
"irq handled != irq. expected 0x%08x, but was 0x%08x\n",
irq_received, irq_handled);

- /* Ack Rx done */
- if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
- writel(ASPEED_I2CD_INTR_RX_DONE,
- bus->base + ASPEED_I2C_INTR_STS_REG);
- readl(bus->base + ASPEED_I2C_INTR_STS_REG);
- }
+ /* Ack Rx done and Tx done with/without ACK */
+ writel(irq_received &
+ (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
+ bus->base + ASPEED_I2C_INTR_STS_REG);
+ readl(bus->base + ASPEED_I2C_INTR_STS_REG);
+
spin_unlock(&bus->lock);
return irq_remaining ? IRQ_NONE : IRQ_HANDLED;
}
--
2.35.1

2023-11-28 07:53:59

by Quan Nguyen

[permalink] [raw]
Subject: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK

Under normal conditions, after the last byte is sent by the Slave, the
TX_NAK interrupt is raised. However, it is also observed that
sometimes the Master issues the next transaction too quickly while the
Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
last byte of the previous READ_PROCESSED state has not been ack’ed.
This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
and RX_DONE interrupt of the next coming transaction from Master. The
Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
ignores the TX_NAK, causing complaints such as
"aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
0x00000086, but was 0x00000084"

This commit adds code to handle this case by emitting a SLAVE_STOP event
for the TX_NAK before processing the RX_DONE for the coming transaction
from the Master.

Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
Signed-off-by: Quan Nguyen <[email protected]>
---
v2:
+ Split to separate series [Joel]
+ Added the Fixes line [Joel]
+ Revised commit message [Quan]

v1:
+ First introduced in
https://lore.kernel.org/all/[email protected]/
---
drivers/i2c/busses/i2c-aspeed.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index 28e2a5fc4528..79476b46285b 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)

/* Slave was requested, restart state machine. */
if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
+ if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
+ bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
+ irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
+ i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
+ }
irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
bus->slave_state = ASPEED_I2C_SLAVE_START;
}
--
2.35.1

2023-11-29 00:20:15

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK

On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
> Under normal conditions, after the last byte is sent by the Slave, the
> TX_NAK interrupt is raised. However, it is also observed that
> sometimes the Master issues the next transaction too quickly while the
> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
> last byte of the previous READ_PROCESSED state has not been ack’ed.
> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
> and RX_DONE interrupt of the next coming transaction from Master. The
> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
> ignores the TX_NAK, causing complaints such as
> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
> 0x00000086, but was 0x00000084"
>
> This commit adds code to handle this case by emitting a SLAVE_STOP event
> for the TX_NAK before processing the RX_DONE for the coming transaction
> from the Master.
>
> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
> Signed-off-by: Quan Nguyen <[email protected]>
> ---
> v2:
> + Split to separate series [Joel]
> + Added the Fixes line [Joel]
> + Revised commit message [Quan]
>
> v1:
> + First introduced in
> https://lore.kernel.org/all/[email protected]/
> ---
> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index 28e2a5fc4528..79476b46285b 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>
> /* Slave was requested, restart state machine. */
> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> + }

So we're already (partially) processing this a bit later on line 287:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287

From the description of the problem in the commit message it sounds
like the ordering of the interrupt processing is incorrect. Prior to
this patch we have the following abstract ordering of interrupt
processing:

1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
2. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED

With this patch we have:

1. If ASPEED_I2CD_INTR_SLAVE_MATCH then process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
2. Process ASPEED_I2CD_INTR_SLAVE_MATCH
3. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED

That feels a bit complex and redundant. What I think we can have is:

1. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
1. Process ASPEED_I2CD_INTR_SLAVE_MATCH

Moving back from the abstract to the concrete, implementing what I
believe we need would look something like this patch:

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index 28e2a5fc4528..98dd0f35c9d3 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -251,6 +251,14 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)

command = readl(bus->base + ASPEED_I2C_CMD_REG);

+ /* Complete any active read */
+ if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
+ bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
+ irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
+ i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
+ bus->slave_state = ASPEED_I2C_SLAVE_STOP;
+ }
+
/* Slave was requested, restart state machine. */
if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
@@ -284,11 +292,6 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
bus->slave_state = ASPEED_I2C_SLAVE_STOP;
}
- if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
- bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
- irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
- bus->slave_state = ASPEED_I2C_SLAVE_STOP;
- }

switch (bus->slave_state) {
case ASPEED_I2C_SLAVE_READ_REQUESTED:

Thoughts? I haven't tested it, it's just something to throw darts at.

Andrew

2023-11-29 00:33:59

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late

On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
> interrupt handler") acknowledges most interrupts early before the slave
> irq handler is executed, except for the "Receive Done Interrupt status"
> which is acknowledged late in the interrupt.
> However, it is observed that the early acknowledgment of "Transmit Done
> Interrupt Status" (with ACK or NACK) often causes the interrupt to be
> raised in READ REQUEST state, resulting in "Unexpected ACK on read
> request." complaint messages.
>
> Assuming that the "Transmit Done" interrupt should only be acknowledged
> once it is truly processed, this commit fixes this issue by acknowledging
> this interrupt for both ACK and NACK cases late in the interrupt handler
> also.
>
> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
> Signed-off-by: Quan Nguyen <[email protected]>
> ---
> v2:
> + Split to separate series [Joel]
> + Added the Fixes line [Joel]
> + Fixed multiline comment [Joel]
> + Refactor irq clearing code [Joel, Guenter]
> + Revised commit message [Joel]
> + Revised commit message [Quan]
> + About a note to remind why the readl() should immediately follow the
> writel() to fix the race condition when clearing irq status from commit
> c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
> condition"), I think it looks straight forward in this patch and decided
> not to add that note. [Joel]
>
> v1:
> + First introduced in
> https://lore.kernel.org/all/[email protected]/
> ---
> drivers/i2c/busses/i2c-aspeed.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index 79476b46285b..3231f430e335 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>
> spin_lock(&bus->lock);
> irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> - /* Ack all interrupts except for Rx done */
> - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
> + /* Ack all interrupts except for Rx done and Tx done with/without ACK */

I'm not a huge fan of this comment, it just says what the code does. It
would be much better to explain *why* the code does what it does.

I realise describing what the code does was already the gist of the
comment and that you're just updating it to match the change to the
code, but that's my entire problem with it. We'd be better off deleting
it if we're not going to explain why the masking is necessary.

> + writel(irq_received &
> + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> bus->base + ASPEED_I2C_INTR_STS_REG);
> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
> @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
> irq_received, irq_handled);
>
> - /* Ack Rx done */
> - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
> - writel(ASPEED_I2CD_INTR_RX_DONE,
> - bus->base + ASPEED_I2C_INTR_STS_REG);
> - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> - }
> + /* Ack Rx done and Tx done with/without ACK */
> + writel(irq_received &
> + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> + bus->base + ASPEED_I2C_INTR_STS_REG);
> + readl(bus->base + ASPEED_I2C_INTR_STS_REG);

I'm not sure why the write was conditional, but I'm not sure that
making it unconditional is valid either? Why the change? Why not add
the extra interrupt bits to the condition in addition to the value mask
for the write?

Andrew

2023-11-29 00:36:31

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK

Hi Quan,

On Tue, Nov 28, 2023 at 02:52:35PM +0700, Quan Nguyen wrote:
> Under normal conditions, after the last byte is sent by the Slave, the
> TX_NAK interrupt is raised. However, it is also observed that
> sometimes the Master issues the next transaction too quickly while the
> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
> last byte of the previous READ_PROCESSED state has not been ack’ed.
> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
> and RX_DONE interrupt of the next coming transaction from Master. The
> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
> ignores the TX_NAK, causing complaints such as
> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
> 0x00000086, but was 0x00000084"
>
> This commit adds code to handle this case by emitting a SLAVE_STOP event
> for the TX_NAK before processing the RX_DONE for the coming transaction
> from the Master.
>
> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
> Signed-off-by: Quan Nguyen <[email protected]>
> ---
> v2:
> + Split to separate series [Joel]
> + Added the Fixes line [Joel]
> + Revised commit message [Quan]
>
> v1:
> + First introduced in
> https://lore.kernel.org/all/[email protected]/
> ---
> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index 28e2a5fc4528..79476b46285b 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>
> /* Slave was requested, restart state machine. */
> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> + }

this is a duplicate of a later "if (...)" satement. What is the
need for having them both?

Andi

> irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
> bus->slave_state = ASPEED_I2C_SLAVE_START;
> }
> --
> 2.35.1
>

2023-11-29 00:47:02

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late

Hi Quan,

On Tue, Nov 28, 2023 at 02:52:36PM +0700, Quan Nguyen wrote:
> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
> interrupt handler") acknowledges most interrupts early before the slave
> irq handler is executed, except for the "Receive Done Interrupt status"
> which is acknowledged late in the interrupt.
> However, it is observed that the early acknowledgment of "Transmit Done
> Interrupt Status" (with ACK or NACK) often causes the interrupt to be
> raised in READ REQUEST state, resulting in "Unexpected ACK on read
> request." complaint messages.
>
> Assuming that the "Transmit Done" interrupt should only be acknowledged
> once it is truly processed, this commit fixes this issue by acknowledging
> this interrupt for both ACK and NACK cases late in the interrupt handler
> also.
>
> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
> Signed-off-by: Quan Nguyen <[email protected]>
> ---
> v2:
> + Split to separate series [Joel]
> + Added the Fixes line [Joel]
> + Fixed multiline comment [Joel]
> + Refactor irq clearing code [Joel, Guenter]
> + Revised commit message [Joel]
> + Revised commit message [Quan]
> + About a note to remind why the readl() should immediately follow the
> writel() to fix the race condition when clearing irq status from commit
> c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
> condition"), I think it looks straight forward in this patch and decided
> not to add that note. [Joel]
>
> v1:
> + First introduced in
> https://lore.kernel.org/all/[email protected]/
> ---
> drivers/i2c/busses/i2c-aspeed.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index 79476b46285b..3231f430e335 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>
> spin_lock(&bus->lock);
> irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> - /* Ack all interrupts except for Rx done */
> - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
> + /* Ack all interrupts except for Rx done and Tx done with/without ACK */
> + writel(irq_received &
> + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> bus->base + ASPEED_I2C_INTR_STS_REG);
> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
> @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
> irq_received, irq_handled);
>
> - /* Ack Rx done */
> - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
> - writel(ASPEED_I2CD_INTR_RX_DONE,
> - bus->base + ASPEED_I2C_INTR_STS_REG);
> - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> - }
> + /* Ack Rx done and Tx done with/without ACK */
> + writel(irq_received &
> + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> + bus->base + ASPEED_I2C_INTR_STS_REG);
> + readl(bus->base + ASPEED_I2C_INTR_STS_REG);

So, you are acknowledging everything here. Why wasn’t it done
this way in the first place?

I would appreciate a comment here from Guenter, whose commit you
are fixing.

Thanks,
Andi

> spin_unlock(&bus->lock);
> return irq_remaining ? IRQ_NONE : IRQ_HANDLED;
> }
> --
> 2.35.1
>

2023-11-29 09:03:23

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late



On 29/11/2023 07:33, Andrew Jeffery wrote:
> On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
>> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
>> interrupt handler") acknowledges most interrupts early before the slave
>> irq handler is executed, except for the "Receive Done Interrupt status"
>> which is acknowledged late in the interrupt.
>> However, it is observed that the early acknowledgment of "Transmit Done
>> Interrupt Status" (with ACK or NACK) often causes the interrupt to be
>> raised in READ REQUEST state, resulting in "Unexpected ACK on read
>> request." complaint messages.
>>
>> Assuming that the "Transmit Done" interrupt should only be acknowledged
>> once it is truly processed, this commit fixes this issue by acknowledging
>> this interrupt for both ACK and NACK cases late in the interrupt handler
>> also.
>>
>> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
>> Signed-off-by: Quan Nguyen <[email protected]>
>> ---
>> v2:
>> + Split to separate series [Joel]
>> + Added the Fixes line [Joel]
>> + Fixed multiline comment [Joel]
>> + Refactor irq clearing code [Joel, Guenter]
>> + Revised commit message [Joel]
>> + Revised commit message [Quan]
>> + About a note to remind why the readl() should immediately follow the
>> writel() to fix the race condition when clearing irq status from commit
>> c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
>> condition"), I think it looks straight forward in this patch and decided
>> not to add that note. [Joel]
>>
>> v1:
>> + First introduced in
>> https://lore.kernel.org/all/[email protected]/
>> ---
>> drivers/i2c/busses/i2c-aspeed.c | 17 +++++++++--------
>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>> index 79476b46285b..3231f430e335 100644
>> --- a/drivers/i2c/busses/i2c-aspeed.c
>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>> @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>>
>> spin_lock(&bus->lock);
>> irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> - /* Ack all interrupts except for Rx done */
>> - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
>> + /* Ack all interrupts except for Rx done and Tx done with/without ACK */
>
> I'm not a huge fan of this comment, it just says what the code does. It
> would be much better to explain *why* the code does what it does.
>
> I realise describing what the code does was already the gist of the
> comment and that you're just updating it to match the change to the
> code, but that's my entire problem with it. We'd be better off deleting
> it if we're not going to explain why the masking is necessary.
>

Thanks for the comment Andrew.

I would prefer to delete it.

But if to put some comment, how about:

/* Early ack INTR_RX_DONE, INTR_TX_[ACK|NAK] would indicate HW to start
receiving/sending new data and may cause a race condition as irq handler
not yet to handle these irqs but being acked. Let ack them late in the
end of irq handler when those are truly processed */

>> + writel(irq_received &
>> + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>> bus->base + ASPEED_I2C_INTR_STS_REG);
>> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
>> @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>> "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
>> irq_received, irq_handled);
>>
>> - /* Ack Rx done */
>> - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
>> - writel(ASPEED_I2CD_INTR_RX_DONE,
>> - bus->base + ASPEED_I2C_INTR_STS_REG);
>> - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> - }
>> + /* Ack Rx done and Tx done with/without ACK */
>> + writel(irq_received &
>> + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>> + bus->base + ASPEED_I2C_INTR_STS_REG);
>> + readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>
> I'm not sure why the write was conditional, but I'm not sure that
> making it unconditional is valid either? Why the change? Why not add
> the extra interrupt bits to the condition in addition to the value mask
> for the write?
>

In original code, only INTR_RX_DONE was acked late. So the check
(irq_received & ASPEED_I2CD_INTR_RX_DONE) is need and that help to save
one write() then read() if there was no such irq.

In the new code, there is no such check and the drawback is that there
always be one write() and one read() for all cases, include the case
where there is no irq at all, ie writing 0 into ASPEED_I2C_INTR_STS_REG.

And yes, your concern maybe right, we can not say of writing 0 into
ASPEED_I2C_INTR_STS_REG is good or not.

I checked back my debug log and seeing that irq always come with at
least one of INTR_RX_DONE BIT(2), INTR_TX_ACK BIT(0), INTR_TX_NAK BIT(1)
raised. So it seems like the case of writing 0 into
ASPEED_I2C_INTR_STS_REG is indeed rarely to happen.

Do you think we should change it to:

if (irq_received & (INTR_RX_DONE | INTR_TX_ACK | INTR_TX_NAK)) {
writel( irq_received & (INTR_RX_DONE| INTR_TX_ACK| INTR_TX_NAK),
bus->base + ASPEED_I2C_INTR_STS_REG);
readl(bus->base + ASPEED_I2C_INTR_STS_REG);
}

Again, thanks a lot for the review.
- Quan

2023-11-29 09:03:37

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK



On 29/11/2023 07:35, Andi Shyti wrote:
> Hi Quan,
>
> On Tue, Nov 28, 2023 at 02:52:35PM +0700, Quan Nguyen wrote:
>> Under normal conditions, after the last byte is sent by the Slave, the
>> TX_NAK interrupt is raised. However, it is also observed that
>> sometimes the Master issues the next transaction too quickly while the
>> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
>> last byte of the previous READ_PROCESSED state has not been ack’ed.
>> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
>> and RX_DONE interrupt of the next coming transaction from Master. The
>> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
>> ignores the TX_NAK, causing complaints such as
>> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
>> 0x00000086, but was 0x00000084"
>>
>> This commit adds code to handle this case by emitting a SLAVE_STOP event
>> for the TX_NAK before processing the RX_DONE for the coming transaction
>> from the Master.
>>
>> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
>> Signed-off-by: Quan Nguyen <[email protected]>
>> ---
>> v2:
>> + Split to separate series [Joel]
>> + Added the Fixes line [Joel]
>> + Revised commit message [Quan]
>>
>> v1:
>> + First introduced in
>> https://lore.kernel.org/all/[email protected]/
>> ---
>> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>> index 28e2a5fc4528..79476b46285b 100644
>> --- a/drivers/i2c/busses/i2c-aspeed.c
>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>>
>> /* Slave was requested, restart state machine. */
>> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
>> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
>> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
>> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
>> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>> + }
>
> this is a duplicate of a later "if (...)" satement. What is the
> need for having them both?
>
Thanks Andi for the review.

I assumed the if statement you mentioned is here in [1]. If so, then
that is not duplicate.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287


The if statement is to process the case when Slave sending data to
Master but being NAK, the I2C_SLAVE_STOP event will emit later in
switch-case statement. But it is only for the case INTR_TX_NAK without
INTR_SLAVE_MATCH.

The new code is for the case of INTR_TX_NAK with INTR_SLAVE_MATCH. What
it does is to detect if there is a mix of INTR_TX_NAK of previous i2c
transaction and the start of new i2c transaction, indicate by
INTR_SLAVE_MATCH which is only raised when Slave found its address
matched on the first byte it received. If so, the new code will try to
emit the I2C_SLAVE_STOP first to complete the previous transaction and
process the rest as a new request.

So if this was the case (with INTR_SLAVE_MATCH), the INTR_RX_DONE should
always raise with INTR_SLAVE_MATCH because Slave did receive the data
which matched with its Slave address. And this will be translated into
either I2C_SLAVE_[READ|WRITE]_REQUESTED and that make the if statement
you mentioned [1] evaluate to false and skip.

So, in short, the new code is trying to handle the case of INTR_TX_NAK
with INTR_SLAVE_MATCH first before let the rest process as normal.

Thanks,
- Quan

2023-11-29 09:04:11

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK



On 29/11/2023 07:19, Andrew Jeffery wrote:
> On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
>> Under normal conditions, after the last byte is sent by the Slave, the
>> TX_NAK interrupt is raised. However, it is also observed that
>> sometimes the Master issues the next transaction too quickly while the
>> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
>> last byte of the previous READ_PROCESSED state has not been ack’ed.
>> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
>> and RX_DONE interrupt of the next coming transaction from Master. The
>> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
>> ignores the TX_NAK, causing complaints such as
>> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
>> 0x00000086, but was 0x00000084"
>>
>> This commit adds code to handle this case by emitting a SLAVE_STOP event
>> for the TX_NAK before processing the RX_DONE for the coming transaction
>> from the Master.
>>
>> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
>> Signed-off-by: Quan Nguyen <[email protected]>
>> ---
>> v2:
>> + Split to separate series [Joel]
>> + Added the Fixes line [Joel]
>> + Revised commit message [Quan]
>>
>> v1:
>> + First introduced in
>> https://lore.kernel.org/all/[email protected]/
>> ---
>> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>> index 28e2a5fc4528..79476b46285b 100644
>> --- a/drivers/i2c/busses/i2c-aspeed.c
>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>>
>> /* Slave was requested, restart state machine. */
>> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
>> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
>> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
>> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
>> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>> + }
>
> So we're already (partially) processing this a bit later on line 287:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287
>

Thanks Andrew for the review.

I think it's worth noting that the byte mode is used in this case.

About the code you mentioned, it is for the general process of single
Slave transmission with NAK which should be interpret as I2C_SLAVE_STOP
event.

In this case, there is a mix of Slave events:

+ I2C_SLAVE_STOP (due to INTR_TX_NAK, BIT(1) of previous transaction)
+ Followed by I2C_SLAVE_[READ|WRITE]_REQUESTED (due to
INTR_SLAVE_MATCH and INTR_RX_DONE, BIT(7) and BIT(2), of next transaction)

That is the reason we need to emit the I2C_SLAVE_STOP first for Slave
backend to process.

> From the description of the problem in the commit message it sounds
> like the ordering of the interrupt processing is incorrect.

Yes, this is correct as per my explanation above.

Prior to
> this patch we have the following abstract ordering of interrupt
> processing:
>
> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
> 2. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>

From my test, the flow is as below:

1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
ASPEED_I2C_SLAVE_START
2. As there is INTR_RX_DONE and slave_state is
ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
3. When reach to the if statement to process INTR_TX_NAK, slave_state
is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluate as
false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.

> With this patch we have:
>
> 1. If ASPEED_I2CD_INTR_SLAVE_MATCH then process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> 2. Process ASPEED_I2CD_INTR_SLAVE_MATCH
> 3. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>

With this patch:

0. The I2C_SLAVE_STOP is emitted to backend Slave driver first to
complete the previous transaction. And let the rest process as before
this patch.

1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
ASPEED_I2C_SLAVE_START
2. As there is INTR_RX_DONE and slave_state is
ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
3. When reach to the if statement to process INTR_TX_NAK, slave_state
is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluated as
false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.

> That feels a bit complex and redundant. What I think we can have is:
>
> 1. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
>
> Moving back from the abstract to the concrete, implementing what I
> believe we need would look something like this patch:
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index 28e2a5fc4528..98dd0f35c9d3 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -251,6 +251,14 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>
> command = readl(bus->base + ASPEED_I2C_CMD_REG);
>
> + /* Complete any active read */
> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> + bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> + }
> +

It is not confirmed through test yet but I'm afraid the switch case part
will emit another I2C_SLAVE_STOP event in case there is no mix of
interrupts.

> /* Slave was requested, restart state machine. */
> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
> @@ -284,11 +292,6 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
> bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> }
> - if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> - bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> - irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> - bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> - }
>
> switch (bus->slave_state) {
> case ASPEED_I2C_SLAVE_READ_REQUESTED:
>
> Thoughts? I haven't tested it, it's just something to throw darts at.
>
> Andrew
>

2023-11-29 09:06:09

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late



On 29/11/2023 07:45, Andi Shyti wrote:
> Hi Quan,
>
> On Tue, Nov 28, 2023 at 02:52:36PM +0700, Quan Nguyen wrote:
>> Commit 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in
>> interrupt handler") acknowledges most interrupts early before the slave
>> irq handler is executed, except for the "Receive Done Interrupt status"
>> which is acknowledged late in the interrupt.
>> However, it is observed that the early acknowledgment of "Transmit Done
>> Interrupt Status" (with ACK or NACK) often causes the interrupt to be
>> raised in READ REQUEST state, resulting in "Unexpected ACK on read
>> request." complaint messages.
>>
>> Assuming that the "Transmit Done" interrupt should only be acknowledged
>> once it is truly processed, this commit fixes this issue by acknowledging
>> this interrupt for both ACK and NACK cases late in the interrupt handler
>> also.
>>
>> Fixes: 2be6b47211e1 ("i2c: aspeed: Acknowledge most interrupts early in interrupt handler")
>> Signed-off-by: Quan Nguyen <[email protected]>
>> ---
>> v2:
>> + Split to separate series [Joel]
>> + Added the Fixes line [Joel]
>> + Fixed multiline comment [Joel]
>> + Refactor irq clearing code [Joel, Guenter]
>> + Revised commit message [Joel]
>> + Revised commit message [Quan]
>> + About a note to remind why the readl() should immediately follow the
>> writel() to fix the race condition when clearing irq status from commit
>> c926c87b8e36 ("i2c: aspeed: Avoid i2c interrupt status clear race
>> condition"), I think it looks straight forward in this patch and decided
>> not to add that note. [Joel]
>>
>> v1:
>> + First introduced in
>> https://lore.kernel.org/all/[email protected]/
>> ---
>> drivers/i2c/busses/i2c-aspeed.c | 17 +++++++++--------
>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>> index 79476b46285b..3231f430e335 100644
>> --- a/drivers/i2c/busses/i2c-aspeed.c
>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>> @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>>
>> spin_lock(&bus->lock);
>> irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> - /* Ack all interrupts except for Rx done */
>> - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
>> + /* Ack all interrupts except for Rx done and Tx done with/without ACK */
>> + writel(irq_received &
>> + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>> bus->base + ASPEED_I2C_INTR_STS_REG);
>> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
>> @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>> "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
>> irq_received, irq_handled);
>>
>> - /* Ack Rx done */
>> - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
>> - writel(ASPEED_I2CD_INTR_RX_DONE,
>> - bus->base + ASPEED_I2C_INTR_STS_REG);
>> - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> - }
>> + /* Ack Rx done and Tx done with/without ACK */
>> + writel(irq_received &
>> + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>> + bus->base + ASPEED_I2C_INTR_STS_REG);
>> + readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>
> So, you are acknowledging everything here. Why wasn’t it done
> this way in the first place?
>
> I would appreciate a comment here from Guenter, whose commit you
> are fixing.
>

Thanks Andi for the comment.

This base on my observation that HW may proceed to start
transmit/receive new date as soon as those irqs are early ack. This may
cause a race condition because SW was not actually process that irq yet.

I've also put some explanation in my reply to Andrew in the other mail
for this part as well.

And of course, I definitively love to hear from Guenter as well as these
code is just based on my observation through debug only.

Thanks a lot for the comment.
- Quan

2023-11-29 21:26:05

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK

Hi Quan,

> On 29/11/2023 07:35, Andi Shyti wrote:
> > Hi Quan,
> >
> > On Tue, Nov 28, 2023 at 02:52:35PM +0700, Quan Nguyen wrote:
> > > Under normal conditions, after the last byte is sent by the Slave, the
> > > TX_NAK interrupt is raised. However, it is also observed that
> > > sometimes the Master issues the next transaction too quickly while the
> > > Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
> > > last byte of the previous READ_PROCESSED state has not been ack’ed.
> > > This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
> > > and RX_DONE interrupt of the next coming transaction from Master. The
> > > Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
> > > ignores the TX_NAK, causing complaints such as
> > > "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
> > > 0x00000086, but was 0x00000084"
> > >
> > > This commit adds code to handle this case by emitting a SLAVE_STOP event
> > > for the TX_NAK before processing the RX_DONE for the coming transaction
> > > from the Master.
> > >
> > > Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
> > > Signed-off-by: Quan Nguyen <[email protected]>
> > > ---
> > > v2:
> > > + Split to separate series [Joel]
> > > + Added the Fixes line [Joel]
> > > + Revised commit message [Quan]
> > >
> > > v1:
> > > + First introduced in
> > > https://lore.kernel.org/all/[email protected]/
> > > ---
> > > drivers/i2c/busses/i2c-aspeed.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > > index 28e2a5fc4528..79476b46285b 100644
> > > --- a/drivers/i2c/busses/i2c-aspeed.c
> > > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > > @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > > /* Slave was requested, restart state machine. */
> > > if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> > > + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> > > + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> > > + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> > > + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> > > + }
> >
> > this is a duplicate of a later "if (...)" satement. What is the
> > need for having them both?
> >
> Thanks Andi for the review.
>
> I assumed the if statement you mentioned is here in [1]. If so, then that is
> not duplicate.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287
>
>
> The if statement is to process the case when Slave sending data to Master
> but being NAK, the I2C_SLAVE_STOP event will emit later in switch-case
> statement. But it is only for the case INTR_TX_NAK without INTR_SLAVE_MATCH.
>
> The new code is for the case of INTR_TX_NAK with INTR_SLAVE_MATCH. What it
> does is to detect if there is a mix of INTR_TX_NAK of previous i2c
> transaction and the start of new i2c transaction, indicate by
> INTR_SLAVE_MATCH which is only raised when Slave found its address matched
> on the first byte it received. If so, the new code will try to emit the
> I2C_SLAVE_STOP first to complete the previous transaction and process the
> rest as a new request.
>
> So if this was the case (with INTR_SLAVE_MATCH), the INTR_RX_DONE should
> always raise with INTR_SLAVE_MATCH because Slave did receive the data which
> matched with its Slave address. And this will be translated into either
> I2C_SLAVE_[READ|WRITE]_REQUESTED and that make the if statement you
> mentioned [1] evaluate to false and skip.
>
> So, in short, the new code is trying to handle the case of INTR_TX_NAK with
> INTR_SLAVE_MATCH first before let the rest process as normal.

yes, I saw that, but wasn't it easier to do something like this:

if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
bus->slave_state = ASPEED_I2C_SLAVE_STOP;

if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH)
i2c_slave_event(slave, I2C_SLAVE_STOP, &value);

}

But I see that Andrew has done some similar comment, also for
patch 2. You can answer both in the same mail, not to duplicate
the answer :-)

We can wait for him to reply.

Andi

2023-11-29 22:46:07

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late

On Wed, 2023-11-29 at 16:02 +0700, Quan Nguyen wrote:
>
> On 29/11/2023 07:33, Andrew Jeffery wrote:
> > On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
> > > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > > index 79476b46285b..3231f430e335 100644
> > > --- a/drivers/i2c/busses/i2c-aspeed.c
> > > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > > @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> > >
> > > spin_lock(&bus->lock);
> > > irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> > > - /* Ack all interrupts except for Rx done */
> > > - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
> > > + /* Ack all interrupts except for Rx done and Tx done with/without ACK */
> >
> > I'm not a huge fan of this comment, it just says what the code does. It
> > would be much better to explain *why* the code does what it does.
> >
> > I realise describing what the code does was already the gist of the
> > comment and that you're just updating it to match the change to the
> > code, but that's my entire problem with it. We'd be better off deleting
> > it if we're not going to explain why the masking is necessary.
> >
>
> Thanks for the comment Andrew.
>
> I would prefer to delete it.
>
> But if to put some comment, how about:
>
> /* Early ack INTR_RX_DONE, INTR_TX_[ACK|NAK] would indicate HW to start
> receiving/sending new data and may cause a race condition as irq handler
> not yet to handle these irqs but being acked. Let ack them late in the
> end of irq handler when those are truly processed */

Please update the patch with this comment. It at least goes some way to
explain why.

>
> > > + writel(irq_received &
> > > + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> > > bus->base + ASPEED_I2C_INTR_STS_REG);
> > > readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> > > irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
> > > @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
> > > "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
> > > irq_received, irq_handled);
> > >
> > > - /* Ack Rx done */
> > > - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
> > > - writel(ASPEED_I2CD_INTR_RX_DONE,
> > > - bus->base + ASPEED_I2C_INTR_STS_REG);
> > > - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> > > - }
> > > + /* Ack Rx done and Tx done with/without ACK */
> > > + writel(irq_received &
> > > + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
> > > + bus->base + ASPEED_I2C_INTR_STS_REG);
> > > + readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> >
> > I'm not sure why the write was conditional, but I'm not sure that
> > making it unconditional is valid either? Why the change? Why not add
> > the extra interrupt bits to the condition in addition to the value mask
> > for the write?
> >
>
> In original code, only INTR_RX_DONE was acked late. So the check
> (irq_received & ASPEED_I2CD_INTR_RX_DONE) is need and that help to save
> one write() then read() if there was no such irq.
>
> In the new code, there is no such check and the drawback is that there
> always be one write() and one read() for all cases, include the case
> where there is no irq at all, ie writing 0 into ASPEED_I2C_INTR_STS_REG.
>
> And yes, your concern maybe right, we can not say of writing 0 into
> ASPEED_I2C_INTR_STS_REG is good or not.
>
> I checked back my debug log and seeing that irq always come with at
> least one of INTR_RX_DONE BIT(2), INTR_TX_ACK BIT(0), INTR_TX_NAK BIT(1)
> raised. So it seems like the case of writing 0 into
> ASPEED_I2C_INTR_STS_REG is indeed rarely to happen.
>
> Do you think we should change it to:
>
> if (irq_received & (INTR_RX_DONE | INTR_TX_ACK | INTR_TX_NAK)) {
> writel( irq_received & (INTR_RX_DONE| INTR_TX_ACK| INTR_TX_NAK),
> bus->base + ASPEED_I2C_INTR_STS_REG);
> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
> }

This is less different from the existing strategy and doesn't require
any explanation beyond what you're already trying to achieve in the
patch, so I think you should switch to this approach.

If someone wants to work out why it was done conditionally and argue
for its removal then they can do that separately.

Cheers,

Andrew

2023-11-30 01:21:59

by Andrew Jeffery

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK

On Wed, 2023-11-29 at 16:03 +0700, Quan Nguyen wrote:
>
> On 29/11/2023 07:19, Andrew Jeffery wrote:
> > On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
> > > Under normal conditions, after the last byte is sent by the Slave, the
> > > TX_NAK interrupt is raised. However, it is also observed that
> > > sometimes the Master issues the next transaction too quickly while the
> > > Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
> > > last byte of the previous READ_PROCESSED state has not been ack’ed.
> > > This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
> > > and RX_DONE interrupt of the next coming transaction from Master. The
> > > Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
> > > ignores the TX_NAK, causing complaints such as
> > > "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
> > > 0x00000086, but was 0x00000084"
> > >
> > > This commit adds code to handle this case by emitting a SLAVE_STOP event
> > > for the TX_NAK before processing the RX_DONE for the coming transaction
> > > from the Master.
> > >
> > > Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
> > > Signed-off-by: Quan Nguyen <[email protected]>
> > > ---
> > > v2:
> > > + Split to separate series [Joel]
> > > + Added the Fixes line [Joel]
> > > + Revised commit message [Quan]
> > >
> > > v1:
> > > + First introduced in
> > > https://lore.kernel.org/all/[email protected]/
> > > ---
> > > drivers/i2c/busses/i2c-aspeed.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > > index 28e2a5fc4528..79476b46285b 100644
> > > --- a/drivers/i2c/busses/i2c-aspeed.c
> > > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > > @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> > >
> > > /* Slave was requested, restart state machine. */
> > > if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> > > + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> > > + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> > > + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> > > + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> > > + }
> >
> > So we're already (partially) processing this a bit later on line 287:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287
> >
>
> Thanks Andrew for the review.
>
> I think it's worth noting that the byte mode is used in this case.
>
> About the code you mentioned, it is for the general process of single
> Slave transmission with NAK which should be interpret as I2C_SLAVE_STOP
> event.
>
> In this case, there is a mix of Slave events:
>
> + I2C_SLAVE_STOP (due to INTR_TX_NAK, BIT(1) of previous transaction)
> + Followed by I2C_SLAVE_[READ|WRITE]_REQUESTED (due to
> INTR_SLAVE_MATCH and INTR_RX_DONE, BIT(7) and BIT(2), of next transaction)
>
> That is the reason we need to emit the I2C_SLAVE_STOP first for Slave
> backend to process.
>
> > From the description of the problem in the commit message it sounds
> > like the ordering of the interrupt processing is incorrect.
>
> Yes, this is correct as per my explanation above.
>
> Prior to
> > this patch we have the following abstract ordering of interrupt
> > processing:
> >
> > 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
> > 2. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> >
>
> From my test, the flow is as below:
>
> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
> ASPEED_I2C_SLAVE_START
> 2. As there is INTR_RX_DONE and slave_state is
> ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
> moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
> 3. When reach to the if statement to process INTR_TX_NAK, slave_state
> is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
> in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluate as
> false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.
>
> > With this patch we have:
> >
> > 1. If ASPEED_I2CD_INTR_SLAVE_MATCH then process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> > 2. Process ASPEED_I2CD_INTR_SLAVE_MATCH
> > 3. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> >
>
> With this patch:
>
> 0. The I2C_SLAVE_STOP is emitted to backend Slave driver first to
> complete the previous transaction. And let the rest process as before
> this patch.
>
> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
> ASPEED_I2C_SLAVE_START
> 2. As there is INTR_RX_DONE and slave_state is
> ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
> moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
> 3. When reach to the if statement to process INTR_TX_NAK, slave_state
> is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
> in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluated as
> false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.
>
> > That feels a bit complex and redundant. What I think we can have is:
> >
> > 1. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
> > 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
> >
> > Moving back from the abstract to the concrete, implementing what I
> > believe we need would look something like this patch:
> >
> > diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> > index 28e2a5fc4528..98dd0f35c9d3 100644
> > --- a/drivers/i2c/busses/i2c-aspeed.c
> > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > @@ -251,6 +251,14 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> >
> > command = readl(bus->base + ASPEED_I2C_CMD_REG);
> >
> > + /* Complete any active read */
> > + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> > + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> > + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> > + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> > + bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> > + }
> > +
>
> It is not confirmed through test yet but I'm afraid the switch case part
> will emit another I2C_SLAVE_STOP event in case there is no mix of
> interrupts.

Ah, good catch. I think we can rework things a bit to rationalise the
logic at the expense a bigger diff. What do you think about this? I've
boot tested it on an ast2600-evb and poked at some NVMe drives over
MCTP to exercise the slave path.

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index aec8966bceab..3c9333a12967 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -249,18 +249,47 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
if (!slave)
return 0;

- command = readl(bus->base + ASPEED_I2C_CMD_REG);
+ /*
+ * Handle stop conditions early, prior to SLAVE_MATCH. Some masters may drive
+ * transfers with low enough latency between the nak/stop phase of the current
+ * command and the start/address phase of the following command that the
+ * interrupts are coalesced by the time we process them.
+ */
+
+ if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
+ irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
+ bus->slave_state = ASPEED_I2C_SLAVE_STOP;
+ }
+
+ if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
+ bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
+ irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
+ bus->slave_state = ASPEED_I2C_SLAVE_STOP;
+ }
+
+ /* Propagate any stop conditions to the slave implementation */
+ if (bus->slave_state == ASPEED_I2C_SLAVE_STOP) {
+ i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
+ bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
+ }

- /* Slave was requested, restart state machine. */
+ /*
+ * Now that we've dealt with any potentially coalesced stop conditions,
+ * address any start conditions.
+ */
if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
bus->slave_state = ASPEED_I2C_SLAVE_START;
}

- /* Slave is not currently active, irq was for someone else. */
+ /*
+ * If the slave has been stopped and not started then slave interrupt handling
+ * is complete.
+ */
if (bus->slave_state == ASPEED_I2C_SLAVE_INACTIVE)
return irq_handled;

+ command = readl(bus->base + ASPEED_I2C_CMD_REG);
dev_dbg(bus->dev, "slave irq status 0x%08x, cmd 0x%08x\n",
irq_status, command);

@@ -279,17 +308,6 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
irq_handled |= ASPEED_I2CD_INTR_RX_DONE;
}

- /* Slave was asked to stop. */
- if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
- irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
- bus->slave_state = ASPEED_I2C_SLAVE_STOP;
- }
- if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
- bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
- irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
- bus->slave_state = ASPEED_I2C_SLAVE_STOP;
- }
-
switch (bus->slave_state) {
case ASPEED_I2C_SLAVE_READ_REQUESTED:
if (unlikely(irq_status & ASPEED_I2CD_INTR_TX_ACK))
@@ -324,8 +342,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED, &value);
break;
case ASPEED_I2C_SLAVE_STOP:
- i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
- bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
+ /* Stop event handling is done early. Unreachable. */
break;
case ASPEED_I2C_SLAVE_START:
/* Slave was just started. Waiting for the next event. */;

2023-11-30 06:53:27

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK



On 30/11/2023 08:20, Andrew Jeffery wrote:
> On Wed, 2023-11-29 at 16:03 +0700, Quan Nguyen wrote:
>>
>> On 29/11/2023 07:19, Andrew Jeffery wrote:
>>> On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
>>>> Under normal conditions, after the last byte is sent by the Slave, the
>>>> TX_NAK interrupt is raised. However, it is also observed that
>>>> sometimes the Master issues the next transaction too quickly while the
>>>> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
>>>> last byte of the previous READ_PROCESSED state has not been ack’ed.
>>>> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
>>>> and RX_DONE interrupt of the next coming transaction from Master. The
>>>> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
>>>> ignores the TX_NAK, causing complaints such as
>>>> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
>>>> 0x00000086, but was 0x00000084"
>>>>
>>>> This commit adds code to handle this case by emitting a SLAVE_STOP event
>>>> for the TX_NAK before processing the RX_DONE for the coming transaction
>>>> from the Master.
>>>>
>>>> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
>>>> Signed-off-by: Quan Nguyen <[email protected]>
>>>> ---
>>>> v2:
>>>> + Split to separate series [Joel]
>>>> + Added the Fixes line [Joel]
>>>> + Revised commit message [Quan]
>>>>
>>>> v1:
>>>> + First introduced in
>>>> https://lore.kernel.org/all/[email protected]/
>>>> ---
>>>> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
>>>> 1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>>>> index 28e2a5fc4528..79476b46285b 100644
>>>> --- a/drivers/i2c/busses/i2c-aspeed.c
>>>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>>>> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>>>>
>>>> /* Slave was requested, restart state machine. */
>>>> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
>>>> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
>>>> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
>>>> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
>>>> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>>>> + }
>>>
>>> So we're already (partially) processing this a bit later on line 287:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287
>>>
>>
>> Thanks Andrew for the review.
>>
>> I think it's worth noting that the byte mode is used in this case.
>>
>> About the code you mentioned, it is for the general process of single
>> Slave transmission with NAK which should be interpret as I2C_SLAVE_STOP
>> event.
>>
>> In this case, there is a mix of Slave events:
>>
>> + I2C_SLAVE_STOP (due to INTR_TX_NAK, BIT(1) of previous transaction)
>> + Followed by I2C_SLAVE_[READ|WRITE]_REQUESTED (due to
>> INTR_SLAVE_MATCH and INTR_RX_DONE, BIT(7) and BIT(2), of next transaction)
>>
>> That is the reason we need to emit the I2C_SLAVE_STOP first for Slave
>> backend to process.
>>
>>> From the description of the problem in the commit message it sounds
>>> like the ordering of the interrupt processing is incorrect.
>>
>> Yes, this is correct as per my explanation above.
>>
>> Prior to
>>> this patch we have the following abstract ordering of interrupt
>>> processing:
>>>
>>> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
>>> 2. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>>>
>>
>> From my test, the flow is as below:
>>
>> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
>> ASPEED_I2C_SLAVE_START
>> 2. As there is INTR_RX_DONE and slave_state is
>> ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
>> moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
>> 3. When reach to the if statement to process INTR_TX_NAK, slave_state
>> is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
>> in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluate as
>> false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.
>>
>>> With this patch we have:
>>>
>>> 1. If ASPEED_I2CD_INTR_SLAVE_MATCH then process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>>> 2. Process ASPEED_I2CD_INTR_SLAVE_MATCH
>>> 3. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>>>
>>
>> With this patch:
>>
>> 0. The I2C_SLAVE_STOP is emitted to backend Slave driver first to
>> complete the previous transaction. And let the rest process as before
>> this patch.
>>
>> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH, slave_state is set to
>> ASPEED_I2C_SLAVE_START
>> 2. As there is INTR_RX_DONE and slave_state is
>> ASPEED_I2C_SLAVE_START, depends on the data received, the slave_state
>> moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED.
>> 3. When reach to the if statement to process INTR_TX_NAK, slave_state
>> is already moves to either ASPEED_I2C_SLAVE_[READ|WRITE]_REQUESTED, not
>> in ASPEED_I2C_SLAVE_READ_PROCESSED anymore. This eventually evaluated as
>> false and the if statement is bypass. IOW, this INTR_TX_NAK is not process.
>>
>>> That feels a bit complex and redundant. What I think we can have is:
>>>
>>> 1. Process ASPEED_I2CD_INTR_TX_NAK when in ASPEED_I2C_SLAVE_READ_PROCESSED
>>> 1. Process ASPEED_I2CD_INTR_SLAVE_MATCH
>>>
>>> Moving back from the abstract to the concrete, implementing what I
>>> believe we need would look something like this patch:
>>>
>>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>>> index 28e2a5fc4528..98dd0f35c9d3 100644
>>> --- a/drivers/i2c/busses/i2c-aspeed.c
>>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>>> @@ -251,6 +251,14 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>>>
>>> command = readl(bus->base + ASPEED_I2C_CMD_REG);
>>>
>>> + /* Complete any active read */
>>> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
>>> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
>>> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
>>> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>>> + bus->slave_state = ASPEED_I2C_SLAVE_STOP;
>>> + }
>>> +
>>
>> It is not confirmed through test yet but I'm afraid the switch case part
>> will emit another I2C_SLAVE_STOP event in case there is no mix of
>> interrupts.
>
> Ah, good catch. I think we can rework things a bit to rationalise the
> logic at the expense a bigger diff. What do you think about this? I've
> boot tested it on an ast2600-evb and poked at some NVMe drives over
> MCTP to exercise the slave path.
>
> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
> index aec8966bceab..3c9333a12967 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -249,18 +249,47 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> if (!slave)
> return 0;
>
> - command = readl(bus->base + ASPEED_I2C_CMD_REG);
> + /*
> + * Handle stop conditions early, prior to SLAVE_MATCH. Some masters may drive
> + * transfers with low enough latency between the nak/stop phase of the current
> + * command and the start/address phase of the following command that the
> + * interrupts are coalesced by the time we process them.
> + */
> +
> + if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
> + irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
> + bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> + }
> +
> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> + bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> + }
> +
> + /* Propagate any stop conditions to the slave implementation */
> + if (bus->slave_state == ASPEED_I2C_SLAVE_STOP) {
> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> + bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
> + }
>
> - /* Slave was requested, restart state machine. */
> + /*
> + * Now that we've dealt with any potentially coalesced stop conditions,
> + * address any start conditions.
> + */
> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
> irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
> bus->slave_state = ASPEED_I2C_SLAVE_START;
> }
>
> - /* Slave is not currently active, irq was for someone else. */
> + /*
> + * If the slave has been stopped and not started then slave interrupt handling
> + * is complete.
> + */
> if (bus->slave_state == ASPEED_I2C_SLAVE_INACTIVE)
> return irq_handled;
>
> + command = readl(bus->base + ASPEED_I2C_CMD_REG);
> dev_dbg(bus->dev, "slave irq status 0x%08x, cmd 0x%08x\n",
> irq_status, command);
>
> @@ -279,17 +308,6 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> irq_handled |= ASPEED_I2CD_INTR_RX_DONE;
> }
>
> - /* Slave was asked to stop. */
> - if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
> - irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
> - bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> - }
> - if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> - bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> - irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> - bus->slave_state = ASPEED_I2C_SLAVE_STOP;
> - }
> -
> switch (bus->slave_state) {
> case ASPEED_I2C_SLAVE_READ_REQUESTED:
> if (unlikely(irq_status & ASPEED_I2CD_INTR_TX_ACK))
> @@ -324,8 +342,7 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
> i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED, &value);
> break;
> case ASPEED_I2C_SLAVE_STOP:
> - i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
> - bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
> + /* Stop event handling is done early. Unreachable. */
> break;
> case ASPEED_I2C_SLAVE_START:
> /* Slave was just started. Waiting for the next event. */;

Yes, this looks much better. I'll confirm these changes through test and
will post the next version ASAP.

Thanks a lot for the review,
- Quan

2023-11-30 06:53:41

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 2/2] i2c: aspeed: Acknowledge Tx done with and without ACK irq late



On 30/11/2023 05:44, Andrew Jeffery wrote:
> On Wed, 2023-11-29 at 16:02 +0700, Quan Nguyen wrote:
>>
>> On 29/11/2023 07:33, Andrew Jeffery wrote:
>>> On Tue, 2023-11-28 at 14:52 +0700, Quan Nguyen wrote:
>>>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>>>> index 79476b46285b..3231f430e335 100644
>>>> --- a/drivers/i2c/busses/i2c-aspeed.c
>>>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>>>> @@ -611,8 +611,9 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>>>>
>>>> spin_lock(&bus->lock);
>>>> irq_received = readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> - /* Ack all interrupts except for Rx done */
>>>> - writel(irq_received & ~ASPEED_I2CD_INTR_RX_DONE,
>>>> + /* Ack all interrupts except for Rx done and Tx done with/without ACK */
>>>
>>> I'm not a huge fan of this comment, it just says what the code does. It
>>> would be much better to explain *why* the code does what it does.
>>>
>>> I realise describing what the code does was already the gist of the
>>> comment and that you're just updating it to match the change to the
>>> code, but that's my entire problem with it. We'd be better off deleting
>>> it if we're not going to explain why the masking is necessary.
>>>
>>
>> Thanks for the comment Andrew.
>>
>> I would prefer to delete it.
>>
>> But if to put some comment, how about:
>>
>> /* Early ack INTR_RX_DONE, INTR_TX_[ACK|NAK] would indicate HW to start
>> receiving/sending new data and may cause a race condition as irq handler
>> not yet to handle these irqs but being acked. Let ack them late in the
>> end of irq handler when those are truly processed */
>
> Please update the patch with this comment. It at least goes some way to
> explain why.
>

Yes, will do in next version.

>>
>>>> + writel(irq_received &
>>>> + ~(ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>>>> bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> irq_received &= ASPEED_I2CD_INTR_RECV_MASK;
>>>> @@ -657,12 +658,12 @@ static irqreturn_t aspeed_i2c_bus_irq(int irq, void *dev_id)
>>>> "irq handled != irq. expected 0x%08x, but was 0x%08x\n",
>>>> irq_received, irq_handled);
>>>>
>>>> - /* Ack Rx done */
>>>> - if (irq_received & ASPEED_I2CD_INTR_RX_DONE) {
>>>> - writel(ASPEED_I2CD_INTR_RX_DONE,
>>>> - bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> - readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> - }
>>>> + /* Ack Rx done and Tx done with/without ACK */
>>>> + writel(irq_received &
>>>> + (ASPEED_I2CD_INTR_RX_DONE | ASPEED_I2CD_INTR_TX_ACK | ASPEED_I2CD_INTR_TX_NAK),
>>>> + bus->base + ASPEED_I2C_INTR_STS_REG);
>>>> + readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>>>
>>> I'm not sure why the write was conditional, but I'm not sure that
>>> making it unconditional is valid either? Why the change? Why not add
>>> the extra interrupt bits to the condition in addition to the value mask
>>> for the write?
>>>
>>
>> In original code, only INTR_RX_DONE was acked late. So the check
>> (irq_received & ASPEED_I2CD_INTR_RX_DONE) is need and that help to save
>> one write() then read() if there was no such irq.
>>
>> In the new code, there is no such check and the drawback is that there
>> always be one write() and one read() for all cases, include the case
>> where there is no irq at all, ie writing 0 into ASPEED_I2C_INTR_STS_REG.
>>
>> And yes, your concern maybe right, we can not say of writing 0 into
>> ASPEED_I2C_INTR_STS_REG is good or not.
>>
>> I checked back my debug log and seeing that irq always come with at
>> least one of INTR_RX_DONE BIT(2), INTR_TX_ACK BIT(0), INTR_TX_NAK BIT(1)
>> raised. So it seems like the case of writing 0 into
>> ASPEED_I2C_INTR_STS_REG is indeed rarely to happen.
>>
>> Do you think we should change it to:
>>
>> if (irq_received & (INTR_RX_DONE | INTR_TX_ACK | INTR_TX_NAK)) {
>> writel( irq_received & (INTR_RX_DONE| INTR_TX_ACK| INTR_TX_NAK),
>> bus->base + ASPEED_I2C_INTR_STS_REG);
>> readl(bus->base + ASPEED_I2C_INTR_STS_REG);
>> }
>
> This is less different from the existing strategy and doesn't require
> any explanation beyond what you're already trying to achieve in the
> patch, so I think you should switch to this approach.
>
> If someone wants to work out why it was done conditionally and argue
> for its removal then they can do that separately.
>

I agree, will update in next version.

Thanks
- Quan

2023-11-30 06:53:51

by Quan Nguyen

[permalink] [raw]
Subject: Re: [PATCH v2 RESEND 1/2] i2c: aspeed: Fix unhandled Tx done with NAK



On 30/11/2023 04:25, Andi Shyti wrote:
> Hi Quan,
>
>> On 29/11/2023 07:35, Andi Shyti wrote:
>>> Hi Quan,
>>>
>>> On Tue, Nov 28, 2023 at 02:52:35PM +0700, Quan Nguyen wrote:
>>>> Under normal conditions, after the last byte is sent by the Slave, the
>>>> TX_NAK interrupt is raised. However, it is also observed that
>>>> sometimes the Master issues the next transaction too quickly while the
>>>> Slave IRQ handler is not yet invoked and the TX_NAK interrupt for the
>>>> last byte of the previous READ_PROCESSED state has not been ack’ed.
>>>> This TX_NAK interrupt is then raised together with SLAVE_MATCH interrupt
>>>> and RX_DONE interrupt of the next coming transaction from Master. The
>>>> Slave IRQ handler currently handles the SLAVE_MATCH and RX_DONE, but
>>>> ignores the TX_NAK, causing complaints such as
>>>> "aspeed-i2c-bus 1e78a040.i2c-bus: irq handled != irq. Expected
>>>> 0x00000086, but was 0x00000084"
>>>>
>>>> This commit adds code to handle this case by emitting a SLAVE_STOP event
>>>> for the TX_NAK before processing the RX_DONE for the coming transaction
>>>> from the Master.
>>>>
>>>> Fixes: f9eb91350bb2 ("i2c: aspeed: added slave support for Aspeed I2C driver")
>>>> Signed-off-by: Quan Nguyen <[email protected]>
>>>> ---
>>>> v2:
>>>> + Split to separate series [Joel]
>>>> + Added the Fixes line [Joel]
>>>> + Revised commit message [Quan]
>>>>
>>>> v1:
>>>> + First introduced in
>>>> https://lore.kernel.org/all/[email protected]/
>>>> ---
>>>> drivers/i2c/busses/i2c-aspeed.c | 5 +++++
>>>> 1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
>>>> index 28e2a5fc4528..79476b46285b 100644
>>>> --- a/drivers/i2c/busses/i2c-aspeed.c
>>>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>>>> @@ -253,6 +253,11 @@ static u32 aspeed_i2c_slave_irq(struct aspeed_i2c_bus *bus, u32 irq_status)
>>>> /* Slave was requested, restart state machine. */
>>>> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
>>>> + if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
>>>> + bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
>>>> + irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
>>>> + i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>>>> + }
>>>
>>> this is a duplicate of a later "if (...)" satement. What is the
>>> need for having them both?
>>>
>> Thanks Andi for the review.
>>
>> I assumed the if statement you mentioned is here in [1]. If so, then that is
>> not duplicate.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-aspeed.c?h=v6.7-rc3#n287
>>
>>
>> The if statement is to process the case when Slave sending data to Master
>> but being NAK, the I2C_SLAVE_STOP event will emit later in switch-case
>> statement. But it is only for the case INTR_TX_NAK without INTR_SLAVE_MATCH.
>>
>> The new code is for the case of INTR_TX_NAK with INTR_SLAVE_MATCH. What it
>> does is to detect if there is a mix of INTR_TX_NAK of previous i2c
>> transaction and the start of new i2c transaction, indicate by
>> INTR_SLAVE_MATCH which is only raised when Slave found its address matched
>> on the first byte it received. If so, the new code will try to emit the
>> I2C_SLAVE_STOP first to complete the previous transaction and process the
>> rest as a new request.
>>
>> So if this was the case (with INTR_SLAVE_MATCH), the INTR_RX_DONE should
>> always raise with INTR_SLAVE_MATCH because Slave did receive the data which
>> matched with its Slave address. And this will be translated into either
>> I2C_SLAVE_[READ|WRITE]_REQUESTED and that make the if statement you
>> mentioned [1] evaluate to false and skip.
>>
>> So, in short, the new code is trying to handle the case of INTR_TX_NAK with
>> INTR_SLAVE_MATCH first before let the rest process as normal.
>
> yes, I saw that, but wasn't it easier to do something like this:
>
> if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
> bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
> irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
> bus->slave_state = ASPEED_I2C_SLAVE_STOP;
>
> if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH)
> i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
>
> }
>
> But I see that Andrew has done some similar comment, also for
> patch 2. You can answer both in the same mail, not to duplicate
> the answer :-)
>
> We can wait for him to reply.
>

I think Andrew's idea to handle the STOP conditions prior is much
better. Will test and post the next version ASAP.

Thanks a lot for the review
- Quan