2021-08-05 21:36:04

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v2 0/4] can: c_can: cache frames to operate as a true FIFO


Performance tests of the c_can driver led to the patch that gives the
series its name. I also added two patches not really related to the topic
of the series.

Changes in v2:
- Move c_can_get_tx_free() from c_can_main.c to c_can.h.

Dario Binacchi (4):
can: c_can: remove struct c_can_priv::priv field
can: c_can: exit c_can_do_tx() early if no frames have been sent
can: c_can: support tx ring algorithm
can: c_can: cache frames to operate as a true FIFO

drivers/net/can/c_can/c_can.h | 26 ++++++-
drivers/net/can/c_can/c_can_main.c | 100 +++++++++++++++++++------
drivers/net/can/c_can/c_can_platform.c | 1 -
3 files changed, 101 insertions(+), 26 deletions(-)

--
2.17.1


2021-08-05 21:36:09

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v2 2/4] can: c_can: exit c_can_do_tx() early if no frames have been sent

The c_can_poll() handles RX/TX events unconditionally. It may therefore
happen that c_can_do_tx() is called unnecessarily because the interrupt
was triggered by the reception of a frame. In these cases, we avoid to
execute unnecessary statements and exit immediately.

Signed-off-by: Dario Binacchi <[email protected]>
---

(no changes since v1)

drivers/net/can/c_can/c_can_main.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index 7588f70ca0fe..fec0e3416970 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -720,17 +720,18 @@ static void c_can_do_tx(struct net_device *dev)
pkts++;
}

+ if (!pkts)
+ return;
+
/* Clear the bits in the tx_active mask */
atomic_sub(clr, &priv->tx_active);

if (clr & BIT(priv->msg_obj_tx_num - 1))
netif_wake_queue(dev);

- if (pkts) {
- stats->tx_bytes += bytes;
- stats->tx_packets += pkts;
- can_led_event(dev, CAN_LED_EVENT_TX);
- }
+ stats->tx_bytes += bytes;
+ stats->tx_packets += pkts;
+ can_led_event(dev, CAN_LED_EVENT_TX);
}

/* If we have a gap in the pending bits, that means we either
--
2.17.1

2021-08-05 21:36:12

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v2 3/4] can: c_can: support tx ring algorithm

The algorithm is already used successfully by other CAN drivers
(e.g. mcp251xfd). Its implementation was kindly suggested to me by
Marc Kleine-Budde following a patch I had previously submitted. You can
find every detail at https://lore.kernel.org/patchwork/patch/1422929/.

The idea is that after this patch, it will be easier to patch the driver
to use the message object memory as a true FIFO.

Suggested-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Dario Binacchi <[email protected]>

---

Changes in v2:
- Move c_can_get_tx_free() from c_can_main.c to c_can.h.

drivers/net/can/c_can/c_can.h | 33 ++++++++++++++-
drivers/net/can/c_can/c_can_main.c | 67 ++++++++++++++++++++++--------
2 files changed, 82 insertions(+), 18 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 8f23e9c83c84..9b4e54c950a6 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -176,6 +176,13 @@ struct c_can_raminit {
bool needs_pulse;
};

+/* c_can tx ring structure */
+struct c_can_tx_ring {
+ unsigned int head;
+ unsigned int tail;
+ unsigned int obj_num;
+};
+
/* c_can private data structure */
struct c_can_priv {
struct can_priv can; /* must be the first member */
@@ -190,10 +197,10 @@ struct c_can_priv {
unsigned int msg_obj_tx_first;
unsigned int msg_obj_tx_last;
u32 msg_obj_rx_mask;
- atomic_t tx_active;
atomic_t sie_pending;
unsigned long tx_dir;
int last_status;
+ struct c_can_tx_ring tx;
u16 (*read_reg)(const struct c_can_priv *priv, enum reg index);
void (*write_reg)(const struct c_can_priv *priv, enum reg index, u16 val);
u32 (*read_reg32)(const struct c_can_priv *priv, enum reg index);
@@ -219,4 +226,28 @@ int c_can_power_down(struct net_device *dev);

void c_can_set_ethtool_ops(struct net_device *dev);

+static inline u8 c_can_get_tx_head(const struct c_can_tx_ring *ring)
+{
+ return ring->head & (ring->obj_num - 1);
+}
+
+static inline u8 c_can_get_tx_tail(const struct c_can_tx_ring *ring)
+{
+ return ring->tail & (ring->obj_num - 1);
+}
+
+static inline u8 c_can_get_tx_free(const struct c_can_tx_ring *ring)
+{
+ u8 head = c_can_get_tx_head(ring);
+ u8 tail = c_can_get_tx_tail(ring);
+
+ /* This is not a FIFO. C/D_CAN sends out the buffers
+ * prioritized. The lowest buffer number wins.
+ */
+ if (head < tail)
+ return 0;
+
+ return ring->obj_num - head;
+}
+
#endif /* C_CAN_H */
diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index fec0e3416970..80a6196a8d7a 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -427,24 +427,50 @@ static void c_can_setup_receive_object(struct net_device *dev, int iface,
c_can_object_put(dev, iface, obj, IF_COMM_RCV_SETUP);
}

+static bool c_can_tx_busy(const struct c_can_priv *priv,
+ const struct c_can_tx_ring *tx_ring)
+{
+ if (c_can_get_tx_free(tx_ring) > 0)
+ return false;
+
+ netif_stop_queue(priv->dev);
+
+ /* Memory barrier before checking tx_free (head and tail) */
+ smp_mb();
+
+ if (c_can_get_tx_free(tx_ring) == 0) {
+ netdev_dbg(priv->dev,
+ "Stopping tx-queue (tx_head=0x%08x, tx_tail=0x%08x, len=%d).\n",
+ tx_ring->head, tx_ring->tail,
+ tx_ring->head - tx_ring->tail);
+ return true;
+ }
+
+ netif_start_queue(priv->dev);
+ return false;
+}
+
static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
struct can_frame *frame = (struct can_frame *)skb->data;
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
u32 idx, obj;

if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
- /* This is not a FIFO. C/D_CAN sends out the buffers
- * prioritized. The lowest buffer number wins.
- */
- idx = fls(atomic_read(&priv->tx_active));
- obj = idx + priv->msg_obj_tx_first;

- /* If this is the last buffer, stop the xmit queue */
- if (idx == priv->msg_obj_tx_num - 1)
+ if (c_can_tx_busy(priv, tx_ring))
+ return NETDEV_TX_BUSY;
+
+ idx = c_can_get_tx_head(tx_ring);
+ tx_ring->head++;
+ if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);
+
+ obj = idx + priv->msg_obj_tx_first;
+
/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
* transmit as we might race against do_tx().
@@ -453,8 +479,6 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
priv->dlc[idx] = frame->len;
can_put_echo_skb(skb, dev, idx, 0);

- /* Update the active bits */
- atomic_add(BIT(idx), &priv->tx_active);
/* Start transmission */
c_can_object_put(dev, IF_TX, obj, IF_COMM_TX);

@@ -567,6 +591,7 @@ static int c_can_software_reset(struct net_device *dev)
static int c_can_chip_config(struct net_device *dev)
{
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
int err;

err = c_can_software_reset(dev);
@@ -598,7 +623,8 @@ static int c_can_chip_config(struct net_device *dev)
priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);

/* Clear all internal status */
- atomic_set(&priv->tx_active, 0);
+ tx_ring->head = 0;
+ tx_ring->tail = 0;
priv->tx_dir = 0;

/* set bittiming params */
@@ -696,14 +722,14 @@ static int c_can_get_berr_counter(const struct net_device *dev,
static void c_can_do_tx(struct net_device *dev)
{
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
struct net_device_stats *stats = &dev->stats;
- u32 idx, obj, pkts = 0, bytes = 0, pend, clr;
+ u32 idx, obj, pkts = 0, bytes = 0, pend;

if (priv->msg_obj_tx_last > 32)
pend = priv->read_reg32(priv, C_CAN_INTPND3_REG);
else
pend = priv->read_reg(priv, C_CAN_INTPND2_REG);
- clr = pend;

while ((idx = ffs(pend))) {
idx--;
@@ -723,11 +749,14 @@ static void c_can_do_tx(struct net_device *dev)
if (!pkts)
return;

- /* Clear the bits in the tx_active mask */
- atomic_sub(clr, &priv->tx_active);
-
- if (clr & BIT(priv->msg_obj_tx_num - 1))
- netif_wake_queue(dev);
+ tx_ring->tail += pkts;
+ if (c_can_get_tx_free(tx_ring)) {
+ /* Make sure that anybody stopping the queue after
+ * this sees the new tx_ring->tail.
+ */
+ smp_mb();
+ netif_wake_queue(priv->dev);
+ }

stats->tx_bytes += bytes;
stats->tx_packets += pkts;
@@ -1206,6 +1235,10 @@ struct net_device *alloc_c_can_dev(int msg_obj_num)
priv->msg_obj_tx_last =
priv->msg_obj_tx_first + priv->msg_obj_tx_num - 1;

+ priv->tx.head = 0;
+ priv->tx.tail = 0;
+ priv->tx.obj_num = msg_obj_tx_num;
+
netif_napi_add(dev, &priv->napi, c_can_poll, priv->msg_obj_rx_num);

priv->dev = dev;
--
2.17.1

2021-08-05 21:36:21

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v2 4/4] can: c_can: cache frames to operate as a true FIFO

As reported by a comment in the c_can_start_xmit() this was not a FIFO.
C/D_CAN controller sends out the buffers prioritized so that the lowest
buffer number wins.

What did c_can_start_xmit() do if head was less tail in the tx ring ? It
waited until all the frames queued in the FIFO was actually transmitted
by the controller before accepting a new CAN frame to transmit, even if
the FIFO was not full, to ensure that the messages were transmitted in
the order in which they were loaded.

By storing the frames in the FIFO without requiring its transmission, we
will be able to use the full size of the FIFO even in cases such as the
one described above. The transmission interrupt will trigger their
transmission only when all the messages previously loaded but stored in
less priority positions of the buffers have been transmitted.

Suggested-by: Gianluca Falavigna <[email protected]>
Signed-off-by: Dario Binacchi <[email protected]>

---

(no changes since v1)

drivers/net/can/c_can/c_can.h | 12 ++----------
drivers/net/can/c_can/c_can_main.c | 28 ++++++++++++++++++++++++----
2 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 9b4e54c950a6..fc499a70b797 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -200,6 +200,7 @@ struct c_can_priv {
atomic_t sie_pending;
unsigned long tx_dir;
int last_status;
+ spinlock_t tx_lock;
struct c_can_tx_ring tx;
u16 (*read_reg)(const struct c_can_priv *priv, enum reg index);
void (*write_reg)(const struct c_can_priv *priv, enum reg index, u16 val);
@@ -238,16 +239,7 @@ static inline u8 c_can_get_tx_tail(const struct c_can_tx_ring *ring)

static inline u8 c_can_get_tx_free(const struct c_can_tx_ring *ring)
{
- u8 head = c_can_get_tx_head(ring);
- u8 tail = c_can_get_tx_tail(ring);
-
- /* This is not a FIFO. C/D_CAN sends out the buffers
- * prioritized. The lowest buffer number wins.
- */
- if (head < tail)
- return 0;
-
- return ring->obj_num - head;
+ return ring->obj_num - (ring->head - ring->tail);
}

#endif /* C_CAN_H */
diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index 80a6196a8d7a..4c061fef002c 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -456,7 +456,7 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
struct can_frame *frame = (struct can_frame *)skb->data;
struct c_can_priv *priv = netdev_priv(dev);
struct c_can_tx_ring *tx_ring = &priv->tx;
- u32 idx, obj;
+ u32 idx, obj, cmd = IF_COMM_TX;

if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
@@ -469,7 +469,11 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);

- obj = idx + priv->msg_obj_tx_first;
+ spin_lock_bh(&priv->tx_lock);
+ if (idx < c_can_get_tx_tail(tx_ring))
+ cmd &= ~IF_COMM_TXRQST; /* Cache the message */
+ else
+ spin_unlock_bh(&priv->tx_lock);

/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
@@ -478,9 +482,11 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
c_can_setup_tx_object(dev, IF_TX, frame, idx);
priv->dlc[idx] = frame->len;
can_put_echo_skb(skb, dev, idx, 0);
+ obj = idx + priv->msg_obj_tx_first;
+ c_can_object_put(dev, IF_TX, obj, cmd);

- /* Start transmission */
- c_can_object_put(dev, IF_TX, obj, IF_COMM_TX);
+ if (spin_is_locked(&priv->tx_lock))
+ spin_unlock_bh(&priv->tx_lock);

return NETDEV_TX_OK;
}
@@ -725,6 +731,7 @@ static void c_can_do_tx(struct net_device *dev)
struct c_can_tx_ring *tx_ring = &priv->tx;
struct net_device_stats *stats = &dev->stats;
u32 idx, obj, pkts = 0, bytes = 0, pend;
+ u8 tail;

if (priv->msg_obj_tx_last > 32)
pend = priv->read_reg32(priv, C_CAN_INTPND3_REG);
@@ -761,6 +768,18 @@ static void c_can_do_tx(struct net_device *dev)
stats->tx_bytes += bytes;
stats->tx_packets += pkts;
can_led_event(dev, CAN_LED_EVENT_TX);
+
+ tail = c_can_get_tx_tail(tx_ring);
+
+ if (tail == 0) {
+ u8 head = c_can_get_tx_head(tx_ring);
+
+ /* Start transmission for all cached messages */
+ for (idx = tail; idx < head; idx++) {
+ obj = idx + priv->msg_obj_tx_first;
+ c_can_object_put(dev, IF_TX, obj, IF_COMM_TXRQST);
+ }
+ }
}

/* If we have a gap in the pending bits, that means we either
@@ -1223,6 +1242,7 @@ struct net_device *alloc_c_can_dev(int msg_obj_num)
return NULL;

priv = netdev_priv(dev);
+ spin_lock_init(&priv->tx_lock);
priv->msg_obj_num = msg_obj_num;
priv->msg_obj_rx_num = msg_obj_num - msg_obj_tx_num;
priv->msg_obj_rx_first = 1;
--
2.17.1

2021-08-06 14:26:03

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] can: c_can: cache frames to operate as a true FIFO

On 05.08.2021 22:19:00, Dario Binacchi wrote:
> As reported by a comment in the c_can_start_xmit() this was not a FIFO.
> C/D_CAN controller sends out the buffers prioritized so that the lowest
> buffer number wins.
>
> What did c_can_start_xmit() do if head was less tail in the tx ring ? It
> waited until all the frames queued in the FIFO was actually transmitted
> by the controller before accepting a new CAN frame to transmit, even if
> the FIFO was not full, to ensure that the messages were transmitted in
> the order in which they were loaded.
>
> By storing the frames in the FIFO without requiring its transmission, we
> will be able to use the full size of the FIFO even in cases such as the
> one described above. The transmission interrupt will trigger their
> transmission only when all the messages previously loaded but stored in
> less priority positions of the buffers have been transmitted.
>
> Suggested-by: Gianluca Falavigna <[email protected]>
> Signed-off-by: Dario Binacchi <[email protected]>

My review from
https://lore.kernel.org/linux-can/[email protected]/
applies here, too.

Please use IF_RX in c_can_do_tx(), remove the spin_lock and test. After
applying your series, I'll send a patch that changes IF_RX into IF_NAPI
to avoid any further confusion.

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (1.61 kB)
signature.asc (499.00 B)
Download all attachments