2021-08-07 13:09:14

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v3 0/4] can: c_can: cache frames to operate as a true FIFO


Performance tests of the c_can driver led to the patch that gives the
series its name. I also added two patches not really related to the topic
of the series.

Run test succesfully on a custom board having two CAN ports.
I connected the CAN1 port to the CAN2 port with a cable. Then I
opened two terminals. On one I issued a dump command and on the
other the transmit command used in the tests described in
https://marc.info/?l=linux-can&m=139746476821294&w=2.

Terminal-1:
$ ip link set can1 type can bitrate <bitrate>
$ ip link set up can1
$ candump can1 >/tmp/can-test-<bitrate>

Terminal-2
$ ip link set can0 type can bitrate <bitrate>
$ ip link set up can0
$ time cangen can0 -g0 -p1 -I5A5 -L0 -x -n 1000000

Then I applied the following commands to the file generated by the dump
command:
$ wc -l </tmp/can-test-<bitrate> # ca
$ egrep -v ' can1 5A5 \[0\]' /tmp/can-test-<bitrate> | wc -l # cb

I repeated the tests for 1000000, 500000, 250000 and 125000 bitrates,
before and after applying the series.
Here are the results:

Before applying the series:
bitrate time ca cb
125000 6m 42.71s 1000000 0
250000 3m 23.28s 1000000 0
500000 1m 44.04s 1000000 0
1000000 1m 8.44s 1000000 0

After applying ring-FIFO series:
bitrate time ca cb
125000 6m 40.48s 1000000 0
250000 3m 20.80s 1000000 0
500000 1m 42.56s 1000000 0
1000000 1m 7.89s 1000000 0


Changes in v3:
- Remove the transmission spin_lock.
- Use IF_RX in c_can_do_tx().

Changes in v2:
- Move c_can_get_tx_free() from c_can_main.c to c_can.h.

Dario Binacchi (4):
can: c_can: remove struct c_can_priv::priv field
can: c_can: exit c_can_do_tx() early if no frames have been sent
can: c_can: support tx ring algorithm
can: c_can: cache frames to operate as a true FIFO

drivers/net/can/c_can/c_can.h | 25 ++++++-
drivers/net/can/c_can/c_can_main.c | 95 +++++++++++++++++++-------
drivers/net/can/c_can/c_can_platform.c | 1 -
3 files changed, 94 insertions(+), 27 deletions(-)

--
2.17.1


2021-08-07 13:10:08

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v3 2/4] can: c_can: exit c_can_do_tx() early if no frames have been sent

The c_can_poll() handles RX/TX events unconditionally. It may therefore
happen that c_can_do_tx() is called unnecessarily because the interrupt
was triggered by the reception of a frame. In these cases, we avoid to
execute unnecessary statements and exit immediately.

Signed-off-by: Dario Binacchi <[email protected]>
---

(no changes since v1)

drivers/net/can/c_can/c_can_main.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index 7588f70ca0fe..fec0e3416970 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -720,17 +720,18 @@ static void c_can_do_tx(struct net_device *dev)
pkts++;
}

+ if (!pkts)
+ return;
+
/* Clear the bits in the tx_active mask */
atomic_sub(clr, &priv->tx_active);

if (clr & BIT(priv->msg_obj_tx_num - 1))
netif_wake_queue(dev);

- if (pkts) {
- stats->tx_bytes += bytes;
- stats->tx_packets += pkts;
- can_led_event(dev, CAN_LED_EVENT_TX);
- }
+ stats->tx_bytes += bytes;
+ stats->tx_packets += pkts;
+ can_led_event(dev, CAN_LED_EVENT_TX);
}

/* If we have a gap in the pending bits, that means we either
--
2.17.1

2021-08-07 13:10:22

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v3 4/4] can: c_can: cache frames to operate as a true FIFO

As reported by a comment in the c_can_start_xmit() this was not a FIFO.
C/D_CAN controller sends out the buffers prioritized so that the lowest
buffer number wins.

What did c_can_start_xmit() do if head was less tail in the tx ring ? It
waited until all the frames queued in the FIFO was actually transmitted
by the controller before accepting a new CAN frame to transmit, even if
the FIFO was not full, to ensure that the messages were transmitted in
the order in which they were loaded.

By storing the frames in the FIFO without requiring its transmission, we
will be able to use the full size of the FIFO even in cases such as the
one described above. The transmission interrupt will trigger their
transmission only when all the messages previously loaded but stored in
less priority positions of the buffers have been transmitted.

Suggested-by: Gianluca Falavigna <[email protected]>
Signed-off-by: Dario Binacchi <[email protected]>

---

Changes in v3:
- Remove the transmission spin_lock.
- Use IF_RX in c_can_do_tx().

drivers/net/can/c_can/c_can.h | 11 +----------
drivers/net/can/c_can/c_can_main.c | 23 ++++++++++++++++++-----
2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 9b4e54c950a6..08b6efa7a1a7 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -238,16 +238,7 @@ static inline u8 c_can_get_tx_tail(const struct c_can_tx_ring *ring)

static inline u8 c_can_get_tx_free(const struct c_can_tx_ring *ring)
{
- u8 head = c_can_get_tx_head(ring);
- u8 tail = c_can_get_tx_tail(ring);
-
- /* This is not a FIFO. C/D_CAN sends out the buffers
- * prioritized. The lowest buffer number wins.
- */
- if (head < tail)
- return 0;
-
- return ring->obj_num - head;
+ return ring->obj_num - (ring->head - ring->tail);
}

#endif /* C_CAN_H */
diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index 80a6196a8d7a..e26b097b11ff 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -456,7 +456,7 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
struct can_frame *frame = (struct can_frame *)skb->data;
struct c_can_priv *priv = netdev_priv(dev);
struct c_can_tx_ring *tx_ring = &priv->tx;
- u32 idx, obj;
+ u32 idx, obj, cmd = IF_COMM_TX;

if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
@@ -469,7 +469,8 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);

- obj = idx + priv->msg_obj_tx_first;
+ if (idx < c_can_get_tx_tail(tx_ring))
+ cmd &= ~IF_COMM_TXRQST; /* Cache the message */

/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
@@ -478,9 +479,8 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
c_can_setup_tx_object(dev, IF_TX, frame, idx);
priv->dlc[idx] = frame->len;
can_put_echo_skb(skb, dev, idx, 0);
-
- /* Start transmission */
- c_can_object_put(dev, IF_TX, obj, IF_COMM_TX);
+ obj = idx + priv->msg_obj_tx_first;
+ c_can_object_put(dev, IF_TX, obj, cmd);

return NETDEV_TX_OK;
}
@@ -725,6 +725,7 @@ static void c_can_do_tx(struct net_device *dev)
struct c_can_tx_ring *tx_ring = &priv->tx;
struct net_device_stats *stats = &dev->stats;
u32 idx, obj, pkts = 0, bytes = 0, pend;
+ u8 tail;

if (priv->msg_obj_tx_last > 32)
pend = priv->read_reg32(priv, C_CAN_INTPND3_REG);
@@ -761,6 +762,18 @@ static void c_can_do_tx(struct net_device *dev)
stats->tx_bytes += bytes;
stats->tx_packets += pkts;
can_led_event(dev, CAN_LED_EVENT_TX);
+
+ tail = c_can_get_tx_tail(tx_ring);
+
+ if (tail == 0) {
+ u8 head = c_can_get_tx_head(tx_ring);
+
+ /* Start transmission for all cached messages */
+ for (idx = tail; idx < head; idx++) {
+ obj = idx + priv->msg_obj_tx_first;
+ c_can_object_put(dev, IF_RX, obj, IF_COMM_TXRQST);
+ }
+ }
}

/* If we have a gap in the pending bits, that means we either
--
2.17.1

2021-08-07 13:10:24

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v3 1/4] can: c_can: remove struct c_can_priv::priv field

It references the clock but it is never used. So let's remove it.

Signed-off-by: Dario Binacchi <[email protected]>
---

(no changes since v1)

drivers/net/can/c_can/c_can.h | 1 -
drivers/net/can/c_can/c_can_platform.c | 1 -
2 files changed, 2 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 4247ff80a29c..8f23e9c83c84 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -200,7 +200,6 @@ struct c_can_priv {
void (*write_reg32)(const struct c_can_priv *priv, enum reg index, u32 val);
void __iomem *base;
const u16 *regs;
- void *priv; /* for board-specific data */
enum c_can_dev_id type;
struct c_can_raminit raminit_sys; /* RAMINIT via syscon regmap */
void (*raminit)(const struct c_can_priv *priv, bool enable);
diff --git a/drivers/net/can/c_can/c_can_platform.c b/drivers/net/can/c_can/c_can_platform.c
index 36950363682f..86e95e9d6533 100644
--- a/drivers/net/can/c_can/c_can_platform.c
+++ b/drivers/net/can/c_can/c_can_platform.c
@@ -385,7 +385,6 @@ static int c_can_plat_probe(struct platform_device *pdev)
priv->base = addr;
priv->device = &pdev->dev;
priv->can.clock.freq = clk_get_rate(clk);
- priv->priv = clk;
priv->type = drvdata->id;

platform_set_drvdata(pdev, dev);
--
2.17.1

2021-08-07 13:10:49

by Dario Binacchi

[permalink] [raw]
Subject: [PATCH v3 3/4] can: c_can: support tx ring algorithm

The algorithm is already used successfully by other CAN drivers
(e.g. mcp251xfd). Its implementation was kindly suggested to me by
Marc Kleine-Budde following a patch I had previously submitted. You can
find every detail at https://lore.kernel.org/patchwork/patch/1422929/.

The idea is that after this patch, it will be easier to patch the driver
to use the message object memory as a true FIFO.

Suggested-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Dario Binacchi <[email protected]>

---

(no changes since v2)

Changes in v2:
- Move c_can_get_tx_free() from c_can_main.c to c_can.h.

drivers/net/can/c_can/c_can.h | 33 ++++++++++++++-
drivers/net/can/c_can/c_can_main.c | 67 ++++++++++++++++++++++--------
2 files changed, 82 insertions(+), 18 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h
index 8f23e9c83c84..9b4e54c950a6 100644
--- a/drivers/net/can/c_can/c_can.h
+++ b/drivers/net/can/c_can/c_can.h
@@ -176,6 +176,13 @@ struct c_can_raminit {
bool needs_pulse;
};

+/* c_can tx ring structure */
+struct c_can_tx_ring {
+ unsigned int head;
+ unsigned int tail;
+ unsigned int obj_num;
+};
+
/* c_can private data structure */
struct c_can_priv {
struct can_priv can; /* must be the first member */
@@ -190,10 +197,10 @@ struct c_can_priv {
unsigned int msg_obj_tx_first;
unsigned int msg_obj_tx_last;
u32 msg_obj_rx_mask;
- atomic_t tx_active;
atomic_t sie_pending;
unsigned long tx_dir;
int last_status;
+ struct c_can_tx_ring tx;
u16 (*read_reg)(const struct c_can_priv *priv, enum reg index);
void (*write_reg)(const struct c_can_priv *priv, enum reg index, u16 val);
u32 (*read_reg32)(const struct c_can_priv *priv, enum reg index);
@@ -219,4 +226,28 @@ int c_can_power_down(struct net_device *dev);

void c_can_set_ethtool_ops(struct net_device *dev);

+static inline u8 c_can_get_tx_head(const struct c_can_tx_ring *ring)
+{
+ return ring->head & (ring->obj_num - 1);
+}
+
+static inline u8 c_can_get_tx_tail(const struct c_can_tx_ring *ring)
+{
+ return ring->tail & (ring->obj_num - 1);
+}
+
+static inline u8 c_can_get_tx_free(const struct c_can_tx_ring *ring)
+{
+ u8 head = c_can_get_tx_head(ring);
+ u8 tail = c_can_get_tx_tail(ring);
+
+ /* This is not a FIFO. C/D_CAN sends out the buffers
+ * prioritized. The lowest buffer number wins.
+ */
+ if (head < tail)
+ return 0;
+
+ return ring->obj_num - head;
+}
+
#endif /* C_CAN_H */
diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
index fec0e3416970..80a6196a8d7a 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -427,24 +427,50 @@ static void c_can_setup_receive_object(struct net_device *dev, int iface,
c_can_object_put(dev, iface, obj, IF_COMM_RCV_SETUP);
}

+static bool c_can_tx_busy(const struct c_can_priv *priv,
+ const struct c_can_tx_ring *tx_ring)
+{
+ if (c_can_get_tx_free(tx_ring) > 0)
+ return false;
+
+ netif_stop_queue(priv->dev);
+
+ /* Memory barrier before checking tx_free (head and tail) */
+ smp_mb();
+
+ if (c_can_get_tx_free(tx_ring) == 0) {
+ netdev_dbg(priv->dev,
+ "Stopping tx-queue (tx_head=0x%08x, tx_tail=0x%08x, len=%d).\n",
+ tx_ring->head, tx_ring->tail,
+ tx_ring->head - tx_ring->tail);
+ return true;
+ }
+
+ netif_start_queue(priv->dev);
+ return false;
+}
+
static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
struct net_device *dev)
{
struct can_frame *frame = (struct can_frame *)skb->data;
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
u32 idx, obj;

if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
- /* This is not a FIFO. C/D_CAN sends out the buffers
- * prioritized. The lowest buffer number wins.
- */
- idx = fls(atomic_read(&priv->tx_active));
- obj = idx + priv->msg_obj_tx_first;

- /* If this is the last buffer, stop the xmit queue */
- if (idx == priv->msg_obj_tx_num - 1)
+ if (c_can_tx_busy(priv, tx_ring))
+ return NETDEV_TX_BUSY;
+
+ idx = c_can_get_tx_head(tx_ring);
+ tx_ring->head++;
+ if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);
+
+ obj = idx + priv->msg_obj_tx_first;
+
/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
* transmit as we might race against do_tx().
@@ -453,8 +479,6 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
priv->dlc[idx] = frame->len;
can_put_echo_skb(skb, dev, idx, 0);

- /* Update the active bits */
- atomic_add(BIT(idx), &priv->tx_active);
/* Start transmission */
c_can_object_put(dev, IF_TX, obj, IF_COMM_TX);

@@ -567,6 +591,7 @@ static int c_can_software_reset(struct net_device *dev)
static int c_can_chip_config(struct net_device *dev)
{
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
int err;

err = c_can_software_reset(dev);
@@ -598,7 +623,8 @@ static int c_can_chip_config(struct net_device *dev)
priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);

/* Clear all internal status */
- atomic_set(&priv->tx_active, 0);
+ tx_ring->head = 0;
+ tx_ring->tail = 0;
priv->tx_dir = 0;

/* set bittiming params */
@@ -696,14 +722,14 @@ static int c_can_get_berr_counter(const struct net_device *dev,
static void c_can_do_tx(struct net_device *dev)
{
struct c_can_priv *priv = netdev_priv(dev);
+ struct c_can_tx_ring *tx_ring = &priv->tx;
struct net_device_stats *stats = &dev->stats;
- u32 idx, obj, pkts = 0, bytes = 0, pend, clr;
+ u32 idx, obj, pkts = 0, bytes = 0, pend;

if (priv->msg_obj_tx_last > 32)
pend = priv->read_reg32(priv, C_CAN_INTPND3_REG);
else
pend = priv->read_reg(priv, C_CAN_INTPND2_REG);
- clr = pend;

while ((idx = ffs(pend))) {
idx--;
@@ -723,11 +749,14 @@ static void c_can_do_tx(struct net_device *dev)
if (!pkts)
return;

- /* Clear the bits in the tx_active mask */
- atomic_sub(clr, &priv->tx_active);
-
- if (clr & BIT(priv->msg_obj_tx_num - 1))
- netif_wake_queue(dev);
+ tx_ring->tail += pkts;
+ if (c_can_get_tx_free(tx_ring)) {
+ /* Make sure that anybody stopping the queue after
+ * this sees the new tx_ring->tail.
+ */
+ smp_mb();
+ netif_wake_queue(priv->dev);
+ }

stats->tx_bytes += bytes;
stats->tx_packets += pkts;
@@ -1206,6 +1235,10 @@ struct net_device *alloc_c_can_dev(int msg_obj_num)
priv->msg_obj_tx_last =
priv->msg_obj_tx_first + priv->msg_obj_tx_num - 1;

+ priv->tx.head = 0;
+ priv->tx.tail = 0;
+ priv->tx.obj_num = msg_obj_tx_num;
+
netif_napi_add(dev, &priv->napi, c_can_poll, priv->msg_obj_rx_num);

priv->dev = dev;
--
2.17.1

2021-08-09 07:12:57

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] can: c_can: cache frames to operate as a true FIFO

On 07.08.2021 15:07:56, Dario Binacchi wrote:
>
> Performance tests of the c_can driver led to the patch that gives the
> series its name. I also added two patches not really related to the topic
> of the series.
>
> Run test succesfully on a custom board having two CAN ports.
> I connected the CAN1 port to the CAN2 port with a cable. Then I
> opened two terminals. On one I issued a dump command and on the
> other the transmit command used in the tests described in
> https://marc.info/?l=linux-can&m=139746476821294&w=2.

Thanks! Applied to linux-can-next/testing.

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (874.00 B)
signature.asc (499.00 B)
Download all attachments