Subject: [PATCH v5 00/12] can: m_can: Optimizations for m_can/tcan part 2

Hi Marc, Simon and everyone,

v5 got a rebase on v6.5 with some small style fixes as pointed out in v4.

It is tested on tcan455x but I don't have hardware with mcan on the SoC
myself so any testing is appreciated.

The series implements many small and bigger throughput improvements and
adds rx/tx coalescing at the end.

Based on v6.5-rc1. Also available at
https://gitlab.baylibre.com/msp8/linux/-/tree/topic/mcan-optimization/v6.5?ref_type=heads

Best,
Markus

Changes in v5:
- Add back parenthesis in m_can_set_coalesce(). This will make
checkpatch unhappy but gcc happy.
- Remove unused fifo_header variable in m_can_tx_handler().
- Rebased to v6.5-rc1

Changes in v4:
- Create and use struct m_can_fifo_element in m_can_tx_handler
- Fix memcpy_and_pad to copy the full buffer
- Fixed a few checkpatch warnings
- Change putidx to be unsigned
- Print hard_xmit error only once when TX FIFO is full

Changes in v3:
- Remove parenthesis in error messages
- Use memcpy_and_pad for buffer copy in 'can: m_can: Write transmit
header and data in one transaction'.
- Replace spin_lock with spin_lock_irqsave. I got a report of a
interrupt that was calling start_xmit just after the netqueue was
woken up before the locked region was exited. spin_lock_irqsave should
fix this. I attached the full stack at the end of the mail if someone
wants to know.
- Rebased to v6.3-rc1.
- Removed tcan4x5x patches from this series.

Changes in v2:
- Rebased on v6.2-rc5
- Fixed missing/broken accounting for non peripheral m_can devices.

previous versions:
v1 - https://lore.kernel.org/lkml/[email protected]
v2 - https://lore.kernel.org/lkml/[email protected]
v3 - https://lore.kernel.org/lkml/[email protected]/
v4 - https://lore.kernel.org/lkml/[email protected]/

Markus Schneider-Pargmann (12):
can: m_can: Write transmit header and data in one transaction
can: m_can: Implement receive coalescing
can: m_can: Implement transmit coalescing
can: m_can: Add rx coalescing ethtool support
can: m_can: Add tx coalescing ethtool support
can: m_can: Use u32 for putidx
can: m_can: Cache tx putidx
can: m_can: Use the workqueue as queue
can: m_can: Introduce a tx_fifo_in_flight counter
can: m_can: Use tx_fifo_in_flight for netif_queue control
can: m_can: Implement BQL
can: m_can: Implement transmit submission coalescing

drivers/net/can/m_can/m_can.c | 517 +++++++++++++++++++++++++---------
drivers/net/can/m_can/m_can.h | 35 ++-
2 files changed, 418 insertions(+), 134 deletions(-)


base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
--
2.40.1



Subject: [PATCH v5 03/12] can: m_can: Implement transmit coalescing

Extend the coalescing implementation for transmits.

In normal mode the chip raises an interrupt for every finished transmit.
This implementation switches to coalescing mode as soon as an interrupt
handled a transmit. For coalescing the watermark level interrupt is used
to interrupt exactly after x frames were sent. It switches back into
normal mode once there was an interrupt with no finished transmit and
the timer being inactive.

The timer is shared with receive coalescing. The time for receive and
transmit coalescing timers have to be the same for that to work. The
benefit is to have only a single running timer.

Signed-off-by: Markus Schneider-Pargmann <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
---
drivers/net/can/m_can/m_can.c | 33 ++++++++++++++++++++-------------
drivers/net/can/m_can/m_can.h | 3 +++
2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index dd0fa58660d7..e979aeb2ef13 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -255,6 +255,7 @@ enum m_can_reg {
#define TXESC_TBDS_64B 0x7

/* Tx Event FIFO Configuration (TXEFC) */
+#define TXEFC_EFWM_MASK GENMASK(29, 24)
#define TXEFC_EFS_MASK GENMASK(21, 16)

/* Tx Event FIFO Status (TXEFS) */
@@ -429,7 +430,7 @@ static void m_can_interrupt_enable(struct m_can_classdev *cdev, u32 interrupts)

static void m_can_coalescing_disable(struct m_can_classdev *cdev)
{
- u32 new_interrupts = cdev->active_interrupts | IR_RF0N;
+ u32 new_interrupts = cdev->active_interrupts | IR_RF0N | IR_TEFN;

hrtimer_cancel(&cdev->irq_timer);
m_can_interrupt_enable(cdev, new_interrupts);
@@ -1096,21 +1097,26 @@ static int m_can_echo_tx_event(struct net_device *dev)
static void m_can_coalescing_update(struct m_can_classdev *cdev, u32 ir)
{
u32 new_interrupts = cdev->active_interrupts;
- bool enable_timer = false;
+ bool enable_rx_timer = false;
+ bool enable_tx_timer = false;

if (cdev->rx_coalesce_usecs_irq > 0 && (ir & (IR_RF0N | IR_RF0W))) {
- enable_timer = true;
+ enable_rx_timer = true;
new_interrupts &= ~IR_RF0N;
- } else if (!hrtimer_active(&cdev->irq_timer)) {
- new_interrupts |= IR_RF0N;
}
+ if (cdev->tx_coalesce_usecs_irq > 0 && (ir & (IR_TEFN | IR_TEFW))) {
+ enable_tx_timer = true;
+ new_interrupts &= ~IR_TEFN;
+ }
+ if (!enable_rx_timer && !hrtimer_active(&cdev->irq_timer))
+ new_interrupts |= IR_RF0N;
+ if (!enable_tx_timer && !hrtimer_active(&cdev->irq_timer))
+ new_interrupts |= IR_TEFN;

m_can_interrupt_enable(cdev, new_interrupts);
- if (enable_timer) {
- hrtimer_start(&cdev->irq_timer,
- ns_to_ktime(cdev->rx_coalesce_usecs_irq * NSEC_PER_USEC),
+ if (enable_rx_timer | enable_tx_timer)
+ hrtimer_start(&cdev->irq_timer, cdev->irq_timer_wait,
HRTIMER_MODE_REL);
- }
}

static irqreturn_t m_can_isr(int irq, void *dev_id)
@@ -1165,7 +1171,7 @@ static irqreturn_t m_can_isr(int irq, void *dev_id)
netif_wake_queue(dev);
}
} else {
- if (ir & IR_TEFN) {
+ if (ir & (IR_TEFN | IR_TEFW)) {
/* New TX FIFO Element arrived */
if (m_can_echo_tx_event(dev) != 0)
goto out_fail;
@@ -1333,9 +1339,8 @@ static int m_can_chip_config(struct net_device *dev)
}

/* Disable unused interrupts */
- interrupts &= ~(IR_ARA | IR_ELO | IR_DRX | IR_TEFF | IR_TEFW | IR_TFE |
- IR_TCF | IR_HPM | IR_RF1F | IR_RF1W | IR_RF1N |
- IR_RF0F);
+ interrupts &= ~(IR_ARA | IR_ELO | IR_DRX | IR_TEFF | IR_TFE | IR_TCF |
+ IR_HPM | IR_RF1F | IR_RF1W | IR_RF1N | IR_RF0F);

m_can_config_endisable(cdev, true);

@@ -1372,6 +1377,8 @@ static int m_can_chip_config(struct net_device *dev)
} else {
/* Full TX Event FIFO is used */
m_can_write(cdev, M_CAN_TXEFC,
+ FIELD_PREP(TXEFC_EFWM_MASK,
+ cdev->tx_max_coalesced_frames_irq) |
FIELD_PREP(TXEFC_EFS_MASK,
cdev->mcfg[MRAM_TXE].num) |
cdev->mcfg[MRAM_TXE].off);
diff --git a/drivers/net/can/m_can/m_can.h b/drivers/net/can/m_can/m_can.h
index c59099d3f5b9..d0c21eddb6ec 100644
--- a/drivers/net/can/m_can/m_can.h
+++ b/drivers/net/can/m_can/m_can.h
@@ -85,6 +85,7 @@ struct m_can_classdev {
struct phy *transceiver;

struct hrtimer irq_timer;
+ ktime_t irq_timer_wait;

struct m_can_ops *ops;

@@ -98,6 +99,8 @@ struct m_can_classdev {
u32 active_interrupts;
u32 rx_max_coalesced_frames_irq;
u32 rx_coalesce_usecs_irq;
+ u32 tx_max_coalesced_frames_irq;
+ u32 tx_coalesce_usecs_irq;

struct mram_cfg mcfg[MRAM_CFG_NUM];
};
--
2.40.1


Subject: [PATCH v5 01/12] can: m_can: Write transmit header and data in one transaction

Combine header and data before writing to the transmit fifo to reduce
the overhead for peripheral chips.

Signed-off-by: Markus Schneider-Pargmann <[email protected]>
---
drivers/net/can/m_can/m_can.c | 35 +++++++++++++++++++++--------------
1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index c5af92bcc9c9..478e0670f0d1 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -317,6 +317,12 @@ struct id_and_dlc {
u32 dlc;
};

+struct m_can_fifo_element {
+ u32 id;
+ u32 dlc;
+ u8 data[CANFD_MAX_DLEN];
+};
+
static inline u32 m_can_read(struct m_can_classdev *cdev, enum m_can_reg reg)
{
return cdev->ops->read_reg(cdev, reg);
@@ -1622,9 +1628,10 @@ static int m_can_next_echo_skb_occupied(struct net_device *dev, int putidx)
static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
{
struct canfd_frame *cf = (struct canfd_frame *)cdev->tx_skb->data;
+ u8 len_padded = DIV_ROUND_UP(cf->len, 4);
+ struct m_can_fifo_element fifo_element;
struct net_device *dev = cdev->net;
struct sk_buff *skb = cdev->tx_skb;
- struct id_and_dlc fifo_header;
u32 cccr, fdflags;
u32 txfqs;
int err;
@@ -1635,27 +1642,27 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
/* Generate ID field for TX buffer Element */
/* Common to all supported M_CAN versions */
if (cf->can_id & CAN_EFF_FLAG) {
- fifo_header.id = cf->can_id & CAN_EFF_MASK;
- fifo_header.id |= TX_BUF_XTD;
+ fifo_element.id = cf->can_id & CAN_EFF_MASK;
+ fifo_element.id |= TX_BUF_XTD;
} else {
- fifo_header.id = ((cf->can_id & CAN_SFF_MASK) << 18);
+ fifo_element.id = ((cf->can_id & CAN_SFF_MASK) << 18);
}

if (cf->can_id & CAN_RTR_FLAG)
- fifo_header.id |= TX_BUF_RTR;
+ fifo_element.id |= TX_BUF_RTR;

if (cdev->version == 30) {
netif_stop_queue(dev);

- fifo_header.dlc = can_fd_len2dlc(cf->len) << 16;
+ fifo_element.dlc = can_fd_len2dlc(cf->len) << 16;

/* Write the frame ID, DLC, and payload to the FIFO element. */
- err = m_can_fifo_write(cdev, 0, M_CAN_FIFO_ID, &fifo_header, 2);
+ err = m_can_fifo_write(cdev, 0, M_CAN_FIFO_ID, &fifo_element, 2);
if (err)
goto out_fail;

err = m_can_fifo_write(cdev, 0, M_CAN_FIFO_DATA,
- cf->data, DIV_ROUND_UP(cf->len, 4));
+ cf->data, len_padded);
if (err)
goto out_fail;

@@ -1717,15 +1724,15 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
fdflags |= TX_BUF_BRS;
}

- fifo_header.dlc = FIELD_PREP(TX_BUF_MM_MASK, putidx) |
+ fifo_element.dlc = FIELD_PREP(TX_BUF_MM_MASK, putidx) |
FIELD_PREP(TX_BUF_DLC_MASK, can_fd_len2dlc(cf->len)) |
fdflags | TX_BUF_EFC;
- err = m_can_fifo_write(cdev, putidx, M_CAN_FIFO_ID, &fifo_header, 2);
- if (err)
- goto out_fail;

- err = m_can_fifo_write(cdev, putidx, M_CAN_FIFO_DATA,
- cf->data, DIV_ROUND_UP(cf->len, 4));
+ memcpy_and_pad(fifo_element.data, CANFD_MAX_DLEN, &cf->data,
+ cf->len, 0);
+
+ err = m_can_fifo_write(cdev, putidx, M_CAN_FIFO_ID,
+ &fifo_element, 2 + len_padded);
if (err)
goto out_fail;

--
2.40.1


Subject: [PATCH v5 12/12] can: m_can: Implement transmit submission coalescing

m_can supports submitting multiple transmits with one register write.
This is an interesting option to reduce the number of SPI transfers for
peripheral chips.

The m_can_tx_op is extended with a bool that signals if it is the last
transmission and the submit should be executed immediately.

The worker then writes the skb to the FIFO and submits it only if the
submit bool is set. If it isn't set, the worker will write the next skb
which is waiting in the workqueue to the FIFO, etc.

Signed-off-by: Markus Schneider-Pargmann <[email protected]>
---

Notes:
Notes:
- I put this behind the tx-frames ethtool coalescing option as we do
wait before submitting packages but it is something different than the
tx-frames-irq option. I am not sure if this is the correct option,
please let me know.

drivers/net/can/m_can/m_can.c | 55 ++++++++++++++++++++++++++++++++---
drivers/net/can/m_can/m_can.h | 6 ++++
2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index b775ee8e5ff5..aa8c5a8445de 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1515,6 +1515,9 @@ static int m_can_start(struct net_device *dev)
if (ret)
return ret;

+ netdev_queue_set_dql_min_limit(netdev_get_tx_queue(cdev->net, 0),
+ cdev->tx_max_coalesced_frames);
+
cdev->can.state = CAN_STATE_ERROR_ACTIVE;

m_can_enable_all_interrupts(cdev);
@@ -1811,8 +1814,13 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev,
*/
can_put_echo_skb(skb, dev, putidx, frame_len);

- /* Enable TX FIFO element to start transfer */
- m_can_write(cdev, M_CAN_TXBAR, (1 << putidx));
+ if (cdev->is_peripheral) {
+ /* Delay enabling TX FIFO element */
+ cdev->tx_peripheral_submit |= BIT(putidx);
+ } else {
+ /* Enable TX FIFO element to start transfer */
+ m_can_write(cdev, M_CAN_TXBAR, BIT(putidx));
+ }
cdev->tx_fifo_putidx = (++cdev->tx_fifo_putidx >= cdev->can.echo_skb_max ?
0 : cdev->tx_fifo_putidx);
}
@@ -1825,6 +1833,17 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev,
return NETDEV_TX_BUSY;
}

+static void m_can_tx_submit(struct m_can_classdev *cdev)
+{
+ if (cdev->version == 30)
+ return;
+ if (!cdev->is_peripheral)
+ return;
+
+ m_can_write(cdev, M_CAN_TXBAR, cdev->tx_peripheral_submit);
+ cdev->tx_peripheral_submit = 0;
+}
+
static void m_can_tx_work_queue(struct work_struct *ws)
{
struct m_can_tx_op *op = container_of(ws, struct m_can_tx_op, work);
@@ -1833,11 +1852,15 @@ static void m_can_tx_work_queue(struct work_struct *ws)

op->skb = NULL;
m_can_tx_handler(cdev, skb);
+ if (op->submit)
+ m_can_tx_submit(cdev);
}

-static void m_can_tx_queue_skb(struct m_can_classdev *cdev, struct sk_buff *skb)
+static void m_can_tx_queue_skb(struct m_can_classdev *cdev, struct sk_buff *skb,
+ bool submit)
{
cdev->tx_ops[cdev->next_tx_op].skb = skb;
+ cdev->tx_ops[cdev->next_tx_op].submit = submit;
queue_work(cdev->tx_wq, &cdev->tx_ops[cdev->next_tx_op].work);

++cdev->next_tx_op;
@@ -1849,6 +1872,7 @@ static netdev_tx_t m_can_start_peripheral_xmit(struct m_can_classdev *cdev,
struct sk_buff *skb)
{
netdev_tx_t err;
+ bool submit;

if (cdev->can.state == CAN_STATE_BUS_OFF) {
m_can_clean(cdev->net);
@@ -1859,7 +1883,15 @@ static netdev_tx_t m_can_start_peripheral_xmit(struct m_can_classdev *cdev,
if (err != NETDEV_TX_OK)
return err;

- m_can_tx_queue_skb(cdev, skb);
+ ++cdev->nr_txs_without_submit;
+ if (cdev->nr_txs_without_submit >= cdev->tx_max_coalesced_frames ||
+ !netdev_xmit_more()) {
+ cdev->nr_txs_without_submit = 0;
+ submit = true;
+ } else {
+ submit = false;
+ }
+ m_can_tx_queue_skb(cdev, skb, submit);

return NETDEV_TX_OK;
}
@@ -1991,6 +2023,7 @@ static int m_can_get_coalesce(struct net_device *dev,

ec->rx_max_coalesced_frames_irq = cdev->rx_max_coalesced_frames_irq;
ec->rx_coalesce_usecs_irq = cdev->rx_coalesce_usecs_irq;
+ ec->tx_max_coalesced_frames = cdev->tx_max_coalesced_frames;
ec->tx_max_coalesced_frames_irq = cdev->tx_max_coalesced_frames_irq;
ec->tx_coalesce_usecs_irq = cdev->tx_coalesce_usecs_irq;

@@ -2035,6 +2068,18 @@ static int m_can_set_coalesce(struct net_device *dev,
netdev_err(dev, "tx-frames-irq and tx-usecs-irq can only be set together\n");
return -EINVAL;
}
+ if (ec->tx_max_coalesced_frames > cdev->mcfg[MRAM_TXE].num) {
+ netdev_err(dev, "tx-frames %u greater than the TX event FIFO %u\n",
+ ec->tx_max_coalesced_frames,
+ cdev->mcfg[MRAM_TXE].num);
+ return -EINVAL;
+ }
+ if (ec->tx_max_coalesced_frames > cdev->mcfg[MRAM_TXB].num) {
+ netdev_err(dev, "tx-frames %u greater than the TX FIFO %u\n",
+ ec->tx_max_coalesced_frames,
+ cdev->mcfg[MRAM_TXB].num);
+ return -EINVAL;
+ }
if (ec->rx_coalesce_usecs_irq != 0 && ec->tx_coalesce_usecs_irq != 0 &&
ec->rx_coalesce_usecs_irq != ec->tx_coalesce_usecs_irq) {
netdev_err(dev, "rx-usecs-irq %u needs to be equal to tx-usecs-irq %u if both are enabled\n",
@@ -2045,6 +2090,7 @@ static int m_can_set_coalesce(struct net_device *dev,

cdev->rx_max_coalesced_frames_irq = ec->rx_max_coalesced_frames_irq;
cdev->rx_coalesce_usecs_irq = ec->rx_coalesce_usecs_irq;
+ cdev->tx_max_coalesced_frames = ec->tx_max_coalesced_frames;
cdev->tx_max_coalesced_frames_irq = ec->tx_max_coalesced_frames_irq;
cdev->tx_coalesce_usecs_irq = ec->tx_coalesce_usecs_irq;

@@ -2062,6 +2108,7 @@ static const struct ethtool_ops m_can_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_RX_USECS_IRQ |
ETHTOOL_COALESCE_RX_MAX_FRAMES_IRQ |
ETHTOOL_COALESCE_TX_USECS_IRQ |
+ ETHTOOL_COALESCE_TX_MAX_FRAMES |
ETHTOOL_COALESCE_TX_MAX_FRAMES_IRQ,
.get_ts_info = ethtool_op_get_ts_info,
.get_coalesce = m_can_get_coalesce,
diff --git a/drivers/net/can/m_can/m_can.h b/drivers/net/can/m_can/m_can.h
index 5c182aece15c..54af26a94042 100644
--- a/drivers/net/can/m_can/m_can.h
+++ b/drivers/net/can/m_can/m_can.h
@@ -74,6 +74,7 @@ struct m_can_tx_op {
struct m_can_classdev *cdev;
struct work_struct work;
struct sk_buff *skb;
+ bool submit;
};

struct m_can_classdev {
@@ -103,6 +104,7 @@ struct m_can_classdev {
u32 active_interrupts;
u32 rx_max_coalesced_frames_irq;
u32 rx_coalesce_usecs_irq;
+ u32 tx_max_coalesced_frames;
u32 tx_max_coalesced_frames_irq;
u32 tx_coalesce_usecs_irq;

@@ -117,6 +119,10 @@ struct m_can_classdev {
int tx_fifo_size;
int next_tx_op;

+ int nr_txs_without_submit;
+ /* bitfield of fifo elements that will be submitted together */
+ u32 tx_peripheral_submit;
+
struct mram_cfg mcfg[MRAM_CFG_NUM];
};

--
2.40.1


2023-07-20 16:06:56

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v5 01/12] can: m_can: Write transmit header and data in one transaction

On Tue, Jul 18, 2023 at 09:56:57AM +0200, Markus Schneider-Pargmann wrote:
> Combine header and data before writing to the transmit fifo to reduce
> the overhead for peripheral chips.
>
> Signed-off-by: Markus Schneider-Pargmann <[email protected]>

Reviewed-by: Simon Horman <[email protected]>

2023-07-20 16:33:04

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] can: m_can: Implement transmit submission coalescing

On Tue, Jul 18, 2023 at 09:57:08AM +0200, Markus Schneider-Pargmann wrote:
> m_can supports submitting multiple transmits with one register write.
> This is an interesting option to reduce the number of SPI transfers for
> peripheral chips.
>
> The m_can_tx_op is extended with a bool that signals if it is the last
> transmission and the submit should be executed immediately.
>
> The worker then writes the skb to the FIFO and submits it only if the
> submit bool is set. If it isn't set, the worker will write the next skb
> which is waiting in the workqueue to the FIFO, etc.
>
> Signed-off-by: Markus Schneider-Pargmann <[email protected]>

Reviewed-by: Simon Horman <[email protected]>


Subject: Re: [PATCH v5 12/12] can: m_can: Implement transmit submission coalescing

Hi Simon,

On Thu, Jul 20, 2023 at 04:58:49PM +0100, Simon Horman wrote:
> On Tue, Jul 18, 2023 at 09:57:08AM +0200, Markus Schneider-Pargmann wrote:
> > m_can supports submitting multiple transmits with one register write.
> > This is an interesting option to reduce the number of SPI transfers for
> > peripheral chips.
> >
> > The m_can_tx_op is extended with a bool that signals if it is the last
> > transmission and the submit should be executed immediately.
> >
> > The worker then writes the skb to the FIFO and submits it only if the
> > submit bool is set. If it isn't set, the worker will write the next skb
> > which is waiting in the workqueue to the FIFO, etc.
> >
> > Signed-off-by: Markus Schneider-Pargmann <[email protected]>
>
> Reviewed-by: Simon Horman <[email protected]>

Thank you for your reviews!

Best,
Markus

Subject: Re: [PATCH v5 00/12] can: m_can: Optimizations for m_can/tcan part 2

Hi Marc,

did you have some time to review this series? Anything I should rework?

Thanks,
Markus

On Tue, Jul 18, 2023 at 09:56:56AM +0200, Markus Schneider-Pargmann wrote:
> Hi Marc, Simon and everyone,
>
> v5 got a rebase on v6.5 with some small style fixes as pointed out in v4.
>
> It is tested on tcan455x but I don't have hardware with mcan on the SoC
> myself so any testing is appreciated.
>
> The series implements many small and bigger throughput improvements and
> adds rx/tx coalescing at the end.
>
> Based on v6.5-rc1. Also available at
> https://gitlab.baylibre.com/msp8/linux/-/tree/topic/mcan-optimization/v6.5?ref_type=heads
>
> Best,
> Markus
>
> Changes in v5:
> - Add back parenthesis in m_can_set_coalesce(). This will make
> checkpatch unhappy but gcc happy.
> - Remove unused fifo_header variable in m_can_tx_handler().
> - Rebased to v6.5-rc1
>
> Changes in v4:
> - Create and use struct m_can_fifo_element in m_can_tx_handler
> - Fix memcpy_and_pad to copy the full buffer
> - Fixed a few checkpatch warnings
> - Change putidx to be unsigned
> - Print hard_xmit error only once when TX FIFO is full
>
> Changes in v3:
> - Remove parenthesis in error messages
> - Use memcpy_and_pad for buffer copy in 'can: m_can: Write transmit
> header and data in one transaction'.
> - Replace spin_lock with spin_lock_irqsave. I got a report of a
> interrupt that was calling start_xmit just after the netqueue was
> woken up before the locked region was exited. spin_lock_irqsave should
> fix this. I attached the full stack at the end of the mail if someone
> wants to know.
> - Rebased to v6.3-rc1.
> - Removed tcan4x5x patches from this series.
>
> Changes in v2:
> - Rebased on v6.2-rc5
> - Fixed missing/broken accounting for non peripheral m_can devices.
>
> previous versions:
> v1 - https://lore.kernel.org/lkml/[email protected]
> v2 - https://lore.kernel.org/lkml/[email protected]
> v3 - https://lore.kernel.org/lkml/[email protected]/
> v4 - https://lore.kernel.org/lkml/[email protected]/
>
> Markus Schneider-Pargmann (12):
> can: m_can: Write transmit header and data in one transaction
> can: m_can: Implement receive coalescing
> can: m_can: Implement transmit coalescing
> can: m_can: Add rx coalescing ethtool support
> can: m_can: Add tx coalescing ethtool support
> can: m_can: Use u32 for putidx
> can: m_can: Cache tx putidx
> can: m_can: Use the workqueue as queue
> can: m_can: Introduce a tx_fifo_in_flight counter
> can: m_can: Use tx_fifo_in_flight for netif_queue control
> can: m_can: Implement BQL
> can: m_can: Implement transmit submission coalescing
>
> drivers/net/can/m_can/m_can.c | 517 +++++++++++++++++++++++++---------
> drivers/net/can/m_can/m_can.h | 35 ++-
> 2 files changed, 418 insertions(+), 134 deletions(-)
>
>
> base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
> --
> 2.40.1
>