2021-12-06 22:40:01

by Aleksander Jan Bajkowski

[permalink] [raw]
Subject: [PATCH net v2] net: lantiq_xrx200: increase buffer reservation

If the user sets a smaller mtu on the CPU port than on the switch,
then DMA inserts a few more bytes than expected. In the worst case,
it may exceed the size of the buffer. The experiments showed that
the buffer should be a multiple of the burst length value. This patch
rounds the length of the rx buffer upwards and fixes this bug.

Fixes: 998ac358019e ("net: lantiq: add support for jumbo frames")
Signed-off-by: Aleksander Jan Bajkowski <[email protected]>
---
drivers/net/ethernet/lantiq_xrx200.c | 38 ++++++++++++++++++++--------
1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/lantiq_xrx200.c b/drivers/net/ethernet/lantiq_xrx200.c
index 0da09ea81980..d3423764da75 100644
--- a/drivers/net/ethernet/lantiq_xrx200.c
+++ b/drivers/net/ethernet/lantiq_xrx200.c
@@ -71,6 +71,9 @@ struct xrx200_priv {
struct xrx200_chan chan_tx;
struct xrx200_chan chan_rx;

+ u16 max_frame_len;
+ u16 rx_buf_size;
+
struct net_device *net_dev;
struct device *dev;

@@ -97,6 +100,16 @@ static void xrx200_pmac_mask(struct xrx200_priv *priv, u32 clear, u32 set,
xrx200_pmac_w32(priv, val, offset);
}

+static int xrx200_max_frame_len(int mtu)
+{
+ return VLAN_ETH_HLEN + mtu + ETH_FCS_LEN;
+}
+
+static int xrx200_buffer_size(int mtu)
+{
+ return round_up(xrx200_max_frame_len(mtu) - 1, 4 * XRX200_DMA_BURST_LEN);
+}
+
/* drop all the packets from the DMA ring */
static void xrx200_flush_dma(struct xrx200_chan *ch)
{
@@ -109,8 +122,7 @@ static void xrx200_flush_dma(struct xrx200_chan *ch)
break;

desc->ctl = LTQ_DMA_OWN | LTQ_DMA_RX_OFFSET(NET_IP_ALIGN) |
- (ch->priv->net_dev->mtu + VLAN_ETH_HLEN +
- ETH_FCS_LEN);
+ ch->priv->max_frame_len;
ch->dma.desc++;
ch->dma.desc %= LTQ_DESC_NUM;
}
@@ -158,21 +170,21 @@ static int xrx200_close(struct net_device *net_dev)

static int xrx200_alloc_skb(struct xrx200_chan *ch)
{
- int len = ch->priv->net_dev->mtu + VLAN_ETH_HLEN + ETH_FCS_LEN;
struct sk_buff *skb = ch->skb[ch->dma.desc];
+ struct xrx200_priv *priv = ch->priv;
dma_addr_t mapping;
int ret = 0;

- ch->skb[ch->dma.desc] = netdev_alloc_skb_ip_align(ch->priv->net_dev,
- len);
+ ch->skb[ch->dma.desc] = netdev_alloc_skb_ip_align(priv->net_dev,
+ priv->rx_buf_size);
if (!ch->skb[ch->dma.desc]) {
ret = -ENOMEM;
goto skip;
}

- mapping = dma_map_single(ch->priv->dev, ch->skb[ch->dma.desc]->data,
- len, DMA_FROM_DEVICE);
- if (unlikely(dma_mapping_error(ch->priv->dev, mapping))) {
+ mapping = dma_map_single(priv->dev, ch->skb[ch->dma.desc]->data,
+ priv->rx_buf_size, DMA_FROM_DEVICE);
+ if (unlikely(dma_mapping_error(priv->dev, mapping))) {
dev_kfree_skb_any(ch->skb[ch->dma.desc]);
ch->skb[ch->dma.desc] = skb;
ret = -ENOMEM;
@@ -184,7 +196,7 @@ static int xrx200_alloc_skb(struct xrx200_chan *ch)
wmb();
skip:
ch->dma.desc_base[ch->dma.desc].ctl =
- LTQ_DMA_OWN | LTQ_DMA_RX_OFFSET(NET_IP_ALIGN) | len;
+ LTQ_DMA_OWN | LTQ_DMA_RX_OFFSET(NET_IP_ALIGN) | priv->max_frame_len;

return ret;
}
@@ -356,6 +368,8 @@ xrx200_change_mtu(struct net_device *net_dev, int new_mtu)
int ret = 0;

net_dev->mtu = new_mtu;
+ priv->rx_buf_size = xrx200_buffer_size(new_mtu);
+ priv->max_frame_len = xrx200_max_frame_len(new_mtu);

if (new_mtu <= old_mtu)
return ret;
@@ -375,6 +389,8 @@ xrx200_change_mtu(struct net_device *net_dev, int new_mtu)
ret = xrx200_alloc_skb(ch_rx);
if (ret) {
net_dev->mtu = old_mtu;
+ priv->rx_buf_size = xrx200_buffer_size(old_mtu);
+ priv->max_frame_len = xrx200_max_frame_len(old_mtu);
break;
}
dev_kfree_skb_any(skb);
@@ -505,7 +521,9 @@ static int xrx200_probe(struct platform_device *pdev)
net_dev->netdev_ops = &xrx200_netdev_ops;
SET_NETDEV_DEV(net_dev, dev);
net_dev->min_mtu = ETH_ZLEN;
- net_dev->max_mtu = XRX200_DMA_DATA_LEN - VLAN_ETH_HLEN - ETH_FCS_LEN;
+ net_dev->max_mtu = XRX200_DMA_DATA_LEN - xrx200_max_frame_len(0);
+ priv->rx_buf_size = xrx200_buffer_size(ETH_DATA_LEN);
+ priv->max_frame_len = xrx200_max_frame_len(ETH_DATA_LEN);

/* load the memory ranges */
priv->pmac_reg = devm_platform_get_and_ioremap_resource(pdev, 0, NULL);
--
2.30.2



2021-12-08 04:55:00

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net v2] net: lantiq_xrx200: increase buffer reservation

On Mon, 6 Dec 2021 23:39:09 +0100 Aleksander Jan Bajkowski wrote:
> +static int xrx200_max_frame_len(int mtu)
> +{
> + return VLAN_ETH_HLEN + mtu + ETH_FCS_LEN;

You sure the problem is not that this doesn't include ETH_HLEN?
MTU is the length of the L2 _payload_.

> +}
> +
> +static int xrx200_buffer_size(int mtu)
> +{
> + return round_up(xrx200_max_frame_len(mtu) - 1, 4 * XRX200_DMA_BURST_LEN);

Why the - 1 ? ????

For a frame size 101 => max_frame_len 109 you'll presumably want
the buffer to be 116, not 108?

> +}
> +

2021-12-12 23:05:21

by Aleksander Jan Bajkowski

[permalink] [raw]
Subject: Re: [PATCH net v2] net: lantiq_xrx200: increase buffer reservation

Hi,

Sorry for the late reply, but recently I haven't had access to
hardware to test different MTU values and packet lengths on the
hardware.

On 12/8/21 5:54 AM, Jakub Kicinski wrote:
> On Mon, 6 Dec 2021 23:39:09 +0100 Aleksander Jan Bajkowski wrote:
>> +static int xrx200_max_frame_len(int mtu)
>> +{
>> + return VLAN_ETH_HLEN + mtu + ETH_FCS_LEN;
>
> You sure the problem is not that this doesn't include ETH_HLEN?
> MTU is the length of the L2 _payload_.
>

VLAN_ETH_HLEN (14 + 4) contains ETH_HLEN (14). This function returns
the length of the frame that is written to the RX descriptor. Maybe
I don't understand the question and you are asking something else?

>> +}
>> +
>> +static int xrx200_buffer_size(int mtu)
>> +{
>> + return round_up(xrx200_max_frame_len(mtu) - 1, 4 * XRX200_DMA_BURST_LEN);
>
> Why the - 1 ? ????
>

This is how the hardware behaves. I don't really know where the -1
comes from. Unfortunately, I do not have access to TRM.

> For a frame size 101 => max_frame_len 109 you'll presumably want
> the buffer to be 116, not 108?
>


For a frame size 101 => max_frame_len is 123 (18 + 101 + 4). Infact, PMAC strips FCS and ETH_FCS_LEN may not be needed. This behavior
is controlled by the PMAC_HD_CTL_RC bit. This bit is enabled from
the beginning of this driver. Ethtool has the option to enable
FCS reception, but the ethtool interface is not yet supported
by this driver.

>> +}
>> +

Experiments show that the hardware starts to split the frame at
max_frame_len() - 1. Some examples:

pkt len MTU max_frame_size() buffer_size() desc1 desc2 desc3 desc4
----------------------------------------------------------------------------------------------
1506 1483 1505 1504 1502 4 X X
1505 1483 1505 1504 1502 3 X X
1504 1483 1505 1504 1504 X X X
1503 1483 1505 1504 1503 X X X
1502 1483 1505 1504 1502 X X X
1501 1483 1505 1504 1501 X X X
----------------------------------------------------------------------------------------------
1249 380 402 416 414 416 416 3
1248 380 402 416 414 416 416 2
1247 380 402 416 414 416 416 1
1246 380 402 416 414 416 416 X
1245 380 402 416 414 416 415 X
----------------------------------------------------------------------------------------------


In fact, this patch is a preparation for SG DMA support, which
I wrote some time ago.




2021-12-13 15:43:54

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net v2] net: lantiq_xrx200: increase buffer reservation

On Mon, 13 Dec 2021 00:05:16 +0100 Aleksander Bajkowski wrote:
> On 12/8/21 5:54 AM, Jakub Kicinski wrote:
> > On Mon, 6 Dec 2021 23:39:09 +0100 Aleksander Jan Bajkowski wrote:
> >> +static int xrx200_max_frame_len(int mtu)
> >> +{
> >> + return VLAN_ETH_HLEN + mtu + ETH_FCS_LEN;
> >
> > You sure the problem is not that this doesn't include ETH_HLEN?
> > MTU is the length of the L2 _payload_.
>
> VLAN_ETH_HLEN (14 + 4) contains ETH_HLEN (14). This function returns
> the length of the frame that is written to the RX descriptor. Maybe
> I don't understand the question and you are asking something else?

Ah, right, misread that as VLAN_HLEN.

> >> +}
> >> +
> >> +static int xrx200_buffer_size(int mtu)
> >> +{
> >> + return round_up(xrx200_max_frame_len(mtu) - 1, 4 * XRX200_DMA_BURST_LEN);
> >
> > Why the - 1 ? ????
> >
>
> This is how the hardware behaves. I don't really know where the -1
> comes from. Unfortunately, I do not have access to TRM.
>
> > For a frame size 101 => max_frame_len 109 you'll presumably want
> > the buffer to be 116, not 108?
>
> For a frame size 101 => max_frame_len is 123 (18 + 101 + 4).

You get my point, tho right?

> Infact, PMAC strips FCS and ETH_FCS_LEN may not be needed. This behavior
> is controlled by the PMAC_HD_CTL_RC bit. This bit is enabled from
> the beginning of this driver. Ethtool has the option to enable
> FCS reception, but the ethtool interface is not yet supported
> by this driver.
>
> >> +}
> >> +
>
> Experiments show that the hardware starts to split the frame at
> max_frame_len() - 1. Some examples:
>
> pkt len MTU max_frame_size() buffer_size() desc1 desc2 desc3 desc4
> ----------------------------------------------------------------------------------------------
> 1506 1483 1505 1504 1502 4 X X
> 1505 1483 1505 1504 1502 3 X X
> 1504 1483 1505 1504 1504 X X X
> 1503 1483 1505 1504 1503 X X X
> 1502 1483 1505 1504 1502 X X X
> 1501 1483 1505 1504 1501 X X X
> ----------------------------------------------------------------------------------------------
> 1249 380 402 416 414 416 416 3
> 1248 380 402 416 414 416 416 2
> 1247 380 402 416 414 416 416 1
> 1246 380 402 416 414 416 416 X
> 1245 380 402 416 414 416 415 X
> ----------------------------------------------------------------------------------------------

Hm, doesn't the former example prove that the calculation in this
patch is incorrect? Shouldn't we be able to receive a 1505B frame
into a single buffer for the MTU of 1483?

The HW doesn't split at max_frame_len - 1 in the latter example.
Maybe to save one bit of state the HW doesn't register the lowest
bit of the buffer length? I thinks the buffer is 1504.

I'd lean towards dropping the -1 and letting the DMA alignment
calculation round the whole length up.

Also should we not take NET_IP_ALIGN into account?

Judging by the length of the first buffer when split happens
NET_IP_ALIGN is 2, and HW rounds off (2 + pkt-len) to the DMA
burst size. This makes me think we overrun the end of the buffer
by NET_IP_ALIGN.

> In fact, this patch is a preparation for SG DMA support, which
> I wrote some time ago.