2023-09-25 10:31:29

by Linus Walleij

[permalink] [raw]
Subject: [PATCH net-next v2] net: ixp4xx_eth: Support changing the MTU

As we don't specify the MTU in the driver, the framework
will fall back to 1500 bytes and this doesn't work very
well when we try to attach a DSA switch:

eth1: mtu greater than device maximum
ixp4xx_eth c800a000.ethernet eth1: error -22 setting
MTU to 1504 to include DSA overhead

After locating an out-of-tree patch in OpenWrt I found
suitable code to set the MTU on the interface and ported
it and updated it. Now the MTU gets set properly.

Signed-off-by: Linus Walleij <[email protected]>
---
Changes in v2:
- Don't just set min/max MTU: implement the interface for actually
changing it as well.
- Link to v1: https://lore.kernel.org/r/[email protected]
---
drivers/net/ethernet/xscale/ixp4xx_eth.c | 64 +++++++++++++++++++++++++++++++-
1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/xscale/ixp4xx_eth.c b/drivers/net/ethernet/xscale/ixp4xx_eth.c
index 3b0c5f177447..18e52abc005b 100644
--- a/drivers/net/ethernet/xscale/ixp4xx_eth.c
+++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c
@@ -24,6 +24,7 @@
#include <linux/dma-mapping.h>
#include <linux/dmapool.h>
#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/net_tstamp.h>
@@ -63,7 +64,15 @@

#define POOL_ALLOC_SIZE (sizeof(struct desc) * (RX_DESCS + TX_DESCS))
#define REGS_SIZE 0x1000
-#define MAX_MRU 1536 /* 0x600 */
+
+/* MRU is said to be 14320 in a code dump, the SW manual says that
+ * MRU/MTU is 16320 and includes VLAN and ethernet headers.
+ * See "IXP400 Software Programmer's Guide" section 10.3.2, page 161.
+ *
+ * FIXME: we have chosen the safe default (14320) but if you can test
+ * jumboframes, experiment with 16320 and see what happens!
+ */
+#define MAX_MRU (14320 - VLAN_ETH_HLEN)
#define RX_BUFF_SIZE ALIGN((NET_IP_ALIGN) + MAX_MRU, 4)

#define NAPI_WEIGHT 16
@@ -1182,6 +1191,53 @@ static void destroy_queues(struct port *port)
}
}

+static int ixp4xx_do_change_mtu(struct net_device *dev, int new_mtu)
+{
+ struct port *port = netdev_priv(dev);
+ struct npe *npe = port->npe;
+ struct msg msg;
+ /* adjust for ethernet headers */
+ int framesize = new_mtu + VLAN_ETH_HLEN;
+ /* max rx/tx 64 byte chunks */
+ int chunks = DIV_ROUND_UP(framesize, 64);
+
+ memset(&msg, 0, sizeof(msg));
+ msg.cmd = NPE_SETMAXFRAMELENGTHS;
+ msg.eth_id = port->id;
+
+ /* Firmware wants to know buffer size in 64 byte chunks */
+ msg.byte2 = chunks << 8;
+ msg.byte3 = chunks << 8;
+
+ msg.byte4 = msg.byte6 = framesize >> 8;
+ msg.byte5 = msg.byte7 = framesize & 0xff;
+
+ if (npe_send_recv_message(npe, &msg, "ETH_SET_MAX_FRAME_LENGTH"))
+ return -EIO;
+ netdev_dbg(dev, "set MTU on NPE %s to %d bytes\n",
+ npe_name(npe), new_mtu);
+
+ return 0;
+}
+
+static int ixp4xx_eth_change_mtu(struct net_device *dev, int new_mtu)
+{
+ int ret;
+
+ /* MTU can only be changed when the interface is up. We also
+ * set the MTU from dev->mtu when opening the device.
+ */
+ if (dev->flags & IFF_UP) {
+ ret = ixp4xx_do_change_mtu(dev, new_mtu);
+ if (ret < 0)
+ return ret;
+ }
+
+ dev->mtu = new_mtu;
+
+ return 0;
+}
+
static int eth_open(struct net_device *dev)
{
struct port *port = netdev_priv(dev);
@@ -1232,6 +1288,8 @@ static int eth_open(struct net_device *dev)
if (npe_send_recv_message(port->npe, &msg, "ETH_SET_FIREWALL_MODE"))
return -EIO;

+ ixp4xx_do_change_mtu(dev, dev->mtu);
+
if ((err = request_queues(port)) != 0)
return err;

@@ -1374,6 +1432,7 @@ static int eth_close(struct net_device *dev)
static const struct net_device_ops ixp4xx_netdev_ops = {
.ndo_open = eth_open,
.ndo_stop = eth_close,
+ .ndo_change_mtu = ixp4xx_eth_change_mtu,
.ndo_start_xmit = eth_xmit,
.ndo_set_rx_mode = eth_set_mcast_list,
.ndo_eth_ioctl = eth_ioctl,
@@ -1488,6 +1547,9 @@ static int ixp4xx_eth_probe(struct platform_device *pdev)
ndev->dev.dma_mask = dev->dma_mask;
ndev->dev.coherent_dma_mask = dev->coherent_dma_mask;

+ ndev->min_mtu = ETH_MIN_MTU;
+ ndev->max_mtu = MAX_MRU;
+
netif_napi_add_weight(ndev, &port->napi, eth_poll, NAPI_WEIGHT);

if (!(port->npe = npe_request(NPE_ID(port->id))))

---
base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
change-id: 20230923-ixp4xx-eth-mtu-c041d7efe932

Best regards,
--
Linus Walleij <[email protected]>


2023-09-26 04:13:32

by Jacob Keller

[permalink] [raw]
Subject: Re: [PATCH net-next v2] net: ixp4xx_eth: Support changing the MTU



On 9/25/2023 12:09 AM, Linus Walleij wrote:
> As we don't specify the MTU in the driver, the framework
> will fall back to 1500 bytes and this doesn't work very
> well when we try to attach a DSA switch:
>
> eth1: mtu greater than device maximum
> ixp4xx_eth c800a000.ethernet eth1: error -22 setting
> MTU to 1504 to include DSA overhead
>
> After locating an out-of-tree patch in OpenWrt I found
> suitable code to set the MTU on the interface and ported
> it and updated it. Now the MTU gets set properly.
>

Nice!

> Signed-off-by: Linus Walleij <[email protected]>
> ---
> Changes in v2:
> - Don't just set min/max MTU: implement the interface for actually
> changing it as well.
> - Link to v1: https://lore.kernel.org/r/[email protected]
> ---
> drivers/net/ethernet/xscale/ixp4xx_eth.c | 64 +++++++++++++++++++++++++++++++-
> 1 file changed, 63 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/xscale/ixp4xx_eth.c b/drivers/net/ethernet/xscale/ixp4xx_eth.c
> index 3b0c5f177447..18e52abc005b 100644
> --- a/drivers/net/ethernet/xscale/ixp4xx_eth.c
> +++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c
> @@ -24,6 +24,7 @@
> #include <linux/dma-mapping.h>
> #include <linux/dmapool.h>
> #include <linux/etherdevice.h>
> +#include <linux/if_vlan.h>
> #include <linux/io.h>
> #include <linux/kernel.h>
> #include <linux/net_tstamp.h>
> @@ -63,7 +64,15 @@
>
> #define POOL_ALLOC_SIZE (sizeof(struct desc) * (RX_DESCS + TX_DESCS))
> #define REGS_SIZE 0x1000
> -#define MAX_MRU 1536 /* 0x600 */
> +
> +/* MRU is said to be 14320 in a code dump, the SW manual says that
> + * MRU/MTU is 16320 and includes VLAN and ethernet headers.
> + * See "IXP400 Software Programmer's Guide" section 10.3.2, page 161.
> + *
> + * FIXME: we have chosen the safe default (14320) but if you can test
> + * jumboframes, experiment with 16320 and see what happens!
> + */


Ok, so you're choosing a conservative upper limit that is known to work
while leaving the higher 16320 value for later if/when someone cares?

> +#define MAX_MRU (14320 - VLAN_ETH_HLEN)
> #define RX_BUFF_SIZE ALIGN((NET_IP_ALIGN) + MAX_MRU, 4)
>
> #define NAPI_WEIGHT 16
> @@ -1182,6 +1191,53 @@ static void destroy_queues(struct port *port)
> }
> }
>
> +static int ixp4xx_do_change_mtu(struct net_device *dev, int new_mtu)
> +{
> + struct port *port = netdev_priv(dev);
> + struct npe *npe = port->npe;
> + struct msg msg;
> + /* adjust for ethernet headers */
> + int framesize = new_mtu + VLAN_ETH_HLEN;
> + /* max rx/tx 64 byte chunks */
> + int chunks = DIV_ROUND_UP(framesize, 64);
> +

netdev coding style wants all of the declarations in "reverse christmas
tree" ordering. Assign to the local variables after the block if
necessary. Something like:

struct port *port = netdev_priv(dev);
struct npe *npe = port->npe;
int framesize, chunks;
struct msg msg;

/* adjust for ethernet headers */
framesize = new_mtu + VLAN_ETH_HLEN;
/* max rx/tx 64 byte chunks */
chunks = DIV_ROUND_UP(framesize, 64);


> + memset(&msg, 0, sizeof(msg));


You could also use "struct msg msg = {}" instead of memset here.

> + msg.cmd = NPE_SETMAXFRAMELENGTHS;
> + msg.eth_id = port->id;
> +
> + /* Firmware wants to know buffer size in 64 byte chunks */
> + msg.byte2 = chunks << 8;
> + msg.byte3 = chunks << 8;
> +

I am not sure I follow the "<< 8" here.

> + msg.byte4 = msg.byte6 = framesize >> 8;
> + msg.byte5 = msg.byte7 = framesize & 0xff;
> +
> + if (npe_send_recv_message(npe, &msg, "ETH_SET_MAX_FRAME_LENGTH"))
> + return -EIO;
> + netdev_dbg(dev, "set MTU on NPE %s to %d bytes\n",
> + npe_name(npe), new_mtu);
> +
> + return 0;
> +}
> +
> +static int ixp4xx_eth_change_mtu(struct net_device *dev, int new_mtu)
> +{
> + int ret;
> +
> + /* MTU can only be changed when the interface is up. We also
> + * set the MTU from dev->mtu when opening the device.
> + */
> + if (dev->flags & IFF_UP) {
> + ret = ixp4xx_do_change_mtu(dev, new_mtu);
> + if (ret < 0)
> + return ret;
> + }
> +
> + dev->mtu = new_mtu;
> +
> + return 0;
> +}
> +
> static int eth_open(struct net_device *dev)
> {
> struct port *port = netdev_priv(dev);
> @@ -1232,6 +1288,8 @@ static int eth_open(struct net_device *dev)
> if (npe_send_recv_message(port->npe, &msg, "ETH_SET_FIREWALL_MODE"))
> return -EIO;
>
> + ixp4xx_do_change_mtu(dev, dev->mtu);
> +
> if ((err = request_queues(port)) != 0)
> return err;
>
> @@ -1374,6 +1432,7 @@ static int eth_close(struct net_device *dev)
> static const struct net_device_ops ixp4xx_netdev_ops = {
> .ndo_open = eth_open,
> .ndo_stop = eth_close,
> + .ndo_change_mtu = ixp4xx_eth_change_mtu,
> .ndo_start_xmit = eth_xmit,
> .ndo_set_rx_mode = eth_set_mcast_list,
> .ndo_eth_ioctl = eth_ioctl,
> @@ -1488,6 +1547,9 @@ static int ixp4xx_eth_probe(struct platform_device *pdev)
> ndev->dev.dma_mask = dev->dma_mask;
> ndev->dev.coherent_dma_mask = dev->coherent_dma_mask;
>
> + ndev->min_mtu = ETH_MIN_MTU;
> + ndev->max_mtu = MAX_MRU;
> +
> netif_napi_add_weight(ndev, &port->napi, eth_poll, NAPI_WEIGHT);
>
> if (!(port->npe = npe_request(NPE_ID(port->id))))
>

Functionality-wise the patch seems fine to me, and properly implementing
the MTU changing is a great addition.

Minor nits on the coding style, but not really a huge issue to me. I had
some question about how the chunks work but I don't know the hardware so
I can't really evaluate whether its correct or not.

Reviewed-by: Jacob Keller <[email protected]>

> ---
> base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
> change-id: 20230923-ixp4xx-eth-mtu-c041d7efe932
>

Curious what this change-id thing represents I've never seen it before..
I know base-commit is used by git. Would be interested in an explanation
if you happen to know! :D

> Best regards,

2023-09-26 14:29:12

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH net-next v2] net: ixp4xx_eth: Support changing the MTU

On Tue, Sep 26, 2023 at 12:29 AM Jacob Keller <[email protected]> wrote:

> > +/* MRU is said to be 14320 in a code dump, the SW manual says that
> > + * MRU/MTU is 16320 and includes VLAN and ethernet headers.
> > + * See "IXP400 Software Programmer's Guide" section 10.3.2, page 161.
> > + *
> > + * FIXME: we have chosen the safe default (14320) but if you can test
> > + * jumboframes, experiment with 16320 and see what happens!
> > + */
>
> Ok, so you're choosing a conservative upper limit that is known to work
> while leaving the higher 16320 value for later if/when someone cares?

Mostly if someone can test it. But maybe I can have authoritative
information from Intel that the statement in the Software Programmers
Guide is correct? ;)

> > +static int ixp4xx_do_change_mtu(struct net_device *dev, int new_mtu)
> > +{
> > + struct port *port = netdev_priv(dev);
> > + struct npe *npe = port->npe;
> > + struct msg msg;
> > + /* adjust for ethernet headers */
> > + int framesize = new_mtu + VLAN_ETH_HLEN;
> > + /* max rx/tx 64 byte chunks */
> > + int chunks = DIV_ROUND_UP(framesize, 64);
> > +
>
> netdev coding style wants all of the declarations in "reverse christmas
> tree" ordering. Assign to the local variables after the block if
> necessary. Something like:
>
> struct port *port = netdev_priv(dev);
> struct npe *npe = port->npe;
> int framesize, chunks;
> struct msg msg;
>
> /* adjust for ethernet headers */
> framesize = new_mtu + VLAN_ETH_HLEN;
> /* max rx/tx 64 byte chunks */
> chunks = DIV_ROUND_UP(framesize, 64);

Right, I fix!

> > + memset(&msg, 0, sizeof(msg));
>
> You could also use "struct msg msg = {}" instead of memset here.

OK

> > + msg.cmd = NPE_SETMAXFRAMELENGTHS;
> > + msg.eth_id = port->id;
> > +
> > + /* Firmware wants to know buffer size in 64 byte chunks */
> > + msg.byte2 = chunks << 8;
> > + msg.byte3 = chunks << 8;
>
> I am not sure I follow the "<< 8" here.

I actually only have this vendor code, but clearly <<8 is not
"multiply by 256" in this case, rather that the number of 64 byte
chunks is in the second byte.

The software manual just describes a "OS abstraction layer"
used by both VXworks and Linux, and since that wasn't acceptable
in the Linux driver, someone has ripped out the code to
talk directly to the NPE firmware, and that is what we are seeing.
If you have the source code to the abstraction layer
"ixEthAcc" or the source code to the NPE microcode, I think the
answer is in there... (I have neither, maybe you can check internally,
hehe. Dan Williams used to work with this hardware!)

> Reviewed-by: Jacob Keller <[email protected]>

Thanks!

> > base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
> > change-id: 20230923-ixp4xx-eth-mtu-c041d7efe932
>
> Curious what this change-id thing represents I've never seen it before..
> I know base-commit is used by git. Would be interested in an explanation
> if you happen to know! :D

It's metadata generated by b4 which is the tool we use for kernel mailing
list handling:
https://people.kernel.org/monsieuricon/sending-a-kernel-patch-with-b4-part-1

I think this change ID cross-references mails I send with my
git branch, so it's easy to collect review tags etc.

Yours,
Linus Walleij

2023-09-27 01:33:25

by Jacob Keller

[permalink] [raw]
Subject: Re: [PATCH net-next v2] net: ixp4xx_eth: Support changing the MTU



On 9/25/2023 11:44 PM, Linus Walleij wrote:
> On Tue, Sep 26, 2023 at 12:29 AM Jacob Keller <[email protected]> wrote:
>
>>> +/* MRU is said to be 14320 in a code dump, the SW manual says that
>>> + * MRU/MTU is 16320 and includes VLAN and ethernet headers.
>>> + * See "IXP400 Software Programmer's Guide" section 10.3.2, page 161.
>>> + *
>>> + * FIXME: we have chosen the safe default (14320) but if you can test
>>> + * jumboframes, experiment with 16320 and see what happens!
>>> + */
>>
>> Ok, so you're choosing a conservative upper limit that is known to work
>> while leaving the higher 16320 value for later if/when someone cares?
>
> Mostly if someone can test it. But maybe I can have authoritative
> information from Intel that the statement in the Software Programmers
> Guide is correct? ;)
>

Unfortunately I'm one of the folks who worked on this driver/hardware,
so I can't comment.

>>> +static int ixp4xx_do_change_mtu(struct net_device *dev, int new_mtu)
>>> +{
>>> + struct port *port = netdev_priv(dev);
>>> + struct npe *npe = port->npe;
>>> + struct msg msg;
>>> + /* adjust for ethernet headers */
>>> + int framesize = new_mtu + VLAN_ETH_HLEN;
>>> + /* max rx/tx 64 byte chunks */
>>> + int chunks = DIV_ROUND_UP(framesize, 64);
>>> +
>>
>> netdev coding style wants all of the declarations in "reverse christmas
>> tree" ordering. Assign to the local variables after the block if
>> necessary. Something like:
>>
>> struct port *port = netdev_priv(dev);
>> struct npe *npe = port->npe;
>> int framesize, chunks;
>> struct msg msg;
>>
>> /* adjust for ethernet headers */
>> framesize = new_mtu + VLAN_ETH_HLEN;
>> /* max rx/tx 64 byte chunks */
>> chunks = DIV_ROUND_UP(framesize, 64);
>
> Right, I fix!
>
>>> + memset(&msg, 0, sizeof(msg));
>>
>> You could also use "struct msg msg = {}" instead of memset here.
>
> OK
>
>>> + msg.cmd = NPE_SETMAXFRAMELENGTHS;
>>> + msg.eth_id = port->id;
>>> +
>>> + /* Firmware wants to know buffer size in 64 byte chunks */
>>> + msg.byte2 = chunks << 8;
>>> + msg.byte3 = chunks << 8;
>>
>> I am not sure I follow the "<< 8" here.
>
> I actually only have this vendor code, but clearly <<8 is not
> "multiply by 256" in this case, rather that the number of 64 byte
> chunks is in the second byte.
>
> The software manual just describes a "OS abstraction layer"
> used by both VXworks and Linux, and since that wasn't acceptable
> in the Linux driver, someone has ripped out the code to
> talk directly to the NPE firmware, and that is what we are seeing.
> If you have the source code to the abstraction layer
> "ixEthAcc" or the source code to the NPE microcode, I think the
> answer is in there... (I have neither, maybe you can check internally,
> hehe. Dan Williams used to work with this hardware!)
>

Yea. We could use FIELD_PREP and some definitions, but I think its ok as-is.

>> Reviewed-by: Jacob Keller <[email protected]>
>
> Thanks!
>
>>> base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
>>> change-id: 20230923-ixp4xx-eth-mtu-c041d7efe932
>>
>> Curious what this change-id thing represents I've never seen it before..
>> I know base-commit is used by git. Would be interested in an explanation
>> if you happen to know! :D
>
> It's metadata generated by b4 which is the tool we use for kernel mailing
> list handling:
> https://people.kernel.org/monsieuricon/sending-a-kernel-patch-with-b4-part-1
>
> I think this change ID cross-references mails I send with my
> git branch, so it's easy to collect review tags etc.
>

Neat! thanks for explaining :) b4 seems quite excellent :D

> Yours,
> Linus Walleij