There is a crash when adding one of the lan966x interfaces under a lag
interface. The issue can be reproduced like this:
ip link add name bond0 type bond miimon 100 mode balance-xor
ip link set dev eth0 master bond0
The reason is because when adding a interface under the lag it would go
through all the ports and try to figure out which other ports are under
that lag interface. And the issue is that lan966x can have ports that are
NULL pointer as they are not probed. So then iterating over these ports
it would just crash as they are NULL pointers.
The fix consists in actually checking for NULL pointers before accessing
something from the ports. Like we do in other places.
Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
Signed-off-by: Horatiu Vultur <[email protected]>
---
drivers/net/ethernet/microchip/lan966x/lan966x_lag.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
index 41fa2523d91d3..89a2c3176f1da 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
@@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
/* Now, set PGIDs for each active LAG */
for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
- struct net_device *bond = lan966x->ports[lag]->bond;
+ struct lan966x_port *port = lan966x->ports[lag];
int num_active_ports = 0;
+ struct net_device *bond;
unsigned long bond_mask;
u8 aggr_idx[16];
- if (!bond || (visited & BIT(lag)))
+ if (!port || !port->bond || (visited & BIT(lag)))
continue;
+ bond = lan966x->ports[lag]->bond;
bond_mask = lan966x_lag_get_mask(lan966x, bond);
for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
struct lan966x_port *port = lan966x->ports[p];
+ if (!port)
+ continue;
+
lan_wr(ANA_PGID_PGID_SET(bond_mask),
lan966x, ANA_PGID(p));
if (port->lag_tx_active)
--
2.34.1
On Mon, Feb 05, 2024 at 09:07:56AM +0100, Horatiu Vultur wrote:
> There is a crash when adding one of the lan966x interfaces under a lag
> interface. The issue can be reproduced like this:
> ip link add name bond0 type bond miimon 100 mode balance-xor
> ip link set dev eth0 master bond0
>
> The reason is because when adding a interface under the lag it would go
> through all the ports and try to figure out which other ports are under
> that lag interface. And the issue is that lan966x can have ports that are
> NULL pointer as they are not probed. So then iterating over these ports
> it would just crash as they are NULL pointers.
> The fix consists in actually checking for NULL pointers before accessing
> something from the ports. Like we do in other places.
>
> Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
> Signed-off-by: Horatiu Vultur <[email protected]>
> ---
> drivers/net/ethernet/microchip/lan966x/lan966x_lag.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> index 41fa2523d91d3..89a2c3176f1da 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> @@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
>
> /* Now, set PGIDs for each active LAG */
> for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
> - struct net_device *bond = lan966x->ports[lag]->bond;
> + struct lan966x_port *port = lan966x->ports[lag];
> int num_active_ports = 0;
> + struct net_device *bond;
> unsigned long bond_mask;
> u8 aggr_idx[16];
>
> - if (!bond || (visited & BIT(lag)))
> + if (!port || !port->bond || (visited & BIT(lag)))
> continue;
>
> + bond = lan966x->ports[lag]->bond;
Why not bond = port->bond?
> bond_mask = lan966x_lag_get_mask(lan966x, bond);
>
> for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
> struct lan966x_port *port = lan966x->ports[p];
>
> + if (!port)
> + continue;
> +
> lan_wr(ANA_PGID_PGID_SET(bond_mask),
> lan966x, ANA_PGID(p));
> if (port->lag_tx_active)
> --
> 2.34.1
>
Only nit, otherwise:
Reviewed-by: Michal Swiatkowski <[email protected]>
Thanks,
Michal
The 02/05/2024 09:44, Michal Swiatkowski wrote:
Hi Michal,
>
> On Mon, Feb 05, 2024 at 09:07:56AM +0100, Horatiu Vultur wrote:
> > There is a crash when adding one of the lan966x interfaces under a lag
> > interface. The issue can be reproduced like this:
> > ip link add name bond0 type bond miimon 100 mode balance-xor
> > ip link set dev eth0 master bond0
> >
> > The reason is because when adding a interface under the lag it would go
> > through all the ports and try to figure out which other ports are under
> > that lag interface. And the issue is that lan966x can have ports that are
> > NULL pointer as they are not probed. So then iterating over these ports
> > it would just crash as they are NULL pointers.
> > The fix consists in actually checking for NULL pointers before accessing
> > something from the ports. Like we do in other places.
> >
> > Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
> > Signed-off-by: Horatiu Vultur <[email protected]>
> > ---
> > drivers/net/ethernet/microchip/lan966x/lan966x_lag.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > index 41fa2523d91d3..89a2c3176f1da 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > @@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
> >
> > /* Now, set PGIDs for each active LAG */
> > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
> > - struct net_device *bond = lan966x->ports[lag]->bond;
> > + struct lan966x_port *port = lan966x->ports[lag];
> > int num_active_ports = 0;
> > + struct net_device *bond;
> > unsigned long bond_mask;
> > u8 aggr_idx[16];
> >
> > - if (!bond || (visited & BIT(lag)))
> > + if (!port || !port->bond || (visited & BIT(lag)))
> > continue;
> >
> > + bond = lan966x->ports[lag]->bond;
> Why not bond = port->bond?
That is also correct and more clear.
I think I just copy the line that I have removed and put it here. As it
has the same effect.
I can update this in the next version.
>
> > bond_mask = lan966x_lag_get_mask(lan966x, bond);
> >
> > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
> > struct lan966x_port *port = lan966x->ports[p];
> >
> > + if (!port)
> > + continue;
> > +
> > lan_wr(ANA_PGID_PGID_SET(bond_mask),
> > lan966x, ANA_PGID(p));
> > if (port->lag_tx_active)
> > --
> > 2.34.1
> >
> Only nit, otherwise:
> Reviewed-by: Michal Swiatkowski <[email protected]>
>
> Thanks,
> Michal
--
/Horatiu
On Mon, Feb 05, 2024 at 10:44:34AM +0100, Horatiu Vultur wrote:
> The 02/05/2024 09:44, Michal Swiatkowski wrote:
>
> Hi Michal,
>
> >
> > On Mon, Feb 05, 2024 at 09:07:56AM +0100, Horatiu Vultur wrote:
> > > There is a crash when adding one of the lan966x interfaces under a lag
> > > interface. The issue can be reproduced like this:
> > > ip link add name bond0 type bond miimon 100 mode balance-xor
> > > ip link set dev eth0 master bond0
> > >
> > > The reason is because when adding a interface under the lag it would go
> > > through all the ports and try to figure out which other ports are under
> > > that lag interface. And the issue is that lan966x can have ports that are
> > > NULL pointer as they are not probed. So then iterating over these ports
> > > it would just crash as they are NULL pointers.
> > > The fix consists in actually checking for NULL pointers before accessing
> > > something from the ports. Like we do in other places.
> > >
> > > Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
> > > Signed-off-by: Horatiu Vultur <[email protected]>
> > > ---
> > > drivers/net/ethernet/microchip/lan966x/lan966x_lag.c | 9 +++++++--
> > > 1 file changed, 7 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > > index 41fa2523d91d3..89a2c3176f1da 100644
> > > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > > @@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
> > >
> > > /* Now, set PGIDs for each active LAG */
> > > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
> > > - struct net_device *bond = lan966x->ports[lag]->bond;
> > > + struct lan966x_port *port = lan966x->ports[lag];
> > > int num_active_ports = 0;
> > > + struct net_device *bond;
> > > unsigned long bond_mask;
> > > u8 aggr_idx[16];
> > >
> > > - if (!bond || (visited & BIT(lag)))
> > > + if (!port || !port->bond || (visited & BIT(lag)))
> > > continue;
> > >
> > > + bond = lan966x->ports[lag]->bond;
> > Why not bond = port->bond?
>
> That is also correct and more clear.
> I think I just copy the line that I have removed and put it here. As it
> has the same effect.
> I can update this in the next version.
>
Great, thanks, fell free to add my reviewed-by tag in next version.
Michal
> >
> > > bond_mask = lan966x_lag_get_mask(lan966x, bond);
> > >
> > > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
> > > struct lan966x_port *port = lan966x->ports[p];
> > >
> > > + if (!port)
> > > + continue;
> > > +
> > > lan_wr(ANA_PGID_PGID_SET(bond_mask),
> > > lan966x, ANA_PGID(p));
> > > if (port->lag_tx_active)
> > > --
> > > 2.34.1
> > >
> > Only nit, otherwise:
> > Reviewed-by: Michal Swiatkowski <[email protected]>
> >
> > Thanks,
> > Michal
>
> --
> /Horatiu