2024-05-09 11:00:33

by Dan Carpenter

[permalink] [raw]
Subject: [PATCH net] net/mlx5: Fix error handling in mlx5_init_one_light()

If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
devl_unlock() is a bug. It's not registered and it's not locked. That
will trigger a stack trace in this case because devl_unregister() checks
both those things at the start of the function.

If mlx5_devlink_params_register() fails then this code will call
devl_unregister() and devl_unlock() twice which will again lead to a
stack trace or possibly something worse as well.

Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
Signed-off-by: Dan Carpenter <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 331ce47f51a1..105c98160327 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
err = mlx5_query_hca_caps_light(dev);
if (err) {
mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err);
- goto query_hca_caps_err;
+ goto err_function_disable;
}

devl_lock(devlink);
@@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
err = mlx5_devlink_params_register(priv_to_devlink(dev));
if (err) {
mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
- goto params_reg_err;
+ goto err_unregister;
}

devl_unlock(devlink);
return 0;

-params_reg_err:
- devl_unregister(devlink);
- devl_unlock(devlink);
-query_hca_caps_err:
+err_unregister:
devl_unregister(devlink);
devl_unlock(devlink);
+err_function_disable:
mlx5_function_disable(dev, true);
out:
dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
--
2.43.0



2024-05-10 06:45:05

by Larysa Zaremba

[permalink] [raw]
Subject: Re: [PATCH net] net/mlx5: Fix error handling in mlx5_init_one_light()

On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> devl_unlock() is a bug. It's not registered and it's not locked. That
> will trigger a stack trace in this case because devl_unregister() checks
> both those things at the start of the function.
>
> If mlx5_devlink_params_register() fails then this code will call
> devl_unregister() and devl_unlock() twice which will again lead to a
> stack trace or possibly something worse as well.
>
> Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")

Reviewed-by: Larysa Zaremba <[email protected]>

> Signed-off-by: Dan Carpenter <[email protected]>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 331ce47f51a1..105c98160327 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
> err = mlx5_query_hca_caps_light(dev);
> if (err) {
> mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err);
> - goto query_hca_caps_err;
> + goto err_function_disable;
> }
>
> devl_lock(devlink);
> @@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
> err = mlx5_devlink_params_register(priv_to_devlink(dev));
> if (err) {
> mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
> - goto params_reg_err;
> + goto err_unregister;
> }
>
> devl_unlock(devlink);
> return 0;
>
> -params_reg_err:
> - devl_unregister(devlink);
> - devl_unlock(devlink);
> -query_hca_caps_err:
> +err_unregister:
> devl_unregister(devlink);
> devl_unlock(devlink);
> +err_function_disable:
> mlx5_function_disable(dev, true);
> out:
> dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
> --
> 2.43.0
>
>

2024-05-11 14:23:27

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net] net/mlx5: Fix error handling in mlx5_init_one_light()

On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> devl_unlock() is a bug. It's not registered and it's not locked. That
> will trigger a stack trace in this case because devl_unregister() checks
> both those things at the start of the function.
>
> If mlx5_devlink_params_register() fails then this code will call
> devl_unregister() and devl_unlock() twice which will again lead to a
> stack trace or possibly something worse as well.
>
> Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
> Signed-off-by: Dan Carpenter <[email protected]>

Hi Dan,

I believe that after you posted this patch, a different fix for this was
added to net as:

3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port")

--
pw-bot: rejected

2024-05-12 08:21:04

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH net] net/mlx5: Fix error handling in mlx5_init_one_light()

On Sat, May 11, 2024 at 03:23:04PM +0100, Simon Horman wrote:
> On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> > If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> > devl_unlock() is a bug. It's not registered and it's not locked. That
> > will trigger a stack trace in this case because devl_unregister() checks
> > both those things at the start of the function.
> >
> > If mlx5_devlink_params_register() fails then this code will call
> > devl_unregister() and devl_unlock() twice which will again lead to a
> > stack trace or possibly something worse as well.
> >
> > Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> > Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
> > Signed-off-by: Dan Carpenter <[email protected]>
>
> Hi Dan,
>
> I believe that after you posted this patch, a different fix for this was
> added to net as:
>
> 3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port")
>

Ah good. Plus that patch has been tested.

regards,
dan carpenter