2021-09-23 18:14:56

by Leon Romanovsky

[permalink] [raw]
Subject: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

From: Leon Romanovsky <[email protected]>

devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.

Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.

Fixes: 4ab0c6a8ffd7 ("bnxt_en: add support to enable VF-representors")
Signed-off-by: Leon Romanovsky <[email protected]>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 13 ++++++-------
drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h | 13 -------------
3 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 037767b370d5..4c483fd91dbe 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -13370,7 +13370,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
}

bnxt_inv_fw_health_reg(bp);
- bnxt_dl_register(bp);
+ rc = bnxt_dl_register(bp);
+ if (rc)
+ goto init_err_dl;

rc = register_netdev(dev);
if (rc)
@@ -13390,6 +13392,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)

init_err_cleanup:
bnxt_dl_unregister(bp);
+init_err_dl:
bnxt_shutdown_tc(bp);
bnxt_clear_int_mode(bp);

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index bf7d3c17049b..dc0851f709f5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -134,7 +134,7 @@ void bnxt_dl_fw_reporters_create(struct bnxt *bp)
{
struct bnxt_fw_health *health = bp->fw_health;

- if (!bp->dl || !health)
+ if (!health)
return;

if (!(bp->fw_cap & BNXT_FW_CAP_HOT_RESET) || health->fw_reset_reporter)
@@ -188,7 +188,7 @@ void bnxt_dl_fw_reporters_destroy(struct bnxt *bp, bool all)
{
struct bnxt_fw_health *health = bp->fw_health;

- if (!bp->dl || !health)
+ if (!health)
return;

if ((all || !(bp->fw_cap & BNXT_FW_CAP_HOT_RESET)) &&
@@ -781,6 +781,7 @@ int bnxt_dl_register(struct bnxt *bp)
{
const struct devlink_ops *devlink_ops;
struct devlink_port_attrs attrs = {};
+ struct bnxt_dl *bp_dl;
struct devlink *dl;
int rc;

@@ -795,7 +796,9 @@ int bnxt_dl_register(struct bnxt *bp)
return -ENOMEM;
}

- bnxt_link_bp_to_dl(bp, dl);
+ bp->dl = dl;
+ bp_dl = devlink_priv(dl);
+ bp_dl->bp = bp;

/* Add switchdev eswitch mode setting, if SRIOV supported */
if (pci_find_ext_capability(bp->pdev, PCI_EXT_CAP_ID_SRIOV) &&
@@ -826,7 +829,6 @@ int bnxt_dl_register(struct bnxt *bp)
err_dl_port_unreg:
devlink_port_unregister(&bp->dl_port);
err_dl_free:
- bnxt_link_bp_to_dl(bp, NULL);
devlink_free(dl);
return rc;
}
@@ -835,9 +837,6 @@ void bnxt_dl_unregister(struct bnxt *bp)
{
struct devlink *dl = bp->dl;

- if (!dl)
- return;
-
if (BNXT_PF(bp)) {
bnxt_dl_params_unregister(bp);
devlink_port_unregister(&bp->dl_port);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
index d889f240da2b..406dc655a5fc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
@@ -20,19 +20,6 @@ static inline struct bnxt *bnxt_get_bp_from_dl(struct devlink *dl)
return ((struct bnxt_dl *)devlink_priv(dl))->bp;
}

-/* To clear devlink pointer from bp, pass NULL dl */
-static inline void bnxt_link_bp_to_dl(struct bnxt *bp, struct devlink *dl)
-{
- bp->dl = dl;
-
- /* add a back pointer in dl to bp */
- if (dl) {
- struct bnxt_dl *bp_dl = devlink_priv(dl);
-
- bp_dl->bp = bp;
- }
-}
-
#define NVM_OFF_MSIX_VEC_PER_PF_MAX 108
#define NVM_OFF_MSIX_VEC_PER_PF_MIN 114
#define NVM_OFF_IGNORE_ARI 164
--
2.31.1


2021-09-23 21:15:21

by Edwin Peer

[permalink] [raw]
Subject: Re: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

On Thu, Sep 23, 2021 at 11:13 AM Leon Romanovsky <[email protected]> wrote:
>
> From: Leon Romanovsky <[email protected]>
>
> devlink is a software interface that doesn't depend on any hardware
> capabilities. The failure in SW means memory issues, wrong parameters,
> programmer error e.t.c.
>
> Like any other such interface in the kernel, the returned status of
> devlink APIs should be checked and propagated further and not ignored.
>
> Fixes: 4ab0c6a8ffd7 ("bnxt_en: add support to enable VF-representors")
> Signed-off-by: Leon Romanovsky <[email protected]>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
> drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 13 ++++++-------
> drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h | 13 -------------
> 3 files changed, 10 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 037767b370d5..4c483fd91dbe 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -13370,7 +13370,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> }
>
> bnxt_inv_fw_health_reg(bp);
> - bnxt_dl_register(bp);
> + rc = bnxt_dl_register(bp);
> + if (rc)
> + goto init_err_dl;
>
> rc = register_netdev(dev);
> if (rc)
> @@ -13390,6 +13392,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> init_err_cleanup:
> bnxt_dl_unregister(bp);
> +init_err_dl:
> bnxt_shutdown_tc(bp);
> bnxt_clear_int_mode(bp);
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> index bf7d3c17049b..dc0851f709f5 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> @@ -134,7 +134,7 @@ void bnxt_dl_fw_reporters_create(struct bnxt *bp)
> {
> struct bnxt_fw_health *health = bp->fw_health;
>
> - if (!bp->dl || !health)
> + if (!health)
> return;
>
> if (!(bp->fw_cap & BNXT_FW_CAP_HOT_RESET) || health->fw_reset_reporter)
> @@ -188,7 +188,7 @@ void bnxt_dl_fw_reporters_destroy(struct bnxt *bp, bool all)
> {
> struct bnxt_fw_health *health = bp->fw_health;
>
> - if (!bp->dl || !health)
> + if (!health)
> return;
>
> if ((all || !(bp->fw_cap & BNXT_FW_CAP_HOT_RESET)) &&
> @@ -781,6 +781,7 @@ int bnxt_dl_register(struct bnxt *bp)
> {
> const struct devlink_ops *devlink_ops;
> struct devlink_port_attrs attrs = {};
> + struct bnxt_dl *bp_dl;
> struct devlink *dl;
> int rc;
>
> @@ -795,7 +796,9 @@ int bnxt_dl_register(struct bnxt *bp)
> return -ENOMEM;
> }
>
> - bnxt_link_bp_to_dl(bp, dl);
> + bp->dl = dl;
> + bp_dl = devlink_priv(dl);
> + bp_dl->bp = bp;
>
> /* Add switchdev eswitch mode setting, if SRIOV supported */
> if (pci_find_ext_capability(bp->pdev, PCI_EXT_CAP_ID_SRIOV) &&
> @@ -826,7 +829,6 @@ int bnxt_dl_register(struct bnxt *bp)
> err_dl_port_unreg:
> devlink_port_unregister(&bp->dl_port);
> err_dl_free:
> - bnxt_link_bp_to_dl(bp, NULL);
> devlink_free(dl);
> return rc;
> }
> @@ -835,9 +837,6 @@ void bnxt_dl_unregister(struct bnxt *bp)
> {
> struct devlink *dl = bp->dl;
>
> - if (!dl)
> - return;
> -

minor nit: There's obviously nothing incorrect about doing this (and
adding the additional error label in the cleanup code above), but bnxt
has generally adopted a style of having cleanup functions being
idempotent. It generally makes error handling simpler and less error
prone.

> if (BNXT_PF(bp)) {
> bnxt_dl_params_unregister(bp);
> devlink_port_unregister(&bp->dl_port);
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
> index d889f240da2b..406dc655a5fc 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
> @@ -20,19 +20,6 @@ static inline struct bnxt *bnxt_get_bp_from_dl(struct devlink *dl)
> return ((struct bnxt_dl *)devlink_priv(dl))->bp;
> }
>
> -/* To clear devlink pointer from bp, pass NULL dl */
> -static inline void bnxt_link_bp_to_dl(struct bnxt *bp, struct devlink *dl)
> -{
> - bp->dl = dl;
> -
> - /* add a back pointer in dl to bp */
> - if (dl) {
> - struct bnxt_dl *bp_dl = devlink_priv(dl);
> -
> - bp_dl->bp = bp;
> - }
> -}
> -
> #define NVM_OFF_MSIX_VEC_PER_PF_MAX 108
> #define NVM_OFF_MSIX_VEC_PER_PF_MIN 114
> #define NVM_OFF_IGNORE_ARI 164
> --
> 2.31.1
>

Reviewed-by: Edwin Peer <[email protected]>

Regards,
Edwin Peer

2021-09-23 23:13:03

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

On Thu, Sep 23, 2021 at 02:11:40PM -0700, Edwin Peer wrote:
> On Thu, Sep 23, 2021 at 11:13 AM Leon Romanovsky <[email protected]> wrote:
> >
> > From: Leon Romanovsky <[email protected]>
> >
> > devlink is a software interface that doesn't depend on any hardware
> > capabilities. The failure in SW means memory issues, wrong parameters,
> > programmer error e.t.c.
> >
> > Like any other such interface in the kernel, the returned status of
> > devlink APIs should be checked and propagated further and not ignored.
> >
> > Fixes: 4ab0c6a8ffd7 ("bnxt_en: add support to enable VF-representors")
> > Signed-off-by: Leon Romanovsky <[email protected]>
> > ---
> > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
> > drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 13 ++++++-------
> > drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h | 13 -------------
> > 3 files changed, 10 insertions(+), 21 deletions(-)

<...>

> > @@ -835,9 +837,6 @@ void bnxt_dl_unregister(struct bnxt *bp)
> > {
> > struct devlink *dl = bp->dl;
> >
> > - if (!dl)
> > - return;
> > -
>
> minor nit: There's obviously nothing incorrect about doing this (and
> adding the additional error label in the cleanup code above), but bnxt
> has generally adopted a style of having cleanup functions being
> idempotent. It generally makes error handling simpler and less error
> prone.

I would argue that opposite is true. Such "impossible" checks hide unwind
flow errors, missing releases e.t.c.

<...>

> >
>
> Reviewed-by: Edwin Peer <[email protected]>

Thanks for the review.


>
> Regards,
> Edwin Peer

2021-09-24 01:41:35

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

On Fri, 24 Sep 2021 02:11:19 +0300 Leon Romanovsky wrote:
> > > @@ -835,9 +837,6 @@ void bnxt_dl_unregister(struct bnxt *bp)
> > > {
> > > struct devlink *dl = bp->dl;
> > >
> > > - if (!dl)
> > > - return;
> > > -
> >
> > minor nit: There's obviously nothing incorrect about doing this (and
> > adding the additional error label in the cleanup code above), but bnxt
> > has generally adopted a style of having cleanup functions being
> > idempotent. It generally makes error handling simpler and less error
> > prone.
>
> I would argue that opposite is true. Such "impossible" checks hide unwind
> flow errors, missing releases e.t.c.

+1, fwiw

2021-09-25 03:31:29

by Edwin Peer

[permalink] [raw]
Subject: Re: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

On Thu, Sep 23, 2021 at 6:39 PM Jakub Kicinski <[email protected]> wrote:

> On Fri, 24 Sep 2021 02:11:19 +0300 Leon Romanovsky wrote:
> > > minor nit: There's obviously nothing incorrect about doing this (and
> > > adding the additional error label in the cleanup code above), but bnxt
> > > has generally adopted a style of having cleanup functions being
> > > idempotent. It generally makes error handling simpler and less error
> > > prone.
> >
> > I would argue that opposite is true. Such "impossible" checks hide unwind
> > flow errors, missing releases e.t.c.
>
> +1, fwiw

I appreciate that being more explicit can improve visibility, but it
does not make error handling inherently less error prone, nor is it
simpler (ie. the opposite isn't true). Idempotency is orthogonal to
unwind flow or the presence or not of a particular unwind handler (one
can still enforce either in review). But, if release handlers are
independent (most in bnxt are), then permitting other orderings can be
perfectly valid and places less burden on achieving the canonical form
for correctness (ie. usage is simpler and less error prone). That's
not to say we should throw caution to the wind and allow arbitrary
unwind flows, but it does mean certain mistakes don't result in actual
bugs. There are other flexibility benefits too. A single, unwind
everything, handler can be reused in more than one context.

That said, isn't the more important question what style and
assumptions the surrounding code has adopted? In this particular case,
I checked that this change wouldn't introduce the possibility of a
double unwind, but in other contexts in this driver code base,
changing error handling in this piecemeal way might actually introduce
a bug in contexts where the caller has assumed the overall function is
idempotent. Isn't local consistency of style a more important concern,
especially given that you are not predominantly responsible for
maintenance of this driver? Dealing with this exception to the norm in
our driver certainly places an additional burden on us to remember to
treat this particular case with special care. We should either rework
all of bnxt error handling to adopt the more accepted canonical form,
or we should adopt the surrounding conventions. What we shouldn't do
is mix approaches in one driver.

Regards,
Edwin Peer

2021-09-25 10:03:36

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH net-next 1/6] bnxt_en: Check devlink allocation and registration status

On Fri, Sep 24, 2021 at 10:20:32AM -0700, Edwin Peer wrote:
> On Thu, Sep 23, 2021 at 6:39 PM Jakub Kicinski <[email protected]> wrote:
>
> > On Fri, 24 Sep 2021 02:11:19 +0300 Leon Romanovsky wrote:
> > > > minor nit: There's obviously nothing incorrect about doing this (and
> > > > adding the additional error label in the cleanup code above), but bnxt
> > > > has generally adopted a style of having cleanup functions being
> > > > idempotent. It generally makes error handling simpler and less error
> > > > prone.
> > >
> > > I would argue that opposite is true. Such "impossible" checks hide unwind
> > > flow errors, missing releases e.t.c.
> >
> > +1, fwiw
>
> I appreciate that being more explicit can improve visibility, but it
> does not make error handling inherently less error prone, nor is it
> simpler (ie. the opposite isn't true). Idempotency is orthogonal to
> unwind flow or the presence or not of a particular unwind handler (one
> can still enforce either in review). But, if release handlers are
> independent (most in bnxt are), then permitting other orderings can be
> perfectly valid and places less burden on achieving the canonical form
> for correctness (ie. usage is simpler and less error prone). That's
> not to say we should throw caution to the wind and allow arbitrary
> unwind flows, but it does mean certain mistakes don't result in actual
> bugs. There are other flexibility benefits too. A single, unwind
> everything, handler can be reused in more than one context.

And this is where the fun begins. Different context means different
lifetime expectations, maybe need of locking and unpredictable flows
from reader perspective.

For example, in this devlink case, it took me time to check all driver
to see that pf can't be null.

The idea that adding code that maybe will be used can be seen as
anti-pattern.

Thanks