There are two cases where the current PF does not support RDMA
functionality. The first is if the NVM loaded on the device is set
to not support RDMA (common_caps.rdma is false). The second is if
the kernel bonding driver has included the current PF in an active
link aggregate.
When the driver has determined that this PF does not support RDMA, then
auxiliary devices should not be created on the auxiliary bus. Without
a device on the auxiliary bus, even if the irdma driver is present, there
will be no RDMA activity attempted on this PF.
Currently, in the reset flow, an attempt to create auxiliary devices is
performed without regard to the ability of the PF. There needs to be a
check in ice_aux_plug_dev (as the central point that creates auxiliary
devices) to see if the PF is in a state to support the functionality.
When disabling and re-enabling RDMA due to the inclusion/removal of the PF
in a link aggregate, we also need to set/clear the bit which controls
auxiliary device creation so that a reset recovery in a link aggregate
situation doesn't try to create auxiliary devices when it shouldn't.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Reported-by: Yongxin Liu <[email protected]>
Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
---
drivers/net/ethernet/intel/ice/ice.h | 2 ++
drivers/net/ethernet/intel/ice/ice_idc.c | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index eadcb9958346..3c4f08d20414 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -695,6 +695,7 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
{
if (pf->hw.func_caps.common_cap.rdma && pf->num_rdma_msix) {
set_bit(ICE_FLAG_RDMA_ENA, pf->flags);
+ set_bit(ICE_FLAG_AUX_ENA, pf->flags);
ice_plug_aux_dev(pf);
}
}
@@ -707,5 +708,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf)
{
ice_unplug_aux_dev(pf);
clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
+ clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
}
#endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
index 1f2afdf6cd48..adcc9a251595 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -271,6 +271,12 @@ int ice_plug_aux_dev(struct ice_pf *pf)
struct auxiliary_device *adev;
int ret;
+ /* if this PF doesn't support a technology that requires auxiliary
+ * devices, then gracefully exit
+ */
+ if (!ice_is_aux_ena(pf))
+ return 0;
+
iadev = kzalloc(sizeof(*iadev), GFP_KERNEL);
if (!iadev)
return -ENOMEM;
--
2.31.1
On Thu, 9 Sep 2021 01:56:12 -0700 Dave Ertman wrote:
> There are two cases where the current PF does not support RDMA
> functionality. The first is if the NVM loaded on the device is set
> to not support RDMA (common_caps.rdma is false). The second is if
> the kernel bonding driver has included the current PF in an active
> link aggregate.
>
> When the driver has determined that this PF does not support RDMA, then
> auxiliary devices should not be created on the auxiliary bus. Without
> a device on the auxiliary bus, even if the irdma driver is present, there
> will be no RDMA activity attempted on this PF.
>
> Currently, in the reset flow, an attempt to create auxiliary devices is
> performed without regard to the ability of the PF. There needs to be a
> check in ice_aux_plug_dev (as the central point that creates auxiliary
> devices) to see if the PF is in a state to support the functionality.
>
> When disabling and re-enabling RDMA due to the inclusion/removal of the PF
> in a link aggregate, we also need to set/clear the bit which controls
> auxiliary device creation so that a reset recovery in a link aggregate
> situation doesn't try to create auxiliary devices when it shouldn't.
>
> Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
> Reported-by: Yongxin Liu <[email protected]>
> Signed-off-by: Dave Ertman <[email protected]>
> Signed-off-by: Tony Nguyen <[email protected]>
Why CC lkml but not CC RDMA or Leon?
> -----Original Message-----
> From: Jakub Kicinski <[email protected]>
> Sent: Thursday, September 9, 2021 2:49 PM
> To: Ertman, David M <[email protected]>
> Cc: [email protected]; [email protected]; Saleem, Shiraz
> <[email protected]>; Nguyen, Anthony L
> <[email protected]>; [email protected]; linux-
> [email protected]; Brandeburg, Jesse <[email protected]>;
> [email protected]
> Subject: Re: [PATCH net] ice: Correctly deal with PFs that do not support
> RDMA
>
> On Thu, 9 Sep 2021 01:56:12 -0700 Dave Ertman wrote:
> > There are two cases where the current PF does not support RDMA
> > functionality. The first is if the NVM loaded on the device is set
> > to not support RDMA (common_caps.rdma is false). The second is if
> > the kernel bonding driver has included the current PF in an active
> > link aggregate.
> >
> > When the driver has determined that this PF does not support RDMA, then
> > auxiliary devices should not be created on the auxiliary bus. Without
> > a device on the auxiliary bus, even if the irdma driver is present, there
> > will be no RDMA activity attempted on this PF.
> >
> > Currently, in the reset flow, an attempt to create auxiliary devices is
> > performed without regard to the ability of the PF. There needs to be a
> > check in ice_aux_plug_dev (as the central point that creates auxiliary
> > devices) to see if the PF is in a state to support the functionality.
> >
> > When disabling and re-enabling RDMA due to the inclusion/removal of the
> PF
> > in a link aggregate, we also need to set/clear the bit which controls
> > auxiliary device creation so that a reset recovery in a link aggregate
> > situation doesn't try to create auxiliary devices when it shouldn't.
> >
> > Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
> > Reported-by: Yongxin Liu <[email protected]>
> > Signed-off-by: Dave Ertman <[email protected]>
> > Signed-off-by: Tony Nguyen <[email protected]>
>
> Why CC lkml but not CC RDMA or Leon?
Oversight on my part - thought I had cut-n-pasted all of the address
in my git send-email command. Will send again and correct issue