Have run into a case where bond_option_mode_set() gets called before
hw_features has been filled in, and very bad things happen when
netdev_change_features() then gets called, because the empty hw_features
wipes out almost all features. Further reading of netdev feature flag
documentation suggests drivers aren't supposed to touch wanted_features,
so this changes bond_option_mode_set() to use netdev_increment_features()
and &= ~BOND_XFRM_FEATURES on mode changes and then only calling
netdev_features_change() if there was actually a change of features. This
specifically fixes bonding on top of mlxsw interfaces, and has been
regression-tested with ixgbe interfaces. This change also simplifies the
xfrm-specific code in bond_setup() a little bit as well.
Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Veaceslav Falico <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Thomas Davis <[email protected]>
Cc: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/bonding/bond_main.c | 10 ++++------
drivers/net/bonding/bond_options.c | 14 +++++++++++---
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 71c9677d135f..b8e0cb4f9480 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4721,15 +4721,13 @@ void bond_setup(struct net_device *bond_dev)
NETIF_F_HW_VLAN_CTAG_FILTER;
bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
-#ifdef CONFIG_XFRM_OFFLOAD
- bond_dev->hw_features |= BOND_XFRM_FEATURES;
-#endif /* CONFIG_XFRM_OFFLOAD */
bond_dev->features |= bond_dev->hw_features;
bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
#ifdef CONFIG_XFRM_OFFLOAD
- /* Disable XFRM features if this isn't an active-backup config */
- if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
- bond_dev->features &= ~BOND_XFRM_FEATURES;
+ bond_dev->hw_features |= BOND_XFRM_FEATURES;
+ /* Only enable XFRM features if this is an active-backup config */
+ if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->features |= BOND_XFRM_FEATURES;
#endif /* CONFIG_XFRM_OFFLOAD */
}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..bce34648d97d 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -748,6 +748,9 @@ const struct bond_option *bond_opt_get(unsigned int option)
static int bond_option_mode_set(struct bonding *bond,
const struct bond_opt_value *newval)
{
+ netdev_features_t features = bond->dev->features;
+ netdev_features_t mask = features & BOND_XFRM_FEATURES;
+
if (!bond_mode_uses_arp(newval->value)) {
if (bond->params.arp_interval) {
netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
@@ -769,10 +772,15 @@ static int bond_option_mode_set(struct bonding *bond,
#ifdef CONFIG_XFRM_OFFLOAD
if (newval->value == BOND_MODE_ACTIVEBACKUP)
- bond->dev->wanted_features |= BOND_XFRM_FEATURES;
+ features = netdev_increment_features(features,
+ BOND_XFRM_FEATURES, mask);
else
- bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
- netdev_change_features(bond->dev);
+ features &= ~BOND_XFRM_FEATURES;
+
+ if (bond->dev->features != features) {
+ bond->dev->features = features;
+ netdev_features_change(bond->dev);
+ }
#endif /* CONFIG_XFRM_OFFLOAD */
/* don't cache arp_validate between modes */
--
2.28.0
On Sun, 22 Nov 2020 22:17:16 -0500
Jarod Wilson <[email protected]> wrote:
> Have run into a case where bond_option_mode_set() gets called before
> hw_features has been filled in, and very bad things happen when
> netdev_change_features() then gets called, because the empty hw_features
> wipes out almost all features. Further reading of netdev feature flag
> documentation suggests drivers aren't supposed to touch wanted_features,
> so this changes bond_option_mode_set() to use netdev_increment_features()
> and &= ~BOND_XFRM_FEATURES on mode changes and then only calling
> netdev_features_change() if there was actually a change of features. This
> specifically fixes bonding on top of mlxsw interfaces, and has been
> regression-tested with ixgbe interfaces. This change also simplifies the
> xfrm-specific code in bond_setup() a little bit as well.
Hi Jarod,
the reason is not correct... The problem is not with empty ->hw_features but
with empty ->wanted_features.
During bond device creation bond_newlink() is called. It calls bond_changelink()
first and afterwards register_netdevice(). The problem is that ->wanted_features
are initialized in register_netdevice() so during bond_changlink() call
->wanted_features is 0. So...
bond_newlink()
-> bond_changelink()
-> __bond_opt_set()
-> bond_option_mode_set()
-> netdev_change_features()
-> __netdev_update_features()
features = netdev_get_wanted_features()
{ dev->features & ~dev->hw_features) | dev->wanted_features }
dev->wanted_features is here zero so the rest of the expression clears a bunch of
bits from dev->features...
In case of mlxsw it is important that NETIF_F_HW_VLAN_CTAG_FILTER bit is cleared in
bonding device because in this case vlan_add_rx_filter_info() does not call
bond_vlan_rx_add_vid() so mlxsw_sp_port_add_vid() is not called as well.
Later this causes a WARN in mlxsw_sp_inetaddr_port_vlan_event() because
instance of mlxsw_sp_port_vlan does not exist as mlxsw_sp_port_add_vid() was not
called.
Btw. it should be enough to call existing snippet in bond_option_mode_set() only
when device is already registered?
E.g.:
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..ca4913fee5a9 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -768,11 +768,13 @@ static int bond_option_mode_set(struct bonding *bond,
bond->params.tlb_dynamic_lb = 1;
#ifdef CONFIG_XFRM_OFFLOAD
- if (newval->value == BOND_MODE_ACTIVEBACKUP)
- bond->dev->wanted_features |= BOND_XFRM_FEATURES;
- else
- bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
- netdev_change_features(bond->dev);
+ if (dev->reg_state == NETREG_REGISTERED) {
+ if (newval->value == BOND_MODE_ACTIVEBACKUP)
+ bond->dev->wanted_features |= BOND_XFRM_FEATURES;
+ else
+ bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
+ netdev_change_features(bond->dev);
+ }
#endif /* CONFIG_XFRM_OFFLOAD */
Thanks,
Ivan
Hi Jarod,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on net-next/master]
[also build test ERROR on next-20201123]
[cannot apply to net/master linux/master linus/master sparc-next/master v5.10-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Jarod-Wilson/bonding-fix-feature-flag-setting-at-init-time/20201123-111956
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git f9e425e99b0756c1479042afe761073779df2a30
config: x86_64-rhel (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/6d883c4c2b01573ba9dddcb9fe109f961a8b7f10
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Jarod-Wilson/bonding-fix-feature-flag-setting-at-init-time/20201123-111956
git checkout 6d883c4c2b01573ba9dddcb9fe109f961a8b7f10
# save the attached .config to linux build tree
make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
drivers/net/bonding/bond_options.c: In function 'bond_option_mode_set':
>> drivers/net/bonding/bond_options.c:752:38: error: 'BOND_XFRM_FEATURES' undeclared (first use in this function)
752 | netdev_features_t mask = features & BOND_XFRM_FEATURES;
| ^~~~~~~~~~~~~~~~~~
drivers/net/bonding/bond_options.c:752:38: note: each undeclared identifier is reported only once for each function it appears in
drivers/net/bonding/bond_options.c:752:20: warning: unused variable 'mask' [-Wunused-variable]
752 | netdev_features_t mask = features & BOND_XFRM_FEATURES;
| ^~~~
vim +/BOND_XFRM_FEATURES +752 drivers/net/bonding/bond_options.c
747
748 static int bond_option_mode_set(struct bonding *bond,
749 const struct bond_opt_value *newval)
750 {
751 netdev_features_t features = bond->dev->features;
> 752 netdev_features_t mask = features & BOND_XFRM_FEATURES;
753
754 if (!bond_mode_uses_arp(newval->value)) {
755 if (bond->params.arp_interval) {
756 netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
757 newval->string);
758 /* disable arp monitoring */
759 bond->params.arp_interval = 0;
760 }
761
762 if (!bond->params.miimon) {
763 /* set miimon to default value */
764 bond->params.miimon = BOND_DEFAULT_MIIMON;
765 netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
766 bond->params.miimon);
767 }
768 }
769
770 if (newval->value == BOND_MODE_ALB)
771 bond->params.tlb_dynamic_lb = 1;
772
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
Don't try to adjust XFRM support flags if the bond device isn't yet
registered. Bad things can currently happen when netdev_change_features()
is called without having wanted_features fully filled in yet. Basically,
this code was racing against register_netdevice() filling in
wanted_features, and when it got there first, the empty wanted_features
led to features also getting emptied out, which was definitely not the
intended behavior, so prevent that from happening.
Originally, I'd hoped to stop adjusting wanted_features at all in the
bonding driver, as it's documented as being something only the network
core should touch, but we actually do need to do this to properly update
both the features and wanted_features fields when changing the bond type,
or we get to a situation where ethtool sees:
esp-hw-offload: off [requested on]
I do think we should be using netdev_update_features instead of
netdev_change_features here though, so we only send notifiers when the
features actually changed.
v2: rework based on further testing and suggestions from ivecera
Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <[email protected]>
Suggested-by: Ivan Vecera <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Veaceslav Falico <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Thomas Davis <[email protected]>
Cc: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/bonding/bond_main.c | 10 ++++------
drivers/net/bonding/bond_options.c | 6 +++++-
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e0880a3840d7..5fe5232cc3f3 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4746,15 +4746,13 @@ void bond_setup(struct net_device *bond_dev)
NETIF_F_HW_VLAN_CTAG_FILTER;
bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
-#ifdef CONFIG_XFRM_OFFLOAD
- bond_dev->hw_features |= BOND_XFRM_FEATURES;
-#endif /* CONFIG_XFRM_OFFLOAD */
bond_dev->features |= bond_dev->hw_features;
bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
#ifdef CONFIG_XFRM_OFFLOAD
- /* Disable XFRM features if this isn't an active-backup config */
- if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
- bond_dev->features &= ~BOND_XFRM_FEATURES;
+ bond_dev->hw_features |= BOND_XFRM_FEATURES;
+ /* Only enable XFRM features if this is an active-backup config */
+ if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->features |= BOND_XFRM_FEATURES;
#endif /* CONFIG_XFRM_OFFLOAD */
}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..19205cfac751 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -768,11 +768,15 @@ static int bond_option_mode_set(struct bonding *bond,
bond->params.tlb_dynamic_lb = 1;
#ifdef CONFIG_XFRM_OFFLOAD
+ if (bond->dev->reg_state != NETREG_REGISTERED)
+ goto noreg;
+
if (newval->value == BOND_MODE_ACTIVEBACKUP)
bond->dev->wanted_features |= BOND_XFRM_FEATURES;
else
bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
- netdev_change_features(bond->dev);
+ netdev_update_features(bond->dev);
+noreg:
#endif /* CONFIG_XFRM_OFFLOAD */
/* don't cache arp_validate between modes */
--
2.28.0
On Wed, 2 Dec 2020 12:30:53 -0500
Jarod Wilson <[email protected]> wrote:
> Don't try to adjust XFRM support flags if the bond device isn't yet
> registered. Bad things can currently happen when netdev_change_features()
> is called without having wanted_features fully filled in yet. Basically,
> this code was racing against register_netdevice() filling in
> wanted_features, and when it got there first, the empty wanted_features
> led to features also getting emptied out, which was definitely not the
> intended behavior, so prevent that from happening.
>
> Originally, I'd hoped to stop adjusting wanted_features at all in the
> bonding driver, as it's documented as being something only the network
> core should touch, but we actually do need to do this to properly update
> both the features and wanted_features fields when changing the bond type,
> or we get to a situation where ethtool sees:
>
> esp-hw-offload: off [requested on]
>
> I do think we should be using netdev_update_features instead of
> netdev_change_features here though, so we only send notifiers when the
> features actually changed.
>
> v2: rework based on further testing and suggestions from ivecera
>
> Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
> Reported-by: Ivan Vecera <[email protected]>
> Suggested-by: Ivan Vecera <[email protected]>
> Cc: Jay Vosburgh <[email protected]>
> Cc: Veaceslav Falico <[email protected]>
> Cc: Andy Gospodarek <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Thomas Davis <[email protected]>
> Cc: [email protected]
> Signed-off-by: Jarod Wilson <[email protected]>
> ---
> drivers/net/bonding/bond_main.c | 10 ++++------
> drivers/net/bonding/bond_options.c | 6 +++++-
> 2 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index e0880a3840d7..5fe5232cc3f3 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -4746,15 +4746,13 @@ void bond_setup(struct net_device *bond_dev)
> NETIF_F_HW_VLAN_CTAG_FILTER;
>
> bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
> -#ifdef CONFIG_XFRM_OFFLOAD
> - bond_dev->hw_features |= BOND_XFRM_FEATURES;
> -#endif /* CONFIG_XFRM_OFFLOAD */
> bond_dev->features |= bond_dev->hw_features;
> bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> #ifdef CONFIG_XFRM_OFFLOAD
> - /* Disable XFRM features if this isn't an active-backup config */
> - if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
> - bond_dev->features &= ~BOND_XFRM_FEATURES;
> + bond_dev->hw_features |= BOND_XFRM_FEATURES;
> + /* Only enable XFRM features if this is an active-backup config */
> + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->features |= BOND_XFRM_FEATURES;
> #endif /* CONFIG_XFRM_OFFLOAD */
> }
>
> diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
> index 9abfaae1c6f7..19205cfac751 100644
> --- a/drivers/net/bonding/bond_options.c
> +++ b/drivers/net/bonding/bond_options.c
> @@ -768,11 +768,15 @@ static int bond_option_mode_set(struct bonding *bond,
> bond->params.tlb_dynamic_lb = 1;
>
> #ifdef CONFIG_XFRM_OFFLOAD
> + if (bond->dev->reg_state != NETREG_REGISTERED)
> + goto noreg;
> +
> if (newval->value == BOND_MODE_ACTIVEBACKUP)
> bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> else
> bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> - netdev_change_features(bond->dev);
> + netdev_update_features(bond->dev);
> +noreg:
> #endif /* CONFIG_XFRM_OFFLOAD */
>
> /* don't cache arp_validate between modes */
Tested-by: Ivan Vecera <[email protected]>
On Wed, 2 Dec 2020 12:30:53 -0500 Jarod Wilson wrote:
> + if (bond->dev->reg_state != NETREG_REGISTERED)
> + goto noreg;
> +
> if (newval->value == BOND_MODE_ACTIVEBACKUP)
> bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> else
> bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> - netdev_change_features(bond->dev);
> + netdev_update_features(bond->dev);
> +noreg:
Why the goto?
Jarod Wilson <[email protected]> wrote:
>Don't try to adjust XFRM support flags if the bond device isn't yet
>registered. Bad things can currently happen when netdev_change_features()
>is called without having wanted_features fully filled in yet. Basically,
>this code was racing against register_netdevice() filling in
>wanted_features, and when it got there first, the empty wanted_features
>led to features also getting emptied out, which was definitely not the
>intended behavior, so prevent that from happening.
Is this an actual race? Reading Ivan's prior message, it sounds
like it's an ordering problem (in that bond_newlink calls
register_netdevice after bond_changelink).
The change to bond_option_mode_set tests against reg_state, so
presumably it wants to skip the first(?) time through, before the
register_netdevice call; is that right?
-J
>Originally, I'd hoped to stop adjusting wanted_features at all in the
>bonding driver, as it's documented as being something only the network
>core should touch, but we actually do need to do this to properly update
>both the features and wanted_features fields when changing the bond type,
>or we get to a situation where ethtool sees:
>
> esp-hw-offload: off [requested on]
>
>I do think we should be using netdev_update_features instead of
>netdev_change_features here though, so we only send notifiers when the
>features actually changed.
>
>v2: rework based on further testing and suggestions from ivecera
>
>Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
>Reported-by: Ivan Vecera <[email protected]>
>Suggested-by: Ivan Vecera <[email protected]>
>Cc: Jay Vosburgh <[email protected]>
>Cc: Veaceslav Falico <[email protected]>
>Cc: Andy Gospodarek <[email protected]>
>Cc: "David S. Miller" <[email protected]>
>Cc: Jakub Kicinski <[email protected]>
>Cc: Thomas Davis <[email protected]>
>Cc: [email protected]
>Signed-off-by: Jarod Wilson <[email protected]>
>---
> drivers/net/bonding/bond_main.c | 10 ++++------
> drivers/net/bonding/bond_options.c | 6 +++++-
> 2 files changed, 9 insertions(+), 7 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index e0880a3840d7..5fe5232cc3f3 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -4746,15 +4746,13 @@ void bond_setup(struct net_device *bond_dev)
> NETIF_F_HW_VLAN_CTAG_FILTER;
>
> bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
>-#ifdef CONFIG_XFRM_OFFLOAD
>- bond_dev->hw_features |= BOND_XFRM_FEATURES;
>-#endif /* CONFIG_XFRM_OFFLOAD */
> bond_dev->features |= bond_dev->hw_features;
> bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> #ifdef CONFIG_XFRM_OFFLOAD
>- /* Disable XFRM features if this isn't an active-backup config */
>- if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
>- bond_dev->features &= ~BOND_XFRM_FEATURES;
>+ bond_dev->hw_features |= BOND_XFRM_FEATURES;
>+ /* Only enable XFRM features if this is an active-backup config */
>+ if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
>+ bond_dev->features |= BOND_XFRM_FEATURES;
> #endif /* CONFIG_XFRM_OFFLOAD */
> }
>
>diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
>index 9abfaae1c6f7..19205cfac751 100644
>--- a/drivers/net/bonding/bond_options.c
>+++ b/drivers/net/bonding/bond_options.c
>@@ -768,11 +768,15 @@ static int bond_option_mode_set(struct bonding *bond,
> bond->params.tlb_dynamic_lb = 1;
>
> #ifdef CONFIG_XFRM_OFFLOAD
>+ if (bond->dev->reg_state != NETREG_REGISTERED)
>+ goto noreg;
>+
> if (newval->value == BOND_MODE_ACTIVEBACKUP)
> bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> else
> bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
>- netdev_change_features(bond->dev);
>+ netdev_update_features(bond->dev);
>+noreg:
>
> #endif /* CONFIG_XFRM_OFFLOAD */
>
> /* don't cache arp_validate between modes */
>--
>2.28.0
>
---
-Jay Vosburgh, [email protected]
On Wed, Dec 2, 2020 at 12:53 PM Jakub Kicinski <[email protected]> wrote:
>
> On Wed, 2 Dec 2020 12:30:53 -0500 Jarod Wilson wrote:
> > + if (bond->dev->reg_state != NETREG_REGISTERED)
> > + goto noreg;
> > +
> > if (newval->value == BOND_MODE_ACTIVEBACKUP)
> > bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> > else
> > bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> > - netdev_change_features(bond->dev);
> > + netdev_update_features(bond->dev);
> > +noreg:
>
> Why the goto?
Seemed cleaner to prevent an extra level of indentation of the code
following the goto and before the label, but I'm not that attached to
it if it's not wanted for coding style reasons.
--
Jarod Wilson
[email protected]
On Wed, 2 Dec 2020 14:03:53 -0500 Jarod Wilson wrote:
> On Wed, Dec 2, 2020 at 12:53 PM Jakub Kicinski <[email protected]> wrote:
> >
> > On Wed, 2 Dec 2020 12:30:53 -0500 Jarod Wilson wrote:
> > > + if (bond->dev->reg_state != NETREG_REGISTERED)
> > > + goto noreg;
> > > +
> > > if (newval->value == BOND_MODE_ACTIVEBACKUP)
> > > bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> > > else
> > > bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> > > - netdev_change_features(bond->dev);
> > > + netdev_update_features(bond->dev);
> > > +noreg:
> >
> > Why the goto?
>
> Seemed cleaner to prevent an extra level of indentation of the code
> following the goto and before the label, but I'm not that attached to
> it if it's not wanted for coding style reasons.
Yes, please don't use gotos where a normal if statement is sufficient.
If you must avoid the indentation move the code to a helper.
Also - this patch did not apply to net, please make sure you're
developing on the correct base.
On Wed, Dec 2, 2020 at 12:55 PM Jay Vosburgh <[email protected]> wrote:
>
> Jarod Wilson <[email protected]> wrote:
>
> >Don't try to adjust XFRM support flags if the bond device isn't yet
> >registered. Bad things can currently happen when netdev_change_features()
> >is called without having wanted_features fully filled in yet. Basically,
> >this code was racing against register_netdevice() filling in
> >wanted_features, and when it got there first, the empty wanted_features
> >led to features also getting emptied out, which was definitely not the
> >intended behavior, so prevent that from happening.
>
> Is this an actual race? Reading Ivan's prior message, it sounds
> like it's an ordering problem (in that bond_newlink calls
> register_netdevice after bond_changelink).
Sorry, yeah, this is not actually a race condition, just an ordering
issue, bond_check_params() gets called at init time, which leads to
bond_option_mode_set() being called, and does so prior to
bond_create() running, which is where we actually call
register_netdevice().
> The change to bond_option_mode_set tests against reg_state, so
> presumably it wants to skip the first(?) time through, before the
> register_netdevice call; is that right?
Correct. Later on, when the bonding driver is already loaded, and
parameter changes are made, bond_option_mode_set() gets called and if
the mode changes to or from active-backup, we do need/want this code
to run to update wanted and features flags properly.
--
Jarod Wilson
[email protected]
On Wed, Dec 2, 2020 at 2:23 PM Jakub Kicinski <[email protected]> wrote:
>
> On Wed, 2 Dec 2020 14:03:53 -0500 Jarod Wilson wrote:
> > On Wed, Dec 2, 2020 at 12:53 PM Jakub Kicinski <[email protected]> wrote:
> > >
> > > On Wed, 2 Dec 2020 12:30:53 -0500 Jarod Wilson wrote:
> > > > + if (bond->dev->reg_state != NETREG_REGISTERED)
> > > > + goto noreg;
> > > > +
> > > > if (newval->value == BOND_MODE_ACTIVEBACKUP)
> > > > bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> > > > else
> > > > bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> > > > - netdev_change_features(bond->dev);
> > > > + netdev_update_features(bond->dev);
> > > > +noreg:
> > >
> > > Why the goto?
> >
> > Seemed cleaner to prevent an extra level of indentation of the code
> > following the goto and before the label, but I'm not that attached to
> > it if it's not wanted for coding style reasons.
>
> Yes, please don't use gotos where a normal if statement is sufficient.
> If you must avoid the indentation move the code to a helper.
>
> Also - this patch did not apply to net, please make sure you're
> developing on the correct base.
Argh, I must have been working in net-next instead of net, apologies.
Okay, I'll clarify the description per what Jay pointed out and adjust
the code to not include a goto, then make it on the right branch.
--
Jarod Wilson
[email protected]
Jarod Wilson <[email protected]> wrote:
>On Wed, Dec 2, 2020 at 12:55 PM Jay Vosburgh <[email protected]> wrote:
>>
>> Jarod Wilson <[email protected]> wrote:
>>
>> >Don't try to adjust XFRM support flags if the bond device isn't yet
>> >registered. Bad things can currently happen when netdev_change_features()
>> >is called without having wanted_features fully filled in yet. Basically,
>> >this code was racing against register_netdevice() filling in
>> >wanted_features, and when it got there first, the empty wanted_features
>> >led to features also getting emptied out, which was definitely not the
>> >intended behavior, so prevent that from happening.
>>
>> Is this an actual race? Reading Ivan's prior message, it sounds
>> like it's an ordering problem (in that bond_newlink calls
>> register_netdevice after bond_changelink).
>
>Sorry, yeah, this is not actually a race condition, just an ordering
>issue, bond_check_params() gets called at init time, which leads to
>bond_option_mode_set() being called, and does so prior to
>bond_create() running, which is where we actually call
>register_netdevice().
So this only happens if there's a "mode" module parameter? That
doesn't sound like the call path that Ivan described (coming in via
bond_newlink).
-J
>> The change to bond_option_mode_set tests against reg_state, so
>> presumably it wants to skip the first(?) time through, before the
>> register_netdevice call; is that right?
>
>Correct. Later on, when the bonding driver is already loaded, and
>parameter changes are made, bond_option_mode_set() gets called and if
>the mode changes to or from active-backup, we do need/want this code
>to run to update wanted and features flags properly.
>
>
>--
>Jarod Wilson
>[email protected]
---
-Jay Vosburgh, [email protected]
On Wed, Dec 2, 2020 at 3:17 PM Jay Vosburgh <[email protected]> wrote:
>
> Jarod Wilson <[email protected]> wrote:
>
> >On Wed, Dec 2, 2020 at 12:55 PM Jay Vosburgh <[email protected]> wrote:
> >>
> >> Jarod Wilson <[email protected]> wrote:
> >>
> >> >Don't try to adjust XFRM support flags if the bond device isn't yet
> >> >registered. Bad things can currently happen when netdev_change_features()
> >> >is called without having wanted_features fully filled in yet. Basically,
> >> >this code was racing against register_netdevice() filling in
> >> >wanted_features, and when it got there first, the empty wanted_features
> >> >led to features also getting emptied out, which was definitely not the
> >> >intended behavior, so prevent that from happening.
> >>
> >> Is this an actual race? Reading Ivan's prior message, it sounds
> >> like it's an ordering problem (in that bond_newlink calls
> >> register_netdevice after bond_changelink).
> >
> >Sorry, yeah, this is not actually a race condition, just an ordering
> >issue, bond_check_params() gets called at init time, which leads to
> >bond_option_mode_set() being called, and does so prior to
> >bond_create() running, which is where we actually call
> >register_netdevice().
>
> So this only happens if there's a "mode" module parameter? That
> doesn't sound like the call path that Ivan described (coming in via
> bond_newlink).
Ah. I think there's actually two different pathways that can trigger
this. The first is for bonds created at module load time, which I was
describing, the second is for a new bond created via bond_newlink()
after the bonding module is already loaded, as described by Ivan. Both
have the problem of bond_option_mode_set() running prior to
register_netdevice(). Of course, that would suggest every bond
currently comes up with unintentionally neutered flags, which I
neglected to catch in earlier testing and development.
--
Jarod Wilson
[email protected]
Don't try to adjust XFRM support flags if the bond device isn't yet
registered. Bad things can currently happen when netdev_change_features()
is called without having wanted_features fully filled in yet. This code
runs on post-module-load mode changes, as well as at module init time
and new bond creation time, and in the latter two scenarios, it is
running prior to register_netdevice() having been called and
subsequently filling in wanted_features. The empty wanted_features led
to features also getting emptied out, which was definitely not the
intended behavior, so prevent that from happening.
Originally, I'd hoped to stop adjusting wanted_features at all in the
bonding driver, as it's documented as being something only the network
core should touch, but we actually do need to do this to properly update
both the features and wanted_features fields when changing the bond type,
or we get to a situation where ethtool sees:
esp-hw-offload: off [requested on]
I do think we should be using netdev_update_features instead of
netdev_change_features here though, so we only send notifiers when the
features actually changed.
v2: rework based on further testing and suggestions from ivecera
v3: add helper function, remove goto, fix problem description
Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <[email protected]>
Suggested-by: Ivan Vecera <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Veaceslav Falico <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Thomas Davis <[email protected]>
Cc: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/bonding/bond_main.c | 10 ++++------
drivers/net/bonding/bond_options.c | 19 ++++++++++++++-----
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 47afc5938c26..7905534a763b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4747,15 +4747,13 @@ void bond_setup(struct net_device *bond_dev)
NETIF_F_HW_VLAN_CTAG_FILTER;
bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
-#ifdef CONFIG_XFRM_OFFLOAD
- bond_dev->hw_features |= BOND_XFRM_FEATURES;
-#endif /* CONFIG_XFRM_OFFLOAD */
bond_dev->features |= bond_dev->hw_features;
bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
#ifdef CONFIG_XFRM_OFFLOAD
- /* Disable XFRM features if this isn't an active-backup config */
- if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
- bond_dev->features &= ~BOND_XFRM_FEATURES;
+ bond_dev->hw_features |= BOND_XFRM_FEATURES;
+ /* Only enable XFRM features if this is an active-backup config */
+ if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->features |= BOND_XFRM_FEATURES;
#endif /* CONFIG_XFRM_OFFLOAD */
}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..1ae0e5ab8c67 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -745,6 +745,18 @@ const struct bond_option *bond_opt_get(unsigned int option)
return &bond_opts[option];
}
+#ifdef CONFIG_XFRM_OFFLOAD
+static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
+{
+ if (mode == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->wanted_features |= BOND_XFRM_FEATURES;
+ else
+ bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
+
+ netdev_update_features(bond_dev);
+}
+#endif /* CONFIG_XFRM_OFFLOAD */
+
static int bond_option_mode_set(struct bonding *bond,
const struct bond_opt_value *newval)
{
@@ -768,11 +780,8 @@ static int bond_option_mode_set(struct bonding *bond,
bond->params.tlb_dynamic_lb = 1;
#ifdef CONFIG_XFRM_OFFLOAD
- if (newval->value == BOND_MODE_ACTIVEBACKUP)
- bond->dev->wanted_features |= BOND_XFRM_FEATURES;
- else
- bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
- netdev_change_features(bond->dev);
+ if (bond->dev->reg_state == NETREG_REGISTERED)
+ bond_set_xfrm_features(bond->dev, newval->value);
#endif /* CONFIG_XFRM_OFFLOAD */
/* don't cache arp_validate between modes */
--
2.28.0
On Wed, 2 Dec 2020 19:43:57 -0500 Jarod Wilson wrote:
> Don't try to adjust XFRM support flags if the bond device isn't yet
> registered. Bad things can currently happen when netdev_change_features()
> is called without having wanted_features fully filled in yet. This code
> runs on post-module-load mode changes, as well as at module init time
> and new bond creation time, and in the latter two scenarios, it is
> running prior to register_netdevice() having been called and
> subsequently filling in wanted_features. The empty wanted_features led
> to features also getting emptied out, which was definitely not the
> intended behavior, so prevent that from happening.
>
> Originally, I'd hoped to stop adjusting wanted_features at all in the
> bonding driver, as it's documented as being something only the network
> core should touch, but we actually do need to do this to properly update
> both the features and wanted_features fields when changing the bond type,
> or we get to a situation where ethtool sees:
>
> esp-hw-offload: off [requested on]
>
> I do think we should be using netdev_update_features instead of
> netdev_change_features here though, so we only send notifiers when the
> features actually changed.
>
> v2: rework based on further testing and suggestions from ivecera
> v3: add helper function, remove goto, fix problem description
>
> Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
> Reported-by: Ivan Vecera <[email protected]>
> Suggested-by: Ivan Vecera <[email protected]>
> Cc: Jay Vosburgh <[email protected]>
> Cc: Veaceslav Falico <[email protected]>
> Cc: Andy Gospodarek <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: Thomas Davis <[email protected]>
> Cc: [email protected]
> Signed-off-by: Jarod Wilson <[email protected]>
> ---
> drivers/net/bonding/bond_main.c | 10 ++++------
> drivers/net/bonding/bond_options.c | 19 ++++++++++++++-----
> 2 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 47afc5938c26..7905534a763b 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -4747,15 +4747,13 @@ void bond_setup(struct net_device *bond_dev)
> NETIF_F_HW_VLAN_CTAG_FILTER;
>
> bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
> -#ifdef CONFIG_XFRM_OFFLOAD
> - bond_dev->hw_features |= BOND_XFRM_FEATURES;
> -#endif /* CONFIG_XFRM_OFFLOAD */
> bond_dev->features |= bond_dev->hw_features;
> bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> #ifdef CONFIG_XFRM_OFFLOAD
> - /* Disable XFRM features if this isn't an active-backup config */
> - if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
> - bond_dev->features &= ~BOND_XFRM_FEATURES;
> + bond_dev->hw_features |= BOND_XFRM_FEATURES;
> + /* Only enable XFRM features if this is an active-backup config */
> + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->features |= BOND_XFRM_FEATURES;
> #endif /* CONFIG_XFRM_OFFLOAD */
> }
>
> diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
> index 9abfaae1c6f7..1ae0e5ab8c67 100644
> --- a/drivers/net/bonding/bond_options.c
> +++ b/drivers/net/bonding/bond_options.c
> @@ -745,6 +745,18 @@ const struct bond_option *bond_opt_get(unsigned int option)
> return &bond_opts[option];
> }
>
> +#ifdef CONFIG_XFRM_OFFLOAD
> +static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
> +{
> + if (mode == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->wanted_features |= BOND_XFRM_FEATURES;
> + else
> + bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
> +
> + netdev_update_features(bond_dev);
> +}
> +#endif /* CONFIG_XFRM_OFFLOAD */
> +
> static int bond_option_mode_set(struct bonding *bond,
> const struct bond_opt_value *newval)
> {
> @@ -768,11 +780,8 @@ static int bond_option_mode_set(struct bonding *bond,
> bond->params.tlb_dynamic_lb = 1;
>
> #ifdef CONFIG_XFRM_OFFLOAD
> - if (newval->value == BOND_MODE_ACTIVEBACKUP)
> - bond->dev->wanted_features |= BOND_XFRM_FEATURES;
> - else
> - bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
> - netdev_change_features(bond->dev);
> + if (bond->dev->reg_state == NETREG_REGISTERED)
> + bond_set_xfrm_features(bond->dev, newval->value);
> #endif /* CONFIG_XFRM_OFFLOAD */
nit: let's narrow down the ifdef-enery
no need for the ifdef here, if the helper looks like this:
+static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
+{
+#ifdef CONFIG_XFRM_OFFLOAD
+ if (mode == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->wanted_features |= BOND_XFRM_FEATURES;
+ else
+ bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
+
+ netdev_update_features(bond_dev);
+#endif /* CONFIG_XFRM_OFFLOAD */
+}
Even better:
+static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
+{
+ if (!IS_ENABLED(CONFIG_XFRM_OFFLOAD))
+ return;
+
+ if (mode == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->wanted_features |= BOND_XFRM_FEATURES;
+ else
+ bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
+
+ netdev_update_features(bond_dev);
+}
(Assuming BOND_XFRM_FEATURES doesn't itself hide under an ifdef.)
>
> /* don't cache arp_validate between modes */
On Wed, 2 Dec 2020 19:43:57 -0500 Jarod Wilson wrote:
> bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
> -#ifdef CONFIG_XFRM_OFFLOAD
> - bond_dev->hw_features |= BOND_XFRM_FEATURES;
> -#endif /* CONFIG_XFRM_OFFLOAD */
> bond_dev->features |= bond_dev->hw_features;
> bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> #ifdef CONFIG_XFRM_OFFLOAD
> - /* Disable XFRM features if this isn't an active-backup config */
> - if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
> - bond_dev->features &= ~BOND_XFRM_FEATURES;
> + bond_dev->hw_features |= BOND_XFRM_FEATURES;
> + /* Only enable XFRM features if this is an active-backup config */
> + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->features |= BOND_XFRM_FEATURES;
> #endif /* CONFIG_XFRM_OFFLOAD */
This makes no functional change, or am I reading it wrong?
On Thu, Dec 3, 2020 at 11:50 AM Jakub Kicinski <[email protected]> wrote:
>
> On Wed, 2 Dec 2020 19:43:57 -0500 Jarod Wilson wrote:
> > bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
> > -#ifdef CONFIG_XFRM_OFFLOAD
> > - bond_dev->hw_features |= BOND_XFRM_FEATURES;
> > -#endif /* CONFIG_XFRM_OFFLOAD */
> > bond_dev->features |= bond_dev->hw_features;
> > bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> > #ifdef CONFIG_XFRM_OFFLOAD
> > - /* Disable XFRM features if this isn't an active-backup config */
> > - if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
> > - bond_dev->features &= ~BOND_XFRM_FEATURES;
> > + bond_dev->hw_features |= BOND_XFRM_FEATURES;
> > + /* Only enable XFRM features if this is an active-backup config */
> > + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> > + bond_dev->features |= BOND_XFRM_FEATURES;
> > #endif /* CONFIG_XFRM_OFFLOAD */
>
> This makes no functional change, or am I reading it wrong?
You are correct, there's ultimately no functional change there, it
primarily just condenses the code down to a single #ifdef block, and
doesn't add and then remove BOND_XFRM_FEATURES from
bond_dev->features, instead omitting it initially and only adding it
when in AB mode. I'd poked at the code in that area while trying to
get to the bottom of this, thought it made it more understandable, so
I left it in, but ultimately, it's not necessary to fix the problem
here.
--
Jarod Wilson
[email protected]
On Thu, 3 Dec 2020 22:14:12 -0500 Jarod Wilson wrote:
> On Thu, Dec 3, 2020 at 11:50 AM Jakub Kicinski <[email protected]> wrote:
> >
> > On Wed, 2 Dec 2020 19:43:57 -0500 Jarod Wilson wrote:
> > > bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
> > > -#ifdef CONFIG_XFRM_OFFLOAD
> > > - bond_dev->hw_features |= BOND_XFRM_FEATURES;
> > > -#endif /* CONFIG_XFRM_OFFLOAD */
> > > bond_dev->features |= bond_dev->hw_features;
> > > bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> > > #ifdef CONFIG_XFRM_OFFLOAD
> > > - /* Disable XFRM features if this isn't an active-backup config */
> > > - if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)
> > > - bond_dev->features &= ~BOND_XFRM_FEATURES;
> > > + bond_dev->hw_features |= BOND_XFRM_FEATURES;
> > > + /* Only enable XFRM features if this is an active-backup config */
> > > + if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> > > + bond_dev->features |= BOND_XFRM_FEATURES;
> > > #endif /* CONFIG_XFRM_OFFLOAD */
> >
> > This makes no functional change, or am I reading it wrong?
>
> You are correct, there's ultimately no functional change there, it
> primarily just condenses the code down to a single #ifdef block, and
> doesn't add and then remove BOND_XFRM_FEATURES from
> bond_dev->features, instead omitting it initially and only adding it
> when in AB mode. I'd poked at the code in that area while trying to
> get to the bottom of this, thought it made it more understandable, so
> I left it in, but ultimately, it's not necessary to fix the problem
> here.
Makes sense, but please split it out and send separately to net-next.
Don't try to adjust XFRM support flags if the bond device isn't yet
registered. Bad things can currently happen when netdev_change_features()
is called without having wanted_features fully filled in yet. This code
runs both on post-module-load mode changes, as well as at module init
time, and when run at module init time, it is before register_netdevice()
has been called and filled in wanted_features. The empty wanted_features
led to features also getting emptied out, which was definitely not the
intended behavior, so prevent that from happening.
Originally, I'd hoped to stop adjusting wanted_features at all in the
bonding driver, as it's documented as being something only the network
core should touch, but we actually do need to do this to properly update
both the features and wanted_features fields when changing the bond type,
or we get to a situation where ethtool sees:
esp-hw-offload: off [requested on]
I do think we should be using netdev_update_features instead of
netdev_change_features here though, so we only send notifiers when the
features actually changed.
Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera <[email protected]>
Suggested-by: Ivan Vecera <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Veaceslav Falico <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Thomas Davis <[email protected]>
Cc: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
v2: rework based on further testing and suggestions from ivecera
v3: add helper function, remove goto
v4: drop hunk not directly related to fix, clean up ifdeffery
drivers/net/bonding/bond_options.c | 22 +++++++++++++++-------
include/net/bonding.h | 2 --
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 9abfaae1c6f7..a4e4e15f574d 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -745,6 +745,19 @@ const struct bond_option *bond_opt_get(unsigned int option)
return &bond_opts[option];
}
+static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
+{
+ if (!IS_ENABLED(CONFIG_XFRM_OFFLOAD))
+ return;
+
+ if (mode == BOND_MODE_ACTIVEBACKUP)
+ bond_dev->wanted_features |= BOND_XFRM_FEATURES;
+ else
+ bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
+
+ netdev_update_features(bond_dev);
+}
+
static int bond_option_mode_set(struct bonding *bond,
const struct bond_opt_value *newval)
{
@@ -767,13 +780,8 @@ static int bond_option_mode_set(struct bonding *bond,
if (newval->value == BOND_MODE_ALB)
bond->params.tlb_dynamic_lb = 1;
-#ifdef CONFIG_XFRM_OFFLOAD
- if (newval->value == BOND_MODE_ACTIVEBACKUP)
- bond->dev->wanted_features |= BOND_XFRM_FEATURES;
- else
- bond->dev->wanted_features &= ~BOND_XFRM_FEATURES;
- netdev_change_features(bond->dev);
-#endif /* CONFIG_XFRM_OFFLOAD */
+ if (bond->dev->reg_state == NETREG_REGISTERED)
+ bond_set_xfrm_features(bond->dev, newval->value);
/* don't cache arp_validate between modes */
bond->params.arp_validate = BOND_ARP_VALIDATE_NONE;
diff --git a/include/net/bonding.h b/include/net/bonding.h
index d9d0ff3b0ad3..adc3da776970 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -86,10 +86,8 @@
#define bond_for_each_slave_rcu(bond, pos, iter) \
netdev_for_each_lower_private_rcu((bond)->dev, pos, iter)
-#ifdef CONFIG_XFRM_OFFLOAD
#define BOND_XFRM_FEATURES (NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
NETIF_F_GSO_ESP)
-#endif /* CONFIG_XFRM_OFFLOAD */
#ifdef CONFIG_NET_POLL_CONTROLLER
extern atomic_t netpoll_block_tx;
--
2.28.0
On Thu, Dec 3, 2020 at 11:45 AM Jakub Kicinski <[email protected]> wrote:
...
> nit: let's narrow down the ifdef-enery
>
> no need for the ifdef here, if the helper looks like this:
>
> +static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
> +{
> +#ifdef CONFIG_XFRM_OFFLOAD
> + if (mode == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->wanted_features |= BOND_XFRM_FEATURES;
> + else
> + bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
> +
> + netdev_update_features(bond_dev);
> +#endif /* CONFIG_XFRM_OFFLOAD */
> +}
>
> Even better:
>
> +static void bond_set_xfrm_features(struct net_device *bond_dev, u64 mode)
> +{
> + if (!IS_ENABLED(CONFIG_XFRM_OFFLOAD))
> + return;
> +
> + if (mode == BOND_MODE_ACTIVEBACKUP)
> + bond_dev->wanted_features |= BOND_XFRM_FEATURES;
> + else
> + bond_dev->wanted_features &= ~BOND_XFRM_FEATURES;
> +
> + netdev_update_features(bond_dev);
> +}
>
> (Assuming BOND_XFRM_FEATURES doesn't itself hide under an ifdef.)
It is, but doesn't need to be. I can mix these changes in as well.
--
Jarod Wilson
[email protected]
On Sat, 5 Dec 2020 12:22:29 -0500 Jarod Wilson wrote:
> Don't try to adjust XFRM support flags if the bond device isn't yet
> registered. Bad things can currently happen when netdev_change_features()
> is called without having wanted_features fully filled in yet. This code
> runs both on post-module-load mode changes, as well as at module init
> time, and when run at module init time, it is before register_netdevice()
> has been called and filled in wanted_features. The empty wanted_features
> led to features also getting emptied out, which was definitely not the
> intended behavior, so prevent that from happening.
>
> Originally, I'd hoped to stop adjusting wanted_features at all in the
> bonding driver, as it's documented as being something only the network
> core should touch, but we actually do need to do this to properly update
> both the features and wanted_features fields when changing the bond type,
> or we get to a situation where ethtool sees:
>
> esp-hw-offload: off [requested on]
>
> I do think we should be using netdev_update_features instead of
> netdev_change_features here though, so we only send notifiers when the
> features actually changed.
>
> Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
> Reported-by: Ivan Vecera <[email protected]>
> Suggested-by: Ivan Vecera <[email protected]>
Applied, thanks!