2020-08-27 18:44:40

by Lennart Sorensen

[permalink] [raw]
Subject: VRRP not working on i40e X722 S2600WFT

I have hit a new problem with the X722 chipset (Intel R1304WFT server).
VRRP simply does not work.

When keepalived registers a vmac interface, and starts transmitting
multicast packets with the vrp message, it never receives those packets
from the peers, so all nodes think they are the master. tcpdump shows
transmits, but no receives. If I stop keepalived, which deletes the
vmac interface, then I start to receive the multicast packets from the
other nodes. Even in promisc mode, tcpdump can't see those packets.

So it seems the hardware is dropping all packets with a source mac that
matches the source mac of the vmac interface, even when the destination
is a multicast address that was subcribed to. This is clearly not
proper behaviour.

I tried a stock 5.8 kernel to check if a driver update helped, and updated
the nvm firware to the latest 4.10 (which appears to be over a year old),
and nothing changes the behaviour at all.

Seems other people have hit this problem too:
http://mails.dpdk.org/archives/users/2018-May/003128.html

Unless someone has a way to fix this, we will have to change away from
this hardware very quickly. The IPsec NAT RSS defect we could tolerate
although didn't like, while this is just unworkable.

Quite frustrated by this. Intel network hardware was always great,
how did the X722 make it out in this state.

--
Len Sorensen


2020-08-28 15:58:26

by Lennart Sorensen

[permalink] [raw]
Subject: Re: VRRP not working on i40e X722 S2600WFT

On Thu, Aug 27, 2020 at 02:30:39PM -0400, Lennart Sorensen wrote:
> I have hit a new problem with the X722 chipset (Intel R1304WFT server).
> VRRP simply does not work.
>
> When keepalived registers a vmac interface, and starts transmitting
> multicast packets with the vrp message, it never receives those packets
> from the peers, so all nodes think they are the master. tcpdump shows
> transmits, but no receives. If I stop keepalived, which deletes the
> vmac interface, then I start to receive the multicast packets from the
> other nodes. Even in promisc mode, tcpdump can't see those packets.
>
> So it seems the hardware is dropping all packets with a source mac that
> matches the source mac of the vmac interface, even when the destination
> is a multicast address that was subcribed to. This is clearly not
> proper behaviour.
>
> I tried a stock 5.8 kernel to check if a driver update helped, and updated
> the nvm firware to the latest 4.10 (which appears to be over a year old),
> and nothing changes the behaviour at all.
>
> Seems other people have hit this problem too:
> http://mails.dpdk.org/archives/users/2018-May/003128.html
>
> Unless someone has a way to fix this, we will have to change away from
> this hardware very quickly. The IPsec NAT RSS defect we could tolerate
> although didn't like, while this is just unworkable.
>
> Quite frustrated by this. Intel network hardware was always great,
> how did the X722 make it out in this state.

Another case with the same problem on an X710:

https://www.talkend.net/post/13256.html

--
Len Sorensen

2020-08-31 20:33:19

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [Intel-wired-lan] VRRP not working on i40e X722 S2600WFT

Lennart Sorensen wrote:

> On Thu, Aug 27, 2020 at 02:30:39PM -0400, Lennart Sorensen wrote:
> > I have hit a new problem with the X722 chipset (Intel R1304WFT server).
> > VRRP simply does not work.
> >
> > When keepalived registers a vmac interface, and starts transmitting
> > multicast packets with the vrp message, it never receives those packets
> > from the peers, so all nodes think they are the master. tcpdump shows
> > transmits, but no receives. If I stop keepalived, which deletes the
> > vmac interface, then I start to receive the multicast packets from the
> > other nodes. Even in promisc mode, tcpdump can't see those packets.
> >
> > So it seems the hardware is dropping all packets with a source mac that
> > matches the source mac of the vmac interface, even when the destination
> > is a multicast address that was subcribed to. This is clearly not
> > proper behaviour.

Thanks for the report Lennart, I understand your frustration, as this
should probably work without user configuration.

However, please give this command a try:
ethtool --set-priv-flags ethX disable-source-pruning on


> > I tried a stock 5.8 kernel to check if a driver update helped, and updated
> > the nvm firware to the latest 4.10 (which appears to be over a year old),
> > and nothing changes the behaviour at all.
> >
> > Seems other people have hit this problem too:
> > http://mails.dpdk.org/archives/users/2018-May/003128.html
> >
> > Unless someone has a way to fix this, we will have to change away from
> > this hardware very quickly. The IPsec NAT RSS defect we could tolerate
> > although didn't like, while this is just unworkable.
> >
> > Quite frustrated by this. Intel network hardware was always great,
> > how did the X722 make it out in this state.
>
> Another case with the same problem on an X710:
>
> https://www.talkend.net/post/13256.html

I don't know how to reply to this other thread, but it is about DPDK,
which would require a code change or further investigation to issue the
right command to the hardware to disable source pruning.

2020-09-01 01:36:47

by Lennart Sorensen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] VRRP not working on i40e X722 S2600WFT

On Mon, Aug 31, 2020 at 10:35:12AM -0700, Jesse Brandeburg wrote:
> Thanks for the report Lennart, I understand your frustration, as this
> should probably work without user configuration.
>
> However, please give this command a try:
> ethtool --set-priv-flags ethX disable-source-pruning on

Hmm, our 4.9 kernel is just a touch too old to support that. And yes
that really should not require a flag to be set, given the card has no
reason to ever do that pruning. There is no justification you could
have for doing it in the first place.

--
Len Sorensen

2020-09-02 14:08:25

by Lennart Sorensen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] VRRP not working on i40e X722 S2600WFT

On Mon, Aug 31, 2020 at 09:35:19PM -0400, wrote:
> On Mon, Aug 31, 2020 at 10:35:12AM -0700, Jesse Brandeburg wrote:
> > Thanks for the report Lennart, I understand your frustration, as this
> > should probably work without user configuration.
> >
> > However, please give this command a try:
> > ethtool --set-priv-flags ethX disable-source-pruning on
>
> Hmm, our 4.9 kernel is just a touch too old to support that. And yes
> that really should not require a flag to be set, given the card has no
> reason to ever do that pruning. There is no justification you could
> have for doing it in the first place.

So backporting the patch that enabled that flag does allow it to work.
Of course there isn't a particularly good place to put an ethtool command
in the boot up to make sure it runs before vrrp is started. This has to
be the default. I know I wasted about a week trying things to get this to
work, and clearly lots of other people have wasted a ton of time on this
"feature" too (calling it a feature is clearly wrong, it is a bug).

By default the NIC should work as expected. Any weird questionable
optimizations have to be turned on by the user explicitly when they
understand the consequences. I can't find any use case documented
anywhere for this bug, I can only find things it has broken (like
apparently arp monitoring on bonding, and vrrp).

So who should make the patch to change this to be the default? Clearly
the current behaviour is harming and confusing more people than could
possibly be impacted by changing the current default setting for the flag
(in fact I would just about be willing to bet there are no people that
want the current behaviour. After all no other NIC does this, so clearly
there is no need for it to be done).

--
Len Sorensen