Dear Linux folks,
Occasionally, Linux outputs the message below on the workstation Dell
OptiPlex 5040 MT.
TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
gives some hits [1][2].
```
$ sudo ethtool -i net00
driver: e1000e
version: 3.2.6-k
firmware-version: 0.8-4
expansion-rom-version:
bus-info: 0000:00:1f.6
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
```
Can the driver e1000e be improved?
Any idea, what triggers this, as I do not see it every boot? Download
of big files?
Kind regards,
Paul
[1]: https://access.redhat.com/solutions/3152971
"Why following error "TCP: ensX: Driver has suspect GRO implementation, TCP performance may be compromised" are seen in system log file ?"
[2]: https://patchwork.ozlabs.org/patch/723007/
On 5/28/19 8:42 AM, Paul Menzel wrote:
> Dear Linux folks,
>
>
> Occasionally, Linux outputs the message below on the workstation Dell
> OptiPlex 5040 MT.
>
> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
>
> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
> gives some hits [1][2].
>
> ```
> $ sudo ethtool -i net00
> driver: e1000e
> version: 3.2.6-k
> firmware-version: 0.8-4
> expansion-rom-version:
> bus-info: 0000:00:1f.6
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
> ```
>
> Can the driver e1000e be improved?
>
> Any idea, what triggers this, as I do not see it every boot? Download
> of big files?
>
Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
rcu_read_lock();
dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
if (!dev || len >= dev->mtu)
- pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
- dev ? dev->name : "Unknown driver");
+ pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
+ " len %u mtu %u\n",
+ dev ? dev->name : "Unknown driver",
+ len, dev ? dev->mtu : 0);
rcu_read_unlock();
}
}
Dear Eric,
Thank you for the quick reply.
On 05/28/19 19:18, Eric Dumazet wrote:
> On 5/28/19 8:42 AM, Paul Menzel wrote:
>> Occasionally, Linux outputs the message below on the workstation Dell
>> OptiPlex 5040 MT.
>>
>> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
>>
>> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
>> gives some hits [1][2].
>>
>> ```
>> $ sudo ethtool -i net00
>> driver: e1000e
>> version: 3.2.6-k
>> firmware-version: 0.8-4
>> expansion-rom-version:
>> bus-info: 0000:00:1f.6
>> supports-statistics: yes
>> supports-test: yes
>> supports-eeprom-access: yes
>> supports-register-dump: yes
>> supports-priv-flags: no
>> ```
>>
>> Can the driver e1000e be improved?
>>
>> Any idea, what triggers this, as I do not see it every boot? Download
>> of big files?
>>
> Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
> rcu_read_lock();
> dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
> if (!dev || len >= dev->mtu)
> - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
> - dev ? dev->name : "Unknown driver");
> + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
> + " len %u mtu %u\n",
> + dev ? dev->name : "Unknown driver",
> + len, dev ? dev->mtu : 0);
> rcu_read_unlock();
> }
> }
I applied your patch on commit 9fb67d643 (Merge tag 'pinctrl-v5.2-2' of
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl):
[ 5507.291769] TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised. len 1856 mtu 1500
Kind regards,
Paul
On Wed, May 29, 2019 at 7:49 AM Paul Menzel <[email protected]> wrote:
>
> Dear Eric,
>
>
> Thank you for the quick reply.
>
> On 05/28/19 19:18, Eric Dumazet wrote:
> > On 5/28/19 8:42 AM, Paul Menzel wrote:
>
> >> Occasionally, Linux outputs the message below on the workstation Dell
> >> OptiPlex 5040 MT.
> >>
> >> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
> >>
> >> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
> >> gives some hits [1][2].
> >>
> >> ```
> >> $ sudo ethtool -i net00
> >> driver: e1000e
> >> version: 3.2.6-k
> >> firmware-version: 0.8-4
> >> expansion-rom-version:
> >> bus-info: 0000:00:1f.6
> >> supports-statistics: yes
> >> supports-test: yes
> >> supports-eeprom-access: yes
> >> supports-register-dump: yes
> >> supports-priv-flags: no
> >> ```
> >>
> >> Can the driver e1000e be improved?
> >>
> >> Any idea, what triggers this, as I do not see it every boot? Download
> >> of big files?
> >>
> > Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
> > rcu_read_lock();
> > dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
> > if (!dev || len >= dev->mtu)
> > - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
> > - dev ? dev->name : "Unknown driver");
> > + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
> > + " len %u mtu %u\n",
> > + dev ? dev->name : "Unknown driver",
> > + len, dev ? dev->mtu : 0);
> > rcu_read_unlock();
> > }
> > }
>
> I applied your patch on commit 9fb67d643 (Merge tag 'pinctrl-v5.2-2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl):
>
> [ 5507.291769] TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised. len 1856 mtu 1500
The 'GRO' in the warning can be probably ignored, since this NIC does
not implement its own GRO.
You can confirm this with this debug patch:
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0e09bede42a2bd2c912366a68863a52a22def8ee..014a43ce77e09664bda0568dd118064b006acd67
100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -561,6 +561,9 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
if (staterr & E1000_RXD_STAT_VP)
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), tag);
+ if (skb->len > netdev->mtu)
+ pr_err_ratelimited("received packet bigger (%u) than
MTU (%u)\n",
+ skb->len, netdev->mtu);
napi_gro_receive(&adapter->napi, skb);
}
On Wed, 29 May 2019 09:00:54 -0700
Eric Dumazet <[email protected]> wrote:
> On Wed, May 29, 2019 at 7:49 AM Paul Menzel <[email protected]> wrote:
> >
> > Dear Eric,
> >
> >
> > Thank you for the quick reply.
> >
> > On 05/28/19 19:18, Eric Dumazet wrote:
> > > On 5/28/19 8:42 AM, Paul Menzel wrote:
> >
> > >> Occasionally, Linux outputs the message below on the workstation Dell
> > >> OptiPlex 5040 MT.
> > >>
> > >> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
> > >>
> > >> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
> > >> gives some hits [1][2].
> > >>
> > >> ```
> > >> $ sudo ethtool -i net00
> > >> driver: e1000e
> > >> version: 3.2.6-k
> > >> firmware-version: 0.8-4
> > >> expansion-rom-version:
> > >> bus-info: 0000:00:1f.6
> > >> supports-statistics: yes
> > >> supports-test: yes
> > >> supports-eeprom-access: yes
> > >> supports-register-dump: yes
> > >> supports-priv-flags: no
> > >> ```
> > >>
> > >> Can the driver e1000e be improved?
> > >>
> > >> Any idea, what triggers this, as I do not see it every boot? Download
> > >> of big files?
> > >>
> > > Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
> > >
> > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > > index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
> > > --- a/net/ipv4/tcp_input.c
> > > +++ b/net/ipv4/tcp_input.c
> > > @@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
> > > rcu_read_lock();
> > > dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
> > > if (!dev || len >= dev->mtu)
> > > - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
> > > - dev ? dev->name : "Unknown driver");
> > > + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
> > > + " len %u mtu %u\n",
> > > + dev ? dev->name : "Unknown driver",
> > > + len, dev ? dev->mtu : 0);
> > > rcu_read_unlock();
> > > }
> > > }
> >
> > I applied your patch on commit 9fb67d643 (Merge tag 'pinctrl-v5.2-2' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl):
> >
> > [ 5507.291769] TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised. len 1856 mtu 1500
>
>
> The 'GRO' in the warning can be probably ignored, since this NIC does
> not implement its own GRO.
>
> You can confirm this with this debug patch:
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 0e09bede42a2bd2c912366a68863a52a22def8ee..014a43ce77e09664bda0568dd118064b006acd67
> 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -561,6 +561,9 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
> if (staterr & E1000_RXD_STAT_VP)
> __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), tag);
>
> + if (skb->len > netdev->mtu)
> + pr_err_ratelimited("received packet bigger (%u) than
> MTU (%u)\n",
> + skb->len, netdev->mtu);
> napi_gro_receive(&adapter->napi, skb);
> }
I think e1000 is one of those devices that only has receive limit as power of 2.
Therefore frames up to 2K can be received.
There always some confusion in Linux about whether MTU is transmit only or devices
have to enforce it on receive.
On Wed, May 29, 2019 at 9:38 AM Stephen Hemminger
<[email protected]> wrote:
>
> On Wed, 29 May 2019 09:00:54 -0700
> Eric Dumazet <[email protected]> wrote:
>
> > On Wed, May 29, 2019 at 7:49 AM Paul Menzel <[email protected]> wrote:
> > >
> > > Dear Eric,
> > >
> > >
> > > Thank you for the quick reply.
> > >
> > > On 05/28/19 19:18, Eric Dumazet wrote:
> > > > On 5/28/19 8:42 AM, Paul Menzel wrote:
> > >
> > > >> Occasionally, Linux outputs the message below on the workstation Dell
> > > >> OptiPlex 5040 MT.
> > > >>
> > > >> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
> > > >>
> > > >> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
> > > >> gives some hits [1][2].
> > > >>
> > > >> ```
> > > >> $ sudo ethtool -i net00
> > > >> driver: e1000e
> > > >> version: 3.2.6-k
> > > >> firmware-version: 0.8-4
> > > >> expansion-rom-version:
> > > >> bus-info: 0000:00:1f.6
> > > >> supports-statistics: yes
> > > >> supports-test: yes
> > > >> supports-eeprom-access: yes
> > > >> supports-register-dump: yes
> > > >> supports-priv-flags: no
> > > >> ```
> > > >>
> > > >> Can the driver e1000e be improved?
> > > >>
> > > >> Any idea, what triggers this, as I do not see it every boot? Download
> > > >> of big files?
> > > >>
> > > > Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
> > > >
> > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > > > index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
> > > > --- a/net/ipv4/tcp_input.c
> > > > +++ b/net/ipv4/tcp_input.c
> > > > @@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
> > > > rcu_read_lock();
> > > > dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
> > > > if (!dev || len >= dev->mtu)
> > > > - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
> > > > - dev ? dev->name : "Unknown driver");
> > > > + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
> > > > + " len %u mtu %u\n",
> > > > + dev ? dev->name : "Unknown driver",
> > > > + len, dev ? dev->mtu : 0);
> > > > rcu_read_unlock();
> > > > }
> > > > }
> > >
> > > I applied your patch on commit 9fb67d643 (Merge tag 'pinctrl-v5.2-2' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl):
> > >
> > > [ 5507.291769] TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised. len 1856 mtu 1500
> >
> >
> > The 'GRO' in the warning can be probably ignored, since this NIC does
> > not implement its own GRO.
> >
> > You can confirm this with this debug patch:
> >
> > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> > b/drivers/net/ethernet/intel/e1000e/netdev.c
> > index 0e09bede42a2bd2c912366a68863a52a22def8ee..014a43ce77e09664bda0568dd118064b006acd67
> > 100644
> > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > @@ -561,6 +561,9 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
> > if (staterr & E1000_RXD_STAT_VP)
> > __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), tag);
> >
> > + if (skb->len > netdev->mtu)
> > + pr_err_ratelimited("received packet bigger (%u) than
> > MTU (%u)\n",
> > + skb->len, netdev->mtu);
> > napi_gro_receive(&adapter->napi, skb);
> > }
>
> I think e1000 is one of those devices that only has receive limit as power of 2.
> Therefore frames up to 2K can be received.
>
> There always some confusion in Linux about whether MTU is transmit only or devices
> have to enforce it on receive.
Actually I think there are some parts that don't have any receive
limits that are supported by the e1000 part. What ends up happening is
that we only drop the packet if it spans more than one buffer if I
recall correctly, and buffer size is determined by MTU.
I always thought MTU only applied to transmit since it is kind of in
the name. As a result I am pretty sure igb and ixgbe will be able to
trigger this warning under certain circumstances as well. Also what
about the case where someone sets the MTU to less than 1500? I think
most NICs probably don't update their limits in such a case and
wouldn't it also trigger a similar error?
On 5/30/19 3:52 PM, Alexander Duyck wrote:
> Actually I think there are some parts that don't have any receive
> limits that are supported by the e1000 part. What ends up happening is
> that we only drop the packet if it spans more than one buffer if I
> recall correctly, and buffer size is determined by MTU.
>
> I always thought MTU only applied to transmit since it is kind of in
> the name. As a result I am pretty sure igb and ixgbe will be able to
> trigger this warning under certain circumstances as well. Also what
> about the case where someone sets the MTU to less than 1500? I think
> most NICs probably don't update their limits in such a case and
> wouldn't it also trigger a similar error?
>
Linux does not have a notion of MRU, mtu is used for both tx and rx.
Most NIC drivers allocate skb of the maximal size (derived from netdev->mtu)
and program the NIC to drop packets bigger than X bytes (X also derived from netdev->mtu)
Another interesting point is that Paul host is receiving big packets,
that means that one host in his local network is overriding the 1500 MTU :)
Eventually we could add a netdev->mru and allow few drivers to set
their maximal mru, if bigger than netdev->mtu.
e1000e would probably set netdev->mru to 2048 - sizeof(ethernet headers), if
the driver is operating at MTU = 1500
Dear Eric,
Sorry for the late reply.
On 5/29/19 6:00 PM, Eric Dumazet wrote:
> On Wed, May 29, 2019 at 7:49 AM Paul Menzel <[email protected]> wrote:
>> On 05/28/19 19:18, Eric Dumazet wrote:
>>> On 5/28/19 8:42 AM, Paul Menzel wrote:
>>
>>>> Occasionally, Linux outputs the message below on the workstation Dell
>>>> OptiPlex 5040 MT.
>>>>
>>>> TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised.
>>>>
>>>> Linux 4.14.55 and Linux 5.2-rc2 show the message, and the WWW also
>>>> gives some hits [1][2].
>>>>
>>>> ```
>>>> $ sudo ethtool -i net00
>>>> driver: e1000e
>>>> version: 3.2.6-k
>>>> firmware-version: 0.8-4
>>>> expansion-rom-version:
>>>> bus-info: 0000:00:1f.6
>>>> supports-statistics: yes
>>>> supports-test: yes
>>>> supports-eeprom-access: yes
>>>> supports-register-dump: yes
>>>> supports-priv-flags: no
>>>> ```
>>>>
>>>> Can the driver e1000e be improved?
>>>>
>>>> Any idea, what triggers this, as I do not see it every boot? Download
>>>> of big files?
>>>>
>>> Maybe the driver/NIC can receive frames bigger than MTU, although this would be strange.
>>>
>>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>>> index c61edd023b352123e2a77465782e0d32689e96b0..cb0194f66125bcba427e6e7e3cacf0c93040ef61 100644
>>> --- a/net/ipv4/tcp_input.c
>>> +++ b/net/ipv4/tcp_input.c
>>> @@ -150,8 +150,10 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
>>> rcu_read_lock();
>>> dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
>>> if (!dev || len >= dev->mtu)
>>> - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n",
>>> - dev ? dev->name : "Unknown driver");
>>> + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised."
>>> + " len %u mtu %u\n",
>>> + dev ? dev->name : "Unknown driver",
>>> + len, dev ? dev->mtu : 0);
>>> rcu_read_unlock();
>>> }
>>> }
>>
>> I applied your patch on commit 9fb67d643 (Merge tag 'pinctrl-v5.2-2' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl):
>>
>> [ 5507.291769] TCP: net00: Driver has suspect GRO implementation, TCP performance may be compromised. len 1856 mtu 1500
>
>
> The 'GRO' in the warning can be probably ignored, since this NIC does
> not implement its own GRO.
>
> You can confirm this with this debug patch:
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 0e09bede42a2bd2c912366a68863a52a22def8ee..014a43ce77e09664bda0568dd118064b006acd67
> 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -561,6 +561,9 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
> if (staterr & E1000_RXD_STAT_VP)
> __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), tag);
>
> + if (skb->len > netdev->mtu)
> + pr_err_ratelimited("received packet bigger (%u) than
> MTU (%u)\n",
> + skb->len, netdev->mtu);
> napi_gro_receive(&adapter->napi, skb);
> }
With this patch applied, I unfortunately could not trigger the condition
anymore. No idea why. Or is that expected?
(As a side note, plain Linux 5.2.2 still shows the warning.)
Kind regards,
Paul