Resend to netdev. LKML CCed in case anyone in the wider kernel community
can suggest a way forward. Please CC responses if replying only to LKML.
It seems that this 4+ year old regression in the r8169 driver (documented in
this thread on netdev beginning on 9 March 2013) will never be fixed,
despite the identification of the commit which broke it. Cards using this
driver will therefore remain unusable for certain workloads utilising UDP.
----- Original message from 14 Nov 2017 -----
Date: Tue, 14 Nov 2017 16:09:23 +1030
From: Jonathan Woithe
To: [email protected]
Subject: Re: r8169 regression: UDP packets dropped intermittantly
As far as I am aware there were no follow up comments to my last post on
this subject on 24 March 2017. The text of that post is included below for
reference. To summarise: a short test program which reliably triggered the
problem was written in the hope it would assist in the repair of this
regression.
Today I ran the tests on the 4.14 kernel. The problem is still present. If
the same machine is run under a 4.3 kernel with the hacked r8169 driver the
problem does not occur. Using the 4.3 r8169 driver triggers the problem.
It also works without trouble under 2.6.35.11 (the kernel we've stuck with
due to the problem affecting most newer kernels).
To recap the history of this thread, the misbehaviour of the r8169 driver in
the presence of small UDP packets affects kernels newer than 3.3. The
initial post in this thread was on 9 March 2013. The regression was
introduced with commit da78dbff2e05630921c551dbbc70a4b7981a8fff.
Since this regression has persisted for more than 4 years, is there any
chance that it will be fixed? The inability to run newer kernels has
prevented us from providing them as upgrades in our products. If this
problem in the r8169 driver will never be fixed, it seems we'll have to find
a supply of a PCI/PCIe NIC which doesn't utilise this driver. Of course
this won't help those whose systems in the field are fitted with the
r8169-based card.
Regards
jonathan
Post from Mar 24, 2017:
> On Thu, Jun 23, 2016 at 01:22:50AM +0200, Francois Romieu wrote:
> > Jonathan Woithe <[email protected]> :
> > [...]
> > > to mainline (in which case I'll keep watching out for it)? Or is the
> > > out-of-tree workaround mentioned above considered to be the long term
> > > fix for those who encounter the problem?
> >
> > It's a workaround. Nothing less, nothing more.
>
> Recently I have had a chance to revisit this issue. I have written a
> program (r8196-test, source is included below) which recreates the problem
> without requiring our external hardware devices. That is, this program
> triggers the fault when run between two networked computers. To use, two
> PCs are needed. One (the "master") has an rtl8169 network card fitted (ours
> has a Netgear GA311, but the problem has been seen with others too from
> memory). The network hardware of the other computer (the "slave") isn't
> important. First run
>
> ./r8196-test
>
> on the slave, followed by
>
> ./r8196-test <IPv4 address of slave>
>
> on the master. When running stock kernel version 4.3 the master stops
> reliably within a minute or so with a timeout, indicating (in this case)
> that the response packet never arrived within the 0.5 second timeout period.
> The ID whose response was never received by the master is reported as having
> been seen (and a response sent) by the slave.
>
> If I substitute the forward ported r8169 driver mentioned earlier in this
> thread into kernel 4.3, the above program sequence runs seemingly
> indefinitely without any timeouts (runtime is beyond two hours as of this
> writing, compared to tens of seconds with the standard driver).
>
> This demonstrates that the problem is independent of our custom network
> devices and allows the fault to be recreated using commodity hardware.
>
> Does this make it any easier to develop a mainline fix for the regression?
>
> Regards
> jonathan
>
> /*
> * To test, the "master" mode is run on a PC with an RTL-8169 card.
> * The "slave" mode is run on any other PC. "Master" mode is activated
> * by providing the IP of the slave PC on the command line. The slave
> * should be started before the master; without a running slave the master
> * will time out.
> *
> * This code is in the public domain.
> */
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <stdio.h>
> #include <netinet/in.h>
> #include <arpa/inet.h>
> #include <string.h>
> #include <unistd.h>
>
> #include <errno.h>
>
> unsigned char ping_payload[] = {
> 0x00, 0x00,
> 0x00, 0x00, 0x00, 0x00,
> };
>
> #define PING_PAYLOAD_SIZE 6
>
> unsigned char ack_payload[] = {
> 0x12, 0x34,
> 0x01, 0x01, 0x00, 0x00,
> 0x00, 0x00, 0x00, 0x00,
> 0x00, 0x00, 0x00, 0x00,
> };
>
> #define ACK_PAYLOAD_SIZE 14
>
> #define UDP_PORT 49491
>
> signed int open_udp(const char *target_addr)
> {
> struct sockaddr_in local_addr;
> struct timeval tv;
> int sock;
>
> sock = socket(PF_INET,SOCK_DGRAM, 0);
> if (sock < 0) {
> return -1;
> }
>
> tv.tv_sec = 0;
> tv.tv_usec = 500000;
> setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, (char *)&tv, sizeof(tv));
> setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(tv));
>
> memset(&local_addr, 0, sizeof(local_addr));
> local_addr.sin_family = AF_INET;
> local_addr.sin_addr.s_addr = INADDR_ANY;
> local_addr.sin_port = htons(49491);
> if (bind(sock, (struct sockaddr *)&local_addr,
> sizeof(struct sockaddr)) < 0) {
> return -1;
> }
>
> if (target_addr != NULL) {
> struct sockaddr_in dest_addr;
> memset(&dest_addr, 0, sizeof(dest_addr));
> dest_addr.sin_family = AF_INET;
> dest_addr.sin_port = htons(49491);
> if (inet_aton(target_addr, &dest_addr.sin_addr) < 0) {
> return -1;
> }
> if (connect(sock, (struct sockaddr *)&dest_addr,
> sizeof(dest_addr)) < 0) {
> return -1;
> }
> }
> return sock;
> }
>
> void master(const char *target_addr)
> {
> signed int id = 0;
> int sock = open_udp(target_addr);
>
> printf("master()\n");
> if (sock < 0) {
> return;
> }
>
> for (;; id++) {
> unsigned char buf[1024];
> signed int n;
> ping_payload[0] = id & 0xff;
> if (send(sock, ping_payload, PING_PAYLOAD_SIZE, 0) < 0) {
> break;
> }
> n = recv(sock, buf, sizeof(buf), 0);
> if (n == -1) {
> if (errno == EAGAIN) {
> printf("id 0x%02x: no response received (timeout)\n",
> ping_payload[0]);
> break;
> }
> } else {
> printf("id 0x%02x: recv %d\n", buf[0], n);
> }
> usleep(10000);
> }
> close(sock);
> }
>
> void slave()
> {
> int sock = open_udp(NULL);
>
> printf("slave()\n");
> if (sock < 0) {
> return;
> }
>
> for ( ; ; ) {
> struct sockaddr master_addr;
> unsigned char buf[1024];
> signed int n;
>
> socklen_t len = sizeof(master_addr);
> n = recvfrom(sock, buf, sizeof(buf), 0, &master_addr, &len);
> if (n == PING_PAYLOAD_SIZE) {
> printf("id 0x%02x: recv %d, sending %d\n", buf[0], n,
> ACK_PAYLOAD_SIZE);
> ack_payload[0] = buf[0];
> sendto(sock, ack_payload, ACK_PAYLOAD_SIZE, 0, &master_addr, len);
> }
> }
>
> close(sock);
> }
>
> int main(int argc, char *argv[]) {
> if (argc > 1) {
> master(argv[1]);
> } else {
> slave();
> }
> return 0;
> }
On 12/18/17 06:49, Jonathan Woithe wrote:
> Resend to netdev. LKML CCed in case anyone in the wider kernel community
> can suggest a way forward. Please CC responses if replying only to LKML.
>
> It seems that this 4+ year old regression in the r8169 driver (documented in
> this thread on netdev beginning on 9 March 2013) will never be fixed,
> despite the identification of the commit which broke it. Cards using this
> driver will therefore remain unusable for certain workloads utilising UDP.
(snip)
Hi,
Since I've seen your postings several times now with no comment or resolution
I've decided to try your reproducer on my own systems. In short, I cannot
reproduce any packet loss, despite having 2 (cheap) 1Gb switches between the
two machines. Both are running 4.14.7.
Both NICs are onboard PCIe and identify as:
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
$ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl8168e-3_0.0.4 03/27/12
..
Both machines are from 2012, so quite dated already. Nevertheless your
reproducer runs forever and all I see is 6 bytes request, 14 bytes response,
with no drops. Not one. I tried in both directions - no difference.
I realize this doesn't actually solve your immediate problem, but it is
nevertheless an indicator that whatever you have been observing is caused
by something else.
regards,
Holger
Hi Holger
On Mon, Dec 18, 2017 at 02:38:53PM +0100, Holger Hoffst?tte wrote:
> On 12/18/17 06:49, Jonathan Woithe wrote:
> > Resend to netdev. LKML CCed in case anyone in the wider kernel community
> > can suggest a way forward. Please CC responses if replying only to LKML.
> >
> > It seems that this 4+ year old regression in the r8169 driver (documented in
> > this thread on netdev beginning on 9 March 2013) will never be fixed,
> > despite the identification of the commit which broke it. Cards using this
> > driver will therefore remain unusable for certain workloads utilising UDP.
> (snip)
>
> Since I've seen your postings several times now with no comment or resolution
> I've decided to try your reproducer on my own systems. In short, I cannot
> reproduce any packet loss, despite having 2 (cheap) 1Gb switches between the
> two machines. Both are running 4.14.7.
Thanks for trying the test program on your system. The result indicates
that the problem might be specific to the behaviour of a particular network
variant of the r8169 chip. The systems we use are all equipped with a
PCI Netgear GA311 card, which identifies as
05:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
Subsystem: Netgear GA311
Respective IDs are
05:01.0 0200: 10ec:8169 (rev 10)
Subsystem: 1385:311a
> Both NICs are onboard PCIe
This is a significant difference between your test systems and ours: the
cards we are using are PCI and are not onboard.
> Nevertheless your reproducer runs forever and all I see is 6 bytes
> request, 14 bytes response, with no drops. Not one. I tried in both
> directions - no difference.
That's very interesting. On the system noted above with the GA311 the
packet sequence certainly works most of the time. However, within an hour
the 14 byte response will not be seen by the system which sent the 6 byte
request. The slave sees the 6 byte request and sends the 14 byte response:
the problem is in the master (the system sending the 6 byte request). The
NIC in the slave or kernel version running on the slave does not affect the
result.
> I realize this doesn't actually solve your immediate problem, but it is
> nevertheless an indicator that whatever you have been observing is caused
> by something else.
The inability to trigger the problem on your systems could be due to the
NICs in use. That is an obvious difference between our system (which
reliably experiences the problem) and yours (which doesn't). This may
indicate that only certain variants of the r8169 chip are affected, which
obviously complicates things.
In any case, this tester (and the production program with which the problem
was first noticed) work perfectly until commit
da78dbff2e05630921c551dbbc70a4b7981a8fff (identified with git bisect).
Furthermore, when the pre-da78dbff...981a8fff driver was ported to 4.3 as a
test the problem was resolved, verified over a week of continuous testing;
the standard 4.3 reliably triggered the problem within minutes. Of course
the ported driver isn't a viable long term solution since it's essentially
an out of tree driver.
It's hard to see how this problem is unrelated to da78dbff...981a8fff.
Before this commit, everything worked fine. While keeping everything else on
the system unchanged, applying this single commit to the r8169 driver causes
the problem.
Thank you again for running the tests.
Regards
jonathan
Hi again
This is a follow up to my earlier message.
On Tue, Dec 19, 2017 at 09:02:25AM +1030, Jonathan Woithe wrote:
> On Mon, Dec 18, 2017 at 02:38:53PM +0100, Holger Hoffst?tte wrote:
> > Since I've seen your postings several times now with no comment or resolution
> > I've decided to try your reproducer on my own systems. In short, I cannot
> > reproduce any packet loss, despite having 2 (cheap) 1Gb switches between the
> > two machines. Both are running 4.14.7.
>
> Thanks for trying the test program on your system. The result indicates
> that the problem might be specific to the behaviour of a particular network
> variant of the r8169 chip.
I was able to temporarily acquire a PCIe card which uses the r8169 driver.
This allowed me to run the reproducer on the same machine with two different
r8169-based cards. The original NIC is this:
05:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10) [10ec:8169]
Subsystem: Netgear GA311 [1385:311a]
The PCIe card is this:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B
PCI Express Gigabit Ethernet controller (rev 06) [10ec:8168]
Subsystem: Realtek Semiconductor Co., Ltd. Device 0123 [10ec:0123]
The test was conducted with kernel 4.3.0 since both the 4.3.0 driver (which
triggers the fault) and the forward ported driver (which predates commit
da78dbff2e05630921c551dbbc70a4b7981a8fff) was available. For the record,
the machine used as the slave in these tests (the one receiving the 6 byte
request and sending the 14 byte response) was using its onboard NIC:
00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network
Connection (rev 05) [8086:1503]
Subsystem: Gigabyte Technology Co., Ltd 82579V Gigabit Network
Connection [1458:e000]
Test outcomes were as follows:
PCIe card, unpatched 4.3.0 r8169 driver: no error (tested for 1 hour)
PCIe card, forward ported r8169 driver: no error (tested for 1 hour)
GA311 card, unpatched 4.3.0 r8169 driver: test fail in under 4 minutes
GA311 card, forward ported r8169 driver: no error (tested for 1 hour)
For completeness, I then booted 4.14 and repeated the test with its r8168
driver. The PCIe card ran for an hour without triggering the error, while
the GA311 triggered it quickly (in under 3 minutes).
This clearly indicates that not every card using the r8169 driver is
vulnerable to the problem. It also explains why Holger was unable to
reproduce the result on his system: the PCIe cards do not appear to suffer
from the problem. Most likely the PCI RTL-8169 chip is affected, but newer
PCIe variations do not. However, obviously more testing will be required
with a wider variety of cards if this inference is to hold up.
The above result (and those from Holger) allow the problem description to be
refined a little: changes in commit da78dbff2e05630921c551dbbc70a4b7981a8fff
cause GA311 NICs (and possibly other PCI cards using an RTL-8169) to have
trouble with small UDP packets, while PCIe variants are seemingly
unaffected.
Does this help?
Regards
jonathan
On Tue, Dec 19, 2017 at 04:15:32PM +1030, Jonathan Woithe wrote:
> This clearly indicates that not every card using the r8169 driver is
> vulnerable to the problem. It also explains why Holger was unable to
> reproduce the result on his system: the PCIe cards do not appear to suffer
> from the problem. Most likely the PCI RTL-8169 chip is affected, but newer
> PCIe variations do not. However, obviously more testing will be required
> with a wider variety of cards if this inference is to hold up.
The r8169 driver supports many slightly different variants of the chip.
To identify your variant more precisely, look for a line like
r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at 0xffffc90003135000, d4:3d:7e:2a:30:08, XID 0c900800 IRQ 38
in kernel log.
Michal Kubecek
On Tue, Dec 19, 2017 at 01:25:23PM +0100, Michal Kubecek wrote:
> On Tue, Dec 19, 2017 at 04:15:32PM +1030, Jonathan Woithe wrote:
> > This clearly indicates that not every card using the r8169 driver is
> > vulnerable to the problem. It also explains why Holger was unable to
> > reproduce the result on his system: the PCIe cards do not appear to suffer
> > from the problem. Most likely the PCI RTL-8169 chip is affected, but newer
> > PCIe variations do not. However, obviously more testing will be required
> > with a wider variety of cards if this inference is to hold up.
>
> The r8169 driver supports many slightly different variants of the chip.
> To identify your variant more precisely, look for a line like
>
> r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at 0xffffc90003135000, d4:3d:7e:2a:30:08, XID 0c900800 IRQ 38
>
> in kernel log.
The PCIe card (the one which works correctly with the current driver) shows
this:
r8169 0000:02:00.0 eth0: RTL8168e/8111e at 0xf862e000, 80:1f:02:45:25:a4,
XID 0c200000 IRQ 30
r8169 0000:02:00.0 eth0: jumbo features [frames: 9200 bytes,
tx checksumming: ko]
The PCI card (Netgear GA311) which is affected by the problem shows this:
r8169 0000:05:01.0 eth1: RTL8110s at 0xf8706800, e0:91:f5:1b:5f:c6,
XID 04000000 IRQ 22
r8169 0000:05:01.0 eth1: jumbo features [frames: 7152 bytes,
tx checksumming: ok]
The system which has shown the regressed behaviour is running a 32-bit
kernel; for various reasons we can't move to a 64-bit kernel at present.
However, I was able to boot this system using Slackware 14.2 install discs,
and therefore test using both 32-bit and 64-bit 4.4.14 kernels. In both
cases the fault was observed within 30 minutes of starting the tests when
the GA311 card was in use. The fault is therefore not specific to 32-bit
environments.
Regards
jonathan
On Wed, Dec 20, 2017 at 03:50:11PM +1030, Jonathan Woithe wrote:
> On Tue, Dec 19, 2017 at 01:25:23PM +0100, Michal Kubecek wrote:
> > On Tue, Dec 19, 2017 at 04:15:32PM +1030, Jonathan Woithe wrote:
> > > This clearly indicates that not every card using the r8169 driver is
> > > vulnerable to the problem. It also explains why Holger was unable to
> > > reproduce the result on his system: the PCIe cards do not appear to suffer
> > > from the problem. Most likely the PCI RTL-8169 chip is affected, but newer
> > > PCIe variations do not. However, obviously more testing will be required
> > > with a wider variety of cards if this inference is to hold up.
> >
> > The r8169 driver supports many slightly different variants of the chip.
> > To identify your variant more precisely, look for a line like
> >
> > r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at 0xffffc90003135000, d4:3d:7e:2a:30:08, XID 0c900800 IRQ 38
> >
> > in kernel log.
>
> The PCIe card (the one which works correctly with the current driver) shows
> this:
>
> r8169 0000:02:00.0 eth0: RTL8168e/8111e at 0xf862e000, 80:1f:02:45:25:a4,
> XID 0c200000 IRQ 30
> r8169 0000:02:00.0 eth0: jumbo features [frames: 9200 bytes,
> tx checksumming: ko]
>
> The PCI card (Netgear GA311) which is affected by the problem shows this:
>
> r8169 0000:05:01.0 eth1: RTL8110s at 0xf8706800, e0:91:f5:1b:5f:c6,
> XID 04000000 IRQ 22
> r8169 0000:05:01.0 eth1: jumbo features [frames: 7152 bytes,
> tx checksumming: ok]
>
> The system which has shown the regressed behaviour is running a 32-bit
> kernel; for various reasons we can't move to a 64-bit kernel at present.
> However, I was able to boot this system using Slackware 14.2 install discs,
> and therefore test using both 32-bit and 64-bit 4.4.14 kernels. In both
> cases the fault was observed within 30 minutes of starting the tests when
> the GA311 card was in use. The fault is therefore not specific to 32-bit
> environments.
Is there any more information that can be provided (or tests done) to assist
in tracking this problem down? Based on the tests done in December it seems
that the problem only affects specific RTL-8169 variants, with most being
ok. Is it a case that we simply need to accept that for the greater good
commit da78dbff2e05630921c551dbbc70a4b7981a8fff has permanently broken
Netgear GA311 [1] network cards with respect to these UDP packets and that
nothing can be done?
Regards
jonathan
[1] Or perhaps any using the RTL8110s variant.
On Mon, Jan 15, 2018 at 05:26:59PM +1030, Jonathan Woithe wrote:
> Is there any more information that can be provided (or tests done) to assist
> in tracking this problem down? Based on the tests done in December it seems
> that the problem only affects specific RTL-8169 variants, with most being
> ok. Is it a case that we simply need to accept that for the greater good
> commit da78dbff2e05630921c551dbbc70a4b7981a8fff has permanently broken
> Netgear GA311 [1] network cards with respect to these UDP packets and that
> nothing can be done?
For future reference, commit 6b839b6cf9eada30b086effb51e5d6076bafc761
("r8169: fix NAPI handling under high load") appears to have fixed the
regression documented by this thread. Thanks to Heiner Kallweit for the
work which lead to this solution.
Regards
jonathan
> [1] Or perhaps any using the RTL8110s variant.