2019-05-23 11:02:23

by Paul Menzel

[permalink] [raw]
Subject: r8169: Link only up after 16 s (A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.)

Dear Linux folks,


I optimized the Linux kernel configuration on my ASRock E350M1, and it now
boots really fast.

Unfortunately, that seems to cause the network driver to hit some corner
case, so that the link is supposedly down, although it should be up. The
cable is plugged in the whole time.

```
[ 2.990757] libphy: r8169: probed
[ 2.992661] r8169 0000:03:00.0 eth0: RTL8168e/8111e, bc:5f:f4:c8:d3:98, XID 2c2, IRQ 28
[ 2.992669] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
[ 3.294484] usb 5-2: new low-speed USB device number 2 using ohci-pci
[ 3.458711] usb 5-2: New USB device found, idVendor=1241, idProduct=1122, bcdDevice= 1.00
[ 3.458718] usb 5-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 3.485065] input: HID 1241:1122 as /devices/pci0000:00/0000:00:12.0/usb5/5-2/5-2:1.0/0003:1241:1122.0001/input/input14
[ 3.485320] hid-generic 0003:1241:1122.0001: input,hidraw0: USB HID v1.00 Mouse [HID 1241:1122] on usb-0000:00:12.0-2/input0
[ 3.967622] random: crng init done
[ 3.967628] random: 7 urandom warning(s) missed due to ratelimiting
[ 4.323449] r8169 0000:03:00.0 enp3s0: renamed from eth0
[ 4.363774] RTL8211DN Gigabit Ethernet r8169-300:00: attached PHY driver [RTL8211DN Gigabit Ethernet] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
[ 4.576887] r8169 0000:03:00.0 enp3s0: Link is Down
[ 4.577167] A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.
[ 16.377520] r8169 0000:03:00.0 enp3s0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 16.377553] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
```

It happens with all Linux kernels I tried. Please find all Linux
messages attached.

Could you please tell me, how this can be debugged and solved?


Kind regards,

Paul


Attachments:
20190516--asrock-e350m1-dmesg-r8169.txt (48.47 kB)
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature
Download all attachments

2019-05-23 17:47:49

by Heiner Kallweit

[permalink] [raw]
Subject: Re: r8169: Link only up after 16 s (A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.)

On 23.05.2019 13:00, Paul Menzel wrote:
> Dear Linux folks,
>
>
> I optimized the Linux kernel configuration on my ASRock E350M1, and it now
> boots really fast.
>
> Unfortunately, that seems to cause the network driver to hit some corner
> case, so that the link is supposedly down, although it should be up. The
> cable is plugged in the whole time.
>
> ```
> [ 2.990757] libphy: r8169: probed
> [ 2.992661] r8169 0000:03:00.0 eth0: RTL8168e/8111e, bc:5f:f4:c8:d3:98, XID 2c2, IRQ 28
> [ 2.992669] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
> [ 3.294484] usb 5-2: new low-speed USB device number 2 using ohci-pci
> [ 3.458711] usb 5-2: New USB device found, idVendor=1241, idProduct=1122, bcdDevice= 1.00
> [ 3.458718] usb 5-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
> [ 3.485065] input: HID 1241:1122 as /devices/pci0000:00/0000:00:12.0/usb5/5-2/5-2:1.0/0003:1241:1122.0001/input/input14
> [ 3.485320] hid-generic 0003:1241:1122.0001: input,hidraw0: USB HID v1.00 Mouse [HID 1241:1122] on usb-0000:00:12.0-2/input0
> [ 3.967622] random: crng init done
> [ 3.967628] random: 7 urandom warning(s) missed due to ratelimiting
> [ 4.323449] r8169 0000:03:00.0 enp3s0: renamed from eth0
> [ 4.363774] RTL8211DN Gigabit Ethernet r8169-300:00: attached PHY driver [RTL8211DN Gigabit Ethernet] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
> [ 4.576887] r8169 0000:03:00.0 enp3s0: Link is Down
> [ 4.577167] A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.
> [ 16.377520] r8169 0000:03:00.0 enp3s0: Link is Up - 100Mbps/Full - flow control rx/tx
> [ 16.377553] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
> ```
>
> It happens with all Linux kernels I tried. Please find all Linux
> messages attached.
>
> Could you please tell me, how this can be debugged and solved?
>
This warning is triggered by errors in do_setlink() in net/core/rtnetlink.c
I'd say:
1. Which kernel config options did you change as part of the optimization?
(If I understand you correctly the warning didn't pop up before.)
2. Try to find out which call in do_setlink() fails and which errno is returned.

>
> Kind regards,
>
> Paul
>
Heiner

2019-05-24 15:16:20

by Paul Menzel

[permalink] [raw]
Subject: Re: r8169: Link only up after 16 s (A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.)

Dear Heiner,


Thank you for the quick reply.

On 05/23/19 19:44, Heiner Kallweit wrote:
> On 23.05.2019 13:00, Paul Menzel wrote:

>> I optimized the Linux kernel configuration on my ASRock E350M1, and it now
>> boots really fast.
>>
>> Unfortunately, that seems to cause the network driver to hit some corner
>> case, so that the link is supposedly down, although it should be up. The
>> cable is plugged in the whole time.
>>
>> ```
>> [ 2.990757] libphy: r8169: probed
>> [ 2.992661] r8169 0000:03:00.0 eth0: RTL8168e/8111e, bc:5f:f4:c8:d3:98, XID 2c2, IRQ 28
>> [ 2.992669] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>> [ 3.294484] usb 5-2: new low-speed USB device number 2 using ohci-pci
>> [ 3.458711] usb 5-2: New USB device found, idVendor=1241, idProduct=1122, bcdDevice= 1.00
>> [ 3.458718] usb 5-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
>> [ 3.485065] input: HID 1241:1122 as /devices/pci0000:00/0000:00:12.0/usb5/5-2/5-2:1.0/0003:1241:1122.0001/input/input14
>> [ 3.485320] hid-generic 0003:1241:1122.0001: input,hidraw0: USB HID v1.00 Mouse [HID 1241:1122] on usb-0000:00:12.0-2/input0
>> [ 3.967622] random: crng init done
>> [ 3.967628] random: 7 urandom warning(s) missed due to ratelimiting
>> [ 4.323449] r8169 0000:03:00.0 enp3s0: renamed from eth0
>> [ 4.363774] RTL8211DN Gigabit Ethernet r8169-300:00: attached PHY driver [RTL8211DN Gigabit Ethernet] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
>> [ 4.576887] r8169 0000:03:00.0 enp3s0: Link is Down
>> [ 4.577167] A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.
>> [ 16.377520] r8169 0000:03:00.0 enp3s0: Link is Up - 100Mbps/Full - flow control rx/tx
>> [ 16.377553] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
>> ```
>>
>> It happens with all Linux kernels I tried. Please find all Linux
>> messages attached.
>>
>> Could you please tell me, how this can be debugged and solved?
>>
> This warning is triggered by errors in do_setlink() in net/core/rtnetlink.c
> I'd say:
> 1. Which kernel config options did you change as part of the optimization?
> (If I understand you correctly the warning didn't pop up before.)

Sorry for being unclear. The same problem happens with Debian
Sid/unstable’s default Linux kernel 4.19.0-5 (4.19.37) [1]. With the fast
boot (six seconds from pressing power button to Weston) I just noticed, that
the network was not set up when wanting to use it.

> 2. Try to find out which call in do_setlink() fails and which errno
> is returned.

Yeah, that’s where I need help. ;-)

I applied the simple change below to `net/core/rtnetlink.c`.

if (err < 0)
- net_warn_ratelimited("A link change request failed with some changes committed already. Interface %s may have been left with an inconsistent configuration, please check.\n",
- dev->name);
+ net_warn_ratelimited("A link change request failed with some changes committed already (err = %i). Interface %s may have been left with an inconsistent configuration, please check.\n",
+ dev->name, err);

I get different results each time.

-304123904
-332128256

Any idea, how that can happen?

Is there a better way to debug `do_setlink()` in `rtnetlink.c` than
printf debugging?


Kind regards,

Paul


Attachments:
20190524--asrock-e350m1-linux-4.19.0-5-dmesg-r8169.txt (115.72 kB)
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature
Download all attachments

2019-05-24 17:58:37

by Heiner Kallweit

[permalink] [raw]
Subject: Re: r8169: Link only up after 16 s (A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.)

On 24.05.2019 17:14, Paul Menzel wrote:
> Dear Heiner,
>
>
> Thank you for the quick reply.
>
> On 05/23/19 19:44, Heiner Kallweit wrote:
>> On 23.05.2019 13:00, Paul Menzel wrote:
>
>>> I optimized the Linux kernel configuration on my ASRock E350M1, and it now
>>> boots really fast.
>>>
>>> Unfortunately, that seems to cause the network driver to hit some corner
>>> case, so that the link is supposedly down, although it should be up. The
>>> cable is plugged in the whole time.
>>>
>>> ```
>>> [ 2.990757] libphy: r8169: probed
>>> [ 2.992661] r8169 0000:03:00.0 eth0: RTL8168e/8111e, bc:5f:f4:c8:d3:98, XID 2c2, IRQ 28
>>> [ 2.992669] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>> [ 3.294484] usb 5-2: new low-speed USB device number 2 using ohci-pci
>>> [ 3.458711] usb 5-2: New USB device found, idVendor=1241, idProduct=1122, bcdDevice= 1.00
>>> [ 3.458718] usb 5-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
>>> [ 3.485065] input: HID 1241:1122 as /devices/pci0000:00/0000:00:12.0/usb5/5-2/5-2:1.0/0003:1241:1122.0001/input/input14
>>> [ 3.485320] hid-generic 0003:1241:1122.0001: input,hidraw0: USB HID v1.00 Mouse [HID 1241:1122] on usb-0000:00:12.0-2/input0
>>> [ 3.967622] random: crng init done
>>> [ 3.967628] random: 7 urandom warning(s) missed due to ratelimiting
>>> [ 4.323449] r8169 0000:03:00.0 enp3s0: renamed from eth0
>>> [ 4.363774] RTL8211DN Gigabit Ethernet r8169-300:00: attached PHY driver [RTL8211DN Gigabit Ethernet] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
>>> [ 4.576887] r8169 0000:03:00.0 enp3s0: Link is Down
>>> [ 4.577167] A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.
>>> [ 16.377520] r8169 0000:03:00.0 enp3s0: Link is Up - 100Mbps/Full - flow control rx/tx
>>> [ 16.377553] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
>>> ```
>>>
>>> It happens with all Linux kernels I tried. Please find all Linux
>>> messages attached.
>>>
>>> Could you please tell me, how this can be debugged and solved?
>>>
>> This warning is triggered by errors in do_setlink() in net/core/rtnetlink.c
>> I'd say:
>> 1. Which kernel config options did you change as part of the optimization?
>> (If I understand you correctly the warning didn't pop up before.)
>
> Sorry for being unclear. The same problem happens with Debian
> Sid/unstable’s default Linux kernel 4.19.0-5 (4.19.37) [1]. With the fast
> boot (six seconds from pressing power button to Weston) I just noticed, that
> the network was not set up when wanting to use it.
>
>> 2. Try to find out which call in do_setlink() fails and which errno
>> is returned.
>
> Yeah, that’s where I need help. ;-)
>
> I applied the simple change below to `net/core/rtnetlink.c`.
>
> if (err < 0)
> - net_warn_ratelimited("A link change request failed with some changes committed already. Interface %s may have been left with an inconsistent configuration, please check.\n",
> - dev->name);
> + net_warn_ratelimited("A link change request failed with some changes committed already (err = %i). Interface %s may have been left with an inconsistent configuration, please check.\n",
> + dev->name, err);
>
> I get different results each time.
>
> -304123904
> -332128256
>
> Any idea, how that can happen?
>
Instead of %i you should use %d, and the order of arguments needs to be reversed.
But this won't help you to find out which call in do_setlink() failed.
There are several occurrences of code like this:

err = fct();
if (err < 0)
goto errout;

Best change each such occurrence to the following to find out which call failed.

err = fct();
if (err < 0) {
pr_err("do_setlink: <fct>: err %d\n", err)
goto errout;
}

> Is there a better way to debug `do_setlink()` in `rtnetlink.c` than
> printf debugging?
>
Not really
>
> Kind regards,
>
> Paul
>

2019-05-24 18:07:46

by Joe Perches

[permalink] [raw]
Subject: Re: r8169: Link only up after 16 s (A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.)

On Fri, 2019-05-24 at 19:55 +0200, Heiner Kallweit wrote:
> On 24.05.2019 17:14, Paul Menzel wrote:
> > I applied the simple change below to `net/core/rtnetlink.c`.
> >
> > if (err < 0)
> > - net_warn_ratelimited("A link change request failed with some changes committed already. Interface %s may have been left with an inconsistent configuration, please check.\n",
> > - dev->name);
> > + net_warn_ratelimited("A link change request failed with some changes committed already (err = %i). Interface %s may have been left with an inconsistent configuration, please check.\n",
> > + dev->name, err);
> >
> > I get different results each time.
> >
> > -304123904
> > -332128256
> >
> > Any idea, how that can happen?
> >
> Instead of %i you should use %d, and the order of arguments needs to be reversed.

Doesn't the patch generate a compilation warning?