2023-08-04 18:08:37

by Nick Bowler

[permalink] [raw]
Subject: PROBLEM: Broken or delayed ethernet on Xilinx ZCU104 since 5.18 (regression)

Hi,

With recent kernels (5.18 and newer) the ethernet is all wonky on my
ZCU104 board.

There is some behaviour inconsistency between kernel versions identified
during bisection, so maybe there is more than one issue with the ethernet?

6.5-rc4: after 10 seconds, the following message is printed:

[ 10.761808] platform ff0e0000.ethernet: deferred probe pending

but the network device seemingly never appears (I waited about a minute).

6.1 and 6.4: after 10 seconds, the device suddenly appears and starts
working (but this is way too late).

5.18: the device never appears and no unusual messages are printed
(I waited ten minutes).

With 5.17 and earlier versions, the eth0 device appears without any delay.

Unfortunately, as bisection closed on the problematic section, all the
built kernels became untestable as they appear to crash during early
boot. Nevertheless, I manually selected a commit that sounded relevant:

commit e461bd6f43f4e568f7436a8b6bc21c4ce6914c36
Author: Robert Hancock <[email protected]>
Date: Thu Jan 27 10:37:36 2022 -0600

arm64: dts: zynqmp: Added GEM reset definitions

Reverting this fixes the problem on 5.18. Reverting this fixes the
problem on 6.1. Reverting this fixes the problem on 6.4. In all of
these versions, with this change reverted, the network device appears
without delay.

Unfortunately, it seems this is not sufficient to correct the problem on
6.5-rc4 -- there is no apparent change in behaviour, so maybe there is
a new, different problem?

I guess I can kick off another bisection to find out when this revert
stops fixing things...

Let me know if you need any more info!

Thanks,
Nick