Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47548C6FD1C for ; Tue, 14 Mar 2023 12:42:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230450AbjCNMmd (ORCPT ); Tue, 14 Mar 2023 08:42:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55086 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230207AbjCNMmW (ORCPT ); Tue, 14 Mar 2023 08:42:22 -0400 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4FC89CBFF; Tue, 14 Mar 2023 05:41:51 -0700 (PDT) Received: from maxwell ([109.43.51.107]) by mrelayeu.kundenserver.de (mreue009 [213.165.67.97]) with ESMTPSA (Nemesis) id 1MQ5nE-1ppD5d3FTc-00M497; Tue, 14 Mar 2023 13:40:57 +0100 References: <87h6vd64xa.fsf@henneberg-systemdesign.com> <87cz6164ld.fsf@henneberg-systemdesign.com> User-agent: mu4e 1.8.14; emacs 28.2 From: Jochen Henneberg To: Giuseppe Cavallaro , Alexandre Torgue , Jose Abreu , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Maxime Coquelin , Ong Boon Leong , netdev@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: Issue: stmmac reset by netdev watchdog due to tx queue timeout Date: Tue, 14 Mar 2023 13:40:11 +0100 In-reply-to: <87cz6164ld.fsf@henneberg-systemdesign.com> Message-ID: <87cz5bh5ew.fsf@henneberg-systemdesign.com> MIME-Version: 1.0 Content-Type: text/plain X-Provags-ID: V03:K1:XAsVuPhV29HoiAzPmLttekv+iaGxoYm51el+jvzZKKToSJAFHgX 3zmDFHshyINcsOvD+7quB22wu6jqcMs8KkU9M9fnuf++vBrK1OebDGcz0ClpVceVklNJr8p HMOUULDBxbwpqwg96gLiDH+x1ibA5+gXDlr5SFJR89HkyYo5WOZ76ob0aaNorPZzXHBUbF5 RtmVnDHKnaSHJDqqnDuzQ== UI-OutboundReport: notjunk:1;M01:P0:awvgBpVUxmM=;3bXBDwrVzeIzgACWhUdf4lTfnY/ 0g0LKGYJsEddLRdRoNoFG6j1RwcVrGI/OsY5jKjfc7UlGaS+qFO89Pa8jpT3Z1dh+ZoCZMlZ3 z9BX3DXoax1lbe6fvBJ/a9/jDwKmH/Qzf/Z+TdwpOieaTMuqnG9yPzz9GAljedsKopfTXVWFx lUaHStIXSQ9cWoXKpUG4lsM7ChfOkJy9ymY63RJNt/7W1P3qTT13o5LwnICs2g6KQu74laBgj mVRq+N4858+YB4KsGFHAEGRie7oNgQLYxpgkDmYQT3wMin8ar2662CtcGWauyQeEy7rXfKDz/ vRWkyaVjEb90Br1HcAVfX1o66vv928o3fzWC90MafQv0mY0xpJ79blLhhM1xtEKBrh9isnMGk V8g1Xz24N4rrCz9SYgyUq3JjT84xn4lmV0NbaHrDSIhF2Fs88y4Pay5pQDdfCz7FwgC0DoDud CmMwJs+bpmWYnWRJZCqWpUB8vcO7IPxv504t3opyA8OMBEwP3B+HmUoPoZEpnb4LgUFC4cXsZ xsAYMzRzkrwXwaGR4StHAobcg9vOpuvbJTOnAuFX8x18fu+f/8pxoX3VDZK25HoVZH9SUSzIO Q0xrUWJzeqISVkJTQHtRsGvH7cLfava08JPjmcusyfZOTPiK5udeQmU0QBjHKPVWDajZ3Sdhz d+kjw/boAUffX0wUIBRA3X1nD59s8ZciUn5K00b2qQ== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jochen Henneberg writes: > Jochen Henneberg writes: > >> I have been debugging an issue with the stmmac network driver and the >> Intel Elkhart Lake SoC for quite some time now. The problem comes up >> when a port forwarding is configured with iptables NAT like this: >> >> iptables -t nat -A PREROUTING -p tcp --dport 222 \ >> -j DNAT --to-destination 192.168.178.134:22 >> iptables -t nat -A POSTROUTING -p tcp --dst 192.168.178.134 \ >> --dport 22 j SNAT --to-source 192.168.178.138 >> >> If I 'ssh -p 222 192.168.178.138' the sttmac is reset after some seconds >> with: >> >> [ 553.050018] NETDEV WATCHDOG: eno1 (intel-eth-pci): transmit queue 0 >> timed out >> [ 553.050048] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:525 >> dev_watchdog+0x23a/0x250 >> [ 553.050059] Modules linked in: nft_chain_nat xt_nat nf_nat >> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp nft_compat >> nf_tables libcrc32c nfnetlink snd_seq_dummy snd_hrtimer >> snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic >> ledtrig_audio snd_sof_pci_intel_tgl snd_sof_intel_hda_common >> soundwire_intel soundwire_generic_allocation soundwire_cadence >> snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils >> snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match binfmt_misc >> snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus >> snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi >> snd_hda_codec intel_rapl_msr snd_hda_core intel_rapl_common snd_hwdep >> x86_pkg_temp_thermal intel_powerclamp snd_pcm coretemp nls_iso8859_1 >> snd_seq_midi kvm_intel snd_seq_midi_event snd_rawmidi mei_hdcp i915 >> kvm mei_pxp snd_seq crct10dif_pclmul ghash_clmulni_intel >> snd_seq_device sha512_ssse3 drm_buddy ttm cmdlinepart aesni_intel >> snd_timer spi_nor crypto_simd >> [ 553.050187] drm_display_helper mtd cryptd snd intel_cstate cec >> rc_core joydev intel_wmi_thunderbolt drm_kms_helper soundcore >> i2c_algo_bit mei_me syscopyarea input_leds sysfillrect mei 8250_dw >> sysimgblt igen6_edac mac_hid intel_hid acpi_pad acpi_tad sparse_keymap >> msr parport_pc ppdev lp parport drm pstore_blk ramoops pstore_zone >> reed_solomon efi_pstore ip_tables x_tables autofs4 hid_logitech_hidpp >> hid_logitech_dj hid_generic usbhid uas hid usb_storage mxl_gpy >> polynomial mmc_block dwmac_intel spi_intel_pci i2c_i801 intel_ish_ipc >> gpio_kempld i2c_kempld crc32_pclmul spi_intel intel_ishtp i2c_smbus >> stmmac intel_lpss_pci ahci sdhci_pci xhci_pci intel_lpss >> xhci_pci_renesas pcs_xpcs cqhci libahci video phylink sdhci idma64 >> kempld_core i2c_scmi wmi pinctrl_elkhartlake(+) >> [ 553.050354] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D 6.2.0-rc8+ >> #22 >> [ 553.050360] Hardware name: Default string Default string/COMe-mEL10 >> E2, BIOS MEL1R904 11/02/2022 >> [ 553.050363] RIP: 0010:dev_watchdog+0x23a/0x250 >> [ 553.050370] Code: 00 e9 2b ff ff ff 48 89 df c6 05 de 01 83 01 01 e8 >> fb 25 f8 ff 44 89 f1 48 89 de 48 c7 c7 a8 cd c0 a6 48 89 c2 e8 24 53 >> 20 00 <0f> 0b e9 1c ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 >> 00 >> [ 553.050373] RSP: 0018:ffffbfc34018ce38 EFLAGS: 00010246 >> [ 553.050378] RAX: 0000000000000000 RBX: ffffa0d392b68000 RCX: >> 0000000000000000 >> [ 553.050381] RDX: 0000000000000000 RSI: 0000000000000000 RDI: >> 0000000000000000 >> [ 553.050384] RBP: ffffbfc34018ce68 R08: 0000000000000000 R09: >> 0000000000000000 >> [ 553.050386] R10: 0000000000000000 R11: 0000000000000000 R12: >> ffffa0d392b684c8 >> [ 553.050389] R13: ffffa0d392b6841c R14: 0000000000000000 R15: >> 0000000000000000 >> [ 553.050392] FS: 0000000000000000(0000) GS:ffffa0d4e4300000(0000) >> knlGS:0000000000000000 >> [ 553.050395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 553.050398] CR2: 000055da1a873000 CR3: 000000023fc10000 CR4: >> 0000000000350ee0 >> [ 553.050402] Call Trace: >> [ 553.050406] >> [ 553.050413] ? __pfx_dev_watchdog+0x10/0x10 >> [ 553.050419] call_timer_fn+0x29/0x160 >> [ 553.050425] ? __pfx_dev_watchdog+0x10/0x10 >> [ 553.050429] __run_timers+0x259/0x310 >> [ 553.050434] run_timer_softirq+0x1d/0x40 >> [ 553.050437] __do_softirq+0xd6/0x346 >> [ 553.050444] ? hrtimer_interrupt+0x11f/0x230 >> [ 553.050449] __irq_exit_rcu+0xa2/0xd0 >> [ 553.050455] irq_exit_rcu+0xe/0x20 >> [ 553.050459] sysvec_apic_timer_interrupt+0x92/0xd0 >> [ 553.050465] >> [ 553.050467] >> [ 553.050469] asm_sysvec_apic_timer_interrupt+0x1b/0x20 >> [ 553.050475] RIP: 0010:cpuidle_enter_state+0xde/0x6f0 >> [ 553.050481] Code: 7f 1b 5a e8 14 2f 4e ff 8b 53 04 49 89 c7 0f 1f 44 >> 00 00 31 ff e8 22 40 4d ff 80 7d d0 00 0f 85 eb 00 00 00 fb 0f 1f 44 >> 00 00 <45> 85 f6 0f 88 12 02 00 00 4d 63 ee 49 83 fd 09 0f 87 c7 04 00 >> 00 >> [ 553.050484] RSP: 0018:ffffbfc340113e38 EFLAGS: 00000246 >> [ 553.050489] RAX: 0000000000000000 RBX: ffffa0d4e433c930 RCX: >> 0000000000000000 >> [ 553.050492] RDX: 0000000000000002 RSI: 0000000000000000 RDI: >> 0000000000000000 >> [ 553.050494] RBP: ffffbfc340113e88 R08: 0000000000000000 R09: >> 0000000000000000 >> [ 553.050496] R10: 0000000000000000 R11: 0000000000000000 R12: >> ffffffffa76bb700 >> [ 553.050499] R13: 0000000000000003 R14: 0000000000000003 R15: >> 00000080c45910fe >> [ 553.050504] ? cpuidle_enter_state+0xce/0x6f0 >> [ 553.050508] cpuidle_enter+0x2e/0x50 >> [ 553.050512] do_idle+0x216/0x2a0 >> [ 553.050517] cpu_startup_entry+0x1d/0x20 >> [ 553.050521] start_secondary+0x122/0x160 >> [ 553.050527] secondary_startup_64_no_verify+0xe5/0xeb >> [ 553.050535] >> [ 553.050537] ---[ end trace 0000000000000000 ]--- >> [ 553.050584] intel-eth-pci 0000:00:1d.1 eno1: Reset adapter. >> [ 553.064639] intel-eth-pci 0000:00:1d.1 eno1: FPE workqueue stop >> [ 553.065416] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-0 >> [ 553.066003] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-1 >> [ 553.066469] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-2 >> [ 553.066933] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-3 >> [ 553.067428] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-4 >> [ 553.067895] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-5 >> [ 553.068466] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-6 >> [ 553.069025] intel-eth-pci 0000:00:1d.1 eno1: Register >> MEM_TYPE_PAGE_POOL RxQ-7 >> [ 553.082011] dwmac4: Master AXI performs any burst length >> [ 553.082094] intel-eth-pci 0000:00:1d.1 eno1: Enabling Safety Features >> [ 553.082138] intel-eth-pci 0000:00:1d.1 eno1: IEEE 1588-2008 Advanced >> Timestamp supported >> [ 553.082438] intel-eth-pci 0000:00:1d.1 eno1: registered PTP clock >> [ 553.082641] intel-eth-pci 0000:00:1d.1 eno1: FPE workqueue start >> [ 553.082649] intel-eth-pci 0000:00:1d.1 eno1: configuring for >> inband/sgmii link mode >> [ 553.083700] intel-eth-pci 0000:00:1d.1 eno1: Link is Up - 1Gbps/Full >> - flow control off >> >> This does not happen during normal ssh to the EHL board, this does not >> happen if I do port forwarding through userspace, e. g. with socat. And >> this does not happen if two independent network interfaces are used in >> the iptables rules. I have not observed the issue with other network >> chips with the same kernel version so I think the issue must be in the >> stmmac driver. >> >> What happens is that the OWNED bit of a DMA descriptor is not reset >> (detected in dwmac4_wrback_get_tx_status()) and the queue's txtimer is >> looping endlessly until the timeout resets everything. Sometimes the >> situation is solved before the timeout, most of the times it is not but >> i always takes long until the OWNED bit is reset. This can be reproduced >> with 100% success. >> >> I have tried to understand where the issue comes from and I think it may >> have something to do with the calls to dma_wmb() and wmb() (or more >> precisely with missing barriers) but so far I had no success to solve it >> and provide a patch. >> >> The kernel that I am running is a quite recent linux-net >> (b60417a9f2b8). I have also tried with linux-net-next without >> success. The issue can already be observed with v5.15.39. >> >> Any help or suggestion how I can debug this further would be >> appreciated. Or if somebody else can reproduce or not reproduce the >> issue on the given platform may help as well. >> >> Regards >> -Jochen > > Some more details about the devices involved: > > intel-eth-pci 0000:00:1d.1: enabling device (0000 -> 0002) > intel-eth-pci 0000:00:1d.1: stmmac_config_multi_msi: multi MSI enablement successful > intel-eth-pci 0000:00:1d.1: User ID: 0x51, Synopsys ID: 0x52 > intel-eth-pci 0000:00:1d.1: DWMAC4/5 > intel-eth-pci 0000:00:1d.1: DMA HW capability register supported > intel-eth-pci 0000:00:1d.1: RX Checksum Offload Engine supported > intel-eth-pci 0000:00:1d.1: TX Checksum insertion supported > intel-eth-pci 0000:00:1d.1: TSO supported > intel-eth-pci 0000:00:1d.1: Enable RX Mitigation via HW Watchdog Timer > intel-eth-pci 0000:00:1d.1: device MAC address 00:e0:4b:74:93:db > intel-eth-pci 0000:00:1d.1: Enabled L3L4 Flow TC (entries=2) > intel-eth-pci 0000:00:1d.1: Enabled RFS Flow TC (entries=10) > intel-eth-pci 0000:00:1d.1: Enabling HW TC (entries=256, max_off=256) > intel-eth-pci 0000:00:1d.1: TSO feature enabled > intel-eth-pci 0000:00:1d.1: Using 32 bits DMA width > Maxlinear Ethernet GPY115B stmmac-2:01: Firmware Version: 7.110 (0x876E) > Maxlinear Ethernet GPY115B stmmac-2:01: attached PHY driver (mii_bus:phy_addr=stmmac-2:01> > intel-eth-pci 0000:00:1d.1 eno1: renamed from eth0 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-0 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-1 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-2 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-3 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-4 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-5 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-6 > intel-eth-pci 0000:00:1d.1 eno1: Register MEM_TYPE_PAGE_POOL RxQ-7 > intel-eth-pci 0000:00:1d.1 eno1: Enabling Safety Features > intel-eth-pci 0000:00:1d.1 eno1: IEEE 1588-2008 Advanced Timestamp supported > intel-eth-pci 0000:00:1d.1 eno1: registered PTP clock > intel-eth-pci 0000:00:1d.1 eno1: FPE workqueue start > intel-eth-pci 0000:00:1d.1 eno1: configuring for inband/sgmii link mode > intel-eth-pci 0000:00:1d.1 eno1: Link is Up - 1Gbps/Full - flow control off > No further help needed. I have found the root cause and will provide a fix. -Jochen