Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp6020162pxb; Mon, 14 Feb 2022 13:17:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJxi1ErcRIXPLvkJrW8h9Mu9A6zghnIF+JObDN/RM2w4DM0HyTK1INT+xAJYX/2bNxUqK9xy X-Received: by 2002:a05:6a00:841:: with SMTP id q1mr603129pfk.1.1644873459939; Mon, 14 Feb 2022 13:17:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644873459; cv=none; d=google.com; s=arc-20160816; b=KLahPc+Mpj0+MB6i+Kk5zOKzOpQfqilzMctV6z9c1ZfyKtU4wkcfubFo0DpYVkozaT OsnzOUBHw8W52AQqvzs3ORn6OSbp5bZS/+LeJLGTiN4q5GZfmx/8UoKyBINW4mPZBOjw 1VsE5TI3kd847bK1D2YpIRrNZ4Fayi3PQ4R7ICciqdcc+shfGCjG0mhIti4bIHM40G0V Ehustktuaet0j9dcVUv3agq61+aceIdKQwFr3z1AZdu160Y3M5RZFaa01IrHvkG+so3N 3nSUVLk9iAfPBhcBnZhYABXGx20FS7re3597m8TlWowDAiNk/LzX0wcVB8YHZlysjhsE acIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=F6fAVDO9B8ahdSzfn8aHTj4BGi1ci4XUboIokgZ/nMA=; b=Vabd0My6iCDQEkyA4PPOrHt9SHaIA5fmPoi3VZHt8JwXahULJGcC4DykJ569n7zyco +b9EJ/msCvFJTP/5A3mGuL9Q1nQI4yDu9weQ8r6s+y66txXS2koTOj8sydxM2O9Owp+B mqYQErts9l79WkzYEEqZ9QRVcLdv8SjRYoj+bjR17AxiSvzYEH+1uLTmUVTLcVuQg2X7 tvdKbREHYGbT7gJrZaROgvasl9401BgWNCHWiRABIoI4edY4E0bE9Dc+tYv7hqa2aC2+ LLGIhLyQJdzbpINeWSz5ys4qsfxts5J0yow6CKDq6VsXNZTbap/rnRPyXq+bnwchRyO2 et3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hbqSDI3g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id z6si14803187plg.174.2022.02.14.13.17.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Feb 2022 13:17:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hbqSDI3g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D27822297DB; Mon, 14 Feb 2022 12:36:37 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352469AbiBKSD2 (ORCPT + 93 others); Fri, 11 Feb 2022 13:03:28 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:50966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231338AbiBKSD1 (ORCPT ); Fri, 11 Feb 2022 13:03:27 -0500 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 160BAD43; Fri, 11 Feb 2022 10:03:25 -0800 (PST) Received: by mail-ej1-x62d.google.com with SMTP id p9so1662662ejd.6; Fri, 11 Feb 2022 10:03:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:content-language:to:cc :references:from:subject:in-reply-to:content-transfer-encoding; bh=F6fAVDO9B8ahdSzfn8aHTj4BGi1ci4XUboIokgZ/nMA=; b=hbqSDI3gUWNuS/W4nv/Tb13580VVb10wXGYuRQmdXrZ5rdIo/Q95X47AlRaTHD2X6v 8GSMZ73xZcYbRF3yh3l/gj+kOPceKSjKJ0mYlXBgIIClyRjU3u4cfHlpuVaN5lio5WUv v1BXPH9y97ywqUhYKG1/wFqkB/W6Df4xXNgGTXDrhQ8fLr3CcqF7mTX3WMKYjohOZNqp +0eKXRRHChoZQ+YtyTJ2ONsuuI8VYFVMV0dEhqpSUYxqyjWURjfEYs8h+rEtCeAIGAP+ XQBRx0TvgR+tuONmKHeNS9pdDM/sD7ucFnR5wDCJ0ohwtEQF8N1BbMXrgQ4C/zc01j3W 5xNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:subject:in-reply-to :content-transfer-encoding; bh=F6fAVDO9B8ahdSzfn8aHTj4BGi1ci4XUboIokgZ/nMA=; b=yNUBmXQLRHKJyQIy5izQZTTOPUjuEHGC2IunZVABcpzykAQoULos0xKAaKktvOrVOF ecRVIYcLuoX11eE5/Ah4OiKK1w0AEcv1rDlssPCtnK2ToaF1casiOwAuzrXMMEYdMstg JPdwtA0Xxx9/+VWhfpnvQoolPS6c7ZQ0qS7D5tWq/GsNjord2l2nhC3tv94JfLX+MEOI ob7K/6azTfV23wF3EAKswKJf2Y8Hn57ovEdu9oqFQiqUJUu5sVYbyXOuBQvbDXOjWHNw s6ESwex7FEhcZ4TLTyms2SIzKQiswH1KEdQmzrc6tkiMC3CiFdzkY+m/8nOSBEaRLS5S d5/Q== X-Gm-Message-State: AOAM533LYPifSNLbI/MvTmLszR/+tImmw9XWRsUkz4NTFkRXjgGTsYoj Y5+1/6jpVTPymd/E4C6Akja6W32qras= X-Received: by 2002:a17:906:72de:: with SMTP id m30mr205509ejl.163.1644602603475; Fri, 11 Feb 2022 10:03:23 -0800 (PST) Received: from ?IPV6:2003:ea:8f4d:2b00:e4fc:b19a:b6f3:47f? (p200300ea8f4d2b00e4fcb19ab6f3047f.dip0.t-ipconnect.de. [2003:ea:8f4d:2b00:e4fc:b19a:b6f3:47f]) by smtp.googlemail.com with ESMTPSA id k22sm6782652ejr.211.2022.02.11.10.03.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 11 Feb 2022 10:03:22 -0800 (PST) Message-ID: <1e89db6c-f992-e748-0b97-461e23f3c25f@gmail.com> Date: Fri, 11 Feb 2022 19:03:15 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0 Content-Language: en-US To: Corentin Labbe , swsd@realtek.com, davem@davemloft.net, kuba@kernel.org, thierry.reding@gmail.com, jonathanh@nvidia.com Cc: linux-tegra@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: From: Heiner Kallweit Subject: Re: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11.02.2022 11:32, Corentin Labbe wrote: > Hello > > On my tegra124-jetson-tk1, I always got: > [ 1311.064826] ------------[ cut here ]------------ > [ 1311.064880] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x2fc/0x300 > [ 1311.064976] NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out > [ 1311.065011] Modules linked in: > [ 1311.065074] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.7-dirty #7 > [ 1311.065116] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) > [ 1311.065177] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > [ 1311.065253] [] (show_stack) from [] (dump_stack_lvl+0x40/0x4c) > [ 1311.065322] [] (dump_stack_lvl) from [] (__warn+0xd0/0x12c) > [ 1311.065379] [] (__warn) from [] (warn_slowpath_fmt+0x90/0xb4) > [ 1311.065434] [] (warn_slowpath_fmt) from [] (dev_watchdog+0x2fc/0x300) > [ 1311.065493] [] (dev_watchdog) from [] (call_timer_fn+0x34/0x1a8) > [ 1311.065554] [] (call_timer_fn) from [] (__run_timers.part.0+0x22c/0x328) > [ 1311.065599] [] (__run_timers.part.0) from [] (run_timer_softirq+0x38/0x68) > [ 1311.065648] [] (run_timer_softirq) from [] (__do_softirq+0x124/0x3cc) > [ 1311.065732] [] (__do_softirq) from [] (irq_exit+0xa4/0xd4) > [ 1311.065818] [] (irq_exit) from [] (__irq_svc+0x50/0x80) > [ 1311.065860] Exception stack(0xc1101ed8 to 0xc1101f20) > [ 1311.065884] 1ec0: 00000000 00000001 > [ 1311.065913] 1ee0: c110a800 00000060 00000001 eed889f8 c121eaa0 418a949d 00000001 00000131 > [ 1311.065940] 1f00: 00000001 00000131 00000000 c1101f28 c08bbe20 c08bbee8 60000113 ffffffff > [ 1311.065962] [] (__irq_svc) from [] (cpuidle_enter_state+0x270/0x480) > [ 1311.066031] [] (cpuidle_enter_state) from [] (cpuidle_enter+0x50/0x54) > [ 1311.066078] [] (cpuidle_enter) from [] (do_idle+0x1e0/0x298) > [ 1311.066133] [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c) > [ 1311.066174] [] (cpu_startup_entry) from [] (start_kernel+0x678/0x6bc) > [ 1311.066242] ---[ end trace 3df1a997f30c7eb8 ]--- > [ 1311.083269] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [ 2671.118597] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [27521.391461] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [47441.629280] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [49046.691475] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [53081.713430] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [55101.737951] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [59351.771382] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [60491.797371] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [61351.805499] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [69631.911327] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [71246.958267] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [86522.110241] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [88507.174307] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [104612.315286] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > [132797.695339] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > > This happen since at least 5.10. > Any idea on how to debug this ? > For whatever reason the chip locked up what results in the tx timeout and the following rtl_rxtx_empty_cond == 0 message. However the chip soft reset in the timeout handler seems to help. Typically these timeouts are hard to debug because there's no public datasheets and errata information. Few questions: - Is this a mainline or a downstream kernel? - Full dmesg log would help (e.g. to identify exact chip version). - Does the issue correlate with specific activity or specific types of traffic? - Is the interface operating in promiscuous mode (e.g. part of a bridge)? At first you could try to disable all hw offloading / ASPM / EEE. There's also a small chance that the issue is linked to a specific link partner. So you could test whether issue persists with another switch in between. Or with a different link partner. Ar you aware of any earlier kernel version where the issue did not happen? Then you could bisect. > Regards