Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp632499pxa; Wed, 19 Aug 2020 10:30:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJFdB0LJAUoX+qPOM7208PuIt4d7kJEkjgPCfc+e4/fPJohPfMLyRWaFCBkyrERSfVSzS4 X-Received: by 2002:a05:6402:1606:: with SMTP id f6mr24727099edv.328.1597858247035; Wed, 19 Aug 2020 10:30:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597858247; cv=none; d=google.com; s=arc-20160816; b=wOlewx66XOnoR4xXBUXfp2SoNMZSvwkYaDCCY1IttS3cV3GeuW60jttsw+xtx7aVc6 dSsWRA10TzNTvXzz9AFQto2nRW1d+gcU0lkGWGx1VKYirUfLomtuIzB9niGX6ORk1Zrv 0qwU8j3xZ3hdQPC38Bnv1w4L/1OXOBRTUqh3APm2nH8OCCXpSByRTZdvTBwam4VHx7qE Jw5NjxMXvpvNLQx+R5hAkQIAFmtVxduK0usGZce5qcqSyfC5MAKKSL0JDLXDXasmLOgj DlV+anYfh7CUQgVdHeRFOXMF9moS2JcTrwMLjfLkhoGl5aUiqSww38S8VAE4Q+qcJNN6 WdYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=xBEAhrQGHvBNM6GoIItVqMyLelXW0Db1YZBxUvV0wQo=; b=078dBw2sjAtwxUj+ZZWWabfzvMoS7CHVPsTEP+kGmpNJGiV8M9NHj3Ys3/Src3vlbl OCWCWK8SWihVq0E3/wIPxEObCdLq0SYWZjyPB0Abh4WghaYC/FwoPShXxpCf1Et4mmXp 3p8/qXLwB8xaIDkNPIiTSg+k77pOfwoTRis9ZjVStld9QLQ3blRRTw9W5XL8Q9210y1C B+vD8YBgVN48+EVbiPWBGBpKb1RJde7JxMhVXp0YLaRYAE9hYlQlN+cmxFIguPhe11no VDcgfRYMd6qpEWZE+v+PCTdr3GoPWBtkQ0Fz6Jxqsf2+DNVzH226f+HIPvTsCJznHaPw Mdqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y27si15440561ejc.439.2020.08.19.10.30.22; Wed, 19 Aug 2020 10:30:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726702AbgHSR30 (ORCPT + 99 others); Wed, 19 Aug 2020 13:29:26 -0400 Received: from mga04.intel.com ([192.55.52.120]:35337 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725804AbgHSR3M (ORCPT ); Wed, 19 Aug 2020 13:29:12 -0400 IronPort-SDR: rTRMxi/Gqi5uL7saD2QhA+lThjs3Dlb68+Q75m71V8KB6tjUb4x9MJgjATXP5Z0d9mHx8WIyK5 2VQZAQ3XpnUw== X-IronPort-AV: E=McAfee;i="6000,8403,9718"; a="152577296" X-IronPort-AV: E=Sophos;i="5.76,332,1592895600"; d="scan'208";a="152577296" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2020 10:29:11 -0700 IronPort-SDR: fjTHM9Cqv3sYIY/fdPyfa8pCXMVQNRCtF3fY/Qf7Xvejs7oMp7ACqpmERR6OO1PDjQthPYBVat y1Pn8YDltSBg== X-IronPort-AV: E=Sophos;i="5.76,332,1592895600"; d="scan'208";a="497824286" Received: from jbrandeb-mobl3.amr.corp.intel.com (HELO localhost) ([10.212.220.26]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2020 10:29:11 -0700 Date: Wed, 19 Aug 2020 10:29:09 -0700 From: Jesse Brandeburg To: Steven Rostedt Cc: Naresh Kamboju , linux- stable , open list , Netdev , "Greg Kroah-Hartman" , Sasha Levin , Masami Hiramatsu , Leo Yan , Jamal Hadi Salim , Cong Wang , Jiri Pirko , "David S. Miller" , "Jakub Kicinski" , , LTP List Subject: Re: NETDEV WATCHDOG: WARNING: at net/sched/sch_generic.c:442 dev_watchdog Message-ID: <20200819102909.000016ac@intel.com> In-Reply-To: <20200819125732.1c296ce7@oasis.local.home> References: <20200819125732.1c296ce7@oasis.local.home> X-Mailer: Claws Mail 3.12.0 (GTK+ 2.24.28; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Steven Rostedt wrote: > On Wed, 19 Aug 2020 17:01:06 +0530 > Naresh Kamboju wrote: > > > kernel warning noticed on x86_64 while running LTP tracing ftrace-stress-test > > case. started noticing on the stable-rc linux-5.8.y branch. > > > > This device booted with KASAN config and DYNAMIC tracing configs and more. > > This reported issue is not easily reproducible. > > > > metadata: > > git branch: linux-5.8.y > > git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > git commit: ad8c735b1497520df959f675718f39dca8cb8019 > > git describe: v5.8.2 > > make_kernelversion: 5.8.2 > > kernel-config: > > https://builds.tuxbuild.com/bOz0eAwkcraRiWALTW9D3Q/kernel.config > > > > > > [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, > > stat_blocked and stat_runtime require the kernel parameter > > schedstats=enable or kernel.sched_schedstats=1 > > [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, > > stat_blocked and stat_runtime require the kernel parameter > > schedstats=enable or kernel.sched_schedstats=1 > > [ 107.507991] ------------[ cut here ]------------ > > [ 107.513103] NETDEV WATCHDOG: eth0 (igb): transmit queue 2 timed out > > [ 107.519973] WARNING: CPU: 1 PID: 331 at net/sched/sch_generic.c:442 > > dev_watchdog+0x4c7/0x4d0 > > [ 107.528907] Modules linked in: x86_pkg_temp_thermal > > [ 107.534541] CPU: 1 PID: 331 Comm: systemd-journal Not tainted 5.8.2 #1 > > [ 107.541480] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS > > 2.2 05/23/2018 > > [ 107.549314] RIP: 0010:dev_watchdog+0x4c7/0x4d0 > > [ 107.554226] Code: ff ff 48 8b 5d c8 c6 05 6d f7 94 01 01 48 89 df > > e8 9e b4 f8 ff 44 89 e9 48 89 de 48 c7 c7 20 49 51 9c 48 89 c2 e8 91 > > 7e e9 fe <0f> 0b e9 03 ff ff ff 66 90 e8 9b 23 db fe 55 48 89 e5 41 57 > > I've triggered this myself in my testing, and I assumed that adding the > overhead of tracing and here KASAN too, made some watchdog a bit > unhappy. By commenting out the warning, I've seen no ill effects. > > Perhaps this is something we need to dig a bit deeper into. Looked into it a little, igb uses a timeout of 5 seconds, and the stack prints the warning if we haven't completed the transmit in that time. What I don't understand in the stack trace is this: > > [ 107.654661] Call Trace: > > [ 107.657735] > > [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0 > > [ 107.667929] call_timer_fn+0x3b/0x1b0 > > [ 107.672238] ? netif_carrier_off+0x70/0x70 > > [ 107.677771] ? netif_carrier_off+0x70/0x70 > > [ 107.682656] ? ftrace_graph_caller+0xc0/0xc0 > > [ 107.687379] run_timer_softirq+0x3e8/0xa10 > > [ 107.694653] ? call_timer_fn+0x1b0/0x1b0 > > [ 107.699382] ? trace_event_raw_event_softirq+0xdd/0x150 > > [ 107.706768] ? ring_buffer_unlock_commit+0xf5/0x210 > > [ 107.712213] ? call_timer_fn+0x1b0/0x1b0 > > [ 107.716625] ? __do_softirq+0x155/0x467 If the carrier was turned off by something, that could cause the stack to timeout since it appears the driver didn't call this itself after finishing all transmits like it normally would have. Is the trace above correct? Usually the ? indicate unsure backtrace due to missing symbols, right?