Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754474AbdCNBUh (ORCPT ); Mon, 13 Mar 2017 21:20:37 -0400 Received: from mga02.intel.com ([134.134.136.20]:60861 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753433AbdCNBUf (ORCPT ); Mon, 13 Mar 2017 21:20:35 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,161,1486454400"; d="scan'208";a="834246838" From: "Brown, Aaron F" To: =?utf-8?B?QmrDuHJuIE1vcms=?= , Borislav Petkov CC: Andy Shevchenko , "lkml@pengaru.com" , linux-kernel , "vcaputo@pengaru.com" , "linux-pci@vger.kernel.org" , "intel-wired-lan@lists.osuosl.org" , khalidm , "David Singleton" , "Kirsher, Jeffrey T" Subject: RE: [BUG] 4.11.0-rc1 panic on shutdown X61s Thread-Topic: [BUG] 4.11.0-rc1 panic on shutdown X61s Thread-Index: AQHSnBlOZzSnxDpUIEGUDok2YPRWUKGTfoLQ Date: Tue, 14 Mar 2017 01:20:27 +0000 Message-ID: <309B89C4C689E141A5FF6A0C5FB2118B8C5B3F9B@ORSMSX101.amr.corp.intel.com> References: <20170312053723.GH802@shells.gnugeneration.com> <20170312115703.GA18197@nazgul.tnic> <20170312122621.GA2823@nazgul.tnic> <20170312222333.hm54nl2ul5iuixap@pd.tnic> <87shmhjfap.fsf@miraculix.mork.no> In-Reply-To: <87shmhjfap.fsf@miraculix.mork.no> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.140] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v2E1KmPC012307 Content-Length: 2914 Lines: 52 > From: Bjørn Mork [mailto:bjorn@mork.no] > Sent: Monday, March 13, 2017 9:46 AM > To: Borislav Petkov > Cc: Andy Shevchenko ; lkml@pengaru.com; > linux-kernel ; vcaputo@pengaru.com; linux- > pci@vger.kernel.org; intel-wired-lan@lists.osuosl.org; khalidm > ; David Singleton ; Brown, Aaron > F ; Kirsher, Jeffrey T > > Subject: Re: [BUG] 4.11.0-rc1 panic on shutdown X61s > > Borislav Petkov writes: > > On Sun, Mar 12, 2017 at 03:55:08PM +0200, Andy Shevchenko wrote: > > > >> The only change that IMHO matters happened between v4.10 and v4.11- > rc1 is this: > >> > >> @@ -6276,8 +6274,8 @@ static int e1000e_pm_freeze(struct device *dev) > >> /* Quiesce the device without resetting the hardware */ > >> e1000e_down(adapter, false); > >> e1000_free_irq(adapter); > >> + e1000e_reset_interrupt_capability(adapter); > >> } > >> - e1000e_reset_interrupt_capability(adapter); > >> > >> So, it apparently misses something for the other case, like > >> pci_disable_msi() call or so. > > > > Well, lemme add the people from > > > > 7e54d9d063fa ("e1000e: driver trying to free already-free irq") > > > > to CC then. :-) > > Already did that a week ago: > https://www.spinics.net/lists/netdev/msg423379.html > > Haven't heard anything back yet. Wondering if they are waiting for > someone else to submit the pretty obvious revert? Don't understand why > that should take more than a minute to figure out. It's not like they > are testing these changes anyway... Believe it or not we actually do test these changes. This one was tested by me and I did not have the same results you and the other people reporting this trace did. I made it back in the lab today and have spent a good part of the day attempting to reproduce this bug without success. Freeze / resume works for me on all the systems I have tried, which includes a sampling of all the current parts and many older ones. Given there are several other reports of this it is obviously an issue and I would like to be able to reproduce it in case another patch to resolve the issue this attempts to fix comes back in another form. So I want to know what's different between the systems that hit this and my bank of systems that don't. What exact part (or parts) are we looking at (lspci|grep -i eth) that trigger this? Could it be a difference in .config files? The trace says it is falling back to legacy interrupts, does the system continue to work and does the network continue to function in that mode? In case it's related to user space what is the base distro? Any other information you think can help me reproduce the issue would be appreciated. Thanks, Aaron > > > Bjørn