Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758012AbcLBRDF (ORCPT ); Fri, 2 Dec 2016 12:03:05 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:57256 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751576AbcLBRCu (ORCPT ); Fri, 2 Dec 2016 12:02:50 -0500 DMARC-Filter: OpenDMARC Filter v1.3.1 smtp.codeaurora.org 9D24F615A6 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=pass smtp.mailfrom=tbaicar@codeaurora.org Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN To: "Neftin, Sasha" , jeffrey.t.kirsher@intel.com, intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, okaya@codeaurora.org, timur@codeaurora.org References: <1478727681-16505-1-git-send-email-tbaicar@codeaurora.org> <42e53bc2-0361-e7b7-1093-4e095aa56955@intel.com> <3914adae-91b2-7b74-02bc-579b3e32d143@intel.com> <20c191fe-e8fe-b2ca-0628-7d0188d1f28e@codeaurora.org> From: "Baicar, Tyler" Message-ID: Date: Fri, 2 Dec 2016 10:02:39 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.0 MIME-Version: 1.0 In-Reply-To: <20c191fe-e8fe-b2ca-0628-7d0188d1f28e@codeaurora.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9600 Lines: 224 Hello Sasha, Were you able to reproduce this issue? Do you have a patch fixing the close function inconsistencies that you mentioned which I could try out? Thanks, Tyler On 11/21/2016 1:40 PM, Baicar, Tyler wrote: > On 11/17/2016 6:31 AM, Neftin, Sasha wrote: >> On 11/13/2016 10:34 AM, Neftin, Sasha wrote: >>> On 11/11/2016 12:35 AM, Baicar, Tyler wrote: >>>> Hello Sasha, >>>> >>>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote: >>>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote: >>>>>> Move IRQ free code so that it will happen regardless of the >>>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ >>>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because >>>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ. >>>>>> In such a situation, we will hit a kernel bug later in e1000_remove >>>>>> because the IRQ still has action since it was never freed. A >>>>>> secondary bus reset can cause this case to happen. >>>>>> >>>>>> Signed-off-by: Tyler Baicar >>>>>> --- >>>>>> drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c >>>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c >>>>>> index 7017281..36cfcb0 100644 >>>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c >>>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c >>>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev) >>>>>> if (!test_bit(__E1000_DOWN, &adapter->state)) { >>>>>> e1000e_down(adapter, true); >>>>>> - e1000_free_irq(adapter); >>>>>> /* Link status message must follow this format */ >>>>>> pr_info("%s NIC Link is Down\n", adapter->netdev->name); >>>>>> } >>>>>> + e1000_free_irq(adapter); >>>>>> + >>>>>> napi_disable(&adapter->napi); >>>>>> e1000e_free_tx_resources(adapter->tx_ring); >>>>>> >>>>> I would like not recommend insert this change. This change related >>>>> driver state machine, we afraid from lot of synchronization >>>>> problem and >>>>> issues. >>>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready. >>>> What do you mean here? There is no loop. If __E1000_DOWN is set >>>> then we >>>> will never free the IRQ. >>>> >>>>> Another point, does before execute secondary bus reset your SW >>>>> back up >>>>> pcie configuration space as properly? >>>> After a secondary bus reset, the link needs to recover and go back >>>> to a >>>> working state after 1 second. >>>> >>>> From the callstack, the issue is happening while removing the >>>> endpoint >>>> from the system, before applying the secondary bus reset. >>>> >>>> The order of events is >>>> 1. remove the drivers >>>> 2. cause a secondary bus reset >>>> 3. wait 1 second >>> Actually, this is too much, usually link up in less than 100ms.You can >>> check Data Link Layer indication. >>>> 4. recover the link >>>> >>>> callstack: >>>> free_msi_irqs+0x6c/0x1a8 >>>> pci_disable_msi+0xb0/0x148 >>>> e1000e_reset_interrupt_capability+0x60/0x78 >>>> e1000_remove+0xc8/0x180 >>>> pci_device_remove+0x48/0x118 >>>> __device_release_driver+0x80/0x108 >>>> device_release_driver+0x2c/0x40 >>>> pci_stop_bus_device+0xa0/0xb0 >>>> pci_stop_bus_device+0x3c/0xb0 >>>> pci_stop_root_bus+0x54/0x80 >>>> acpi_pci_root_remove+0x28/0x64 >>>> acpi_bus_trim+0x6c/0xa4 >>>> acpi_device_hotplug+0x19c/0x3f4 >>>> acpi_hotplug_work_fn+0x28/0x3c >>>> process_one_work+0x150/0x460 >>>> worker_thread+0x50/0x4b8 >>>> kthread+0xd4/0xe8 >>>> ret_from_fork+0x10/0x50 >>>> >>>> Thanks, >>>> Tyler >>>> >>> Hello Tyler, >>> Okay, we need consult more about this suggestion. >>> May I ask what is setup you run? Is there NIC or on board LAN? I would >>> like try reproduce this issue in our lab's too. >>> Also, is same issue observed with same scenario and others NIC's too? >>> Sasha >>> _______________________________________________ >>> Intel-wired-lan mailing list >>> Intel-wired-lan@lists.osuosl.org >>> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan >>> >> Hello Tyler, >> I see some in consistent implementation of __*_close methods in our >> drivers. Do you have any igb NIC to check if same problem persist there? >> Thanks, >> Sasha > Hello Sasha, > > I couldn't find an igb NIC to test with, but I did find another e1000e > card that does not cause the same issue. That card is: > > 0004:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit > Network Connection > Subsystem: Intel Corporation Gigabit CT Desktop Adapter > Physical Slot: 5-1 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr- Stepping- SERR+ FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 299 > Region 0: Memory at c01001c0000 (32-bit, non-prefetchable) > [size=128K] > Region 1: Memory at c0100100000 (32-bit, non-prefetchable) > [size=512K] > Region 2: I/O ports at 1000 [size=32] > Region 3: Memory at c01001e0000 (32-bit, non-prefetchable) > [size=16K] > Expansion ROM at c0100180000 [disabled] [size=256K] > Capabilities: [c8] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Address: 00000000397f0040 Data: 0000 > Capabilities: [e0] Express (v1) Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s > <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ > Unsupported+ > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 256 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- > AuxPwr+ TransPend- > LnkCap: Port #8, Speed 2.5GT/s, Width x1, ASPM L0s L1, > Exit Latency L0s <128ns, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- > LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- > SlotClk+ DLActive- BWMgmt- ABWMgmt- > Capabilities: [a0] MSI-X: Enable- Count=5 Masked- > Vector table: BAR=3 offset=00000000 > PBA: BAR=3 offset=00002000 > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- > UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap- CGenEn- > ChkCap- ChkEn- > Capabilities: [140 v1] Device Serial Number > 68-05-ca-ff-ff-29-47-34 > Kernel driver in use: e1000e > > Here are the kernel logs from first running the test on this card and > then running the test on the Intel 82571EB card that I originally > found the issue with (you can see the issue doesn't happen on this > card but does on the other): > > [ 44.146749] ACPI: \_SB_.PCI0: Device has suffered a power fault > [ 44.155238] pcieport 0000:00:00.0: PCIe Bus Error: > severity=Uncorrected (Non-Fatal), type=Transaction Layer, > id=0000(Requester ID) > [ 44.166111] pcieport 0000:00:00.0: device [17cb:0400] error > status/mask=00004000/00400000 > [ 44.174420] pcieport 0000:00:00.0: [14] Completion Timeout (First) > [ 44.401943] e1000e 0000:01:00.0 eth0: Timesync Tx Control register > not set as expected > [ 82.445586] pcieport 0002:00:00.0: PCIe Bus Error: > severity=Corrected, type=Physical Layer, id=0000(Receiver ID) > [ 82.454851] pcieport 0002:00:00.0: device [17cb:0400] error > status/mask=00000001/00006000 > [ 82.463209] pcieport 0002:00:00.0: [ 0] Receiver Error > [ 82.469355] pcieport 0002:00:00.0: PCIe Bus Error: > severity=Uncorrected (Non-Fatal), type=Transaction Layer, > id=0000(Requester ID) > [ 82.481026] pcieport 0002:00:00.0: device [17cb:0400] error > status/mask=00004000/00400000 > [ 82.489343] pcieport 0002:00:00.0: [14] Completion Timeout (First) > [ 82.504573] ACPI: \_SB_.PCI2: Device has suffered a power fault > [ 84.528036] kernel BUG at drivers/pci/msi.c:369! > > I'm not sure why it reproduces on the 82571EB card and not the 82574L > card. The only obvious difference is there is no Reciever Error on the > 82574L card. > > If you have a patch fixing the inconsistencies you mentioned with the > __*_close methods I would certainly be willing to test it out! > > Thanks, > Tyler > -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.