Received: by 10.192.165.148 with SMTP id m20csp881019imm; Wed, 25 Apr 2018 09:03:48 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpQaBfKWyUlC6xBt8QLfv2/N7z3xdhpOzRGghU8TRpPpEdpa7oYIIkE6hAnYgdudkLsNB6S X-Received: by 10.98.65.93 with SMTP id o90mr1938761pfa.140.1524672228536; Wed, 25 Apr 2018 09:03:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524672228; cv=none; d=google.com; s=arc-20160816; b=T4TFcnlh69FisKEEemI4w/a6TfNw8zwBJ33PH5yta2+A+CCWhIauzGG4lHTD2dW5jx +fCL10ep2CPT8+kzsnFtbnEMGWPkOP5HPXrB1PDNu0Z45n3D6F5yXrWOK+xbor3tWL8q QAkJKtBrjQ6AJmpIt89RiopsWVaCjynfZKzBtb5+QFiHzpVpIoMjYWqCeDAUyxSbjHkS KYt2jKLxp09Mec2qakgJpeL6bMWjV0E1ji8jHAXCeaFmSL0YVZBIGlmHL3LQbnbw9nk0 2OP6w1kh8gtwjAGOMDJjuWXoZP/rGSXvnsZeSfMmM6xFIziIVWqixHhDaKVvMQ/U/HXw q9SQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=rugilG/V/tEjMJ0lqGgijGNUA2x26L3afbS1ivZ7+vQ=; b=md5VjJUALgC5y7LMCg89MCuMNmSQUTuaTXlpGq2Zl9KDkkuaf6kdXz9UiLk26RiuG6 vzHdlJcGyjrCbu/fVMOXYxRUewRrEnfjD0DzlDVkoNm5nEZQCpOk+IGsVdv3pUs5I6gt CmmtvZ4GHJzuoM0wEbS7ef1VsdKDtDQ/4AjyVvX+0FGWvRLxULVaTQAnyZCnYg9VYyVc bRY8/geV3XCXCObEgcDeJfpXHwhT53Kw6shlkKZkRW17s000WdcsXsdmVYvqvM7fHeeM 9wISHzbqu37xMiecVYiyQYaGgKsfGVfZ5N8l3xB6Bj3R+DIkeviFNAiChhC3hjyqkMvm H7aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=p8ZQ8oHy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s16-v6si15988928plp.487.2018.04.25.09.03.23; Wed, 25 Apr 2018 09:03:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=p8ZQ8oHy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755765AbeDYQCF (ORCPT + 99 others); Wed, 25 Apr 2018 12:02:05 -0400 Received: from mail-oi0-f48.google.com ([209.85.218.48]:42584 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754954AbeDYQB6 (ORCPT ); Wed, 25 Apr 2018 12:01:58 -0400 Received: by mail-oi0-f48.google.com with SMTP id t27-v6so21310752oij.9 for ; Wed, 25 Apr 2018 09:01:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rugilG/V/tEjMJ0lqGgijGNUA2x26L3afbS1ivZ7+vQ=; b=p8ZQ8oHyG6EmIKRI5ox080Wscr+dVrJjkYRwyLWhav7UsM65xjWCOWXS/dggrQKidF a5b35P4mESo5s6pqv5ly9zrLas18bQw+vdEkhN9bulNIRmYXzS2hbxLjwp/g4TukOveU xRHT0ZglvESTO715bYg+yNB16Se1KADebZEu/LBUfHIHLhLk0fKyEhfv9jtLa3S+RzRy 9RAt084AmXh9RsfGtL/qOxa2axSKpCISwhhIIGBfWfUTgL9Xtrke9k/noB7Rl43RRyse X15FToiMsWKPWjcWsDoyMe8cMnVlwqmr9xzFQJG3eFdrK/Mpr1/568CcSm2r+dkCwCwE rVng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rugilG/V/tEjMJ0lqGgijGNUA2x26L3afbS1ivZ7+vQ=; b=j6cKub7CPhpthv8SeCwFk42cefXVp5qHbcKvftYsGruF/O1DlXkjGo6nPr97FfUMIn sfPCOFRkotTKl3jQfChjj8I3gZiZZCAXsdO9auBcXsNLnmCLS9Mua8DWnKg6sG9asR6Q lq82cae96QMw/x6+WW4E/KHG4g61XsP3hEpVTMVfMbrgpVU7/3j45PgSy2k1leh6mEdb 1O22TQX2CRzx9BiyYCpBCLfchG8sqhYW826UhquMsk7yRL34ISDJ4Gi7r4TusGIIdRyC BSua21rlMfDEsm9RVuiwZZ0J1JIx+1lgZJljD6VToUf2A6Lb15a8D6neh/kCO0cQPQcT EBog== X-Gm-Message-State: ALQs6tDxZa9pahH6wtXjesb2n/+Ek3eEkfYUnW+XLvr+6ollxWxhOhGq 2y99b+wsmDakZrI2OTdOEy10UoJWmHaqpP2v3Is= X-Received: by 2002:aca:5405:: with SMTP id i5-v6mr18989725oib.262.1524672115160; Wed, 25 Apr 2018 09:01:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.201.10.209 with HTTP; Wed, 25 Apr 2018 09:01:54 -0700 (PDT) In-Reply-To: <877eovobxl.fsf@gmail.com> References: <87h8o0ocul.fsf@gmail.com> <877eovobxl.fsf@gmail.com> From: Alexander Duyck Date: Wed, 25 Apr 2018 09:01:54 -0700 Message-ID: Subject: Re: [BUG] igb: reconnecting of cable not always detected To: Holger Schurig Cc: Jeff Kirsher , intel-wired-lan , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 25, 2018 at 2:47 AM, Holger Schurig wrote: > Hi Alex, > > (Sent a 2nd time, this time with "Reply to all" and without HTML, so > that it hits the kernel archives as well. Sorry for the noise. > > > > >> Sounds like the link is failing to re-establish. You might double >> check a few things. One is to verify if the link partner is >> recognizing the link as coming up or not. > > It turns on differently. Before I remove the cable, the LED on the TP > LINK "TL SG-108" was green. After removing the cable, the LED went off. > After reinserting the cable, it became orange after some while. > > Green LED means 1000 MB/s, orange LED means 10/100 MB/s. Was the orange LED on the igb NIC or on the TL SG-108? Based on the comment below I am assuming it is the switch. Based on that I am thinking we probably need to work on the PHY configuration. > I have a different, even older switch: "Allnet ALL8039". Here the same: > the switch detects a link, but igb not. > > > >> If you could also provide an "lspci -vvv" > > 02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network > Connection (rev 03) Okay so we are working with an i210. > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 19 > Region 0: Memory at 90600000 (32-bit, non-prefetchable) [size=512K] > Region 2: I/O ports at d000 [size=32] > Region 3: Memory at 90680000 (32-bit, non-prefetchable) [size=16K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > Address: 0000000000000000 Data: 0000 > Masking: 00000000 Pending: 00000000 > Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > Vector table: BAR=3 offset=00000000 > PBA: BAR=3 offset=00002000 > Capabilities: [a0] Express (v2) Endpoint, MSI 00 > DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s > <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ > SlotPowerLimit 0.000W > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ > Unsupported+ > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ > FLReset- > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ > TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit > Latency L0s <2us, L1 <16us > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ > DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, > OBFF Not Supported > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, > LTR-, OBFF Disabled > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- > SpeedDis- > Transmit Margin: Normal Operating Range, > EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB, > EqualizationComplete-, EqualizationPhase1- > EqualizationPhase2-, EqualizationPhase3-, > LinkEqualizationRequest- > Capabilities: [100 v2] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr- > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ > ChkEn- > Capabilities: [140 v1] Device Serial Number 00-13-95-ff-ff-1a-54-33 > Capabilities: [1a0 v1] Transaction Processing Hints > Device specific mode supported > Steering table in TPH capability structure > Kernel driver in use: igb > Kernel modules: igb > >> and "ethtool -i" for the > > driver: igb > version: 5.4.0-k > firmware-version: 3.20, 0x80000553 > expansion-rom-version: > bus-info: 0000:02:00.0 > supports-statistics: yes > supports-test: yes > supports-eeprom-access: yes > supports-register-dump: yes > supports-priv-flags: yes > > > > One thing that is interesting is how igb reacts to ethtool inquiries > once it goes into the failed state. You inquired for "ethtool -i eth0", > but in the failed state I only get this: > > Cannot restart autonegotiation: No such device I assume you mean "ethtool -r" since that is what is supposed to be restarting negotiation. The "ethtool -i" is what you provided above. The fact that the device disappears is a bit concerning. I'm wondering if we are somehow triggering the surprise removal code. > But eth0 is of course still there, "ip -d link show eth0" shows: > > > 2: eth0: mtu 1500 qdisc mq state DOWN > mode DEFAULT group default qlen 1000 > link/ether 00:13:95:1a:54:33 brd ff:ff:ff:ff:ff:ff promiscuity 0 > numtxqueues 8 numrxqueues 8 gso_max_size 65536 gso_max_segs 65535 > > > > > > Other ethtool commands also don't report any information once the link > went bogus. Here one output from "ethtool eth0": > > Settings for eth0: > Supported ports: [ TP ] > Supported link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Supported pause frame use: Symmetric > Supports auto-negotiation: Yes > Advertised link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Advertised pause frame use: Symmetric > Advertised auto-negotiation: Yes > Speed: 1000Mb/s > Duplex: Full > Port: Twisted Pair > PHYAD: 1 > Transceiver: internal > Auto-negotiation: on > MDI-X: off (auto) > Supports Wake-on: pumbg > Wake-on: g > Current message level: 0x00000007 (7) > drv probe link > Link detected: yes > > ... and here another: > > Settings for eth0: > Cannot get device settings: No such device > Cannot get wake-on-lan settings: No such device > Cannot get message level: No such device > Cannot get link status: No such device > Settings for eth0: > No data available > > > > I'm willing to pepper the source with printk, if this helps :-) > > > Greetings, > Holger Thanks. I'm suspecting we may need to instrument igb_rd32 at this point. In order to trigger what you are seeing I am assuming the device has been detached due to a read failure of some sort. Another thing you could look at doing is narrowing down the possible factors involved. You could go through and limit phy settings and look at possibly dropping features such as EEE if it is enabled on the device. Thanks. - Alex