Received: by 10.192.165.148 with SMTP id m20csp2317953imm; Thu, 26 Apr 2018 09:03:59 -0700 (PDT) X-Google-Smtp-Source: AIpwx49mZKtKZnrSTjVQAEVLKaVTD7o5ELErfYMAJWSsAP6TGvvUMEKmoaTRIMtPnJ4vYVbZ7tP0 X-Received: by 10.101.102.198 with SMTP id c6mr26784535pgw.127.1524758639898; Thu, 26 Apr 2018 09:03:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524758639; cv=none; d=google.com; s=arc-20160816; b=E58IovZYP70XZvcZa53PQMnJOTCpUCgXO+ejQiAb4+uPkxXjduBHJfjcWNFSGQ6CFf hFqEGm7clv6RDssxSzOQkuP6nbDNGL6ZaFyUlrGrOQS2XEh+k6atMkKuvlTcTqjXgjy6 XvqJg7WYWdP6Alsr9Qmpplmxypj4Z+TFNPQeS9A+HAPDtmqEsx+UTl4gJschbKcA5cdq rELxAEMIA6xyzQtGiX9NL6dHnkE22ECRar46X7IYbIwAkFUUYxXSLBDMEWQQ0ZIEsSQw tD02kIgAZ0iitwYltsP7wm7xA/Id1Zn1cpnbrMQiXtdQAXIsuMGatyL1rfAmrrty9uFg +CbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=f6yVOLNy3WKIob8qxuZqpBxMEdsdCBmyn7mUJE1AzV4=; b=ghEVRv1A2Ib7Di2ZxNjz3wcAWfU81PzBqhh6su9jA4sqiW/bolek/F7Wj7CVgIc5DC QoJ0HG7idXTofDjK5dBr2R0MsZThDVA4gMafdMU4FnDHNFXne5vzqmhdw2tGKLeydNC0 MRgkSW5JD+MaZkHmdNYYueapDTevs8xFnRVmLKjd9+8BaSTZ8ShhA0fT8/Q8z/EEqWOj tleZMHG/p9YNzcZkkc0uO8qmtet3JV8U2W4PeLQfO1ITMs78V8kV48Tx43AAwcAPjMsc OVG86VF5wklwiod6jq9CKk4jY68IYzJh1kqK2LyhpOwqMFWKOAnkfMzdk68VhT4zxxm1 9aoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VuOH2UnA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i1-v6si18588273pld.152.2018.04.26.09.03.45; Thu, 26 Apr 2018 09:03:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VuOH2UnA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756734AbeDZQC3 (ORCPT + 99 others); Thu, 26 Apr 2018 12:02:29 -0400 Received: from mail-oi0-f46.google.com ([209.85.218.46]:42185 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756519AbeDZQC2 (ORCPT ); Thu, 26 Apr 2018 12:02:28 -0400 Received: by mail-oi0-f46.google.com with SMTP id t27-v6so24577769oij.9 for ; Thu, 26 Apr 2018 09:02:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=f6yVOLNy3WKIob8qxuZqpBxMEdsdCBmyn7mUJE1AzV4=; b=VuOH2UnAvryyrxfC2ke9V+h8651ZZLgGt51L6jkxmVkSYyEMvTel+1uBUsRZmc2KP9 pl4P5gSXwZjWd6PzPutbR27ZsJqiTzgCwhmX6bjRJ9aCqIw/E/b8WgjTIeTOBE31fsfe xXM3IFx3GgUkyM+3mjTYoVb98x4FtdfcsGwJyAduchP5wArdD8WGT1DFXLnsKcuT/Alf wvGB5SnRgKpu3BtcsyX1sQR5ckGWjqHLhVXwpB0aWeUm6Vg0O76JRPCzirQvsqjdbvPH /SQRwZh8hZYhw4m6ByY6kR5c+7/lU0tp9f9LeteYPq2DYkVm1hkZijgYMoVyzkCUfUpn lOnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=f6yVOLNy3WKIob8qxuZqpBxMEdsdCBmyn7mUJE1AzV4=; b=HtephYqfI6y6tIoxbaOCjCEMsN/SzYKTjrj4S0mBxueKBeoEm2ba74ck/nHegL/n72 ZPpxde4pqBWqljYUUTrZwHiBbRVe8fh1jy6XGj0h2CNWwc4Nmjik+T2cXRQVBc+HXAxS E4TLwASjlAOS8D4G+c4vrUhY22solTnY3iL8S0vLBiAURiOshHuIBTF/jcwoUKDogRtM CHxIHrywUnD/bBFpficVlzP1mQYNbmKqbR+sxRxt3BDhnOxJ/3HkDYcUzq4lP3tIaUPy 4eKOUQ5lVQ4ZWt/Z++NDaKE9goFlWFLuSg+wRke0XGmaTi3PdZ62kkgd12L1E3x7uor4 xPhA== X-Gm-Message-State: ALQs6tCM4U0NPYq1hPPvnq832hcwMvkuFaKGTWyJhpP3uLVcam5ZYfU7 NYk2Rgoo+FAYHvOmBBk+85TtkPSAO8BQWeyKoXNnZgDh X-Received: by 2002:aca:4dcc:: with SMTP id a195-v6mr19862871oib.259.1524758547070; Thu, 26 Apr 2018 09:02:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.201.10.209 with HTTP; Thu, 26 Apr 2018 09:02:26 -0700 (PDT) In-Reply-To: <87wowumj21.fsf@gmail.com> References: <87h8o0ocul.fsf@gmail.com> <877eovobxl.fsf@gmail.com> <87wowumj21.fsf@gmail.com> From: Alexander Duyck Date: Thu, 26 Apr 2018 09:02:26 -0700 Message-ID: Subject: Re: [BUG] igb: reconnecting of cable not always detected To: Holger Schurig Cc: Jeff Kirsher , intel-wired-lan , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 26, 2018 at 2:08 AM, Holger Schurig wrote: > Hi, > >> Thanks. I'm suspecting we may need to instrument igb_rd32 at this >> point. In order to trigger what you are seeing I am assuming the >> device has been detached due to a read failure of some sort. > > Okay, I added a printk to igb_rd32. And because no one calls this > function directly (all access goes via the rd32/rd32_array macro) I also > added the output of the calling function. This should help greatly in > identifying the read from the hardware to the consumer. > > Finally, I noticed that igb_update_stats() produced a lot of churn that > most likely are unrelated. So I helper variable to make output from this > function go away. > > I installed this modified driver, rebooted, and removed / inserted the > LAN cable until the error was present. > > As before, "ethtool" and "mii-tool" now said that the device is not > there, while "ip link" showed the device as present. > > > The full output of "journalctl -fk | grep igb" is 600 kB. So put the > whole file at Google Drive: > > https://drive.google.com/open?id=1p9cCT2d_EHnSHh29oS3AepUgFTKGFSeA > > > > I looked at the output to see patterns, e.g with > > grep -n igb_get_cfg_done_i210 igb.error.txt > grep -n __igb_shutdown igb.error.txt > ... > > (and almost all other function names). I hoped to see patterns. But for > my untrained eye, things looked not out of the order. Thanks for the data. It is actually useful. There are a few things that I see that seem to point to an obvious issue. The first are the following 2 lines from your dump: Apr 26 10:42:49 kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: RX Apr 26 10:42:49 kernel: igb 0000:02:00.0: EEE Disabled: unsupported at half duplex. Re-enable using ethtool when at full duplex. In case you aren't aware 1000Mbps Half Duplex is not a valid combination. The other bit that catches my attention is: Apr 26 10:42:51 kernel: igb 0000:02:00.0: exceed max 2 second Which appears to be a timeout error that is triggered in response to the above error which I believe is the fact that it didn't actually link at 1000Mbps. As I get time I will try to look into this further. I will have to go through the MDIC reads to figure out if there is something in there that is providing us with bad information from the PHY or if we are misinterpreting something. Thanks. - Alex