Received: by 10.223.176.5 with SMTP id f5csp3143094wra; Mon, 29 Jan 2018 09:24:21 -0800 (PST) X-Google-Smtp-Source: AH8x2259oGGou2EY8MID+v0KwP6YRBaAzBJJEvVfl80uS88C5dwsShQyMX+Bg5SEiQo17vMpzBhE X-Received: by 2002:a17:902:328:: with SMTP id 37-v6mr15821402pld.398.1517246661405; Mon, 29 Jan 2018 09:24:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517246661; cv=none; d=google.com; s=arc-20160816; b=rdcmeABw/mVZmMq+Q8CdBXPiiAGw9wcF87OZKjIeKV0uEnPD1fGiLOfPR2Owf+ThAN ZCyTOJawocifXFbBgd+JgTocPHCVFxjUB5QN73g3LiDV/EQIQrTem6S7siDhyk20yjvT Cy5oQwwzTNEy4N+qCXrvgNilOanggKTOB/iBodAl504mird7eKx9E+rYGvBYrXALLIBr 0IwJ+cpqwJAdidXxGFVwFd+GX5OcV/tj75Qge9OUydiJxipss+Ym8Iv5CkF+JL/3iD5q rwqOMp7ixJuHCQzHuCwbln8Oq9JOJClY6VOlEZBsAfRbtnsYKVaURDAesf+eWo5ByhUR Kijw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature:arc-authentication-results; bh=DD0FORhS5sRfZpFgqy5NoQwKCKWxK5Hw3KQQEKsIuiE=; b=VOSM2cufX45bcFxdRKfUBwsHXYUjp3Q6juphWr+eF21eyqSwzcDBTXy4ozZzg2xi0A efHhGMMWOwNUJhEy6/DGTL7id9g25UwL5YWRVTY0XpHLzOpmn6v4GrV2ov6mvmLW0Md7 ApzHLQ8D5yvtPs623Vhl8zHNqvqRCJ7uSr85gsmIqNGWGyVA2Hey0tRgMuPWZQWQTNmf VuDjJA1DqegjF/sndCaIs4VMrcSupWL1f+zP9xySMsfE2ANG4nI3HQ9d04Dntq8vClBC AUa1zHbnnvXmngZ+qJNAQ88z/ylP07vF/c/02oAS1LXkzh61ZCNeoO/V5AUo/qb63VJC H01A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=V1d4amTA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s72si850858pgc.632.2018.01.29.09.24.06; Mon, 29 Jan 2018 09:24:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=V1d4amTA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751542AbeA2RW6 (ORCPT + 99 others); Mon, 29 Jan 2018 12:22:58 -0500 Received: from mail-qk0-f195.google.com ([209.85.220.195]:45854 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750959AbeA2RWz (ORCPT ); Mon, 29 Jan 2018 12:22:55 -0500 Received: by mail-qk0-f195.google.com with SMTP id z12so6497728qkf.12; Mon, 29 Jan 2018 09:22:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=DD0FORhS5sRfZpFgqy5NoQwKCKWxK5Hw3KQQEKsIuiE=; b=V1d4amTAJlyPSO18js6qRSH9BmzWN0XKZYplS8WzHq5c8M/+E+ym50/0VHxPARPs56 KvUj2a4qxUd+gOiOgNBKfj9Y1cx9ctGXEly4rM9w3HiTWGXNDyDr9+uQHmvBGoFUXumS S09lVRxgrky9r3y4k8PTek7zL7VinQrl3N93tufiw4mzvsdMhkLIt+aYRFbPIkSPjY6N ZULK1dZg/0G/Dns9nFhtWYH4+r3UxutLROBP0YDDg9WPMbl7Jn/eBFBSkJfKDsTICOPu dHr0+BbHlCAM6tKWtqHLgBR181gZwMqkaZ2Dri9FTrvjFqxfoAGk3NKSjl3+I1QAIHW+ fdZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=DD0FORhS5sRfZpFgqy5NoQwKCKWxK5Hw3KQQEKsIuiE=; b=dajK8ZHybWu7j/Ft0DNp1sMFuMQB/efYxmLDw/tScILJ743ovYVODeYhivIKXI2lr/ hnr52VcqGEk9OWv+xCvH4ODe3x59qVts39O3miO8WFO5h3NU6tlEcH/p7ULWK2OLerQy +kdC4+3YXWJYzdT0wdUHKWPFx1CDb+rw1tqNjCtmZg/UF/M+jZp0fna6/YTsCrsFHwrM gCsnAczm7nOOQ1xwKYiZCqPk7DYPidua+Q4A9qHG4rHzKCfq0rQw6ZQTmhBLGUe4NWuq 26Igaksub3z86S3EwTsrqBw8t1gFdTbNokZ7cWw/kvYMPYEgZYVZAF1U3M8NP/uSYG8B CSMQ== X-Gm-Message-State: AKwxytcnRVc6SmT+gWj8kkN8fRulCIdI/RwxQoCiFP5onELnkbhyZq9p 5AQ4F2j8KR0E9TZeYoqRBfuGakKGid3pJUQNAdQ= X-Received: by 10.55.51.18 with SMTP id z18mr37184172qkz.103.1517246573772; Mon, 29 Jan 2018 09:22:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.89.199 with HTTP; Mon, 29 Jan 2018 09:22:53 -0800 (PST) In-Reply-To: <20180129072805.7ifsjr3r6eziwp7a@f1.synalogic.ca> References: <20180126091236.13044-1-bpoirier@suse.com> <20180126091236.13044-4-bpoirier@suse.com> <20180129072805.7ifsjr3r6eziwp7a@f1.synalogic.ca> From: Alexander Duyck Date: Mon, 29 Jan 2018 09:22:53 -0800 Message-ID: Subject: Re: [PATCH 3/3] Revert "e1000e: Do not read ICR in Other interrupt" To: Benjamin Poirier Cc: Jeff Kirsher , intel-wired-lan , Netdev , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jan 28, 2018 at 11:28 PM, Benjamin Poirier wrot= e: > On 2018/01/26 13:01, Alexander Duyck wrote: >> On Fri, Jan 26, 2018 at 1:12 AM, Benjamin Poirier wr= ote: >> > This reverts commit 16ecba59bc333d6282ee057fb02339f77a880beb. >> > >> > It was reported that emulated e1000e devices in vmware esxi 6.5 Build >> > 7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid recei= ver >> > overrun interrupt bursts"). Some tracing shows that after >> > e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_oth= er() >> > on emulated e1000e devices. In comparison, on real e1000e 82574 hardwa= re, >> > icr=3D0x80000004 (_INT_ASSERTED | _LSC) in the same situation. >> > >> > Some experimentation showed that this flaw in vmware e1000e emulation = can >> > be worked around by not setting Other in EIAC. This is how it was befo= re >> > commit 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt"). >> > >> > Since the ICR read in the Other interrupt handler has already been >> > restored, this patch effectively reverts the remainder of commit >> > 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt"). >> > >> > Fixes: 4aea7a5c5e94 ("e1000e: Avoid receiver overrun interrupt bursts"= ) >> > Signed-off-by: Benjamin Poirier >> > --- >> > drivers/net/ethernet/intel/e1000e/netdev.c | 10 ++++++++-- >> > 1 file changed, 8 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/= ethernet/intel/e1000e/netdev.c >> > index ed103b9a8d3a..fffc1f0e3895 100644 >> > --- a/drivers/net/ethernet/intel/e1000e/netdev.c >> > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c >> > @@ -1916,6 +1916,13 @@ static irqreturn_t e1000_msix_other(int __alway= s_unused irq, void *data) >> > struct e1000_hw *hw =3D &adapter->hw; >> > u32 icr =3D er32(ICR); >> > >> > + /* Certain events (such as RXO) which trigger Other do not set >> > + * INT_ASSERTED. In that case, read to clear of icr does not t= ake >> > + * place. >> > + */ >> > + if (!(icr & E1000_ICR_INT_ASSERTED)) >> > + ew32(ICR, E1000_ICR_OTHER); >> > + >> >> This piece doesn't make sense to me. Why are we clearing OTHER if >> ICR_INT_ASSERTED is not set? > > Datasheet =C2=A710.2.4.1 ("Interrupt Cause Read Register") says that ICR = read > to clear only occurs if INT_ASSERTED is set. This corresponds to what I > observed. > > However, while working on these issues, I noticed that when there is an r= xo > event, INT_ASSERTED is not always set even though the interrupt is raised= . I > think this is a hardware flaw. I agree. I need to check with our silicon team to see what we can determine= . > For example, if doing > ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER); > we enter e1000_msix_other() and two consecutive reads of ICR result in > 0x81000004 > 0x00000000 > > If doing > ew32(ICS, E1000_ICS_RXO | E1000_ICS_OTHER); > we enter e1000_msix_other() and two consecutive reads of ICR result in > 0x01000041 > 0x01000041 This is interesting. So the ICR is doing the clear on read, so that answers the question I had about the earlier patch. One thought on this.. Is there any reason why you are limiting this to only the OTHER bit? It seems like RXO and the other causes that aren't supposed to be included in the mask should probably be cleared as well, are they auto-cleared, ignored, or is there some advantage to leaving them set? > Consequently, we must clear OTHER manually from ICR, otherwise the > interrupt is immediately re-raised after exiting the handler. > > These observations are the same whether the interrupt is triggered via a > write to ICS or in hardware. > > Furthermore, I tested that this behavior is the same for other Other > events (MDAC, SRPD, ACK, MNG). Those were tested via a write to ICS > only, not in hardware. > > This is a version of the test patch that I used to trigger lsc and rxo in > software and hardware. It applies over this patch series. I plan to look into this some more over the next few days. Ideally if we could mask these "OTHER" interrupts besides the LSC we could comply with all the needed bits for MSI-X. My concern is that we are still stuck reading the ICR at this point because of this and it is going to make dealing with MSI-X challenging on 82574 since it seems like the intention was that you weren't supposed to be reading the ICR when MSI-X is enabled based on the list of current issues and HW errata. At this point it seems like the interrupts is firing and the INT_ASSERTED is all we really need to be checking for if I understand this all correctly. Basically if LSC is set it will trigger OTHER and INT_ASSERTED, if any of the other causes are set they are only setting OTHER. > diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/et= hernet/intel/e1000e/defines.h > index 0641c0098738..f54e7ac9c934 100644 > --- a/drivers/net/ethernet/intel/e1000e/defines.h > +++ b/drivers/net/ethernet/intel/e1000e/defines.h > @@ -398,6 +398,7 @@ > #define E1000_ICR_LSC 0x00000004 /* Link Status Change */ > #define E1000_ICR_RXSEQ 0x00000008 /* Rx sequence error */ > #define E1000_ICR_RXDMT0 0x00000010 /* Rx desc min. threshold (0)= */ > +#define E1000_ICR_RXO 0x00000040 /* rx overrun */ > #define E1000_ICR_RXT0 0x00000080 /* Rx timer intr (ring 0) */ > #define E1000_ICR_ECCER 0x00400000 /* Uncorrectable ECC Error */ > /* If this bit asserted, the driver should claim the interrupt */ > diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c b/drivers/net/et= hernet/intel/e1000e/ethtool.c > index 003cbd605799..4933c1beac74 100644 > --- a/drivers/net/ethernet/intel/e1000e/ethtool.c > +++ b/drivers/net/ethernet/intel/e1000e/ethtool.c > @@ -1802,98 +1802,20 @@ static void e1000_diag_test(struct net_device *ne= tdev, > struct ethtool_test *eth_test, u64 *data) > { > struct e1000_adapter *adapter =3D netdev_priv(netdev); > - u16 autoneg_advertised; > - u8 forced_speed_duplex; > - u8 autoneg; > - bool if_running =3D netif_running(netdev); > + struct e1000_hw *hw =3D &adapter->hw; > > pm_runtime_get_sync(netdev->dev.parent); > > set_bit(__E1000_TESTING, &adapter->state); > > - if (!if_running) { > - /* Get control of and reset hardware */ > - if (adapter->flags & FLAG_HAS_AMT) > - e1000e_get_hw_control(adapter); > - > - e1000e_power_up_phy(adapter); > - > - adapter->hw.phy.autoneg_wait_to_complete =3D 1; > - e1000e_reset(adapter); > - adapter->hw.phy.autoneg_wait_to_complete =3D 0; > - } > - > if (eth_test->flags =3D=3D ETH_TEST_FL_OFFLINE) { > - /* Offline tests */ > - > - /* save speed, duplex, autoneg settings */ > - autoneg_advertised =3D adapter->hw.phy.autoneg_advertised= ; > - forced_speed_duplex =3D adapter->hw.mac.forced_speed_dupl= ex; > - autoneg =3D adapter->hw.mac.autoneg; > - > - e_info("offline testing starting\n"); > - > - if (if_running) > - /* indicate we're in test mode */ > - e1000e_close(netdev); > - > - if (e1000_reg_test(adapter, &data[0])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - e1000e_reset(adapter); > - if (e1000_eeprom_test(adapter, &data[1])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - e1000e_reset(adapter); > - if (e1000_intr_test(adapter, &data[2])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - e1000e_reset(adapter); > - if (e1000_loopback_test(adapter, &data[3])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - /* force this routine to wait until autoneg complete/time= out */ > - adapter->hw.phy.autoneg_wait_to_complete =3D 1; > - e1000e_reset(adapter); > - adapter->hw.phy.autoneg_wait_to_complete =3D 0; > - > - if (e1000_link_test(adapter, &data[4])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - /* restore speed, duplex, autoneg settings */ > - adapter->hw.phy.autoneg_advertised =3D autoneg_advertised= ; > - adapter->hw.mac.forced_speed_duplex =3D forced_speed_dupl= ex; > - adapter->hw.mac.autoneg =3D autoneg; > - e1000e_reset(adapter); > - > - clear_bit(__E1000_TESTING, &adapter->state); > - if (if_running) > - e1000e_open(netdev); > + // LSC, RXO, MDAC, SRPD, ACK, MNG > + ew32(ICS, E1000_ICR_RXO | E1000_ICR_OTHER); > } else { > - /* Online tests */ > - > - e_info("online testing starting\n"); > - > - /* register, eeprom, intr and loopback tests not run onli= ne */ > - data[0] =3D 0; > - data[1] =3D 0; > - data[2] =3D 0; > - data[3] =3D 0; > - > - if (e1000_link_test(adapter, &data[4])) > - eth_test->flags |=3D ETH_TEST_FL_FAILED; > - > - clear_bit(__E1000_TESTING, &adapter->state); > - } > - > - if (!if_running) { > - e1000e_reset(adapter); > - > - if (adapter->flags & FLAG_HAS_AMT) > - e1000e_release_hw_control(adapter); > + ew32(ICS, E1000_ICR_LSC | E1000_ICR_OTHER); > } > > - msleep_interruptible(4 * 1000); > + clear_bit(__E1000_TESTING, &adapter->state); > > pm_runtime_put_sync(netdev->dev.parent); > } > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/eth= ernet/intel/e1000e/netdev.c > index fffc1f0e3895..5b3a0feaf052 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -46,6 +46,10 @@ > > #include "e1000.h" > > +DEFINE_RATELIMIT_STATE(rx_ratelimit_state, 2 * HZ, 1); > +DEFINE_RATELIMIT_STATE(other_ratelimit_state, 2 * HZ, 1); > +DEFINE_RATELIMIT_STATE(other_ratelimit_state2, 2 * HZ, 1); > + > #define DRV_EXTRAVERSION "-k" > > #define DRV_VERSION "3.2.6" DRV_EXTRAVERSION > @@ -936,6 +940,9 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_= ring, int *work_done, > int cleaned_count =3D 0; > bool cleaned =3D false; > unsigned int total_rx_bytes =3D 0, total_rx_packets =3D 0; > + static unsigned int count; > + > + mdelay(10); > > i =3D rx_ring->next_to_clean; > rx_desc =3D E1000_RX_DESC_EXT(*rx_ring, i); > @@ -1067,6 +1074,16 @@ static bool e1000_clean_rx_irq(struct e1000_ring *= rx_ring, int *work_done, > > adapter->total_rx_bytes +=3D total_rx_bytes; > adapter->total_rx_packets +=3D total_rx_packets; > + > + count++; > + if (__ratelimit(&rx_ratelimit_state)) { > + static unsigned int max; > + max =3D max(max, total_rx_packets); > + trace_printk("rx %u now, max %u, %u rounds\n", > + total_rx_packets, max, count); > + count =3D 0; > + } > + > return cleaned; > } > > @@ -1914,14 +1931,30 @@ static irqreturn_t e1000_msix_other(int __always_= unused irq, void *data) > struct net_device *netdev =3D data; > struct e1000_adapter *adapter =3D netdev_priv(netdev); > struct e1000_hw *hw =3D &adapter->hw; > - u32 icr =3D er32(ICR); > + static unsigned int count; > + u32 icr2, icr =3D er32(ICR); > > /* Certain events (such as RXO) which trigger Other do not set > * INT_ASSERTED. In that case, read to clear of icr does not take > * place. > */ > + /* > if (!(icr & E1000_ICR_INT_ASSERTED)) > ew32(ICR, E1000_ICR_OTHER); > + */ > + > + icr2 =3D er32(ICR); > + > + count++; > + if (__ratelimit(&other_ratelimit_state)) { > + trace_printk("icr 0x%08x icr2 0x%08x count %u\n", icr, ic= r2, > + count); > + count =3D 0; > + } > + if (icr & E1000_ICR_RXO && icr & E1000_ICR_INT_ASSERTED && > + __ratelimit(&other_ratelimit_state2)) { > + trace_printk("special icr 0x%08x icr2 0x%08x\n", icr, icr= 2); > + } > > if (icr & adapter->eiac_mask) > ew32(ICS, (icr & adapter->eiac_mask));