Received: by 10.223.185.116 with SMTP id b49csp3922527wrg; Mon, 26 Feb 2018 08:16:01 -0800 (PST) X-Google-Smtp-Source: AH8x225zFZ+yitd1f3WwegcQ16n4xwhmQga1mD4KnW004x5cRRrzXW8YmsQohDsJLqeX4BO9hYpY X-Received: by 10.98.70.198 with SMTP id o67mr10954750pfi.173.1519661761809; Mon, 26 Feb 2018 08:16:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519661761; cv=none; d=google.com; s=arc-20160816; b=ySnXAzZ6U716P7hoyMiNCWk7JxI018Y2IVavuxJ/QsuAR4XUZQzvK7Vuz29cwVQYiq 1U5oFcuq351c2xtG2do36ljTrknbrwzjpcS1WcI95uOpTAU6P9xiCoMUGvY7GsLOmWbX 7vTSmGzBWdFNmX+oDH87UKn4YighdLTfDWBBmEKMN3xKRvZkWtxrqlnnco3B6QjKETkM T0D4fn9NA1BMeRRzeve4m0nVk/JPl2JQHX1YI+Dwl3CnLv7J5IGQAgDbpT3sm8vOhP82 l+kmqepJszbi7QScOhM5a0NCF1BZY/WjtPjA2fWFe88P4NcMAu+KL1ptTjG/Xm+aLJOc XJOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=vUUHjoyt/XPBegTJt9nW4LiVZ8KCVw4B/GRCX4YGO7A=; b=ELY0JUBDQVIIgcvs2aYwSKYbJZK6pATB/s70YSWhKaqEhEqMJSb3YESy0fLKH+Emu1 qVQ9ZcnhYSim9ALUGlMv34C+Q8cQzc5+xrg4CHccE0PLOZQkf/BZyHG+XeOVETNLbPoU /a6Q4oem2OlFcHyYjbMTAEDuzVLn+B6Ci1V+3MqizOVBSeL8Ddqs2u7CFtGll+x+PlYq phdNJWe9ZsZiFiAoPyaPVQGDw8pDy06KgPB1XCRYiyaalaWHX3M5+MHWxgrTtdxFLfAt /ogUqXGoXRjGuGzRj2Q5XSJf+defG9Zx35q7940g8N/0flOAuDoCBIbq+kuGFtK1C6dF wEjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tfuqUU70; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i16si1836237pgv.255.2018.02.26.08.15.46; Mon, 26 Feb 2018 08:16:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tfuqUU70; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751835AbeBZQOl (ORCPT + 99 others); Mon, 26 Feb 2018 11:14:41 -0500 Received: from mail-qk0-f195.google.com ([209.85.220.195]:37163 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751518AbeBZQOk (ORCPT ); Mon, 26 Feb 2018 11:14:40 -0500 Received: by mail-qk0-f195.google.com with SMTP id y137so19687090qka.4; Mon, 26 Feb 2018 08:14:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vUUHjoyt/XPBegTJt9nW4LiVZ8KCVw4B/GRCX4YGO7A=; b=tfuqUU70qE6DUSBIACIN5K9G8zqSE/sfPHOWC0Z5YpmG7Hh+6RWBh65a6M58qPHj1l wgqyRaVxNIpGGxbtl+pWAWWn6z5203ub0gFtA0f9H+PvXlvl88t7Z17zJRaYvr/hzQ8E 3lccl9xk01grFSnvnsnX/AEUq68CJNeaqztJxyCq+/5NcEBjwC+18PYxrbbItZFJV/fA Z4iWEoA9Jusju4Pq0ueqUeHq8VeK58j3mmLvlkssDrS2R0gMI685icFg5DxX6SiRCSFy ci3ePHaQEX+YmBCg3eTMjKLzubvU++RCWxWDc+ws0cpf+0/tNVeCZtuwaVM3bu8prUts ykVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vUUHjoyt/XPBegTJt9nW4LiVZ8KCVw4B/GRCX4YGO7A=; b=UylH0LSTUpZF7bGrTYIA5d7Q2diqtVmkSRZ/jwbxT09P8kquQFki3vw3X2fhmiq3+m yMu0krpYt7awHPXCqSLc+KNLjXsHLsZuhnPWwGc0u95bQyzyZpNBIS1Zr4AeiZ9MTb5+ 4sCYSjcsT0Pf8bJDE5TjIBoTgFtcTNcL1ovcdg6/GXrRGu0u0h++QjdpgH+PzyB91hBw 9L1AP2+kMHEYkGZiFS0oZMlI2Kx40RrWIBQ4Kk9q/kqpK7IuJA3orJCj2wwbT3x1R2mx bRHTjynAlYeUPYJB+q9bDUsqBnaspcKtVQDGJV3vVXFvEJQXKxz4BMIrkuYGKovPbjpi Zevg== X-Gm-Message-State: APf1xPAg2TCvTjQ4P+dE4KlnHNHsCLMGXaiCmeBL+uzo+kV/Zl+vgzFp gQQFcI2jUmr1winvobNx9nj/SxTp0BsCMEMEVj4= X-Received: by 10.55.104.150 with SMTP id d144mr17836509qkc.71.1519661678872; Mon, 26 Feb 2018 08:14:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.89.138 with HTTP; Mon, 26 Feb 2018 08:14:38 -0800 (PST) In-Reply-To: <20180226023118.17439-1-bpoirier@suse.com> References: <20180226023118.17439-1-bpoirier@suse.com> From: Alexander Duyck Date: Mon, 26 Feb 2018 08:14:38 -0800 Message-ID: Subject: Re: [RFC PATCH] e1000e: Fix link check race condition. To: Benjamin Poirier Cc: Jeff Kirsher , intel-wired-lan , Netdev , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 25, 2018 at 6:31 PM, Benjamin Poirier wrote: > Alex reported the following race condition: > > /* link goes up... interrupt... schedule watchdog */ > \ e1000_watchdog_task > \ e1000e_has_link > \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link > \ e1000e_phy_has_link_generic(..., &link) > link = true > > /* link goes down... interrupt */ > \ e1000_msix_other > hw->mac.get_link_status = true > > /* link is up */ > mac->get_link_status = false > > link_active = true > /* link_active is true, wrongly, and stays so because > * get_link_status is false */ > > Avoid this problem by making sure that we don't set get_link_status = false > after having checked the link. > > It seems this problem has been present since the introduction of e1000e. > > Link: https://lkml.org/lkml/2018/1/29/338 > Reported-by: Alexander Duyck > Signed-off-by: Benjamin Poirier > --- > drivers/net/ethernet/intel/e1000e/ich8lan.c | 41 ++++++++++++++++------------- > drivers/net/ethernet/intel/e1000e/mac.c | 14 +++++++--- > 2 files changed, 33 insertions(+), 22 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c > index ff308b05d68c..3c2c4f87e075 100644 > --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c > +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c > @@ -1386,6 +1386,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > */ > if (!mac->get_link_status) > return 1; > + mac->get_link_status = false; > > /* First we want to see if the MII Status Register reports > * link. If so, then we want to get the current speed/duplex > @@ -1393,12 +1394,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > */ > ret_val = e1000e_phy_has_link_generic(hw, 1, 0, &link); > if (ret_val) > - return ret_val; > + goto out; > > if (hw->mac.type == e1000_pchlan) { > ret_val = e1000_k1_gig_workaround_hv(hw, link); > if (ret_val) > - return ret_val; > + goto out; > } > > /* When connected at 10Mbps half-duplex, some parts are excessively > @@ -1431,7 +1432,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > > ret_val = hw->phy.ops.acquire(hw); > if (ret_val) > - return ret_val; > + goto out; > > if (hw->mac.type == e1000_pch2lan) > emi_addr = I82579_RX_CONFIG; > @@ -1453,7 +1454,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > hw->phy.ops.release(hw); > > if (ret_val) > - return ret_val; > + goto out; > > if (hw->mac.type >= e1000_pch_spt) { > u16 data; > @@ -1462,14 +1463,14 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > if (speed == SPEED_1000) { > ret_val = hw->phy.ops.acquire(hw); > if (ret_val) > - return ret_val; > + goto out; > > ret_val = e1e_rphy_locked(hw, > PHY_REG(776, 20), > &data); > if (ret_val) { > hw->phy.ops.release(hw); > - return ret_val; > + goto out; > } > > ptr_gap = (data & (0x3FF << 2)) >> 2; > @@ -1483,18 +1484,18 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > } > hw->phy.ops.release(hw); > if (ret_val) > - return ret_val; > + goto out; > } else { > ret_val = hw->phy.ops.acquire(hw); > if (ret_val) > - return ret_val; > + goto out; > > ret_val = e1e_wphy_locked(hw, > PHY_REG(776, 20), > 0xC023); > hw->phy.ops.release(hw); > if (ret_val) > - return ret_val; > + goto out; > > } > } > @@ -1521,7 +1522,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > (hw->adapter->pdev->device == E1000_DEV_ID_PCH_I218_V3)) { > ret_val = e1000_k1_workaround_lpt_lp(hw, link); > if (ret_val) > - return ret_val; > + goto out; > } > if (hw->mac.type >= e1000_pch_lpt) { > /* Set platform power management values for > @@ -1529,7 +1530,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > */ > ret_val = e1000_platform_pm_pch_lpt(hw, link); > if (ret_val) > - return ret_val; > + goto out; > } > > /* Clear link partner's EEE ability */ > @@ -1551,22 +1552,22 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > ew32(FEXTNVM6, fextnvm6); > } > > - if (!link) > + if (!link) { > + mac->get_link_status = true; > return 0; /* No link detected */ > - > - mac->get_link_status = false; > + } If I am not mistaken this could be just another jump to the "out" label. We should have initialized "ret_val" to 0 near the start of this function when we made the call to phy_has_link_generic. > > switch (hw->mac.type) { > case e1000_pch2lan: > ret_val = e1000_k1_workaround_lv(hw); > if (ret_val) > - return ret_val; > + goto out; > /* fall-thru */ > case e1000_pchlan: > if (hw->phy.type == e1000_phy_82578) { > ret_val = e1000_link_stall_workaround_hv(hw); > if (ret_val) > - return ret_val; > + goto out; > } > > /* Workaround for PCHx parts in half-duplex: > @@ -1595,7 +1596,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > if (hw->phy.type > e1000_phy_82579) { > ret_val = e1000_set_eee_pchlan(hw); > if (ret_val) > - return ret_val; > + goto out; > } > > /* If we are forcing speed/duplex, then we simply return since > @@ -1618,10 +1619,14 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) > ret_val = e1000e_config_fc_after_link_up(hw); > if (ret_val) { > e_dbg("Error configuring flow control\n"); > - return ret_val; > + goto out; > } Technically these changes would be a change in behavior. For these we may just want to leave them as-is since I am not certain they would have any actual impact on the link state other than delaying the link-up. For example do we really care if we fail to negotiate flow control? We may not so we might report link up and just a debug message indicating we didn't negotiate that part of the link. > return 1; > + > +out: > + mac->get_link_status = true; > + return ret_val; > } > > static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) > diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c > index db735644b312..60c8beaf5cb3 100644 > --- a/drivers/net/ethernet/intel/e1000e/mac.c > +++ b/drivers/net/ethernet/intel/e1000e/mac.c > @@ -427,6 +427,7 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw) > */ > if (!mac->get_link_status) > return 1; > + mac->get_link_status = false; > > /* First we want to see if the MII Status Register reports > * link. If so, then we want to get the current speed/duplex > @@ -434,12 +435,13 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw) > */ > ret_val = e1000e_phy_has_link_generic(hw, 1, 0, &link); > if (ret_val) > - return ret_val; > + goto out; > > - if (!link) > + if (!link) { > + mac->get_link_status = true; > return 0; /* No link detected */ > + } Same here. We just initialized it to 0 via the call to e1000e_phy_has_link_generic so we could just jump to "out" if the link is not set. You could probably even combine the two conditions into one even with a check for "ret_val || !link". > - mac->get_link_status = false; > > /* Check if there was DownShift, must be checked > * immediately after link-up > @@ -466,10 +468,14 @@ s32 e1000e_check_for_copper_link(struct e1000_hw *hw) > ret_val = e1000e_config_fc_after_link_up(hw); > if (ret_val) { > e_dbg("Error configuring flow control\n"); > - return ret_val; > + goto out; > } > > return 1; > + > +out: > + mac->get_link_status = true; > + return ret_val; > } > > /** > -- > 2.16.1 >