Received: by 10.192.165.156 with SMTP id m28csp65707imm; Tue, 10 Apr 2018 16:31:32 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/0k+4ieYqmdFAYqdW4/qIPvf21F11U/4iRBXJIyllZ+m4mWRCDQwo3cCzWu2WbDtuKi1FM X-Received: by 10.98.217.85 with SMTP id s82mr1985755pfg.208.1523403091989; Tue, 10 Apr 2018 16:31:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523403091; cv=none; d=google.com; s=arc-20160816; b=o2vHxK8s6yeVP0PNGUbpeCaq5b76j7PV4yVs/yJTOKx2RxPjjG9cQx+3xxuTlbyw+i c7sSi5Wtnv9u6YU3dbEBBzPt1+7uitI0f3Ds9bxxFoQpACYY4Yo260HinJJve2MwZ2Fg g43vfZD0V5SxyKS48OAVRcdWUrNU9FXTYqGP6zQp4cly9+GpPeqjrP6t2JHbpd5+FQAP 2CxI+tu5tK3NOurCAc3lyg+BU/VqQiCOD4K3Fhj4V4BQZ/PjaQ5ywVVD3LCHxowsdZo3 iBEc21yACT5Owaes6+utNzehr3nEWE5q3Xk7t3AUfMgoP/Kfoa0+iLfFVJUtWea1fiCw U61w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=/BNxBXk45oUUMEquCMBQLyV1W8WhzcmTq3/qbBgIIwk=; b=f17hUVRxKRhyDjCGnKV5DNU9ng4a7iWBpliiWs+a/svqUZt48dbuNwf7Hf7fqPpBdb psqavbTnOATRtimXlw7Yv6kFr6VJItlrEvjQQ4xZN7j8EIvmETR/jMCM7ianQkLcBZG0 sCiptKsQTGQzGhPKOTxExtCtKovixu7sIbjL4Z9bXn/doTeteUm0JCm1i4eJBipdT2TE SbBjIHOzSpj9ATlfZz9r8YUcrrxc8I5JiYGP4x+4fXpFP5XnlH1V7GllMpQ1MDIYFVbO WtVlGXdlUWa9RJo/ezICMwnOND45kzOZ4pSUs97F/eFs8gkWuJHImhVX2yTI51W/TrN4 aovw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d13si2489989pgu.626.2018.04.10.16.30.55; Tue, 10 Apr 2018 16:31:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754405AbeDJW3w (ORCPT + 99 others); Tue, 10 Apr 2018 18:29:52 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:39728 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754373AbeDJW3u (ORCPT ); Tue, 10 Apr 2018 18:29:50 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id BF906D59; Tue, 10 Apr 2018 22:29:49 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jacob Keller , Andrew Bowers , Jeff Kirsher , Sasha Levin Subject: [PATCH 4.15 090/168] i40evf: dont rely on netif_running() outside rtnl_lock() Date: Wed, 11 Apr 2018 00:23:52 +0200 Message-Id: <20180410212804.094828797@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180410212800.144079021@linuxfoundation.org> References: <20180410212800.144079021@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Jacob Keller [ Upstream commit 44b034b406211fc103159f82b9e601e05675c739 ] In i40evf_reset_task we use netif_running() to determine whether or not the device is currently up. This allows us to properly free queue memory and shut down things before we request the hardware reset. It turns out that we cannot be guaranteed of netif_running() returning false until the device is fully up, as the kernel core code sets __LINK_STATE_START prior to calling .ndo_open. Since we're not holding the rtnl_lock(), it's possible that the driver's i40evf_open handler function is currently being called while we're resetting. We can't simply hold the rtnl_lock() while checking netif_running() as this could cause a deadlock with the i40evf_open() function. Additionally, we can't avoid the deadlock by holding the rtnl_lock() over the whole reset path, as this essentially serializes all resets, and can cause massive delays if we have multiple VFs on a system. Instead, lets just check our own internal state __I40EVF_RUNNING state field. This allows us to ensure that the state is correct and is only set after we've finished bringing the device up. Without this change we might free data structures about device queues and other memory before they've been fully allocated. Signed-off-by: Jacob Keller Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -1796,7 +1796,11 @@ static void i40evf_disable_vf(struct i40 adapter->flags |= I40EVF_FLAG_PF_COMMS_FAILED; - if (netif_running(adapter->netdev)) { + /* We don't use netif_running() because it may be true prior to + * ndo_open() returning, so we can't assume it means all our open + * tasks have finished, since we're not holding the rtnl_lock here. + */ + if (adapter->state == __I40EVF_RUNNING) { set_bit(__I40E_VSI_DOWN, adapter->vsi.state); netif_carrier_off(adapter->netdev); netif_tx_disable(adapter->netdev); @@ -1854,6 +1858,7 @@ static void i40evf_reset_task(struct wor struct i40evf_mac_filter *f; u32 reg_val; int i = 0, err; + bool running; while (test_and_set_bit(__I40EVF_IN_CLIENT_TASK, &adapter->crit_section)) @@ -1913,7 +1918,13 @@ static void i40evf_reset_task(struct wor } continue_reset: - if (netif_running(netdev)) { + /* We don't use netif_running() because it may be true prior to + * ndo_open() returning, so we can't assume it means all our open + * tasks have finished, since we're not holding the rtnl_lock here. + */ + running = (adapter->state == __I40EVF_RUNNING); + + if (running) { netif_carrier_off(netdev); netif_tx_stop_all_queues(netdev); adapter->link_up = false; @@ -1964,7 +1975,10 @@ continue_reset: mod_timer(&adapter->watchdog_timer, jiffies + 2); - if (netif_running(adapter->netdev)) { + /* We were running when the reset started, so we need to restore some + * state here. + */ + if (running) { /* allocate transmit descriptors */ err = i40evf_setup_all_tx_resources(adapter); if (err)