Received: by 2002:a17:90a:1609:0:0:0:0 with SMTP id n9csp2086244pja; Thu, 26 Mar 2020 09:28:50 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtdl+wBrp7y6S6rCjFo4YoIiAgh2by9iX3th6leuqCYgGMOwAUDGxtmjb5x5P8mE7iAdpEs X-Received: by 2002:aca:891:: with SMTP id 139mr659383oii.137.1585240130054; Thu, 26 Mar 2020 09:28:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585240130; cv=none; d=google.com; s=arc-20160816; b=pQZRYohq/jRty20YjMSu/7mca4l0yzz77No+Baq38BGPp7FsRMuwrFaPWpETBFF+DF gOkNcJNjcgR9zhwvaj0XAgvtEngny7AUYRfc4t0Yvi3ED+xE952XjX4cdiIPcklSKTEJ trIIRgKRCtjAkS3cvwrrhYgHFK8QCeRqsCOirUaiXKT2sMDH7DDBHvb9HEcQeCAS4ncO tJi5nnTrnnsLitI8o1hOv2Lhxyh8tG4CKiAuySie4HKplpyjACQLiD8nOb/DoZ7ReWou AJLoCQbvOcd/7EnmiyfanRYaiqzuxTgNnglHoB8QcUQ1v52iECjCekmQiCT4ZGCetLfg pSMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=2MvgnsSQhFJOcyx9F2ihbJOXdLnu/MGIqeeUoB2SLbM=; b=dBBhAxf2/FYajsoFGFPqoJz29HhZ5zdA7L9DO97slvZ2bdjRWW2hNb3ST6FkdbJSkM iIXMdxriqy22CMV2ZgZRInbCcWgMEAyePO+dqWJyy4iWRwW9ZRCA7xqr6tOQ6FJ7HoAP vNWRUFOVWTnHi/okzragYgbyHBar35YGV55BMS6W6V6TWHvGynNDCVFjiM9vzGU59UFX qGLA8jtZzmSiEWLvAXMEcFxWn/8mdZKFcLsA2aqnNYAauu3ctMShjA8if6fOdV4JjSam r+MIc6Y07+WXkrykCHQT4GOpUu2FR/o4gAMSOGt2T9uKu6IoXJRK1iJ7bb7aDgzmkGIo PMHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O+HVkvBE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e5si1260481otp.317.2020.03.26.09.28.33; Thu, 26 Mar 2020 09:28:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O+HVkvBE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728515AbgCZQ14 (ORCPT + 99 others); Thu, 26 Mar 2020 12:27:56 -0400 Received: from mail-il1-f194.google.com ([209.85.166.194]:41059 "EHLO mail-il1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726175AbgCZQ14 (ORCPT ); Thu, 26 Mar 2020 12:27:56 -0400 Received: by mail-il1-f194.google.com with SMTP id t6so2322624ilj.8; Thu, 26 Mar 2020 09:27:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2MvgnsSQhFJOcyx9F2ihbJOXdLnu/MGIqeeUoB2SLbM=; b=O+HVkvBErHFx61kmKWqwDVr9mM9NwxkP5CgHZkz8KKBG51hrYlwuKP26qg84OHsxVB 0Csn9hEqw5zUtxAZ8hXbKJYonAGgMiW0oSwLVoWbnfKE7nmgDQTWRrmKkq0R3soBo1O6 YA2xpd7O25TSRoBEZeQ7E95N53qbjw3fE5HCjUUKmtAeNikCAx8lmf4WzXJgNMiGFSJx c4+Rvjrk3GX2GNNId3q5hbOuLfTBkYt3mx/jzzQjOjWhqiNfneAMWR968ENRiKCvdJgC Lt4zrBYpUJM4uD/0sBT0Ucf0edmhAvR21jjC3t8qlxejtaMgYo1HetQF0xoeZiEQ/Df3 Sgmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2MvgnsSQhFJOcyx9F2ihbJOXdLnu/MGIqeeUoB2SLbM=; b=MNkjxv5kWyr3Pe1Cl//9zeE5kC6LSuxM1nh5bRaOuSeBBVr/vqmMbCsj2KZUXtHdHF YsBnURRnnaoMhi4z++vAqr7EhE9pNcxcjmuEZYGp3X0PIbIThkxWgacIy8m28BpXqVlr DP8UHztCbTQy8iJ8x2u6maFqsLmj19r8izkXyt2PQ3KM+ABOQE42XoEXfoJQnwiFBzI+ TS2l/tfu2k3oXmwf4RjNb/xVKFe/ojQjfuy3VuIdS20ZNi5JA5St29peQ22y5BcL6oUi Cd2CXMy5b4fAWsyJRqiQtNSAj40GMkWuclxDy9yQyqtSZjaBVzs5xMOkaZPJ6yCNz2Ip carw== X-Gm-Message-State: ANhLgQ0zlqfs3WFGsjO5UaFGySfNHjyxCMqpCdNfy1/KZ3Qyz1H8GLDV 9b8FJ9wfXpc515nlRFL5olRAZ4L+DNGRK5P19gQj6u2sw90= X-Received: by 2002:a92:3955:: with SMTP id g82mr9646493ila.237.1585240073233; Thu, 26 Mar 2020 09:27:53 -0700 (PDT) MIME-Version: 1.0 References: <20200326103926.20888-1-kai.heng.feng@canonical.com> In-Reply-To: <20200326103926.20888-1-kai.heng.feng@canonical.com> From: Alexander Duyck Date: Thu, 26 Mar 2020 09:27:42 -0700 Message-ID: Subject: Re: [Intel-wired-lan] [PATCH] igb: Use a sperate mutex insead of rtnl_lock() To: Kai-Heng Feng Cc: Jeff Kirsher , "open list:NETWORKING DRIVERS" , open list , "moderated list:INTEL ETHERNET DRIVERS" , "David S. Miller" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 26, 2020 at 3:39 AM Kai-Heng Feng wrote: > > Commit 9474933caf21 ("igb: close/suspend race in netif_device_detach") > fixed race condition between close and power management ops by using > rtnl_lock(). > > This fix is a preparation for next patch, to prevent a dead lock under > rtnl_lock() when calling runtime resume routine. > > However, we can't use device_lock() in igb_close() because when module > is getting removed, the lock is already held for igb_remove(), and > igb_close() gets called during unregistering the netdev, hence causing a > deadlock. So let's introduce a new mutex so we don't cause a deadlock > with driver core or netdev core. > > Signed-off-by: Kai-Heng Feng So this description doesn't make much sense to me. You describe the use of the device_lock() in igb_close() but it isn't used there. In addition it seems like you are arbitrarily moving code that was wrapped in the rtnl_lock out of it. I'm not entirely sure that is safe since there are calls within many of these functions that assume the rtnl_lock is held and changing that is likely going to introduce more issues. > --- > drivers/net/ethernet/intel/igb/igb_main.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c > index b46bff8fe056..dc7ed5dd216b 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -288,6 +288,8 @@ static const struct igb_reg_info igb_reg_info_tbl[] = { > {} > }; > > +static DEFINE_MUTEX(igb_mutex); > + > /* igb_regdump - register printout routine */ > static void igb_regdump(struct e1000_hw *hw, struct igb_reg_info *reginfo) > { > @@ -4026,9 +4028,14 @@ static int __igb_close(struct net_device *netdev, bool suspending) > > int igb_close(struct net_device *netdev) > { > + int err = 0; > + > + mutex_lock(&igb_mutex); > if (netif_device_present(netdev) || netdev->dismantle) > - return __igb_close(netdev, false); > - return 0; > + err = __igb_close(netdev, false); > + mutex_unlock(&igb_mutex); > + > + return err; > } > Okay, so I am guessing the problem has something to do with the addition of the netdev->dismantle test here and the fact that it is bypassing the present check for the hotplug remove case? So it looks like nobody ever really reviewed commit 888f22931478 ("igb: Free IRQs when device is hotplugged"). What I would recommend is reverting it and instead we fix the remaining pieces that need to be addressed in igb so it more closely matches what we have in e1000e after commit a7023819404a ("e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm"). From what I can tell the only pieces that are really missing is to update igb_io_error_detected so that in addition to igb_down it will call igb_free_irq, and then in addition we should be wrapping most of the code in that function with an rtnl_lock since it is detaching a device and making modifications to it.