Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp669303imm; Fri, 29 Jun 2018 04:32:57 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLygriVUVP09ATzKeZDr4iPmSuTpqtQdVH8Xr12mH4efICIiHNoi5GsfaGVuMcTYwGXeLjP X-Received: by 2002:a17:902:e3:: with SMTP id a90-v6mr14767315pla.227.1530271977801; Fri, 29 Jun 2018 04:32:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530271977; cv=none; d=google.com; s=arc-20160816; b=TkUhdP5JbVN6DcQ9mDQOvAxXUF11QjKMmmdzEuXMzFeMdrxGE/m0KMzJAZ8s6wdVQy WbHvt5A/WXdOWl3SLy24reyWSy/ZoSeRjqPtQ5K+ssei41KHV6Hj8F7o+/sz5YqtGxL8 okcdw+nC85YLmKaaO6FKQzTbzvSM7pcYRLUuz6tWG2Fiaacin4W8j+5IzLqebar1WUT7 ln05k5mwlffLNMpu+QdVNY5UD4D1M/IMa76vcOshnV46UyJ6ZwnegYQUomZgfnjKBhJP gByMIjK2V+vYWlDTYqVu54fdO7el86ja3d58T1eSwVRV9dNcMmkGTJAErsIm16FDYAfq zsRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=Wpghq5kObdau/Puy8/VvcdP3XzRVApy1Lq7kLPNNIjk=; b=uuwmgu4Xz2V9YiosQ6iuo4Fy5ffgRNbs4QW/T7MkIp8Y08/XGs1N3sWpILVD5e1rw0 OEv1YzX5dX6qBpxaL7xoXa9xK8drDneE33F52bPThVKOhSd38Q9pZfpkrP36nzPiC7vL RLy03tIY68YxbQIjobOpmtdPJnIQgBkcEhb9TwCyvZ2cnfDX0UsgsQlioCwVLT/ggXFQ ey4v+v6iP+kU5NfV88UsOqfcboy7E+lNs0L1iz/XRSDzZVTlqnoWoULWf1/AIbEjyLR/ Fi+tD272T6e/+R5FfOJaaaHnn7cxmzjAddP+FjucFfkgGWOzGPCrSndloVdhXXMtDrI4 YgBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y11-v6si8908533plg.301.2018.06.29.04.32.43; Fri, 29 Jun 2018 04:32:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935480AbeF2LXg (ORCPT + 99 others); Fri, 29 Jun 2018 07:23:36 -0400 Received: from charlotte.tuxdriver.com ([70.61.120.58]:50844 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752987AbeF2LWQ (ORCPT ); Fri, 29 Jun 2018 07:22:16 -0400 Received: from cpe-2606-a000-111b-40b7-640c-26a-4e16-9225.dyn6.twc.com ([2606:a000:111b:40b7:640c:26a:4e16:9225] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1fYrTU-0005d6-VK; Fri, 29 Jun 2018 07:22:13 -0400 Date: Fri, 29 Jun 2018 07:21:27 -0400 From: Neil Horman To: Jason Gunthorpe Cc: linux-rdma@vger.kernel.org, Adit Ranadive , VMware PV-Drivers , Doug Ledford , linux-kernel@vger.kernel.org Subject: Re: [PATCH] vmw_pvrdma: Release netdev when vmxnet3 module is removed Message-ID: <20180629112127.GA16153@hmswarspite.think-freely.org> References: <20180628135938.19625-1-nhorman@tuxdriver.com> <20180628185946.GC379@ziepe.ca> <20180628194526.GA14168@hmswarspite.think-freely.org> <20180628203709.GD379@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180628203709.GD379@ziepe.ca> User-Agent: Mutt/1.10.0 (2018-05-17) X-Spam-Score: -2.9 (--) X-Spam-Status: No Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 28, 2018 at 02:37:09PM -0600, Jason Gunthorpe wrote: > On Thu, Jun 28, 2018 at 03:45:26PM -0400, Neil Horman wrote: > > On Thu, Jun 28, 2018 at 12:59:46PM -0600, Jason Gunthorpe wrote: > > > On Thu, Jun 28, 2018 at 09:59:38AM -0400, Neil Horman wrote: > > > > On repeated module load/unload cycles, its possible for the pvrmda > > > > driver to encounter this crash: > > > > > > > > ... > > > > 297.032448] RIP: 0010:[] [] netdev_walk_all_upper_dev_rcu+0x50/0xb0 > > > > [ 297.034078] RSP: 0018:ffff95087780bd08 EFLAGS: 00010286 > > > > [ 297.034986] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff95087a0c0000 > > > > [ 297.036196] RDX: ffff95087a0c0000 RSI: ffffffff839e44e0 RDI: ffff950835d0c000 > > > > [ 297.037421] RBP: ffff95087780bd40 R08: ffff95087a0e0ea0 R09: abddacd03f8e0ea0 > > > > [ 297.038636] R10: abddacd03f8e0ea0 R11: ffffef5901e9dbc0 R12: ffff95087a0c0000 > > > > [ 297.039854] R13: ffffffff839e44e0 R14: ffff95087a0c0000 R15: ffff950835d0c828 > > > > [ 297.041071] FS: 0000000000000000(0000) GS:ffff95087fc00000(0000) knlGS:0000000000000000 > > > > [ 297.042443] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [ 297.043429] CR2: ffffffffffffffe8 CR3: 000000007a652000 CR4: 00000000003607f0 > > > > [ 297.044674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > [ 297.045893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > [ 297.047109] Call Trace: > > > > [ 297.047545] [] netdev_has_upper_dev_all_rcu+0x18/0x20 > > > > [ 297.048691] [] is_eth_port_of_netdev+0x2f/0xa0 [ib_core] > > > > [ 297.049886] [] ? is_eth_active_slave_of_bonding_rcu+0x70/0x70 [ib_core] > > > > ... > > > > > > > > This occurs because vmw_pvrdma on probe stores a pointer to the netdev > > > > that exists on function 0 of the same bus/device/slot (which represents > > > > the vmxnet3 ethernet driver). However, it never removes this pointer if > > > > the vmxnet3 module is removed, leading to crashes resulting from use > > > > after free dereferencing incidents like the one above. > > > > > > > > The fix is pretty straightforward. vmw_pvrdma should listen for > > > > NETDEV_REGISTER and NETDEV_UNREGISTER events in its event listener code > > > > block, and update the stored netdev pointer accordingly. This solution > > > > has been tested by myself and the reporter with successful results. > > > > This fix also allows the pvrdma driver to find its underlying ethernet > > > > device in the event that vmxnet3 is loaded after pvrdma, which it was > > > > not able to do before. > > > > > > > > Signed-off-by: Neil Horman > > > > Reported-by: ruquin@redhat.com > > > > CC: Adit Ranadive > > > > CC: VMware PV-Drivers > > > > CC: Doug Ledford > > > > CC: Jason Gunthorpe > > > > CC: linux-kernel@vger.kernel.org > > > > .../infiniband/hw/vmw_pvrdma/pvrdma_main.c | 25 +++++++++++++++++-- > > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c > > > > index 0be33a81bbe6..5b4782078a74 100644 > > > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c > > > > @@ -699,8 +699,12 @@ static int pvrdma_del_gid(const struct ib_gid_attr *attr, void **context) > > > > } > > > > > > > > static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev, > > > > + struct net_device *ndev, > > > > unsigned long event) > > > > { > > > > + struct pci_dev *pdev_net; > > > > + > > > > + > > > > switch (event) { > > > > case NETDEV_REBOOT: > > > > case NETDEV_DOWN: > > > > @@ -718,6 +722,21 @@ static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev, > > > > else > > > > pvrdma_dispatch_event(dev, 1, IB_EVENT_PORT_ACTIVE); > > > > break; > > > > + case NETDEV_UNREGISTER: > > > > + dev_put(dev->netdev); > > > > + dev->netdev = NULL; > > > > + break; > > > > + case NETDEV_REGISTER: > > > > + /* Paired vmxnet3 will have same bus, slot. But func will be 0 */ > > > > + pdev_net = pci_get_slot(dev->pdev->bus, PCI_DEVFN(PCI_SLOT(dev->pdev->devfn), 0)); > > > > + if ((dev->netdev == NULL) && (pci_get_drvdata(pdev_net) == ndev)) { > > > > + /* this is our netdev */ > > > > + dev->netdev = ndev; > > > > + dev_hold(ndev); > > > > + } > > > > + pci_dev_put(pdev_net); > > > > + break; > > > > + > > > > default: > > > > dev_dbg(&dev->pdev->dev, "ignore netdevice event %ld on %s\n", > > > > event, dev->ib_dev.name); > > > > @@ -734,8 +753,9 @@ static void pvrdma_netdevice_event_work(struct work_struct *work) > > > > > > > > mutex_lock(&pvrdma_device_list_lock); > > > > list_for_each_entry(dev, &pvrdma_device_list, device_link) { > > > > - if (dev->netdev == netdev_work->event_netdev) { > > > > - pvrdma_netdevice_event_handle(dev, netdev_work->event); > > > > + if ((netdev_work->event == NETDEV_REGISTER) || > > > > + (dev->netdev == netdev_work->event_netdev)) { > > > > + pvrdma_netdevice_event_handle(dev, netdev_work->event_netdev, netdev_work->event); > > > > break; > > > > } > > > > } > > > > @@ -962,6 +982,7 @@ static int pvrdma_pci_probe(struct pci_dev *pdev, > > > > } > > > > > > > > dev->netdev = pci_get_drvdata(pdev_net); > > > > + dev_hold(dev->netdev); > > > > pci_dev_put(pdev_net); > > > > if (!dev->netdev) { > > > > dev_err(&pdev->dev, "failed to get vmxnet3 device\n"); > > > > > > I see a lot of new dev_hold's here, where are the matching > > > dev_puts()? > > > > > I'm not sure I'd call 2 alot, but sure, there is a new dev_hold in the > > pvrdma_pci_probe routine, to hold a reference to the netdev that is looked up > > there. It is balanced by the NETDEV_UNREGISTER case in > > pvrdma_netdevice_event_handle. The UNREGISTER clause is also balancing the > > NETDEV_REGISTER case of the hanlder that looks up the matching netdev should a > > new device be registered. Note that we will only hold a single device at a > > time, because a given pvrdma device only recongnizes a single vmxnet3 device > > (the one on function 0 of its own bus/device tuple). > > I don't see how the dev_hold in pvrdma_pci_probe is undone during > error unwind (eg goto err_free_cq_ring) > Ah, I missed that, thank you, I'll update the patch. > And I don't see how it is put when pvrdma_pci_remove() is called. > Yup, you're right, it should be dropped there too, immediately after the unregisteration of the netdevice notifier. > Jason >