Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp4610910img; Tue, 26 Mar 2019 12:55:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqwu/JkR41k5nCtbJyMwN8RmM73rjsjRGX2tkKk1f449GunWBNSzCQ70RNmVVy3qeOVj6Xzj X-Received: by 2002:a65:63c2:: with SMTP id n2mr30837215pgv.439.1553630136373; Tue, 26 Mar 2019 12:55:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553630136; cv=none; d=google.com; s=arc-20160816; b=q/BjUDD4O32m0Rh6oTrEOBDpboavEKx+RzDCOJgsC5InRzYv17UyDkBH9d4EJQ74H9 3nUKm4eIoOhkCxmHW41etA0uk4+0CJon64FUeDKouXoPuMbsBjAF8gWbj7x2dz8jzZJz jQniKm3jI6Zu01+xWIpco3ucMlCk8kmn0lEeEHEzPr5W/rObBFeVYor3+rYlBIxS9HmC q+9XOMkeyQulrom/gOQ1evSIg/z1tz0C/rBvF+anOCVSxBfFR9h8jX2VhAuI2JiR0+t5 kdNPw3KDjs/MpN2504AUokLaqMlayLsyJOFe6JfcEuozEHXDrYVyhMmP7JZgY6PNKoYk 4X0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=D0EXkVHshud2kCXvyVHVC86oz707MKy8AhJvYee4NNk=; b=OCo3PhBAjOA/FwBcpxkUUGAKvEwu1NM3R93Z07pSWwmDKopj0gGai/c2Y3mvMXkkVF d4Pno8V4Kh+7EOLMjSmfHAtrwmsD0MpETsIiHo3YZgWl864kgPmhHJnwhhewXmNiqmlb YEowRaZd895QDK5zNTXlxrDiodRBJZARUTUoCZd2VmDto3/x2fWodJoFv+UcUf5M2kwx 4iZphTUWD4vHTOFnprhajGK7EIy8hA3qjk7Ia+kTCK7+kLpveR8nvoq0AWYPI5zvFuYK SJG0PXmJRKaBXVge0IUfSZEQGt6B8Eph8ge7xBdFySAJojXrEjlSRiPIPUPD/dor0y54 VB7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m20si16086377pgv.136.2019.03.26.12.55.20; Tue, 26 Mar 2019 12:55:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732369AbfCZTyj (ORCPT + 99 others); Tue, 26 Mar 2019 15:54:39 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:43294 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726207AbfCZTyj (ORCPT ); Tue, 26 Mar 2019 15:54:39 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 432B180D; Tue, 26 Mar 2019 12:54:38 -0700 (PDT) Received: from e107981-ln.cambridge.arm.com (e107981-ln.cambridge.arm.com [10.1.197.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D4E03F575; Tue, 26 Mar 2019 12:54:35 -0700 (PDT) Date: Tue, 26 Mar 2019 19:54:30 +0000 From: Lorenzo Pieralisi To: Dexuan Cui Cc: "bhelgaas@google.com" , "linux-pci@vger.kernel.org" , KY Srinivasan , Stephen Hemminger , Michael Kelley , Sasha Levin , "linux-hyperv@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "driverdev-devel@linuxdriverproject.org" , Haiyang Zhang , "olaf@aepfle.de" , "apw@canonical.com" , "jasowang@redhat.com" , vkuznets , "marcelo.cerri@canonical.com" , "jackm@mellanox.com" , "stable@vger.kernel.org" Subject: Re: [PATCH 3/3] PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary Message-ID: <20190326195429.GA15410@e107981-ln.cambridge.arm.com> References: <20190304213357.16652-1-decui@microsoft.com> <20190304213357.16652-4-decui@microsoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190304213357.16652-4-decui@microsoft.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 04, 2019 at 09:34:49PM +0000, Dexuan Cui wrote: > When we hot-remove a device, usually the host sends us a PCI_EJECT message, > and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0. But when > we do the quick hot-add/hot-remove test, the host may not send us the > PCI_EJECT message, if the guest has not fully finished the initialization > by sending the PCI_RESOURCES_ASSIGNED* message to the host, so it's > potentially unsafe to only depend on the pci_destroy_slot() in > hv_eject_device_work(), though create_root_hv_pci_bus() -> > hv_pci_assign_slots() is not called in this case. Note: in this case, the > host still sends the guest a PCI_BUS_RELATIONS message with > bus_rel->device_count == 0. > > And, in the quick hot-add/hot-remove test, we can have such a race: before > pci_devices_present_work() -> new_pcichild_device() adds the new device > into hbus->children, we may have already received the PCI_EJECT message, > and hence the taklet handler hv_pci_onchannelcallback() may fail to find > the "hpdev" by get_pcichild_wslot(hbus, dev_message->wslot.slot), so > hv_pci_eject_device() is NOT called; later create_root_hv_pci_bus() -> > hv_pci_assign_slots() creates the slot, and the PCI_BUS_RELATIONS message > with bus_rel->device_count == 0 removes the device from hbus->children, and > we end up being unable to remove the slot in hv_pci_remove() -> > hv_pci_remove_slots(). > > The patch removes the slot in pci_devices_present_work() when the device > is removed. This can address the above race. Note 1: > pci_devices_present_work() and hv_eject_device_work() run in the > singled-threaded hbus->wq, so there is not a double-remove issue for the > slot. Note 2: we can't offload hv_pci_eject_device() from > hv_pci_onchannelcallback() to the workqueue, because we need > hv_pci_onchannelcallback() synchronously call hv_pci_eject_device() to > poll the channel's ringbuffer to work around the > "hangs in hv_compose_msi_msg()" issue: see > commit de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") This commit log is unreadable, sorry. Indentation, punctuation and formatting are just a mess, try to read it, you will notice by yourself. I basically reformatted it completely and pushed the series to pci/controller-fixes but that's the last time I do it since I am not an editor, next time I won't merge it. More importantly, these patches are marked for stable, given the series of fixes that triggered this series please ensure it was tested thoroughly because it is honestly complicate to understand and I do not want to backport further fixes to stable kernels on top of this. Please have a look and report back. Thanks, Lorenzo > Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information") > Signed-off-by: Dexuan Cui > Cc: stable@vger.kernel.org > --- > drivers/pci/controller/pci-hyperv.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index b489412e3502..82acd6155adf 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -1776,6 +1776,10 @@ static void pci_devices_present_work(struct work_struct *work) > hpdev = list_first_entry(&removed, struct hv_pci_dev, > list_entry); > list_del(&hpdev->list_entry); > + > + if (hpdev->pci_slot) > + pci_destroy_slot(hpdev->pci_slot); > + > put_pcichild(hpdev); > } > > -- > 2.19.1 >