Received: by 10.223.185.116 with SMTP id b49csp2196230wrg; Sun, 4 Mar 2018 21:14:46 -0800 (PST) X-Google-Smtp-Source: AG47ELs9ahbPUkIIKMcWKyxCOhu88ehKUe0rtge3TKkamvSNG1kWoTfbWOP8do1sczm3/IdFtOPt X-Received: by 10.98.234.22 with SMTP id t22mr14110308pfh.56.1520226886123; Sun, 04 Mar 2018 21:14:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520226886; cv=none; d=google.com; s=arc-20160816; b=UMDlhIVmfA74bfFN8su9VdNj+ZX6AwsNjJG0G5voWNRFvD6EqGr9Zt6NHvYpykbgh7 HacyaZfUxfaBlGegJWippA/ak7C5lj/NfVHA/fhoSbmOKrcanuWJlvMooaw6081V968D N+25E+tMKiMMSYGw9sFtJvRH3wLnn/NGlf4Xzc8LyThV4zkRDtaj4SLLgTG6721F13Z2 bROp6N3+uxTtzNWLxgGHuaADQXffbnbmwaNu2zAW+S7TkuweI0MjaZ8h5a7J0rvi1gRY tim/PElC5a56+1wgeeUqvyrnqQf8oL0tYavml8I4v5MYt41znnfzxhEIvPuFvtwwiTdg wyzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :spamdiagnosticmetadata:spamdiagnosticoutput:msip_labels :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=3mqrcVScJ+qpQidv5BGYQdVqpXVWSHv2kLDojoCpRGU=; b=DVJYvcUjIXXKjhUbQgQ1wVM14jhxygCIWCuQ/t1C0uteNHAt/eavzzsVRhQeNypP3h ST+yT02DahSBFqqhBXBBHEkESKA5ugWvaAT7tp2L6nqEO5lqPgADySd1Ps422EiYR0Ma 64Kbh0amD2bH8iNTxxDjf1HSnYCQfnOmc3q033HBOevhGcA9gM8lfRESIeqKBOS1N4oB e56UIGNFuLBBmVLHCoRPBN3QM24E4zDDgmVTRKj2Jkor3Tgy/NH1Onu9ihAsh0MbH4Lf VU4b+EHbgSW0wqgDaijn0eCnoTRWk6gE/+0s6uniUdgKNtUHNAWqlzoFIczhkYw3n3pl Sv3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=MjTDpPWV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i2si9709037pfa.347.2018.03.04.21.14.18; Sun, 04 Mar 2018 21:14:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=MjTDpPWV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932529AbeCEFLy (ORCPT + 99 others); Mon, 5 Mar 2018 00:11:54 -0500 Received: from mail-pu1apc01on0122.outbound.protection.outlook.com ([104.47.126.122]:13753 "EHLO APC01-PU1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751144AbeCEFLv (ORCPT ); Mon, 5 Mar 2018 00:11:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=3mqrcVScJ+qpQidv5BGYQdVqpXVWSHv2kLDojoCpRGU=; b=MjTDpPWVkKdYmNqhmU3vRBWoJYisOlzM9nbFMXXn9FhAizGsvXhCymnXGzp65rbGA+kuq5pDBJCe/OKzz6oBtnntq95cFSuSUkJ9CA4PiyjNSs3UIL5O8qWOJbuNIDjwebdIR0VrD2Er5RjYD04HtL3Mb2Lv/57MjbwRF/IhEU8= Received: from KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM (10.170.167.17) by KL1P15301MB0022.APCP153.PROD.OUTLOOK.COM (10.170.167.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.567.7; Mon, 5 Mar 2018 05:11:35 +0000 Received: from KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM ([10.170.167.17]) by KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM ([10.170.167.17]) with mapi id 15.20.0588.001; Mon, 5 Mar 2018 05:11:35 +0000 From: Dexuan Cui To: "Michael Kelley (EOSG)" , "bhelgaas@google.com" , "linux-pci@vger.kernel.org" , KY Srinivasan , Stephen Hemminger CC: "linux-kernel@vger.kernel.org" , "driverdev-devel@linuxdriverproject.org" , Haiyang Zhang , "olaf@aepfle.de" , "apw@canonical.com" , "jasowang@redhat.com" , "vkuznets@redhat.com" , "marcelo.cerri@canonical.com" , Jack Morgenstein , "stable@vger.kernel.org" Subject: RE: [PATCH 2/3] PCI: hv: serialize the present/eject work items Thread-Topic: [PATCH 2/3] PCI: hv: serialize the present/eject work items Thread-Index: AQHTsoV58ygEHWMoYEeyKVqO+bn/hqO+rw0AgAIpgWA= Date: Mon, 5 Mar 2018 05:11:35 +0000 Message-ID: References: <20180303001947.20564-1-decui@microsoft.com> <20180303001947.20564-2-decui@microsoft.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=True; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Owner=mikelley@ntdev.microsoft.com; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2018-03-03T16:09:44.3000079Z; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=General; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Application=Microsoft Azure Information Protection; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Extended_MSFT_Method=Automatic; Sensitivity=General authentication-results: spf=none (sender IP is ) smtp.mailfrom=decui@microsoft.com; x-originating-ip: [73.225.236.69] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;KL1P15301MB0022;7:4Y79d+TkLeEbyMAsqtm1ZHuvr/I9tx0Q+PeQXe+1vG88k5QUC8/AzNJuhkpw4Ueqb+0yvuvJ/4fPgmxgLaZy1DlQYH67W2OAT8uknk3Lfs2as8h517gWBKfltpPlVvFPHSqGhENXWEtbUYlOmMlJKqdFWAVXBzuv+rqSmKs2V/14OWu2L60jPCb91qwzpw4FMWT8BrNm+z8KOEW59vrZnUObS7HcvY9WAlH1aghJLUjromdMSKwk4TTBap4LZDiX x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: d89c48e0-563b-4a83-43fd-08d582579184 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7193020);SRVR:KL1P15301MB0022; x-ms-traffictypediagnostic: KL1P15301MB0022: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(9452136761055); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(61425038)(6040501)(2401047)(8121501046)(5005006)(10201501046)(3002001)(93006095)(93001095)(3231220)(944501244)(52105095)(6055026)(61426038)(61427038)(6041288)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123558120)(6072148)(201708071742011);SRVR:KL1P15301MB0022;BCL:0;PCL:0;RULEID:;SRVR:KL1P15301MB0022; x-forefront-prvs: 06022AA85F x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(396003)(366004)(376002)(39380400002)(346002)(39860400002)(189003)(199004)(110136005)(81156014)(6636002)(2900100001)(7696005)(76176011)(5660300001)(53936002)(105586002)(10090500001)(9686003)(6246003)(68736007)(8936002)(25786009)(81166006)(54906003)(8676002)(4326008)(316002)(97736004)(2950100002)(55016002)(99286004)(229853002)(33656002)(3660700001)(22452003)(478600001)(2501003)(86362001)(106356001)(305945005)(6506007)(7416002)(8990500004)(3280700002)(66066001)(102836004)(1511001)(74316002)(26005)(77096007)(2201001)(14454004)(6116002)(2906002)(59450400001)(86612001)(10290500003)(3846002)(6436002)(7736002);DIR:OUT;SFP:1102;SCL:1;SRVR:KL1P15301MB0022;H:KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: VaXCzZxN3fp/hKs95AeFqc5jreNri1p+HI2V3sstwkeR4/faAAxUt3QV+6n9gOwey1gw8KFDYauoZd5KCuo4umfuBWR9JOsGYjvm90Xe0YykQMRj3LA+IpdSyKNYwwo/LhC9cWGsqcL1RIDavh3wTeuGJHngs9L8V3l/+wUyBJbC8Dr6twEcavkwTG9GXvNN/oIpMdFVDDbbBQQbla3YteSoIH7PY3aypK9Dj2rcWiWoO5bBg8iI69PXSHMUZy7029GqqUBdBfxjb0wjedQmv6CqaNd9iLeshWhZcH7dcApenCeSfsxoFMu1SWYzXbFZqmPR7XfoOvBYSiJrb3KlrA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: d89c48e0-563b-4a83-43fd-08d582579184 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Mar 2018 05:11:35.3089 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: KL1P15301MB0022 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > From: Michael Kelley (EOSG) > Sent: Saturday, March 3, 2018 08:10 > > From: linux-kernel-owner@vger.kernel.org owner@vger.kernel.org> On Behalf Of Dexuan Cui > > Sent: Friday, March 2, 2018 4:21 PM > > When we hot-remove the device, we first receive a PCI_EJECT message and > > then receive a PCI_BUS_RELATIONS message with bus_rel->device_count =3D= =3D > 0. > > > > The first message is offloaded to hv_eject_device_work(), and the secon= d > > is offloaded to pci_devices_present_work(). Both the paths can be runni= ng > > list_del(&hpdev->list_entry), causing general protection fault, because > > system_wq can run them concurrently. > > > > The patch eliminates the race condition. >=20 > With this patch, the enum_sem field in struct hv_pcibus_device > is no longer needed. The semaphore serializes execution in > hv_pci_devices_present_work(), and that serialization is now done > with the ordered workqueue. Also, the last paragraph of the top level > comment for hv_pci_devices_present_work() should be updated to > reflect the new ordering assumptions. Thanks! I'll make an extra patch for this. > Separately, an unrelated bug: At the top of hv_eject_device_work(), > the first test may do a put_pcichild() and return. This exit path also > needs to do put_hvpcibus() to balance the ref counts, or do a goto > the last two lines at the bottom of the function. When we're in hv_eject_device_work(), IMO hpdev->state must be=20 hv_pcichild_ejecting, so I'm going to make a patch like this: @@ -1867,10 +1867,7 @@ static void hv_eject_device_work(struct work_struct = *work) hpdev =3D container_of(work, struct hv_pci_dev, wrk); - if (hpdev->state !=3D hv_pcichild_ejecting) { - put_pcichild(hpdev, hv_pcidev_ref_pnp); - return; - } + WARN_ON(hpdev->state !=3D hv_pcichild_ejecting); /* * Ejection can come before or after the PCI bus has been set up, s= o =20 > > @@ -1770,7 +1772,7 @@ static void hv_pci_devices_present(struct > hv_pcibus_device *hbus, > > spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > > get_hvpcibus(hbus); > > - schedule_work(&dr_wrk->wrk); > > + queue_work(hbus->wq, &dr_wrk->wrk); >=20 > This invocation of get_hvpcibus() and queue_work() could be made > conditional on whether the preceding list_add_tail() transitioned > the list from empty to non-empty. If the list was already non-empty, > a previously queued invocation of hv_pci_devices_present_work() > will find the new entry and process it. But this is an > optimization in a non-perf sensitive code path, so may not be > worth it. Exactly. I'll add the the optimization. > > @@ -1848,7 +1850,7 @@ static void hv_pci_eject_device(struct hv_pci_dev > *hpdev) > > get_pcichild(hpdev, hv_pcidev_ref_pnp); > > INIT_WORK(&hpdev->wrk, hv_eject_device_work); > > get_hvpcibus(hpdev->hbus); > > - schedule_work(&hpdev->wrk); > > + queue_work(hpdev->hbus->wq, &hpdev->wrk); > > } > > > > /** > > @@ -2463,11 +2465,17 @@ static int hv_pci_probe(struct hv_device *hdev, > > spin_lock_init(&hbus->retarget_msi_interrupt_lock); > > sema_init(&hbus->enum_sem, 1); > > init_completion(&hbus->remove_event); > > + hbus->wq =3D alloc_ordered_workqueue("hv_pci_%x", 0, > > + hbus->sysdata.domain); > > + if (!hbus->wq) { > > + ret =3D -ENOMEM; > > + goto free_bus; > > + } > > > > ret =3D vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL,= 0, > > hv_pci_onchannelcallback, hbus); > > if (ret) > > - goto free_bus; > > + goto destroy_wq; > > > > hv_set_drvdata(hdev, hbus); > > > > @@ -2536,6 +2544,9 @@ static int hv_pci_probe(struct hv_device *hdev, > > hv_free_config_window(hbus); > > close: > > vmbus_close(hdev->channel); > > +destroy_wq: > > + drain_workqueue(hbus->wq); >=20 > The drain_workqueue() call isn't necessary. destroy_workqueue() calls > drain_workqueue() and there better not be anything in the workqueue > anyway since all the ref counts are zero. OK. Will remove it. > > + destroy_workqueue(hbus->wq); > > free_bus: > > free_page((unsigned long)hbus); > > return ret; > > @@ -2615,6 +2626,8 @@ static int hv_pci_remove(struct hv_device *hdev) > > irq_domain_free_fwnode(hbus->sysdata.fwnode); > > put_hvpcibus(hbus); > > wait_for_completion(&hbus->remove_event); > > + drain_workqueue(hbus->wq); >=20 > Same here -- drain_workqueue() isn't needed. The workqueue > must be empty anyway since the remove_event has completed > and the ref counts will all be zero. Will remove it. > > + destroy_workqueue(hbus->wq); I'm going to post a v2 patchset tomorrow. Thanks, -- Dexuan