Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp3735817ybh; Tue, 6 Aug 2019 00:25:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqx8C3U9hrLQUFup/kF0QC7NxEWc8Sx9/kAVejE/HOMYEg0W/0k9f7847AayzP/VK6YNbeHC X-Received: by 2002:a17:90a:b387:: with SMTP id e7mr1776525pjr.113.1565076346801; Tue, 06 Aug 2019 00:25:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565076346; cv=none; d=google.com; s=arc-20160816; b=0JK2s9/iRI/C8ctfpg6TFpaD/pCgytBvgDFLV92hyVMhfKf8tS/ogW9X3Ilmxak7Z1 qMRT/VuPverj5H7NsdAyNIfdZ0hEES4oMLgmzgsrteshOD30DqtL8S0VKzxPrzMWDSya 9uademtj53UfOVOtaBzwjbBG16RXohXVDhBRe5Ah779QroG/LRpFcLhNbE/z9qa7PDSx F/jHOfte54ztPurEgCZC0DI8eLHjVNgl/J8VNyRFJk/lOJSs3ffB9OFMvmlnkqLYXfaV 0+tMgWFpGl3SDSNBnRIIbMtkWtIghHAecrr/hfWI1Wj/7NttY5SR6rW/9gWkf2LxLrU0 mL9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=86XRFRXYRoFMWXiY7aaS0dl1nk3Ncgc8OVatb+ff0BA=; b=pDwPQxxyui8OWl+Kq8FLPi0w9VSbE+xTRjzuQbgZc8MHZ3S91giKwcUW/6VKUfE81E 9wgLKlkyD6trdvDLePYZ9WXju9IClJexfdWy2m8zDYXwd89A4c/odFb9FwctQSgQZHYg EHpUkYWzwQSz2YfVs6QU8V9zog9IDtBYELiWk670+bg9QfUiwZrPb1CMawk8V8WMtIOe Kq4tiQMPNMJgoFrFTVmdF6YpdpCQ+wrOo8G8qg48bCibepFgULXHvQI8t7dP1GO8wasO L9tgLNn71X638r92QRnmqM01QbqId/DAYyXpzsLXghm4lp5X5trZM7MHvAfFh9Xr6T/i Ql2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d12si44069220pla.121.2019.08.06.00.25.31; Tue, 06 Aug 2019 00:25:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731898AbfHFHYc (ORCPT + 99 others); Tue, 6 Aug 2019 03:24:32 -0400 Received: from bmailout3.hostsharing.net ([176.9.242.62]:58399 "EHLO bmailout3.hostsharing.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731735AbfHFHYb (ORCPT ); Tue, 6 Aug 2019 03:24:31 -0400 Received: from h08.hostsharing.net (h08.hostsharing.net [IPv6:2a01:37:1000::53df:5f1c:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.hostsharing.net", Issuer "COMODO RSA Domain Validation Secure Server CA" (not verified)) by bmailout3.hostsharing.net (Postfix) with ESMTPS id A9E27101C06FB; Tue, 6 Aug 2019 09:24:28 +0200 (CEST) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 6ED1FCCB7; Tue, 6 Aug 2019 09:24:28 +0200 (CEST) Date: Tue, 6 Aug 2019 09:24:28 +0200 From: Lukas Wunner To: Xiongfeng Wang Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, yaohongbo@huawei.com, guohanjun@huawei.com, huawei.libin@huawei.com Subject: Re: [RFC PATCH] pciehp: use completion to wait irq_thread 'pciehp_ist' Message-ID: <20190806072428.2v7k775tvvgkbloh@wunner.de> References: <1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 04, 2019 at 03:50:38PM +0800, Xiongfeng Wang wrote: > When I use the following command to power on a slot which has been > powered off already. > echo 1 > /sys/bus/pci/slots/22/power > It prints the following error: > -bash: echo: write error: No such device > But the slot is actually powered on and the devices is probed. > > In function 'pciehp_sysfs_enable_slot()', we use 'wait_event()' to wait > until 'ctrl->pending_events' is cleared in 'pciehp_ist()'. But in some > situation, when 'pciehp_ist()' is woken up on a nearby CPU after > 'pciehp_request' is called, 'ctrl->pending_events' is cleared before we > go into sleep state. 'wait_event()' will check the condition before > going into sleep. So we return immediately and '-ENODEV' is return. > > This patch use struct completion to wait until irq_thread 'pciehp_ist' > is completed. Thank you, good catch. Unfortunately your patch still allows the following race AFAICS: * pciehp_ist() is running (e.g. due to a hotplug operation) * a request to disable or enable the slot is submitted via sysfs, the completion is reinitialized * pciehp_ist() finishes, signals completion * the sysfs request returns to user space prematurely * pciehp_ist() is run, handles the sysfs request, signals completion again I'd suggest something like the below instead, could you give it a whirl and see if it reliably fixes the issue for you? -- >8 -- Subject: [PATCH] PCI: pciehp: Avoid returning prematurely from sysfs requests A sysfs request to enable or disable a PCIe hotplug slot should not return before it has been carried out. That is sought to be achieved by waiting until the controller's "pending_events" have been cleared. However the IRQ thread pciehp_ist() clears the "pending_events" before it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen to check the "pending_events" after they have been cleared but while pciehp_ist() is still running, the functions may return prematurely with an incorrect return value. Fix by introducing an "ist_running" flag which must be false before a sysfs request is allowed to return. Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread") Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com Reported-by: Xiongfeng Wang Signed-off-by: Lukas Wunner Cc: stable@vger.kernel.org # v4.19+ --- drivers/pci/hotplug/pciehp.h | 2 ++ drivers/pci/hotplug/pciehp_ctrl.c | 6 ++++-- drivers/pci/hotplug/pciehp_hpc.c | 2 ++ 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h index 8c51a04b8083..e316bde45c7b 100644 --- a/drivers/pci/hotplug/pciehp.h +++ b/drivers/pci/hotplug/pciehp.h @@ -72,6 +72,7 @@ extern int pciehp_poll_time; * @reset_lock: prevents access to the Data Link Layer Link Active bit in the * Link Status register and to the Presence Detect State bit in the Slot * Status register during a slot reset which may cause them to flap + * @ist_running: flag to keep user request waiting while IRQ thread is running * @request_result: result of last user request submitted to the IRQ thread * @requester: wait queue to wake up on completion of user request, * used for synchronous slot enable/disable request via sysfs @@ -101,6 +102,7 @@ struct controller { struct hotplug_slot hotplug_slot; /* hotplug core interface */ struct rw_semaphore reset_lock; + unsigned int ist_running; int request_result; wait_queue_head_t requester; }; diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 631ced0ab28a..1ce9ce335291 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -368,7 +368,8 @@ int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot) ctrl->request_result = -ENODEV; pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC); wait_event(ctrl->requester, - !atomic_read(&ctrl->pending_events)); + !atomic_read(&ctrl->pending_events) && + !ctrl->ist_running); return ctrl->request_result; case POWERON_STATE: ctrl_info(ctrl, "Slot(%s): Already in powering on state\n", @@ -401,7 +402,8 @@ int pciehp_sysfs_disable_slot(struct hotplug_slot *hotplug_slot) mutex_unlock(&ctrl->state_lock); pciehp_request(ctrl, DISABLE_SLOT); wait_event(ctrl->requester, - !atomic_read(&ctrl->pending_events)); + !atomic_read(&ctrl->pending_events) && + !ctrl->ist_running); return ctrl->request_result; case POWEROFF_STATE: ctrl_info(ctrl, "Slot(%s): Already in powering off state\n", diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index bd990e3371e3..9e2d7688e8cc 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -608,6 +608,7 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) irqreturn_t ret; u32 events; + ctrl->ist_running = true; pci_config_pm_runtime_get(pdev); /* rerun pciehp_isr() if the port was inaccessible on interrupt */ @@ -654,6 +655,7 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) up_read(&ctrl->reset_lock); pci_config_pm_runtime_put(pdev); + ctrl->ist_running = false; wake_up(&ctrl->requester); return IRQ_HANDLED; } -- 2.20.1