Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp2950552imm; Thu, 24 May 2018 19:44:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrY/6yLpEDjd19A9HFmIU2H8dBr3bLOMXeaeOY9xERhTjGTk+BV1yL3fBXvr4k8yFykp2kO X-Received: by 2002:a17:902:8308:: with SMTP id bd8-v6mr604988plb.195.1527216266543; Thu, 24 May 2018 19:44:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527216266; cv=none; d=google.com; s=arc-20160816; b=BwFjwVdtYwCx8ZZl4IuVkO+Z1H6tXSGBuTSS9WHbK1vG1dKVEUb8C/82dLpnjOOR39 7Et72opQzowl2RYwi+/6AellvUauJR7/PUuWEE31VB9YMkDoIA1GAa4o4hH66F0aE/y0 Gl5hQVFuvnUgCgm/fagHaqfWCATBo89NBCnKhZkAvbvgq5ewgUjAFxXeXX70XwevGEr+ fB0RLusnuuoat7sMRyR8eOOABBcDLL9QSeNOhAWQSI5zCCmhm8MU9yz/qoGUgif8y6Y0 zZSmNSbg/IiZSO+vxpsTv4KN1HTqoQ7eJNbC58U7jOVgbFw2EjIDzU3RdUBRUQ4Hfvv2 rA1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:date:from:message-id :arc-authentication-results; bh=DNnlV4XHs0bFH+1zwUCVVYAtydDjX/G8uHc7yIWKvAo=; b=j2aLnS0EpIArcGXOYmTGIWT3Khk2+RJRUV9u60jpmcMQB2I0tHTRRGNrcLlXt7mTAj vUqkOri7k4Vy9fI7GTrVq2E59Wo82XSdD8nLrhd1oXdixfGxoA8t7HfrhQNJx4Ww8WdD 6aFGtediLN9ZFbawyz6m8zNO6fw7exr6hwTJm/K1wxKJ084P0WXVnDRLuLjCdtKDw0Y0 V4Vulcgmwvm0DuQ/+nY8mQO1NgaFW7aRM73uc/VR/hATrYWVOIF3bq7f67lBHkOhDuoy EUi3YbLMCpSf3YdtT1DZK3TYXQQ/zeArGNC7/fCaucSGid8sSWLVs8/82VCkjQqVt54H McAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v63-v6si2098239pgd.82.2018.05.24.19.44.12; Thu, 24 May 2018 19:44:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S971514AbeEXUzf (ORCPT + 99 others); Thu, 24 May 2018 16:55:35 -0400 Received: from mailout3.hostsharing.net ([176.9.242.54]:54097 "EHLO mailout3.hostsharing.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S969426AbeEXUzc (ORCPT ); Thu, 24 May 2018 16:55:32 -0400 X-Greylist: delayed 598 seconds by postgrey-1.27 at vger.kernel.org; Thu, 24 May 2018 16:55:32 EDT Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.hostsharing.net", Issuer "COMODO RSA Domain Validation Secure Server CA" (not verified)) by mailout3.hostsharing.net (Postfix) with ESMTPS id 4E0D1101E6B18; Thu, 24 May 2018 22:45:32 +0200 (CEST) Received: from localhost (unknown [89.246.108.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by h08.hostsharing.net (Postfix) with ESMTPSA id CBF726000E5A; Thu, 24 May 2018 22:45:31 +0200 (CEST) X-Mailbox-Line: From 8f770886632640321592873e4c902218d42c436b Mon Sep 17 00:00:00 2001 Message-Id: <8f770886632640321592873e4c902218d42c436b.1527194314.git.lukas@wunner.de> From: Lukas Wunner Date: Thu, 24 May 2018 22:45:30 +0200 Subject: [PATCH] genirq: Synchronize only with single thread on free_irq() To: Thomas Gleixner Cc: Bjorn Helgaas , Mika Westerberg , "Sebastian Andrzej Siewior" , linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When pciehp is converted to threaded IRQ handling, removal of unplugged devices below a PCIe hotplug port happens synchronously in the IRQ thread. Removal of devices typically entails a call to free_irq() by their drivers. If those devices share their IRQ with the hotplug port, free_irq() deadlocks because it calls synchronize_irq() to wait for all hard IRQ handlers as well as all threads sharing the IRQ to finish. Actually it's sufficient to wait only for the IRQ thread of the removed device, so call synchronize_hardirq() to wait for all hard IRQ handlers to finish, but no longer for any threads. Compensate by rearranging the control flow in irq_wait_for_interrupt() such that the device's thread is allowed to run one last time after kthread_stop() has been called. Stack trace for posterity: INFO: task irq/17-pciehp:94 blocked for more than 120 seconds. schedule+0x28/0x80 synchronize_irq+0x6e/0xa0 __free_irq+0x15a/0x2b0 free_irq+0x33/0x70 pciehp_release_ctrl+0x98/0xb0 pcie_port_remove_service+0x2f/0x40 device_release_driver_internal+0x157/0x220 bus_remove_device+0xe2/0x150 device_del+0x124/0x340 device_unregister+0x16/0x60 remove_iter+0x1a/0x20 device_for_each_child+0x4b/0x90 pcie_port_device_remove+0x1e/0x30 pci_device_remove+0x36/0xb0 device_release_driver_internal+0x157/0x220 pci_stop_bus_device+0x7d/0xa0 pci_stop_bus_device+0x3d/0xa0 pci_stop_and_remove_bus_device+0xe/0x20 pciehp_unconfigure_device+0xb8/0x160 pciehp_disable_slot+0x84/0x130 pciehp_ist+0x158/0x190 irq_thread_fn+0x1b/0x50 irq_thread+0x143/0x1a0 kthread+0x111/0x130 Cc: Bjorn Helgaas Cc: Mika Westerberg Cc: Sebastian Andrzej Siewior Cc: Thomas Gleixner Signed-off-by: Lukas Wunner --- kernel/irq/manage.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index e3336d904f64..603d2672f942 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -756,9 +756,19 @@ static irqreturn_t irq_forced_secondary_handler(int irq, void *dev_id) static int irq_wait_for_interrupt(struct irqaction *action) { - set_current_state(TASK_INTERRUPTIBLE); + for (;;) { + set_current_state(TASK_INTERRUPTIBLE); - while (!kthread_should_stop()) { + if (kthread_should_stop()) { + /* may need to run one last time. */ + if (test_and_clear_bit(IRQTF_RUNTHREAD, + &action->thread_flags)) { + __set_current_state(TASK_RUNNING); + return 0; + } + __set_current_state(TASK_RUNNING); + return -1; + } if (test_and_clear_bit(IRQTF_RUNTHREAD, &action->thread_flags)) { @@ -766,10 +776,7 @@ static int irq_wait_for_interrupt(struct irqaction *action) return 0; } schedule(); - set_current_state(TASK_INTERRUPTIBLE); } - __set_current_state(TASK_RUNNING); - return -1; } /* @@ -990,7 +997,7 @@ static int irq_thread(void *data) /* * This is the regular exit path. __free_irq() is stopping the * thread via kthread_stop() after calling - * synchronize_irq(). So neither IRQTF_RUNTHREAD nor the + * synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the * oneshot mask bit can be set. We cannot verify that as we * cannot touch the oneshot mask at this point anymore as * __setup_irq() might have given out currents thread_mask @@ -1595,7 +1602,7 @@ static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id) unregister_handler_proc(irq, action); /* Make sure it's not being used on another CPU: */ - synchronize_irq(irq); + synchronize_hardirq(irq); #ifdef CONFIG_DEBUG_SHIRQ /* -- 2.17.0