Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp115797imm; Thu, 12 Jul 2018 15:22:06 -0700 (PDT) X-Google-Smtp-Source: AAOMgpe9ssVMZ2fQ59NJePxaF2Kva1SL03E6lB6rlEj04wpqyODJCeT7lzl2Z9EPUT+Z98RyC6uE X-Received: by 2002:a62:8d16:: with SMTP id z22-v6mr4271128pfd.181.1531434126058; Thu, 12 Jul 2018 15:22:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531434126; cv=none; d=google.com; s=arc-20160816; b=pOLduzsoQfqpS2Ug9ZOXExbiyjXPWrpMTh9Am4LNdMBwNE6ZyHRB9dd2tIp6sTILZA AJqd0YMz8Js9aGFJfxT+WGPu5eZNZhdQ8Br7y8VAvVgnSNkcSagvv1xCeS2uXRFv3zAi gAqJ41BCWnXper2Cluk3G57dwAna9AL8c8GwHDaKORUTuRM9zQ8WWD2GhiTAfyDKueP3 GI4IYKiHoXFSSb3Uc3Lw53puw3m+jy48wEIv5+L42iqaBxXvLNfQc6kjdIuA5Ra5Y+0D dtt5jtKV/o+mW3snDAaPtTyKu0L6d7hCgt3rhc+LtZq41tOkwUwaShQ0h6UdA+ZsEnif j22g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=Sc6CG8f9/0IqS/CdT0QmQ01gDGWhgHgfUa0vosHQhEg=; b=Gudo+XV5tQn8HeoVB4tlkprzg+BimBoZI+KDb6K4iNIYLiiBb2T90g4oqZlTs6c+Ra vDcKps8nKHCVnC9GrMNkRPUgDVN7pRxbaE3Q3ctPWYAd6e6EqfycKUywYrKt1KUrGy0D jWwC1m8rr8HWS6U5i8j3tDD7/GLF4RCJIYrIoa9kmk6W8T/lWHTbEEwK8SbEL0i+9Q/s CUtwKgLr2glCg9pNpCeymc3bOzjTfWEUBdffFa1lGno7zpeqkUNMLIizzq9VlQn4YeDR +U0wNeLf5p+/gkykX7yOJdv4n163TZ7hdqHnY0x3Vj4lqT4RX6oX890RfLR+fwl3lk5Y EfAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=h6+4CBCS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f185-v6si21348286pgc.625.2018.07.12.15.21.50; Thu, 12 Jul 2018 15:22:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=h6+4CBCS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733178AbeGLWcv (ORCPT + 99 others); Thu, 12 Jul 2018 18:32:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:51174 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733114AbeGLWcv (ORCPT ); Thu, 12 Jul 2018 18:32:51 -0400 Received: from localhost (unknown [146.7.4.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 30804208E3; Thu, 12 Jul 2018 22:21:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1531434075; bh=JKmeffv9jsK3LWixfL+X5Ak//fMHrUaploYiuNTO1nQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=h6+4CBCSJzxvtuLsZfo21MmfE5bQa7Kzt3tMK8T/m8kQsFyMyMO1jEXvKPgNPKqTy LsXRyLnrPmoB/Q9MuiRFdyw96oBUbQ6HbvuZsq4+A5ILGj9IbGGLCoGqQfulAclzuB U+31vEwC7NmgU6BIOWjhhmXmLTu1ORF9YYL5soPM= Date: Thu, 12 Jul 2018 17:21:09 -0500 From: Bjorn Helgaas To: Lukas Wunner , Thomas Gleixner Cc: Bjorn Helgaas , Mika Westerberg , "Rafael J. Wysocki" , Ashok Raj , Keith Busch , Yinghai Lu , Sinan Kaya , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 08/32] genirq: Synchronize only with single thread on free_irq() Message-ID: <20180712222109.GE28466@bhelgaas-glaptop.roam.corp.google.com> References: <83cfe0c826f1d793e2ead9032ef2109b5efa403a.1529173804.git.lukas@wunner.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83cfe0c826f1d793e2ead9032ef2109b5efa403a.1529173804.git.lukas@wunner.de> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+cc linux-kernel] I assume this would need to be merged along with the rest of the series, which should probably go through the PCI tree, but I'm definitely not qualified to review this IRQ patch. And it would need an ack from Thomas in any case. On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote: > When pciehp is converted to threaded IRQ handling, removal of unplugged > devices below a PCIe hotplug port happens synchronously in the IRQ > thread. Removal of devices typically entails a call to free_irq() by > their drivers. > > If those devices share their IRQ with the hotplug port, free_irq() > deadlocks because it calls synchronize_irq() to wait for all hard IRQ > handlers as well as all threads sharing the IRQ to finish. > > Actually it's sufficient to wait only for the IRQ thread of the removed > device, so call synchronize_hardirq() to wait for all hard IRQ handlers > to finish, but no longer for any threads. Compensate by rearranging the > control flow in irq_wait_for_interrupt() such that the device's thread > is allowed to run one last time after kthread_stop() has been called. > > Stack trace for posterity: > INFO: task irq/17-pciehp:94 blocked for more than 120 seconds. > schedule+0x28/0x80 > synchronize_irq+0x6e/0xa0 > __free_irq+0x15a/0x2b0 > free_irq+0x33/0x70 > pciehp_release_ctrl+0x98/0xb0 > pcie_port_remove_service+0x2f/0x40 > device_release_driver_internal+0x157/0x220 > bus_remove_device+0xe2/0x150 > device_del+0x124/0x340 > device_unregister+0x16/0x60 > remove_iter+0x1a/0x20 > device_for_each_child+0x4b/0x90 > pcie_port_device_remove+0x1e/0x30 > pci_device_remove+0x36/0xb0 > device_release_driver_internal+0x157/0x220 > pci_stop_bus_device+0x7d/0xa0 > pci_stop_bus_device+0x3d/0xa0 > pci_stop_and_remove_bus_device+0xe/0x20 > pciehp_unconfigure_device+0xb8/0x160 > pciehp_disable_slot+0x84/0x130 > pciehp_ist+0x158/0x190 > irq_thread_fn+0x1b/0x50 > irq_thread+0x143/0x1a0 > kthread+0x111/0x130 > > Cc: Bjorn Helgaas > Cc: Mika Westerberg > Cc: Sebastian Andrzej Siewior > Cc: Thomas Gleixner > Signed-off-by: Lukas Wunner > --- > kernel/irq/manage.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > index daeabd791d58..0531f645bfea 100644 > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -790,9 +790,19 @@ static irqreturn_t irq_forced_secondary_handler(int irq, void *dev_id) > > static int irq_wait_for_interrupt(struct irqaction *action) > { > - set_current_state(TASK_INTERRUPTIBLE); > + for (;;) { > + set_current_state(TASK_INTERRUPTIBLE); > > - while (!kthread_should_stop()) { > + if (kthread_should_stop()) { > + /* may need to run one last time. */ > + if (test_and_clear_bit(IRQTF_RUNTHREAD, > + &action->thread_flags)) { > + __set_current_state(TASK_RUNNING); > + return 0; > + } > + __set_current_state(TASK_RUNNING); > + return -1; > + } > > if (test_and_clear_bit(IRQTF_RUNTHREAD, > &action->thread_flags)) { > @@ -800,10 +810,7 @@ static int irq_wait_for_interrupt(struct irqaction *action) > return 0; > } > schedule(); > - set_current_state(TASK_INTERRUPTIBLE); > } > - __set_current_state(TASK_RUNNING); > - return -1; > } > > /* > @@ -1024,7 +1031,7 @@ static int irq_thread(void *data) > /* > * This is the regular exit path. __free_irq() is stopping the > * thread via kthread_stop() after calling > - * synchronize_irq(). So neither IRQTF_RUNTHREAD nor the > + * synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the > * oneshot mask bit can be set. We cannot verify that as we > * cannot touch the oneshot mask at this point anymore as > * __setup_irq() might have given out currents thread_mask > @@ -1629,7 +1636,7 @@ static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id) > unregister_handler_proc(irq, action); > > /* Make sure it's not being used on another CPU: */ > - synchronize_irq(irq); > + synchronize_hardirq(irq); > > #ifdef CONFIG_DEBUG_SHIRQ > /* > -- > 2.17.1 >