Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1765789AbYBTWUh (ORCPT ); Wed, 20 Feb 2008 17:20:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754658AbYBTWU0 (ORCPT ); Wed, 20 Feb 2008 17:20:26 -0500 Received: from smtp118.sbc.mail.sp1.yahoo.com ([69.147.64.91]:25269 "HELO smtp118.sbc.mail.sp1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752093AbYBTWUY (ORCPT ); Wed, 20 Feb 2008 17:20:24 -0500 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=pacbell.net; h=Received:X-YMail-OSG:X-Yahoo-Newman-Property:From:To:Subject:Date:User-Agent:Cc:References:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Content-Disposition:Message-Id; b=xq7qE4Ickx7ytsaK6wy/jAisMtZv1wEquo04j97pQxjiTXz/mK/5Siphhy30AwakhyUSn+foZySFysq6xf4Foo3zXfC10aEYrGlru7P5jti76n+SMt/kQZVPpbDvapLFtkBlg9EEAy+jDXAqWAA7rEk0J6cREJUdZy6Q+v4LHK4= ; X-YMail-OSG: nsEdpekVM1lLCAzbEqsZYip60PjDThOqG4DWfxzp_0pIAJoW X-Yahoo-Newman-Property: ymail-3 From: David Brownell To: Alan Stern Subject: Re: USB OOPS 2.6.25-rc2-git1 Date: Wed, 20 Feb 2008 13:56:28 -0800 User-Agent: KMail/1.9.6 Cc: Andre Tomt , Kernel development list , USB list References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Content-Disposition: inline Message-Id: <200802201356.28723.david-b@pacbell.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2323 Lines: 68 On Wednesday 20 February 2008, Alan Stern wrote: > > ehci_hcd 0000:00:1d.7: IAA watchdog, lost IAA: status 8029 cmd 10021 > > lines in the log brings up some ideas that have been percolating in my > mind for a while. ?They have to do with the possibility of a race > between the watchdog routine and assertion of IAA. The curious bit IMO being STS_INT (0001), which should also have triggered an IRQ. Suggesting to me that the race might be lower level than that ... at the level of a conflict between the various mechanisms to ack irqs. See the appended patch (Andre, this is the additional one I meant) for a tweak at that level. > In fact, if the timing comes out just wrong then it's possible (on SMP > systems) for an IAA interrupt to arrive when the watchdog > routine has already started running. ?Then end_unlink_async() might get > called right at the start of a new IAA cycle, or when the reclaim list > is empty. The driver's spinlock should prevent that particular problem from appearing. - Dave ========= CUT HERE Modify EHCI irq handling on the theory that at least some of the "lost" IRQs are caused by goofage between multiple lowlevel IRQ acking mechanisms: try rescanning before we exit the handler, in case the EHCI-internal ack (by clearing the irq status) doesn't always suffice for IRQs triggered nearly back-to-back. --- drivers/usb/host/ehci-hcd.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- g26.orig/drivers/usb/host/ehci-hcd.c 2008-02-20 13:26:00.000000000 -0800 +++ g26/drivers/usb/host/ehci-hcd.c 2008-02-20 13:54:37.000000000 -0800 @@ -638,6 +638,8 @@ static irqreturn_t ehci_irq (struct usb_ return IRQ_NONE; } +retrigger: + /* clear (just) interrupts */ ehci_writel(ehci, status, &ehci->regs->status); cmd = ehci_readl(ehci, &ehci->regs->command); @@ -725,6 +727,12 @@ dead: if (bh) ehci_work (ehci); + + status = ehci_readl(ehci, &ehci->regs->status); + status &= INTR_MASK; + if (status) + goto retrigger; + spin_unlock (&ehci->lock); if (pcd_status & STS_PCD) usb_hcd_poll_rh_status(hcd); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/