Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755120Ab3FGBR4 (ORCPT ); Thu, 6 Jun 2013 21:17:56 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:55055 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752414Ab3FGBRz (ORCPT ); Thu, 6 Jun 2013 21:17:55 -0400 Date: Thu, 6 Jun 2013 18:02:25 -0700 From: Greg KH To: Yanmin Zhang Cc: Thomas Gleixner , Liu ShuoX , linux-kernel@vger.kernel.org, fweisbec@gmail.com, sedat.dilek@gmail.com, srivatsa.bhat@linux.vnet.ibm.com, Zhang Yanmin Subject: Re: [PATCH] irq: add a new function irq_in_progress to check pending IRQ handlers Message-ID: <20130607010225.GA28256@kroah.com> References: <1370504296-24389-1-git-send-email-shuox.liu@intel.com> <1370566409.4432.41.camel@ymzhang.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1370566409.4432.41.camel@ymzhang.sh.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2721 Lines: 66 On Fri, Jun 07, 2013 at 08:53:29AM +0800, Yanmin Zhang wrote: > On Thu, 2013-06-06 at 15:18 +0200, Thomas Gleixner wrote: > > On Thu, 6 Jun 2013, shuox.liu@intel.com wrote: > > > From: Zhang Yanmin > > > > > > synchronize_irq waits pending IRQ handlers to be finished. If using this > > > function while holding a resource, the IRQ handler may cause deadlock. > > > > > > Here we add a new function irq_in_progress which doesn't wait for the handlers > > > to be finished. > > > > > > A typical use case at suspend-to-ram: > > > > > > device driver's irq handler is complicated and might hold a mutex at rare cases. > > > Its suspend function is called and a suspended flag is set. > > > In case its IRQ handler is running, suspend function calls irq_in_progress. if > > > handler is running, abort suspend. > > > The irq handler checks the suspended flag. If the device is suspended, irq handler > > > either ignores the interrupt, or wakes up the whole system, and the driver's > > > resume function could deal with the delayed interrupt handling. > > > > This is as wrong as it can be. Fix the driver instead of hacking racy > > functions into the core code. > > > > So your problem looks like this: > > > > CPU 0 CPU 1 > > irq_handler_thread() suspend() > > ..... mutex_lock(&m); > > mutex_lock(&m); synchronize_irq(); > > > > So why needs the mutex to be taken before synchronize_irq()? Why not > > doing the obvious? > > > > suspend() > > disable_irq(); (Implies synchronize_irq) > > mutex_lock(&m); > > .... > > mutex_unlock(&m); > > enable_irq(); > Thanks for the kind comment. > > We do consider your solution before and it works well indeed with some specific > simple drivers. For example, some drives use GPIO pin as interrupt source. > > On one specific platform, disable_irq would really disable the irq at RTE entry, > which means we lose the wakeup capability of this device. > synchronize_irq can be another solution. But we did hit 'DPM device timeout' issue > reported by dpm_wd_handler at suspend-to-ram. > > The driver is complicated. We couldn't change too many functions to optimize it. > In addition, we have to use the driver instead of throwing it away. What is preventing you from rewriting it to work properly? > With irq_in_progress, we can resolve this issue and it does work, although it > looks like ugly. Don't paper over driver bugs in the core kernel, fix the driver. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/