Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753293AbbBSLIb (ORCPT ); Thu, 19 Feb 2015 06:08:31 -0500 Received: from mga11.intel.com ([192.55.52.93]:60997 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752611AbbBSLIa (ORCPT ); Thu, 19 Feb 2015 06:08:30 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,512,1418112000"; d="scan'208";a="668553489" Message-ID: <1424344107.3496.10.camel@ideak-mobl> Subject: Re: [Intel-gfx] [KERNEL] Regression bug in drm/i915, Wrong assumption in commit e11aa36 breaks suspend on at least lenovo x61 From: Imre Deak Reply-To: imre.deak@intel.com To: Dave Gordon Cc: Jani Nikula , Klaus Ethgen , "intel-gfx@lists.freedesktop.org" , linux-kernel@vger.kernel.org, dri-devel Date: Thu, 19 Feb 2015 13:08:27 +0200 In-Reply-To: <54E5BF59.4020208@intel.com> References: <20150211113919.GA5672@ikki.ethgen.ch> <20150216221142.GB14602@ikki.ethgen.ch> <20150217082937.GA27873@ikki.ethgen.ch> <87mw4bw7o7.fsf@intel.com> <1424276692.27236.1.camel@intel.com> <54E5BF59.4020208@intel.com> Organization: Intel Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3875 Lines: 93 On Thu, 2015-02-19 at 10:47 +0000, Dave Gordon wrote: > On 18/02/15 16:24, Imre Deak wrote: > > On ke, 2015-02-18 at 17:39 +0200, Jani Nikula wrote: > >> On Tue, 17 Feb 2015, Klaus Ethgen wrote: > >>> After solving the conflicts, I applied the revert (see attachment) to > >>> v3.18.7. I think it should also apply to the current head. With that > >>> patch, suspend is working again on that version. > >>> > >>> However, I have not to deep knowledge of that subsystem, so please, > >>> someone who have, have a deeper look into it. I especially do not know > >>> if the lines in .../intel_pm.c are correct or better leaving them as > >>> they are in v3.18.7. > >>> > >>> I want to have it working on a version that I know is stable before > >>> asking to pull it to head. > >> > >> Hi Klaus, we fear this patch may hide the actual cause. Would be useful > >> to get a better description of what happens, along with a dmesg with > >> drm.debug=14 module parameter set. This might clutter the mailing list, > >> would you mind filing a bug at [1] and attach the info there? > >> > >> Thanks, > >> Jani. > >> > >> [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel > > > > In addition to the above could you also try the following patch and > > provide a dmesg log on the bugzilla ticket - at this point only to try > > to narrow down the issue?: > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > index d358ce8..02c65f4 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -4466,6 +4466,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg) > > I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | > > I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; > > > > + if (!intel_irqs_enabled(dev_priv)) { > > + if (printk_ratelimit()) > > + DRM_ERROR("spurious/shared interrupt iir %08x imr %08x ier %08x\n", > > + I915_READ(IIR), I915_READ(IMR), I915_READ(IER)); > > + > > + return IRQ_NONE; > > + } > > + > > iir = I915_READ(IIR); > > > > for (;;) { > > @@ -4766,7 +4774,10 @@ void intel_runtime_pm_disable_interrupts(struct drm_device *dev) > > struct drm_i915_private *dev_priv = dev->dev_private; > > > > dev->driver->irq_uninstall(dev); > > + > > + spin_lock_irq(&dev_priv->irq_lock); > > dev_priv->pm._irqs_disabled = true; > > + spin_unlock_irq(&dev_priv->irq_lock); > > } > > > > /* Restore interrupts so we can recover from runtime PM. */ > > @@ -4774,7 +4785,10 @@ void intel_runtime_pm_restore_interrupts(struct drm_device *dev) > > { > > struct drm_i915_private *dev_priv = dev->dev_private; > > > > + spin_lock_irq(&dev_priv->irq_lock); > > dev_priv->pm._irqs_disabled = false; > > + spin_unlock_irq(&dev_priv->irq_lock); > > + > > dev->driver->irq_preinstall(dev); > > dev->driver->irq_postinstall(dev); > > } > > Surely surrounding (what ought to be) an atomic assignment to a single > variable cannot make a difference? Unless it's the memory barrier > semantics that have some effect? It seems unlikely that the compiler has > deferred the write to the variable past the pre/postinstall calls that > actually enable the h/w interrupts, but maybe we should add "volatile" > just in case? spinlocks also serve as a memory barrier. volatile would guarantee only that the compiler doesn't reorder the write, so it wouldn't be enough alone. Otoh, we may also need to add spinlocking to the interrupt handler if the issue turns out to be that we can't access some register past/before intel_runtime_pm_{disable,enable}_interrupts. --Imre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/