Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752847AbbBSKsE (ORCPT ); Thu, 19 Feb 2015 05:48:04 -0500 Received: from mga01.intel.com ([192.55.52.88]:11387 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752220AbbBSKsA (ORCPT ); Thu, 19 Feb 2015 05:48:00 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,512,1418112000"; d="scan'208";a="456715687" Message-ID: <54E5BF59.4020208@intel.com> Date: Thu, 19 Feb 2015 10:47:53 +0000 From: Dave Gordon Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: imre.deak@intel.com, Jani Nikula CC: Klaus Ethgen , "intel-gfx@lists.freedesktop.org" , linux-kernel@vger.kernel.org, dri-devel Subject: Re: [Intel-gfx] [KERNEL] Regression bug in drm/i915, Wrong assumption in commit e11aa36 breaks suspend on at least lenovo x61 References: <20150211113919.GA5672@ikki.ethgen.ch> <20150216221142.GB14602@ikki.ethgen.ch> <20150217082937.GA27873@ikki.ethgen.ch> <87mw4bw7o7.fsf@intel.com> <1424276692.27236.1.camel@intel.com> In-Reply-To: <1424276692.27236.1.camel@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3326 Lines: 84 On 18/02/15 16:24, Imre Deak wrote: > On ke, 2015-02-18 at 17:39 +0200, Jani Nikula wrote: >> On Tue, 17 Feb 2015, Klaus Ethgen wrote: >>> After solving the conflicts, I applied the revert (see attachment) to >>> v3.18.7. I think it should also apply to the current head. With that >>> patch, suspend is working again on that version. >>> >>> However, I have not to deep knowledge of that subsystem, so please, >>> someone who have, have a deeper look into it. I especially do not know >>> if the lines in .../intel_pm.c are correct or better leaving them as >>> they are in v3.18.7. >>> >>> I want to have it working on a version that I know is stable before >>> asking to pull it to head. >> >> Hi Klaus, we fear this patch may hide the actual cause. Would be useful >> to get a better description of what happens, along with a dmesg with >> drm.debug=14 module parameter set. This might clutter the mailing list, >> would you mind filing a bug at [1] and attach the info there? >> >> Thanks, >> Jani. >> >> [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel > > In addition to the above could you also try the following patch and > provide a dmesg log on the bugzilla ticket - at this point only to try > to narrow down the issue?: > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index d358ce8..02c65f4 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -4466,6 +4466,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg) > I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | > I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; > > + if (!intel_irqs_enabled(dev_priv)) { > + if (printk_ratelimit()) > + DRM_ERROR("spurious/shared interrupt iir %08x imr %08x ier %08x\n", > + I915_READ(IIR), I915_READ(IMR), I915_READ(IER)); > + > + return IRQ_NONE; > + } > + > iir = I915_READ(IIR); > > for (;;) { > @@ -4766,7 +4774,10 @@ void intel_runtime_pm_disable_interrupts(struct drm_device *dev) > struct drm_i915_private *dev_priv = dev->dev_private; > > dev->driver->irq_uninstall(dev); > + > + spin_lock_irq(&dev_priv->irq_lock); > dev_priv->pm._irqs_disabled = true; > + spin_unlock_irq(&dev_priv->irq_lock); > } > > /* Restore interrupts so we can recover from runtime PM. */ > @@ -4774,7 +4785,10 @@ void intel_runtime_pm_restore_interrupts(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > > + spin_lock_irq(&dev_priv->irq_lock); > dev_priv->pm._irqs_disabled = false; > + spin_unlock_irq(&dev_priv->irq_lock); > + > dev->driver->irq_preinstall(dev); > dev->driver->irq_postinstall(dev); > } Surely surrounding (what ought to be) an atomic assignment to a single variable cannot make a difference? Unless it's the memory barrier semantics that have some effect? It seems unlikely that the compiler has deferred the write to the variable past the pre/postinstall calls that actually enable the h/w interrupts, but maybe we should add "volatile" just in case? .Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/