2011-03-24 11:34:54

by Dave Airlie

[permalink] [raw]
Subject: [git pull] drm fixes for -rc1


Hi Linus,

this replaces the pull I sent yesterday that I don't see in your tree yet.

It should have the fix for your i915 in the intel patches, along with
a couple of radeon fixes, and the vblank change + fix.

Dave.

The following changes since commit c87a8d8dcd2587c203f3dd8a3c5c15d1e128ec0d:

drm/radeon: fixup refcounts in radeon dumb create ioctl. (2011-03-17 13:58:34 +1000)

are available in the git repository at:
ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-core-next

Alex Deucher (2):
drm/radeon/kms: prefer legacy pll algo for tv-out
drm/radeon/kms: fix hardcoded EDID handling

Chris Wilson (10):
drm: Fix use-after-free in drm_gem_vm_close()
drm/i915: Remove surplus POSTING_READs before wait_for_vblank
drm/i915: skip redundant operations whilst enabling pipes and planes
drm/i915: Fix tiling corruption from pipelined fencing
drm/i915: Fix computation of pitch for dumb bo creator
drm/i915: Disable pagefaults along execbuffer relocation fast path
drm/i915: Restore missing command flush before interrupt on BLT ring
drm/i915: Fix use after free within tracepoint
drm/i915: Avoid unmapping pages from a NULL address space
Revert "drm/i915: Don't save/restore hardware status page address register"

Dave Airlie (3):
drm: check for modesetting on modeset ioctls
Merge remote branch 'intel/drm-intel-fixes' of ../drm-next into drm-core-next
drm/vblank: update recently added vbl interface to be more future proof.

Herton Ronaldo Krzesinski (1):
drm/i915: Prevent racy removal of request from client list

Ilija Hadzic (1):
drm/kernel: vblank wait on crtc > 1

Jesse Barnes (1):
drm/i915: report correct render clock frequencies on SNB

Takashi Iwai (1):
drm/i915/dp: Correct the order of deletion for ghost eDP devices

Thomas Renninger (1):
drm radeon: Return -EINVAL on wrong pm sysfs access

Yuanhan Liu (1):
drm/i915: Re-enable self-refresh

drivers/gpu/drm/drm_crtc.c | 51 +++++++++++++
drivers/gpu/drm/drm_gem.c | 5 +-
drivers/gpu/drm/drm_ioctl.c | 3 +
drivers/gpu/drm/drm_irq.c | 15 +++--
drivers/gpu/drm/i915/i915_debugfs.c | 8 +-
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem.c | 70 +++++++++---------
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 +++++-
drivers/gpu/drm/i915/i915_suspend.c | 6 ++
drivers/gpu/drm/i915/intel_display.c | 39 +++++------
drivers/gpu/drm/i915/intel_dp.c | 4 +-
drivers/gpu/drm/i915/intel_ringbuffer.c | 109 +++++++++++++---------------
drivers/gpu/drm/radeon/atombios_crtc.c | 6 ++-
drivers/gpu/drm/radeon/radeon_combios.c | 21 ++++--
drivers/gpu/drm/radeon/radeon_connectors.c | 30 +++++++-
drivers/gpu/drm/radeon/radeon_mode.h | 1 +
drivers/gpu/drm/radeon/radeon_pm.c | 8 ++-
include/drm/drm.h | 4 +
18 files changed, 257 insertions(+), 145 deletions(-)


2011-03-28 18:43:26

by Pekka Enberg

[permalink] [raw]
Subject: Re: [git pull] drm fixes for -rc1

On Thu, Mar 24, 2011 at 1:34 PM, Dave Airlie <[email protected]> wrote:
> It should have the fix for your i915 in the intel patches, along with
> a couple of radeon fixes, and the vblank change + fix.

I'm seeing some laptop screen flicker during boot and a while after I
log in (it then seems to go away). It's my trusty old Macbook with
i915 and Ubuntu 10.04. I see this in dmesg:

[ 1.782046] [drm] initialized overlay support
[ 1.782075] [drm] capturing error event; look for more information
in /debug/dri/0/i915_error_state
[ 1.782889] render error detected, EIR: 0x00000010
[ 1.782933] page table error
[ 1.782970] PGTBL_ER: 0x00000102
[ 1.783009] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck:
0x00000010, masking
[ 1.783063] render error detected, EIR: 0x00000010
[ 1.783106] page table error
[ 1.783143] PGTBL_ER: 0x00000102

I'm attaching the full dmesg, i915_error_state, and .config.

Pekka


Attachments:
i915_error_state.gz (90.03 kB)
config.gz (19.78 kB)
dmesg.gz (13.65 kB)
Download all attachments

2011-03-28 18:53:35

by Pekka Enberg

[permalink] [raw]
Subject: Re: [git pull] drm fixes for -rc1

On Mon, Mar 28, 2011 at 9:43 PM, Pekka Enberg <[email protected]> wrote:
> On Thu, Mar 24, 2011 at 1:34 PM, Dave Airlie <[email protected]> wrote:
>> It should have the fix for your i915 in the intel patches, along with
>> a couple of radeon fixes, and the vblank change + fix.
>
> I'm seeing some laptop screen flicker during boot and a while after I
> log in (it then seems to go away). It's my trusty old Macbook with
> i915 and Ubuntu 10.04. I see this in dmesg:
>
> [ ? ?1.782046] [drm] initialized overlay support
> [ ? ?1.782075] [drm] capturing error event; look for more information
> in /debug/dri/0/i915_error_state
> [ ? ?1.782889] render error detected, EIR: 0x00000010
> [ ? ?1.782933] page table error
> [ ? ?1.782970] ? PGTBL_ER: 0x00000102
> [ ? ?1.783009] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck:
> 0x00000010, masking
> [ ? ?1.783063] render error detected, EIR: 0x00000010
> [ ? ?1.783106] page table error
> [ ? ?1.783143] ? PGTBL_ER: 0x00000102
>
> I'm attaching the full dmesg, i915_error_state, and .config.

I'm also seeing these errors now which seem to be new from 2.6.38-final:

[ 437.566022] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
purgeable buffer
[ 437.566187] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
purgeable buffer
[ 437.566232] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
purgeable buffer
[ 437.566275] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
purgeable buffer
[ 437.566318] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
purgeable buffer

2011-03-28 19:09:27

by Chris Wilson

[permalink] [raw]
Subject: Re: [git pull] drm fixes for -rc1

On Mon, 28 Mar 2011 21:53:32 +0300, Pekka Enberg <[email protected]> wrote:
> On Mon, Mar 28, 2011 at 9:43 PM, Pekka Enberg <[email protected]> wrote:
> > On Thu, Mar 24, 2011 at 1:34 PM, Dave Airlie <[email protected]> wrote:
> >> It should have the fix for your i915 in the intel patches, along with
> >> a couple of radeon fixes, and the vblank change + fix.
> >
> > I'm seeing some laptop screen flicker during boot and a while after I
> > log in (it then seems to go away). It's my trusty old Macbook with
> > i915 and Ubuntu 10.04. I see this in dmesg:
> >
> > [    1.782046] [drm] initialized overlay support
> > [    1.782075] [drm] capturing error event; look for more information
> > in /debug/dri/0/i915_error_state
> > [    1.782889] render error detected, EIR: 0x00000010
> > [    1.782933] page table error
> > [    1.782970]   PGTBL_ER: 0x00000102
> > [    1.783009] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck:
> > 0x00000010, masking
> > [    1.783063] render error detected, EIR: 0x00000010
> > [    1.783106] page table error
> > [    1.783143]   PGTBL_ER: 0x00000102
> >
> > I'm attaching the full dmesg, i915_error_state, and .config.

Right, looks like we have an issue with setting up the hardware for
KMS/GEM whilst it is still active. As we disable the outputs anyway for
the KMS takeover, we can arrange to do so first and so prevent this bug.
The side-effect will be that initial screen blanking will last a little
bit longer.

> I'm also seeing these errors now which seem to be new from 2.6.38-final:
>
> [ 437.566022] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
> purgeable buffer
> [ 437.566187] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
> purgeable buffer
> [ 437.566232] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
> purgeable buffer
> [ 437.566275] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
> purgeable buffer
> [ 437.566318] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
> purgeable buffer

That's an old userspace bug, which so far no one has been able to
reproduce on the upstream ddx.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-03-29 07:11:23

by Pekka Enberg

[permalink] [raw]
Subject: Re: [git pull] drm fixes for -rc1

On Mon, Mar 28, 2011 at 10:09 PM, Chris Wilson <[email protected]> wrote:
> On Mon, 28 Mar 2011 21:53:32 +0300, Pekka Enberg <[email protected]> wrote:
>> On Mon, Mar 28, 2011 at 9:43 PM, Pekka Enberg <[email protected]> wrote:
>> > On Thu, Mar 24, 2011 at 1:34 PM, Dave Airlie <[email protected]> wrote:
>> >> It should have the fix for your i915 in the intel patches, along with
>> >> a couple of radeon fixes, and the vblank change + fix.
>> >
>> > I'm seeing some laptop screen flicker during boot and a while after I
>> > log in (it then seems to go away). It's my trusty old Macbook with
>> > i915 and Ubuntu 10.04. I see this in dmesg:
>> >
>> > [ ? ?1.782046] [drm] initialized overlay support
>> > [ ? ?1.782075] [drm] capturing error event; look for more information
>> > in /debug/dri/0/i915_error_state
>> > [ ? ?1.782889] render error detected, EIR: 0x00000010
>> > [ ? ?1.782933] page table error
>> > [ ? ?1.782970] ? PGTBL_ER: 0x00000102
>> > [ ? ?1.783009] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck:
>> > 0x00000010, masking
>> > [ ? ?1.783063] render error detected, EIR: 0x00000010
>> > [ ? ?1.783106] page table error
>> > [ ? ?1.783143] ? PGTBL_ER: 0x00000102
>> >
>> > I'm attaching the full dmesg, i915_error_state, and .config.
>
> Right, looks like we have an issue with setting up the hardware for
> KMS/GEM whilst it is still active. As we disable the outputs anyway for
> the KMS takeover, we can arrange to do so first and so prevent this bug.
> The side-effect will be that initial screen blanking will last a little
> bit longer.

Let me know if there's a patch/git tree to test. The flicker is
extremely annoying and I boot the machine often because it's my main
kernel development laptop.

>> I'm also seeing these errors now which seem to be new from 2.6.38-final:
>>
>> [ ?437.566022] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
>> purgeable buffer
>> [ ?437.566187] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
>> purgeable buffer
>> [ ?437.566232] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
>> purgeable buffer
>> [ ?437.566275] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
>> purgeable buffer
>> [ ?437.566318] [drm:i915_gem_mmap_gtt] *ERROR* Attempting to mmap a
>> purgeable buffer
>
> That's an old userspace bug, which so far no one has been able to
> reproduce on the upstream ddx.

Is it harmless? Why is the kernel complaining about it?

2011-03-29 07:49:57

by Chris Wilson

[permalink] [raw]
Subject: Re: [git pull] drm fixes for -rc1

On Tue, 29 Mar 2011 10:11:21 +0300, Pekka Enberg <[email protected]> wrote:
> On Mon, Mar 28, 2011 at 10:09 PM, Chris Wilson <[email protected]> wrote:
> Let me know if there's a patch/git tree to test. The flicker is
> extremely annoying and I boot the machine often because it's my main
> kernel development laptop.

I will let you know as soon as I have something ready for testing.

> > That's an old userspace bug, which so far no one has been able to
> > reproduce on the upstream ddx.
>
> Is it harmless? Why is the kernel complaining about it?

Being pragmatic, so that I can tell one EINVAL apart from another. And
it's only mostly harmless. Userspace is attempting to write to/read from
a buffer it has marked as being no longer required, so some rendering is
going amiss. And it does not rule out the possibility that at some point
it will catch the error later and result in a SIGBUS being sent to the
application (probably X).

However since it is not a kernel error nor is it fatal, that and a lot of
similar messages can be demoted to debug.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-03-29 10:46:41

by Chris Wilson

[permalink] [raw]
Subject: [PATCH] drm/i915: Disable all outputs early, before KMS takeover

If the outputs are active and continuing to access the GATT when we
teardown the PTEs, then there is a potential for us to hang the GPU.
The hang tends to be a PGTBL_ER with either an invalid host access or
an invalid display plane fetch.

Reported-by: Pekka Enberg <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
---
drivers/gpu/drm/i915/i915_dma.c | 31 ++++++++++++++++++++++---------
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/intel_display.c | 17 +++++++++++------
3 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7273037..65d5adf 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1176,11 +1176,11 @@ static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
return can_switch;
}

-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_load_gem_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
unsigned long prealloc_size, gtt_size, mappable_size;
- int ret = 0;
+ int ret;

prealloc_size = dev_priv->mm.gtt->stolen_size;
gtt_size = dev_priv->mm.gtt->gtt_total_entries << PAGE_SHIFT;
@@ -1204,7 +1204,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
ret = i915_gem_init_ringbuffer(dev);
mutex_unlock(&dev->struct_mutex);
if (ret)
- goto out;
+ return ret;

/* Try to set up FBC with a reasonable compressed buffer size */
if (I915_HAS_FBC(dev) && i915_powersave) {
@@ -1222,6 +1222,13 @@ static int i915_load_modeset_init(struct drm_device *dev)

/* Allow hardware batchbuffers unless told otherwise. */
dev_priv->allow_batchbuffer = 1;
+ return 0;
+}
+
+static int i915_load_modeset_init(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ int ret;

ret = intel_parse_bios(dev);
if (ret)
@@ -1236,7 +1243,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
*/
ret = vga_client_register(dev->pdev, dev, NULL, i915_vga_set_decode);
if (ret && ret != -ENODEV)
- goto cleanup_ringbuffer;
+ goto out;

intel_register_dsm_handler();

@@ -1257,13 +1264,19 @@ static int i915_load_modeset_init(struct drm_device *dev)
if (ret)
goto cleanup_vga_switcheroo;

+ ret = i915_load_gem_init(dev);
+ if (ret)
+ goto cleanup_irq;
+
+ intel_modeset_gem_init(dev);
+
/* Always safe in the mode setting case. */
/* FIXME: do pre/post-mode set stuff in core KMS code */
dev->vblank_disable_allowed = 1;

ret = intel_fbdev_init(dev);
if (ret)
- goto cleanup_irq;
+ goto cleanup_gem;

drm_kms_helper_poll_init(dev);

@@ -1272,16 +1285,16 @@ static int i915_load_modeset_init(struct drm_device *dev)

return 0;

+cleanup_gem:
+ mutex_lock(&dev->struct_mutex);
+ i915_gem_cleanup_ringbuffer(dev);
+ mutex_unlock(&dev->struct_mutex);
cleanup_irq:
drm_irq_uninstall(dev);
cleanup_vga_switcheroo:
vga_switcheroo_unregister_client(dev->pdev);
cleanup_vga_client:
vga_client_register(dev->pdev, NULL, NULL, NULL);
-cleanup_ringbuffer:
- mutex_lock(&dev->struct_mutex);
- i915_gem_cleanup_ringbuffer(dev);
- mutex_unlock(&dev->struct_mutex);
out:
return ret;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 359ddce..60ebd79 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,7 @@ static inline void intel_unregister_dsm_handler(void) { return; }

/* modesetting */
extern void intel_modeset_init(struct drm_device *dev);
+extern void intel_modeset_gem_init(struct drm_device *dev);
extern void intel_modeset_cleanup(struct drm_device *dev);
extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state);
extern void i8xx_disable_fbc(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..5c7385b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6497,6 +6497,9 @@ static void intel_setup_outputs(struct drm_device *dev)
}

intel_panel_setup_backlight(dev);
+
+ /* disable all the possible outputs/crtcs before entering KMS mode */
+ drm_helper_disable_unused_functions(dev);
}

static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb)
@@ -7432,13 +7435,12 @@ void intel_modeset_init(struct drm_device *dev)
intel_crtc_init(dev, i);
}

+ /* Just disable it once at startup */
+ i915_disable_vga(dev);
intel_setup_outputs(dev);

intel_enable_clock_gating(dev);

- /* Just disable it once at startup */
- i915_disable_vga(dev);
-
if (IS_IRONLAKE_M(dev)) {
ironlake_enable_drps(dev);
intel_init_emon(dev);
@@ -7447,12 +7449,15 @@ void intel_modeset_init(struct drm_device *dev)
if (IS_GEN6(dev))
gen6_enable_rps(dev_priv);

- if (IS_IRONLAKE_M(dev))
- ironlake_enable_rc6(dev);
-
INIT_WORK(&dev_priv->idle_work, intel_idle_update);
setup_timer(&dev_priv->idle_timer, intel_gpu_idle_timer,
(unsigned long)dev);
+}
+
+void intel_modeset_gem_init(struct drm_device *dev)
+{
+ if (IS_IRONLAKE_M(dev))
+ ironlake_enable_rc6(dev);

intel_setup_overlay(dev);
}
--
1.7.4.1

2011-03-29 12:23:16

by Chris Wilson

[permalink] [raw]
Subject: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

Required so that we don't obliterate the queue if initialising the
rings after the global IRQ handler is installed.

Signed-off-by: Chris Wilson <[email protected]>
---

This patch is required in conjunction with the first to prevent an oops
the first time we try to use i915_wait_request (i.e. when starting X).
-Chris

---
drivers/gpu/drm/i915/i915_irq.c | 6 ------
drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
2 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 188b497..46ccfc8 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1688,12 +1688,6 @@ int i915_driver_irq_postinstall(struct drm_device *dev)
u32 enable_mask = I915_INTERRUPT_ENABLE_FIX | I915_INTERRUPT_ENABLE_VAR;
u32 error_mask;

- DRM_INIT_WAITQUEUE(&dev_priv->ring[RCS].irq_queue);
- if (HAS_BSD(dev))
- DRM_INIT_WAITQUEUE(&dev_priv->ring[VCS].irq_queue);
- if (HAS_BLT(dev))
- DRM_INIT_WAITQUEUE(&dev_priv->ring[BCS].irq_queue);
-
dev_priv->vblank_pipe = DRM_I915_VBLANK_PIPE_A | DRM_I915_VBLANK_PIPE_B;

if (HAS_PCH_SPLIT(dev))
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e9e6f71..884556d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -800,6 +800,7 @@ int intel_init_ring_buffer(struct drm_device *dev,
INIT_LIST_HEAD(&ring->request_list);
INIT_LIST_HEAD(&ring->gpu_write_list);

+ init_waitqueue_head(&ring->irq_queue);
spin_lock_init(&ring->irq_lock);
ring->irq_mask = ~0;

--
1.7.4.1

2011-03-29 13:05:41

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, Mar 29, 2011 at 3:23 PM, Chris Wilson <[email protected]> wrote:
> Required so that we don't obliterate the queue if initialising the
> rings after the global IRQ handler is installed.
>
> Signed-off-by: Chris Wilson <[email protected]>

I applied both of the patches on top of yesterdays git HEAD and I just
get a blank screen after GRUB. No serial or net console here. Do you
want me to try just one of the patches or turn on some debugging
options?

Pekka

2011-03-29 13:22:41

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, 29 Mar 2011 16:05:35 +0300, Pekka Enberg <[email protected]> wrote:
> On Tue, Mar 29, 2011 at 3:23 PM, Chris Wilson <[email protected]> wrote:
> > Required so that we don't obliterate the queue if initialising the
> > rings after the global IRQ handler is installed.
> >
> > Signed-off-by: Chris Wilson <[email protected]>
>
> I applied both of the patches on top of yesterdays git HEAD and I just
> get a blank screen after GRUB. No serial or net console here. Do you
> want me to try just one of the patches or turn on some debugging
> options?

That was the unspoken side-effect: if we fail to load the i915 module after
disabling the outputs, then we would be left with a blank screen.

If you can ssh in and retrieve the dmesg, then it should at least give a
reason...
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-03-29 13:39:55

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, Mar 29, 2011 at 4:22 PM, Chris Wilson <[email protected]> wrote:
> On Tue, 29 Mar 2011 16:05:35 +0300, Pekka Enberg <[email protected]> wrote:
>> On Tue, Mar 29, 2011 at 3:23 PM, Chris Wilson <[email protected]> wrote:
>> > Required so that we don't obliterate the queue if initialising the
>> > rings after the global IRQ handler is installed.
>> >
>> > Signed-off-by: Chris Wilson <[email protected]>
>>
>> I applied both of the patches on top of yesterdays git HEAD and I just
>> get a blank screen after GRUB. No serial or net console here. Do you
>> want me to try just one of the patches or turn on some debugging
>> options?
>
> That was the unspoken side-effect: if we fail to load the i915 module after
> disabling the outputs, then we would be left with a blank screen.
>
> If you can ssh in and retrieve the dmesg, then it should at least give a
> reason...

I have

CONFIG_DRM_I915=y

so there are no modules involved. I'll see if I can ssh to the box.

2011-03-29 14:22:16

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, Mar 29, 2011 at 4:39 PM, Pekka Enberg <[email protected]> wrote:
> On Tue, Mar 29, 2011 at 4:22 PM, Chris Wilson <[email protected]> wrote:
>> On Tue, 29 Mar 2011 16:05:35 +0300, Pekka Enberg <[email protected]> wrote:
>>> On Tue, Mar 29, 2011 at 3:23 PM, Chris Wilson <[email protected]> wrote:
>>> > Required so that we don't obliterate the queue if initialising the
>>> > rings after the global IRQ handler is installed.
>>> >
>>> > Signed-off-by: Chris Wilson <[email protected]>
>>>
>>> I applied both of the patches on top of yesterdays git HEAD and I just
>>> get a blank screen after GRUB. No serial or net console here. Do you
>>> want me to try just one of the patches or turn on some debugging
>>> options?
>>
>> That was the unspoken side-effect: if we fail to load the i915 module after
>> disabling the outputs, then we would be left with a blank screen.
>>
>> If you can ssh in and retrieve the dmesg, then it should at least give a
>> reason...
>
> I have
>
> CONFIG_DRM_I915=y
>
> so there are no modules involved. I'll see if I can ssh to the box.

No ssh - the box seems to be dead.

2011-03-29 14:32:07

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, 29 Mar 2011 17:22:13 +0300, Pekka Enberg <[email protected]> wrote:
> No ssh - the box seems to be dead.

But now you have a machine with which to listen out for the netconsole
scream of anguish...
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-03-29 15:21:52

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Move the irq wait queue initialisation into the ring init

On Tue, Mar 29, 2011 at 5:32 PM, Chris Wilson <[email protected]> wrote:
> On Tue, 29 Mar 2011 17:22:13 +0300, Pekka Enberg <[email protected]> wrote:
>> No ssh - the box seems to be dead.
>
> But now you have a machine with which to listen out for the netconsole
> scream of anguish...

OK, this gets interesting. With

netconsole=... loglevel=7

I am not able to reproduce the bug. With

netconsole=... loglevel=6

I am able to reproduce but nothing is printed to netconsole. I guess
the kernel dies before it's able to set it up.

Pekka

2011-04-01 11:44:50

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Mar 29, 2011 at 11:46:29AM +0100, Chris Wilson wrote:
> If the outputs are active and continuing to access the GATT when we
> teardown the PTEs, then there is a potential for us to hang the GPU.
> The hang tends to be a PGTBL_ER with either an invalid host access or
> an invalid display plane fetch.

This patch seems to fix resume flakiness (that recently developed
complete reliability in hanging the gpu) on my i855gm. Captured
error_states look as described here. Latest -staging merged into latest
-linus is now again fully reliable at s/r.

Tested-by: Daniel Vetter <[email protected]>
--
Daniel Vetter
Mail: [email protected]
Mobile: +41 (0)79 365 57 48

2011-04-01 11:51:44

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Fri, Apr 1, 2011 at 2:44 PM, Daniel Vetter <[email protected]> wrote:
> On Tue, Mar 29, 2011 at 11:46:29AM +0100, Chris Wilson wrote:
>> If the outputs are active and continuing to access the GATT when we
>> teardown the PTEs, then there is a potential for us to hang the GPU.
>> The hang tends to be a PGTBL_ER with either an invalid host access or
>> an invalid display plane fetch.
>
> This patch seems to fix resume flakiness (that recently developed
> complete reliability in hanging the gpu) on my i855gm. Captured
> error_states look as described here. Latest -staging merged into latest
> -linus is now again fully reliable at s/r.
>
> Tested-by: Daniel Vetter <[email protected]>

Unfortunately I get a blank screen with after boot:

Nacked-by: Pekka Enberg <[email protected]>

2011-04-05 10:21:12

by Tomas Winkler

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
> On Fri, Apr 1, 2011 at 2:44 PM, Daniel Vetter <[email protected]> wrote:
>> On Tue, Mar 29, 2011 at 11:46:29AM +0100, Chris Wilson wrote:
>>> If the outputs are active and continuing to access the GATT when we
>>> teardown the PTEs, then there is a potential for us to hang the GPU.
>>> The hang tends to be a PGTBL_ER with either an invalid host access or
>>> an invalid display plane fetch.
>>
>> This patch seems to fix resume flakiness (that recently developed
>> complete reliability in hanging the gpu) on my i855gm. Captured
>> error_states look as described here. Latest -staging merged into latest
>> -linus is now again fully reliable at s/r.
>>
>> Tested-by: Daniel Vetter <[email protected]>
>
> Unfortunately I get a blank screen with after boot:
> Nacked-by: Pekka Enberg <[email protected]>

Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
stuck after boot too. When KMS is disabled I can at least get to the
console (no graphics)
This is with kernel 2.6.39-rc1. It worked fine with 2.6.38. I don't
have much time bisect and reboot. Shell I try to pull drm-fixes for
rc2 or use try this patch?


lspci -vv
00:02.0 VGA compatible controller: Intel Corporation Core Processor
Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller])
Subsystem: Intel Corporation Device 0036
Flags: bus master, fast devsel, latency 0, IRQ 42
Memory at fe000000 (64-bit, non-prefetchable) [size=4M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
I/O ports at f160 [size=8]
Expansion ROM at <unassigned> [disabled]
Capabilities: <access denied>
Kernel driver in use: i915
Kernel modules: i915

Thanks
Tomas

2011-04-05 10:30:53

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <[email protected]> wrote:
> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
> > Unfortunately I get a blank screen with after boot:
> > Nacked-by: Pekka Enberg <[email protected]>

But until you can tell me where it explodes on your system, we fix
issues on several other machines...

> Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
> stuck after boot too. When KMS is disabled I can at least get to the
> console (no graphics)
> This is with kernel 2.6.39-rc1. It worked fine with 2.6.38. I don't
> have much time bisect and reboot. Shell I try to pull drm-fixes for
> rc2 or use try this patch?

Add drm.debug=0xe to your grub kernel parameters and attach the dmesg for
the failing boot. From that I should be able to recommend a course of
action.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-04-05 10:37:21

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

Hi Chris!

On Tue, Apr 5, 2011 at 1:30 PM, Chris Wilson <[email protected]> wrote:
> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <[email protected]> wrote:
>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
>> > Unfortunately I get a blank screen with after boot:
>> > Nacked-by: Pekka Enberg <[email protected]>
>
> But until you can tell me where it explodes on your system, we fix
> issues on several other machines...

Oh, that's nice, you first made the damn thing flicker in 2.6.39-rc1
and now you're fixing it for others by giving me a blank screen after
boot?

I guess I don't need to tell you that I am not at all happy especially
since you keep breaking i915 in almost every damn release!

>> Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
>> stuck after boot too. When KMS is disabled I can at least get to the
>> console (no graphics)
>> This is with kernel 2.6.39-rc1. ?It worked fine with 2.6.38. I don't
>> have much time bisect and reboot. ?Shell I ?try to pull drm-fixes for
>> rc2 or use try this patch?
>
> Add drm.debug=0xe to your grub kernel parameters and attach the dmesg for
> the failing boot. From that I should be able to recommend a course of
> action.

OK, I'll try that tonight.

Pekka

2011-04-05 11:55:14

by Tomas Winkler

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Apr 5, 2011 at 1:37 PM, Pekka Enberg <[email protected]> wrote:
> Hi Chris!
>
> On Tue, Apr 5, 2011 at 1:30 PM, Chris Wilson <[email protected]> wrote:
>> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <[email protected]> wrote:
>>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
>>> > Unfortunately I get a blank screen with after boot:
>>> > Nacked-by: Pekka Enberg <[email protected]>
>>
>> But until you can tell me where it explodes on your system, we fix
>> issues on several other machines...
>
> Oh, that's nice, you first made the damn thing flicker in 2.6.39-rc1
> and now you're fixing it for others by giving me a blank screen after
> boot?
>
> I guess I don't need to tell you that I am not at all happy especially
> since you keep breaking i915 in almost every damn release!
>
>>> Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
>>> stuck after boot too. When KMS is disabled I can at least get to the
>>> console (no graphics)
>>> This is with kernel 2.6.39-rc1.  It worked fine with 2.6.38. I don't
>>> have much time bisect and reboot.  Shell I  try to pull drm-fixes for
>>> rc2 or use try this patch?

merging drm-fixes for rc2 definitely helped booting.

>> Add drm.debug=0xe to your grub kernel parameters and attach the dmesg for
>> the failing boot. From that I should be able to recommend a course of
>> action.

There are some error messages

4.311738] [drm:intel_crt_init], pch crt adpa set to 0xf40000
[ 4.311792] [drm:intel_dp_i2c_init], i2c_init DPDDC-C
[ 4.312300] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e
[ 4.312302] [drm:intel_dp_i2c_aux_ch], aux_ch failed -110
[ 4.312814] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e
[ 4.312815] [drm:intel_dp_i2c_aux_ch], aux_ch failed -110
[ 4.312848] [drm:intel_dp_i2c_init], i2c_init DPDDC-D
[ 4.313357] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e
[ 4.313358] [drm:intel_dp_i2c_aux_ch], aux_ch failed -110
[ 4.313866] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x5145003e
[ 4.313868] [drm:intel_dp_i2c_aux_ch], aux_ch failed -110
[ 4.313886] [drm:intel_panel_get_backlight], get backlight PWM = 0
[ 4.313891] vgaarb: device changed decodes:
PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 4.314378] [drm:ironlake_crtc_dpms], crtc 0/0 dpms off
[ 4.314381] [drm:i915_get_vblank_timestamp], crtc 0 is disabled
[ 4.343908] [drm:intel_update_fbc],
[ 4.343910] [drm:ironlake_crtc_dpms], crtc 1/1 dpms off
[ 4.343912] [drm:gm45_get_vblank_counter], trying to get vblank
count for disabled pipe B
[ 4.343914] [drm:i915_get_vblank_timestamp], crtc 1 is disabled
[ 4.343916] [drm:gm45_get_vblank_counter], trying to get vblank
count for disabled pipe B
[ 4.344555] [drm:intel_update_fbc],
[ 4.344559] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:5:VGA-1]
[ 4.344562] [drm:intel_ironlake_crt_detect_hotplug], trigger
hotplug detect cycle: adpa=0xf40000
[ 4.354477] [drm:intel_ironlake_crt_detect_hotplug], ironlake
hotplug adpa=0xf40000, result 0
[ 4.354483] [drm:intel_crt_detect], CRT not detected via hotplug
[ 4.354488] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:5:VGA-1] disconnected
[ 4.354493] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:8:HDMI-A-1]
[ 4.364484] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:8:HDMI-A-1] disconnected
[ 4.364491] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:12:HDMI-A-2]
[ 4.416299] [drm] GMBUS timed out, falling back to bit banging on
pin 7 [i915 gmbus dpd]
[ 4.537460] [drm:drm_helper_probe_single_connector_modes],
[CONNECTOR:12:HDMI-A-2] probed modes :
[ 4.537462] [drm:drm_mode_debug_printmodeline], Modeline
21:"1280x1024" 60 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48
0x5

Tomas

2011-04-05 14:12:00

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Apr 5, 2011 at 1:37 PM, Pekka Enberg <[email protected]> wrote:
> Hi Chris!
>
> On Tue, Apr 5, 2011 at 1:30 PM, Chris Wilson <[email protected]> wrote:
>> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <[email protected]> wrote:
>>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
>>> > Unfortunately I get a blank screen with after boot:
>>> > Nacked-by: Pekka Enberg <[email protected]>
>>
>> But until you can tell me where it explodes on your system, we fix
>> issues on several other machines...
>
> Oh, that's nice, you first made the damn thing flicker in 2.6.39-rc1
> and now you're fixing it for others by giving me a blank screen after
> boot?

I compiled i195 drm as module and I now see this with netconsole:

[ 4.861272] i8042: No controller found
[ 5.260688] Unable to load isight firmware
[ 7.120150] usbhid 5-1:1.0: couldn't find an input interrupt endpoint

[ 9.310010]
[ 9.310010] Pid: 3757, comm: sh Not tainted 2.6.38+ #18 Apple Inc.
MacBook2,1/Mac-F4208CAA
[ 9.310010] RIP: 0010:[<0000000000000000>] [< (null)>]
(null)
[ 9.310010] RSP: 0018:ffff88003de03d80 EFLAGS: 00010096
[ 9.310010] RAX: 000000000000006d RBX: ffff88002e6be000 RCX: 000000000003ffff
[ 9.310010] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffff88002e6be030
[ 9.310010] RBP: ffff88003de03de8 R08: 0000000000000000 R09: 0000000000000000
[ 9.310010] R10: 0000000000000006 R11: 0000000000000003 R12: ffff88002e5db800
[ 9.310010] R13: ffff88002e6be82c R14: ffff88003ad24200 R15: 0000000000000000
[ 9.310010] FS: 00007f60f21aa700(0000) GS:ffff88003de00000(0000)
knlGS:0000000000000000
[ 9.310010] [<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
[ 9.310010] [<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
[ 9.310010] [<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
[ 9.310010] [<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]
[ 9.310010] [<ffffffff810e150d>] handle_irq_event_percpu+0x5d/0x210
[ 9.310010] [<ffffffff8108c56c>] ? __do_softirq+0x11c/0x200
[ 9.310010] [<ffffffff8108c56c>] ? __do_softirq+0x11c/0x200
[ 9.310010] [<ffffffff810e173a>] handle_irq_event+0x4a/0x80
[ 9.310010] [<ffffffff810e42c1>] handle_fasteoi_irq+0x51/0xc0
[ 9.310010] [<ffffffff8103e3a2>] handle_irq+0x22/0x30
[ 9.310010] [<ffffffff81698e0d>] do_IRQ+0x5d/0xe0
[ 9.310010] [<ffffffff8168fad3>] common_interrupt+0x13/0x13
[ 9.310010] <EOI> [ 9.310010] Call Trace:
[ 9.310010] <IRQ> [<ffffffff8168cd0c>] panic+0x91/0x19e
[ 9.310010] [<ffffffff816909ea>] oops_end+0xea/0xf0
[ 9.310010] [<ffffffff8106afbb>] no_context+0xfb/0x260
[ 9.310010] [<ffffffff8106b245>] __bad_area_nosemaphore+0x125/0x1e0
[ 9.310010] [<ffffffff8106b313>] bad_area_nosemaphore+0x13/0x20
[ 9.310010] [<ffffffff816930c0>] do_page_fault+0x310/0x4c0
[ 9.310010] [<ffffffff810ac06f>] ? up+0x2f/0x50
[ 9.310010] [<ffffffff8108652f>] ? console_unlock+0x17f/0x1d0
[ 9.310010] [<ffffffff8168fd25>] page_fault+0x25/0x30
[ 9.310010] [<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
[ 9.310010] [<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
[ 9.310010] [<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
[ 9.310010] [<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]

This is the same pre-2.6.39-rc1 kernel with the two patches applied.
I'll try latest Linus master next to see if the same problem triggers.

Pekka

2011-04-05 14:28:00

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, 5 Apr 2011 17:11:56 +0300, Pekka Enberg <[email protected]> wrote:
> [ 9.310010] <IRQ> [<ffffffff8168cd0c>] panic+0x91/0x19e
> [ 9.310010] [<ffffffff816909ea>] oops_end+0xea/0xf0
> [ 9.310010] [<ffffffff8106afbb>] no_context+0xfb/0x260
> [ 9.310010] [<ffffffff8106b245>] __bad_area_nosemaphore+0x125/0x1e0
> [ 9.310010] [<ffffffff8106b313>] bad_area_nosemaphore+0x13/0x20
> [ 9.310010] [<ffffffff816930c0>] do_page_fault+0x310/0x4c0
> [ 9.310010] [<ffffffff810ac06f>] ? up+0x2f/0x50
> [ 9.310010] [<ffffffff8108652f>] ? console_unlock+0x17f/0x1d0
> [ 9.310010] [<ffffffff8168fd25>] page_fault+0x25/0x30
> [ 9.310010] [<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
> [ 9.310010] [<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
> [ 9.310010] [<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
> [ 9.310010] [<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]
>
> This is the same pre-2.6.39-rc1 kernel with the two patches applied.
> I'll try latest Linus master next to see if the same problem triggers.

Hmm. Looks like we don't prevent the PGTBL_ER with those patches (or we
provoke another), and trigger the error before we can handle it.

Double ungood. Thanks,
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-04-05 14:31:12

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Apr 5, 2011 at 5:27 PM, Chris Wilson <[email protected]> wrote:
> On Tue, 5 Apr 2011 17:11:56 +0300, Pekka Enberg <[email protected]> wrote:
>> [ ? ?9.310010] ?<IRQ> ?[<ffffffff8168cd0c>] panic+0x91/0x19e
>> [ ? ?9.310010] ?[<ffffffff816909ea>] oops_end+0xea/0xf0
>> [ ? ?9.310010] ?[<ffffffff8106afbb>] no_context+0xfb/0x260
>> [ ? ?9.310010] ?[<ffffffff8106b245>] __bad_area_nosemaphore+0x125/0x1e0
>> [ ? ?9.310010] ?[<ffffffff8106b313>] bad_area_nosemaphore+0x13/0x20
>> [ ? ?9.310010] ?[<ffffffff816930c0>] do_page_fault+0x310/0x4c0
>> [ ? ?9.310010] ?[<ffffffff810ac06f>] ? up+0x2f/0x50
>> [ ? ?9.310010] ?[<ffffffff8108652f>] ? console_unlock+0x17f/0x1d0
>> [ ? ?9.310010] ?[<ffffffff8168fd25>] page_fault+0x25/0x30
>> [ ? ?9.310010] ?[<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
>> [ ? ?9.310010] ?[<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
>> [ ? ?9.310010] ?[<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
>> [ ? ?9.310010] ?[<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]
>>
>> This is the same pre-2.6.39-rc1 kernel with the two patches applied.
>> I'll try latest Linus master next to see if the same problem triggers.
>
> Hmm. Looks like we don't prevent the PGTBL_ER with those patches (or we
> provoke another), and trigger the error before we can handle it.

I'm guessing it's the same PGTBL_ER I've seen for the past two-three
kernel releases during boot. It seems to be harmless otherwise.

2011-04-05 14:34:28

by Chris Wilson

[permalink] [raw]
Subject: [PATCH] drm/i915: Disable all outputs early, before KMS takeover

If the outputs are active and continuing to access the GATT when we
teardown the PTEs, then there is a potential for us to hang the GPU.
The hang tends to be a PGTBL_ER with either an invalid host access or
an invalid display plane fetch.

v2: Reorder IRQ initialisation to defer until after GEM is setup.

Reported-by: Pekka Enberg <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Tested-by: Daniel Vetter <[email protected]> (855GM)
---
drivers/gpu/drm/i915/i915_dma.c | 31 ++++++++++++++++++++++---------
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/intel_display.c | 17 +++++++++++------
drivers/gpu/drm/i915/intel_dp.c | 9 +++++++++
4 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7273037..b28e023 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1176,11 +1176,11 @@ static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
return can_switch;
}

-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_load_gem_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
unsigned long prealloc_size, gtt_size, mappable_size;
- int ret = 0;
+ int ret;

prealloc_size = dev_priv->mm.gtt->stolen_size;
gtt_size = dev_priv->mm.gtt->gtt_total_entries << PAGE_SHIFT;
@@ -1204,7 +1204,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
ret = i915_gem_init_ringbuffer(dev);
mutex_unlock(&dev->struct_mutex);
if (ret)
- goto out;
+ return ret;

/* Try to set up FBC with a reasonable compressed buffer size */
if (I915_HAS_FBC(dev) && i915_powersave) {
@@ -1222,6 +1222,13 @@ static int i915_load_modeset_init(struct drm_device *dev)

/* Allow hardware batchbuffers unless told otherwise. */
dev_priv->allow_batchbuffer = 1;
+ return 0;
+}
+
+static int i915_load_modeset_init(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ int ret;

ret = intel_parse_bios(dev);
if (ret)
@@ -1236,7 +1243,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
*/
ret = vga_client_register(dev->pdev, dev, NULL, i915_vga_set_decode);
if (ret && ret != -ENODEV)
- goto cleanup_ringbuffer;
+ goto out;

intel_register_dsm_handler();

@@ -1253,10 +1260,16 @@ static int i915_load_modeset_init(struct drm_device *dev)

intel_modeset_init(dev);

- ret = drm_irq_install(dev);
+ ret = i915_load_gem_init(dev);
if (ret)
goto cleanup_vga_switcheroo;

+ intel_modeset_gem_init(dev);
+
+ ret = drm_irq_install(dev);
+ if (ret)
+ goto cleanup_gem;
+
/* Always safe in the mode setting case. */
/* FIXME: do pre/post-mode set stuff in core KMS code */
dev->vblank_disable_allowed = 1;
@@ -1274,14 +1287,14 @@ static int i915_load_modeset_init(struct drm_device *dev)

cleanup_irq:
drm_irq_uninstall(dev);
+cleanup_gem:
+ mutex_lock(&dev->struct_mutex);
+ i915_gem_cleanup_ringbuffer(dev);
+ mutex_unlock(&dev->struct_mutex);
cleanup_vga_switcheroo:
vga_switcheroo_unregister_client(dev->pdev);
cleanup_vga_client:
vga_client_register(dev->pdev, NULL, NULL, NULL);
-cleanup_ringbuffer:
- mutex_lock(&dev->struct_mutex);
- i915_gem_cleanup_ringbuffer(dev);
- mutex_unlock(&dev->struct_mutex);
out:
return ret;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 359ddce..60ebd79 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,7 @@ static inline void intel_unregister_dsm_handler(void) { return; }

/* modesetting */
extern void intel_modeset_init(struct drm_device *dev);
+extern void intel_modeset_gem_init(struct drm_device *dev);
extern void intel_modeset_cleanup(struct drm_device *dev);
extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state);
extern void i8xx_disable_fbc(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..5c7385b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6497,6 +6497,9 @@ static void intel_setup_outputs(struct drm_device *dev)
}

intel_panel_setup_backlight(dev);
+
+ /* disable all the possible outputs/crtcs before entering KMS mode */
+ drm_helper_disable_unused_functions(dev);
}

static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb)
@@ -7432,13 +7435,12 @@ void intel_modeset_init(struct drm_device *dev)
intel_crtc_init(dev, i);
}

+ /* Just disable it once at startup */
+ i915_disable_vga(dev);
intel_setup_outputs(dev);

intel_enable_clock_gating(dev);

- /* Just disable it once at startup */
- i915_disable_vga(dev);
-
if (IS_IRONLAKE_M(dev)) {
ironlake_enable_drps(dev);
intel_init_emon(dev);
@@ -7447,12 +7449,15 @@ void intel_modeset_init(struct drm_device *dev)
if (IS_GEN6(dev))
gen6_enable_rps(dev_priv);

- if (IS_IRONLAKE_M(dev))
- ironlake_enable_rc6(dev);
-
INIT_WORK(&dev_priv->idle_work, intel_idle_update);
setup_timer(&dev_priv->idle_timer, intel_gpu_idle_timer,
(unsigned long)dev);
+}
+
+void intel_modeset_gem_init(struct drm_device *dev)
+{
+ if (IS_IRONLAKE_M(dev))
+ ironlake_enable_rc6(dev);

intel_setup_overlay(dev);
}
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 0daefca..6caeabb 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -218,7 +218,16 @@ intel_dp_mode_valid(struct drm_connector *connector,
if (!is_edp(intel_dp) &&
(intel_dp_link_required(connector->dev, intel_dp, mode->clock)
> intel_dp_max_data_rate(max_link_clock, max_lanes)))
+ {
+ DRM_DEBUG_KMS("mode exceeds DP bandwidth: required=%d, max=%d [clock=%d, lanes=%d]\n",
+ intel_dp_link_required(connector->dev,
+ intel_dp,
+ mode->clock),
+ intel_dp_max_data_rate(max_link_clock,
+ max_lanes),
+ max_link_clock, max_lanes);
return MODE_CLOCK_HIGH;
+ }

if (mode->clock < 10000)
return MODE_CLOCK_LOW;
--
1.7.4.1

2011-04-05 14:42:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Apr 5, 2011 at 3:30 AM, Chris Wilson <[email protected]> wrote:
> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <[email protected]> wrote:
>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <[email protected]> wrote:
>> > Unfortunately I get a blank screen with after boot:
>> > Nacked-by: Pekka Enberg <[email protected]>
>
> But until you can tell me where it explodes on your system, we fix
> issues on several other machines...

NO.

Chris, you need to understand the issue of "NO REGRESSIONS".

It's a very simple rule: it DOES NOT MATTER ONE WHIT how many machines
you fix. You never ever regress. Patches that cause regressions are
reverted.

There are multiple reasons for that rule, but the basic one ends up
being very simple: you only _think_ you fix more machines than you
break. Why? Because the people who test out your patches are the
"active" people - and often predominantly the active people who have
problems. In contrast, the people for whom things already work aren't
even testing your patches in the first place. Then, six months later,
when they update to a new Fedora version, things suddenly don't work
for them, and it turns out that yes, you fixed ten active testers, but
you broke a thousand random people.

So even _one_ person saying "this is a regression" is a total blocker.
Really. It's that simple.

YOU NEVER EVER BREAK WORKING MACHINES.

Seriously. We had this for years in ACPI-land and with suspend/resume
with "one step forward, two steps back", and nobody ever knew if we
were doing any real progress at all, because machines that had working
suspend/resume one kernel version would be broken again the next.
There was no real pattern of improvement, there was just a random
pattern of "things get fixed on one machine, and break on another".

We introduced the "no regressions" rule, and things got seriously
better. Suddenly things started getting _reliably_ better.

The whole situation with i915 has been pretty damn random lately, and
you really really need to understand that this is simply not how it's
done. Your cavalier attitude ("but it fixes things for others") is
absolutely not acceptable.

Keith Cc'd, because that patch had better not show up in my tree.

Linus

2011-04-05 15:11:39

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Disable all outputs early, before KMS takeover

Hi Chris,

On Tue, Apr 5, 2011 at 5:34 PM, Chris Wilson <[email protected]> wrote:
> If the outputs are active and continuing to access the GATT when we
> teardown the PTEs, then there is a potential for us to hang the GPU.
> The hang tends to be a PGTBL_ER with either an invalid host access or
> an invalid display plane fetch.
>
> v2: Reorder IRQ initialisation to defer until after GEM is setup.
>
> Reported-by: Pekka Enberg <[email protected]>
> Signed-off-by: Chris Wilson <[email protected]>
> Tested-by: Daniel Vetter <[email protected]> (855GM)

I no longer get a blank screen after boot but flicker got more
aggressive during boot (it calms down after I've logged in). I see
tons of these in dmesg that don't appear with 2.6.39-rc1:

[ 10.175843] [drm:intel_update_fbc],
[ 10.183100] [drm:i915_driver_irq_handler], pipe A underrun
[ 10.185085] [drm:i915_driver_irq_handler], pipe A underrun
[ 10.186082] [drm:i915_driver_irq_handler], pipe A underrun
[ 10.187087] [drm:i915_driver_irq_handler], pipe A underrun
[ 10.189082] [drm:i915_driver_irq_handler], pipe A underrun
[ 10.190085] [drm:i915_driver_irq_handler], pipe A underrun

I've attached the full dmesg.

Pekka


Attachments:
dmesg.gz (29.97 kB)

2011-04-05 15:12:08

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds <[email protected]> wrote:
> NO.

And you seemed to have missed that patch has sat around waiting for Pekka
to give me some information on the failure on his machine.

I have been poking as many people as I could to get it reviewed and tested
on more machines so that someone else could either spot the problem or
capture the oops.

I was being facetious in order to get a response. Thanks for playing,
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-04-05 15:33:20

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, 5 Apr 2011 18:11:37 +0300, Pekka Enberg <[email protected]> wrote:
> Hi Chris,
>
> On Tue, Apr 5, 2011 at 5:34 PM, Chris Wilson <[email protected]> wrote:
> > If the outputs are active and continuing to access the GATT when we
> > teardown the PTEs, then there is a potential for us to hang the GPU.
> > The hang tends to be a PGTBL_ER with either an invalid host access or
> > an invalid display plane fetch.
> >
> > v2: Reorder IRQ initialisation to defer until after GEM is setup.
> >
> > Reported-by: Pekka Enberg <[email protected]>
> > Signed-off-by: Chris Wilson <[email protected]>
> > Tested-by: Daniel Vetter <[email protected]> (855GM)
>
> I no longer get a blank screen after boot but flicker got more
> aggressive during boot (it calms down after I've logged in). I see
> tons of these in dmesg that don't appear with 2.6.39-rc1:

Well the PGTBL_ER is still there. I'm thinking it might worth a check to
see if that is asserted even before we start...

> [ 10.175843] [drm:intel_update_fbc],
> [ 10.183100] [drm:i915_driver_irq_handler], pipe A underrun
> [ 10.185085] [drm:i915_driver_irq_handler], pipe A underrun
> [ 10.186082] [drm:i915_driver_irq_handler], pipe A underrun
> [ 10.187087] [drm:i915_driver_irq_handler], pipe A underrun
> [ 10.189082] [drm:i915_driver_irq_handler], pipe A underrun
> [ 10.190085] [drm:i915_driver_irq_handler], pipe A underrun

If I'm understanding the dmesg correctly, then these start even before we
setup the first crtc.

Whether that means we're not completely disabling all outputs or that we
set a register incorrectly I don't know. Comparing an intel_reg_dumper
with and without the patch applied might give a clue if it is a register
that is set differently due to the reordering.

The other question is of course whether you see those in 2.6.39-rc1 as
well... Probably not since they will correspond with the increased
flicker.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2011-04-05 15:35:23

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds
<[email protected]> wrote:
>> NO.

On Tue, Apr 5, 2011 at 6:12 PM, Chris Wilson <[email protected]> wrote:
> And you seemed to have missed that patch has sat around waiting for Pekka
> to give me some information on the failure on his machine.

No, it wasn't. I told you I wasn't able to capture anything useful and
that 2.6.38 didn't have the problem.

On Tue, Apr 5, 2011 at 6:12 PM, Chris Wilson <[email protected]> wrote:
> I was being facetious in order to get a response. Thanks for playing,

I am happy to test patches but I don't appreciate being blackmailed
into debugging your shit.

Pekka

2011-04-05 15:44:30

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: Disable all outputs early, before KMS takeover

On Tue, Apr 5, 2011 at 6:32 PM, Chris Wilson <[email protected]> wrote:
> On Tue, 5 Apr 2011 18:11:37 +0300, Pekka Enberg <[email protected]> wrote:
>> Hi Chris,
>>
>> On Tue, Apr 5, 2011 at 5:34 PM, Chris Wilson <[email protected]> wrote:
>> > If the outputs are active and continuing to access the GATT when we
>> > teardown the PTEs, then there is a potential for us to hang the GPU.
>> > The hang tends to be a PGTBL_ER with either an invalid host access or
>> > an invalid display plane fetch.
>> >
>> > v2: Reorder IRQ initialisation to defer until after GEM is setup.
>> >
>> > Reported-by: Pekka Enberg <[email protected]>
>> > Signed-off-by: Chris Wilson <[email protected]>
>> > Tested-by: Daniel Vetter <[email protected]> (855GM)
>>
>> I no longer get a blank screen after boot but flicker got more
>> aggressive during boot (it calms down after I've logged in). I see
>> tons of these in dmesg that don't appear with 2.6.39-rc1:
>
> Well the PGTBL_ER is still there. I'm thinking it might worth a check to
> see if that is asserted even before we start...
>
>> [ ? 10.175843] [drm:intel_update_fbc],
>> [ ? 10.183100] [drm:i915_driver_irq_handler], pipe A underrun
>> [ ? 10.185085] [drm:i915_driver_irq_handler], pipe A underrun
>> [ ? 10.186082] [drm:i915_driver_irq_handler], pipe A underrun
>> [ ? 10.187087] [drm:i915_driver_irq_handler], pipe A underrun
>> [ ? 10.189082] [drm:i915_driver_irq_handler], pipe A underrun
>> [ ? 10.190085] [drm:i915_driver_irq_handler], pipe A underrun
>
> If I'm understanding the dmesg correctly, then these start even before we
> setup the first crtc.
>
> Whether that means we're not completely disabling all outputs or that we
> set a register incorrectly I don't know. Comparing an intel_reg_dumper
> with and without the patch applied might give a clue if it is a register
> that is set differently due to the reordering.
>
> The other question is of course whether you see those in 2.6.39-rc1 as
> well... Probably not since they will correspond with the increased
> flicker.

Actually, I do. I guess that logging is enabled by 'drm.debug=0xe'?
I've included dmesg from latest Linus master.

As for the v2 of the patch:

Tested-by: Pekka Enberg <[email protected]> (i915)

Pekka


Attachments:
dmesg.gz (28.54 kB)