2012-10-15 08:10:34

by Daniel Vetter

[permalink] [raw]
Subject: [PATCH] drm/i915: disable cpu relocs on ilk and earlier

Hi Greg&stable-team,

The below patch papers over a graphics corruption issue in 3.5/3.6. The
regression happened due to pwrite tunings in 3.5, which made cpu relocations
much more likely.

The issue seems to have disappeared in 3.7-rc1, but it takes a few days to test
a patch, so we haven't figured out what exactly fixed things. Now users are
taking out their pitchforks already, so instead of wasting more days (maybe
weeks?) to fully understand the bug before backporting the fix, we've opted for
the below disable patch, which should have minimal impact (at most it undoes the
tuning improvements in 3.5).

Patch is tested by reporters & acked by all relevant ppl, please apply to
3.5/3.6 series kernels.

Thanks, Daniel

---

They seem to be implicated in render corruptions. And up to now no one
really seems to understand the issue, so let's just disable them for
now. Most of the machines exhibiting this issue have only a 128 gtt
mmio window, so increased pressure on the mappable part (and so higher
chance for cpu relocs) seems to be the key.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=852210
Tested-by: Dave Airlie <[email protected]>
Cc: [email protected]
Cc: Greg Kroah-Hartman <[email protected]>
Acked-by: Chris Wilson <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ff2819e..682156a 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -268,6 +268,12 @@ eb_destroy(struct eb_objects *eb)

static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
{
+ /* cpu relocs are implicated in some not-yet-understood render
+ * corruptions on at least ilk, but probably also gm45. Until we know
+ * what's going on, just disable them. */
+ if (INTEL_INFO(obj->base.dev)->gen < 6)
+ return false;
+
return (obj->base.write_domain == I915_GEM_DOMAIN_CPU ||
obj->cache_level != I915_CACHE_NONE);
}
--
1.7.10.4


2012-10-15 15:11:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: disable cpu relocs on ilk and earlier

On Mon, Oct 15, 2012 at 10:11:22AM +0200, Daniel Vetter wrote:
> Hi Greg&stable-team,
>
> The below patch papers over a graphics corruption issue in 3.5/3.6. The
> regression happened due to pwrite tunings in 3.5, which made cpu relocations
> much more likely.
>
> The issue seems to have disappeared in 3.7-rc1, but it takes a few days to test
> a patch, so we haven't figured out what exactly fixed things. Now users are
> taking out their pitchforks already, so instead of wasting more days (maybe
> weeks?) to fully understand the bug before backporting the fix, we've opted for
> the below disable patch, which should have minimal impact (at most it undoes the
> tuning improvements in 3.5).
>
> Patch is tested by reporters & acked by all relevant ppl, please apply to
> 3.5/3.6 series kernels.

No, I'd really like to wait until you figure out what is happening in
3.7-rc1 right now before applying the patch. We have the rule, "it must
be in Linus's tree first" for a very good reason :)

So, I'll hold onto this until you say what's up with 3.7-rc1, ok?

thanks,

greg k-h

2012-10-15 17:16:29

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: disable cpu relocs on ilk and earlier

On Mon, Oct 15, 2012 at 5:11 PM, Greg KH <[email protected]> wrote:
> On Mon, Oct 15, 2012 at 10:11:22AM +0200, Daniel Vetter wrote:
>> Hi Greg&stable-team,
>>
>> The below patch papers over a graphics corruption issue in 3.5/3.6. The
>> regression happened due to pwrite tunings in 3.5, which made cpu relocations
>> much more likely.
>>
>> The issue seems to have disappeared in 3.7-rc1, but it takes a few days to test
>> a patch, so we haven't figured out what exactly fixed things. Now users are
>> taking out their pitchforks already, so instead of wasting more days (maybe
>> weeks?) to fully understand the bug before backporting the fix, we've opted for
>> the below disable patch, which should have minimal impact (at most it undoes the
>> tuning improvements in 3.5).
>>
>> Patch is tested by reporters & acked by all relevant ppl, please apply to
>> 3.5/3.6 series kernels.
>
> No, I'd really like to wait until you figure out what is happening in
> 3.7-rc1 right now before applying the patch. We have the rule, "it must
> be in Linus's tree first" for a very good reason :)
>
> So, I'll hold onto this until you say what's up with 3.7-rc1, ok?

Can do, might send a few pitchforks I collect your way though ;-)

While I have your attention (and now that -rc1 is out), can you please
pick up my two console_lock patches into your tty tree for 3.8?

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2012-10-15 17:34:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: disable cpu relocs on ilk and earlier

On Mon, Oct 15, 2012 at 07:16:26PM +0200, Daniel Vetter wrote:
> On Mon, Oct 15, 2012 at 5:11 PM, Greg KH <[email protected]> wrote:
> > On Mon, Oct 15, 2012 at 10:11:22AM +0200, Daniel Vetter wrote:
> >> Hi Greg&stable-team,
> >>
> >> The below patch papers over a graphics corruption issue in 3.5/3.6. The
> >> regression happened due to pwrite tunings in 3.5, which made cpu relocations
> >> much more likely.
> >>
> >> The issue seems to have disappeared in 3.7-rc1, but it takes a few days to test
> >> a patch, so we haven't figured out what exactly fixed things. Now users are
> >> taking out their pitchforks already, so instead of wasting more days (maybe
> >> weeks?) to fully understand the bug before backporting the fix, we've opted for
> >> the below disable patch, which should have minimal impact (at most it undoes the
> >> tuning improvements in 3.5).
> >>
> >> Patch is tested by reporters & acked by all relevant ppl, please apply to
> >> 3.5/3.6 series kernels.
> >
> > No, I'd really like to wait until you figure out what is happening in
> > 3.7-rc1 right now before applying the patch. We have the rule, "it must
> > be in Linus's tree first" for a very good reason :)
> >
> > So, I'll hold onto this until you say what's up with 3.7-rc1, ok?
>
> Can do, might send a few pitchforks I collect your way though ;-)

No problem at all, I can handle them :)

> While I have your attention (and now that -rc1 is out), can you please
> pick up my two console_lock patches into your tty tree for 3.8?

Let me catch up on my 3.7 patches first please, I'm still on the road
traveling to conferences on different continents, and am in a conference
this week as well. The fact that I'm waking up at the right time is
amazing...

greg k-h

2012-10-18 07:34:45

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: disable cpu relocs on ilk and earlier

On Mon, Oct 15, 2012 at 5:11 PM, Greg KH <[email protected]> wrote:
> On Mon, Oct 15, 2012 at 10:11:22AM +0200, Daniel Vetter wrote:
>> Hi Greg&stable-team,
>>
>> The below patch papers over a graphics corruption issue in 3.5/3.6. The
>> regression happened due to pwrite tunings in 3.5, which made cpu relocations
>> much more likely.
>>
>> The issue seems to have disappeared in 3.7-rc1, but it takes a few days to test
>> a patch, so we haven't figured out what exactly fixed things. Now users are
>> taking out their pitchforks already, so instead of wasting more days (maybe
>> weeks?) to fully understand the bug before backporting the fix, we've opted for
>> the below disable patch, which should have minimal impact (at most it undoes the
>> tuning improvements in 3.5).
>>
>> Patch is tested by reporters & acked by all relevant ppl, please apply to
>> 3.5/3.6 series kernels.
>
> No, I'd really like to wait until you figure out what is happening in
> 3.7-rc1 right now before applying the patch. We have the rule, "it must
> be in Linus's tree first" for a very good reason :)

Ok, the verdict is in (thanks a lot Dave for testing all these
different patches) and it seems like

commit 504c7267a1e84b157cbd7e9c1b805e1bc0c2c846
Author: Chris Wilson <[email protected]>
Date: Thu Aug 23 13:12:52 2012 +0100

drm/i915: Use cpu relocations if the object is in the GTT but not mappable

from upstream nicely papers over the issues. Please apply to 3.5/3.6
stable series (earlier kernels don't exhibit the problem).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=852210
Tested-by: Dave Airlie <[email protected]>

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch