2016-12-23 05:47:55

by Du, Changbin

[permalink] [raw]
Subject: [PATCH] drm/i915: check if execlist_port is empty before using its content

From: "Du, Changbin" <[email protected]>

This patch fix a crash in function reset_common_ring. In this case,
the port[0].request is null when reset the render ring, so a null
dereference exception is raised. We need to check execlist_port status
first.

[ 35.748034] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[ 35.749567] IP: [<ffffffff81521bfe>] reset_common_ring+0xbe/0x150
[ 35.749567] Call Trace:
[ 35.749567] [<ffffffff8150ded0>] i915_gem_reset+0x150/0x270
[ 35.749567] [<ffffffff814d3c0a>] i915_reset+0x8a/0xe0
[ 35.749567] [<ffffffff814d8c21>] i915_reset_and_wakeup+0x131/0x160
[ 35.749567] [<ffffffff815298f0>] ? gen5_read8+0x110/0x110
[ 35.749567] [<ffffffff814dc97a>] i915_handle_error+0xca/0x5a0
[ 35.749567] [<ffffffff813bac9d>] ? scnprintf+0x3d/0x70
[ 35.749567] [<ffffffff814dd063>] i915_hangcheck_elapsed+0x213/0x510
[ 35.749567] [<ffffffff810c4c4b>] process_one_work+0x15b/0x470
[ 35.749567] [<ffffffff810c4fa3>] worker_thread+0x43/0x4d0
[ 35.749567] [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
[ 35.749567] [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
[ 35.749567] [<ffffffff810c103e>] ? call_usermodehelper_exec_async+0x12e/0x130
[ 35.749567] [<ffffffff810ca1a5>] kthread+0xc5/0xe0
[ 35.749567] [<ffffffff810ca0e0>] ? kthread_park+0x60/0x60
[ 35.749567] [<ffffffff810c0f10>] ? umh_complete+0x40/0x40
[ 35.749567] [<ffffffff81a35392>] ret_from_fork+0x22/0x30

Signed-off-by: Changbin Du <[email protected]>
---
drivers/gpu/drm/i915/intel_lrc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0a09024..81a9b0b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,

/* Catch up with any missed context-switch interrupts */
I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0));
- if (request->ctx != port[0].request->ctx) {
+ if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) {
i915_gem_request_put(port[0].request);
port[0] = port[1];
memset(&port[1], 0, sizeof(port[1]));
--
2.7.4


2016-12-23 07:05:03

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH] drm/i915: check if execlist_port is empty before using its content

On Fri, 23 Dec 2016, [email protected] wrote:
> From: "Du, Changbin" <[email protected]>
>
> This patch fix a crash in function reset_common_ring. In this case,
> the port[0].request is null when reset the render ring, so a null
> dereference exception is raised. We need to check execlist_port status
> first.
>
> [ 35.748034] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
> [ 35.749567] IP: [<ffffffff81521bfe>] reset_common_ring+0xbe/0x150
> [ 35.749567] Call Trace:
> [ 35.749567] [<ffffffff8150ded0>] i915_gem_reset+0x150/0x270
> [ 35.749567] [<ffffffff814d3c0a>] i915_reset+0x8a/0xe0
> [ 35.749567] [<ffffffff814d8c21>] i915_reset_and_wakeup+0x131/0x160
> [ 35.749567] [<ffffffff815298f0>] ? gen5_read8+0x110/0x110
> [ 35.749567] [<ffffffff814dc97a>] i915_handle_error+0xca/0x5a0
> [ 35.749567] [<ffffffff813bac9d>] ? scnprintf+0x3d/0x70
> [ 35.749567] [<ffffffff814dd063>] i915_hangcheck_elapsed+0x213/0x510
> [ 35.749567] [<ffffffff810c4c4b>] process_one_work+0x15b/0x470
> [ 35.749567] [<ffffffff810c4fa3>] worker_thread+0x43/0x4d0
> [ 35.749567] [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
> [ 35.749567] [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
> [ 35.749567] [<ffffffff810c103e>] ? call_usermodehelper_exec_async+0x12e/0x130
> [ 35.749567] [<ffffffff810ca1a5>] kthread+0xc5/0xe0
> [ 35.749567] [<ffffffff810ca0e0>] ? kthread_park+0x60/0x60
> [ 35.749567] [<ffffffff810c0f10>] ? umh_complete+0x40/0x40
> [ 35.749567] [<ffffffff81a35392>] ret_from_fork+0x22/0x30
>

Fixes: ?

i.e. which commit broke things?

BR,
Jani.


> Signed-off-by: Changbin Du <[email protected]>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 0a09024..81a9b0b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>
> /* Catch up with any missed context-switch interrupts */
> I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0));
> - if (request->ctx != port[0].request->ctx) {
> + if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) {
> i915_gem_request_put(port[0].request);
> port[0] = port[1];
> memset(&port[1], 0, sizeof(port[1]));

--
Jani Nikula, Intel Open Source Technology Center

2016-12-23 07:51:31

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH] drm/i915: check if execlist_port is empty before using its content

On Fri, Dec 23, 2016 at 01:46:36PM +0800, [email protected] wrote:
> From: "Du, Changbin" <[email protected]>
>
> This patch fix a crash in function reset_common_ring. In this case,
> the port[0].request is null when reset the render ring, so a null
> dereference exception is raised. We need to check execlist_port status
> first.

No. The root cause is whatever got you into the illegal condition in the
first place.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-12-26 07:41:43

by Du, Changbin

[permalink] [raw]
Subject: RE: [Intel-gfx] [PATCH] drm/i915: check if execlist_port is empty before using its content

> On Fri, Dec 23, 2016 at 01:46:36PM +0800, [email protected] wrote:
> > From: "Du, Changbin" <[email protected]>
> >
> > This patch fix a crash in function reset_common_ring. In this case,
> > the port[0].request is null when reset the render ring, so a null
> > dereference exception is raised. We need to check execlist_port status
> > first.
>
> No. The root cause is whatever got you into the illegal condition in the
> first place.
> -Chris
>
Thanks, I will restudy the code after process my current job. Since this happen
on gvt guest, so this may related to gvt emulation.

> --
> Chris Wilson, Intel Open Source Technology Centre