2012-11-06 14:03:51

by Daniel J Blueman

[permalink] [raw]
Subject: [PATCH] nouveau: Fix crash after D3

In 3.7-rc4, when starting X with the integrated GPU and suspending the discrete GPU,
after one or more 32-bit applications are used (eg Skype) and X is stopped,
we hit a panic.

Prevent this by testing if the fini function is valid.

Full panic bootlog is at: http://quora.org/2012/nouveau/dmesg-crash.txt
Xorg.log is at: http://quora.org/2012/nouveau/Xorg.0.log-crash.txt
Kernel log after fix is at: http://quora.org/2012/nouveau/dmesg-fix.txt

Signed-off-by: Daniel J Blueman <[email protected]>
---
drivers/gpu/drm/nouveau/core/core/object.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/object.c b/drivers/gpu/drm/nouveau/core/core/object.c
index 0daab62..3da3525 100644
--- a/drivers/gpu/drm/nouveau/core/core/object.c
+++ b/drivers/gpu/drm/nouveau/core/core/object.c
@@ -354,12 +354,16 @@ static int
nouveau_object_decf(struct nouveau_object *object)
{
int ret;
+ struct nouveau_ofuncs *pfuncs;

nv_trace(object, "stopping...\n");

- ret = nv_ofuncs(object)->fini(object, false);
- if (ret)
- nv_warn(object, "failed fini, %d\n", ret);
+ pfuncs = nv_ofuncs(object);
+ if (pfuncs->fini) {
+ ret = nv_ofuncs(object)->fini(object, false);
+ if (ret)
+ nv_warn(object, "failed fini, %d\n", ret);
+ }

if (object->engine) {
mutex_lock(&nv_subdev(object->engine)->mutex);
--
1.7.10.4


2012-11-07 18:47:33

by Marcin Slusarz

[permalink] [raw]
Subject: Re: [PATCH] nouveau: Fix crash after D3

On Tue, Nov 06, 2012 at 10:03:40PM +0800, Daniel J Blueman wrote:
> In 3.7-rc4, when starting X with the integrated GPU and suspending the discrete GPU,
> after one or more 32-bit applications are used (eg Skype) and X is stopped,
> we hit a panic.
>
> Prevent this by testing if the fini function is valid.

It's a bit weird. Can you explain better what is going on?
Why do we try to destroy this object (with NULL fini) only when GPU is
suspended? Maybe it means we are leaking this object on normal close/destroy?
Did you test what happens when you resume nv GPU after stopping X?

> Full panic bootlog is at: http://quora.org/2012/nouveau/dmesg-crash.txt
> Xorg.log is at: http://quora.org/2012/nouveau/Xorg.0.log-crash.txt
> Kernel log after fix is at: http://quora.org/2012/nouveau/dmesg-fix.txt
>
> Signed-off-by: Daniel J Blueman <[email protected]>
> ---
> drivers/gpu/drm/nouveau/core/core/object.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/core/core/object.c b/drivers/gpu/drm/nouveau/core/core/object.c
> index 0daab62..3da3525 100644
> --- a/drivers/gpu/drm/nouveau/core/core/object.c
> +++ b/drivers/gpu/drm/nouveau/core/core/object.c
> @@ -354,12 +354,16 @@ static int
> nouveau_object_decf(struct nouveau_object *object)
> {
> int ret;
> + struct nouveau_ofuncs *pfuncs;
>
> nv_trace(object, "stopping...\n");
>
> - ret = nv_ofuncs(object)->fini(object, false);
> - if (ret)
> - nv_warn(object, "failed fini, %d\n", ret);
> + pfuncs = nv_ofuncs(object);
> + if (pfuncs->fini) {
> + ret = nv_ofuncs(object)->fini(object, false);
> + if (ret)
> + nv_warn(object, "failed fini, %d\n", ret);
> + }
>
> if (object->engine) {
> mutex_lock(&nv_subdev(object->engine)->mutex);
> --
> 1.7.10.4
>