2015-08-12 04:17:52

by Alexandre Courbot

[permalink] [raw]
Subject: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

This reverts commit 1addc1264852

This commit seems to cause crashes in gk104_fifo_intr_runlist() by
returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
intended for GM20B which is not completely supported yet, let's revert
it for the time being.

Reported-by: Eric Biggers <[email protected]>
Signed-off-by: Alexandre Courbot <[email protected]>
---
David, it would be great if this could be merged for 4.2 since lots of
users could potentially experience this issue. Thanks!

drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c | 29 +++++++-----------------
1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c
index 52c22b026005..e10f9644140f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c
@@ -166,30 +166,14 @@ gk104_fifo_context_attach(struct nvkm_object *parent,
}

static int
-gk104_fifo_chan_kick(struct gk104_fifo_chan *chan)
-{
- struct nvkm_object *obj = (void *)chan;
- struct gk104_fifo_priv *priv = (void *)obj->engine;
-
- nv_wr32(priv, 0x002634, chan->base.chid);
- if (!nv_wait(priv, 0x002634, 0x100000, 0x000000)) {
- nv_error(priv, "channel %d [%s] kick timeout\n",
- chan->base.chid, nvkm_client_name(chan));
- return -EBUSY;
- }
-
- return 0;
-}
-
-static int
gk104_fifo_context_detach(struct nvkm_object *parent, bool suspend,
struct nvkm_object *object)
{
struct nvkm_bar *bar = nvkm_bar(parent);
+ struct gk104_fifo_priv *priv = (void *)parent->engine;
struct gk104_fifo_base *base = (void *)parent->parent;
struct gk104_fifo_chan *chan = (void *)parent;
u32 addr;
- int ret;

switch (nv_engidx(object->engine)) {
case NVDEV_ENGINE_SW : return 0;
@@ -204,9 +188,13 @@ gk104_fifo_context_detach(struct nvkm_object *parent, bool suspend,
return -EINVAL;
}

- ret = gk104_fifo_chan_kick(chan);
- if (ret && suspend)
- return ret;
+ nv_wr32(priv, 0x002634, chan->base.chid);
+ if (!nv_wait(priv, 0x002634, 0xffffffff, chan->base.chid)) {
+ nv_error(priv, "channel %d [%s] kick timeout\n",
+ chan->base.chid, nvkm_client_name(chan));
+ if (suspend)
+ return -EBUSY;
+ }

if (addr) {
nv_wo32(base, addr + 0x00, 0x00000000);
@@ -331,7 +319,6 @@ gk104_fifo_chan_fini(struct nvkm_object *object, bool suspend)
gk104_fifo_runlist_update(priv, chan->engine);
}

- gk104_fifo_chan_kick(chan);
nv_wr32(priv, 0x800000 + (chid * 8), 0x00000000);
return nvkm_fifo_channel_fini(&chan->base, suspend);
}
--
2.5.0


2015-08-12 06:01:05

by afzal mohammed

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

Hi,

On Wed, Aug 12, 2015 at 01:17:38PM +0900, Alexandre Courbot wrote:
> This reverts commit 1addc1264852
>
> This commit seems to cause crashes in gk104_fifo_intr_runlist() by
> returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
> intended for GM20B which is not completely supported yet, let's revert
> it for the time being.
>
> Reported-by: Eric Biggers <[email protected]>
> Signed-off-by: Alexandre Courbot <[email protected]>
> ---
> David, it would be great if this could be merged for 4.2 since lots of
> users could potentially experience this issue. Thanks!

Tested-by: Afzal Mohammed <[email protected]>

Please help $subject reach mainline for 4.2, w/o this revert, the
system here hangs most (>90%) of the time at boot time.

As an aside, yesterday after a marathon git bisect, came to the same
solution (though I don't understand what that change means). Was
about to report it and saw this one. Thanks Alexandre.

W/o the revert, in the rare case where it boots, below is observed in
addition to as compared to w/ revert,

[ 9.826010] nouveau E[ PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x122130 [ !ENGINE ]

Regards
Afzal

2015-08-12 07:12:38

by Alexandre Courbot

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

On Wed, Aug 12, 2015 at 3:00 PM, Afzal Mohammed <[email protected]> wrote:
> Hi,
>
> On Wed, Aug 12, 2015 at 01:17:38PM +0900, Alexandre Courbot wrote:
>> This reverts commit 1addc1264852
>>
>> This commit seems to cause crashes in gk104_fifo_intr_runlist() by
>> returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
>> intended for GM20B which is not completely supported yet, let's revert
>> it for the time being.
>>
>> Reported-by: Eric Biggers <[email protected]>
>> Signed-off-by: Alexandre Courbot <[email protected]>
>> ---
>> David, it would be great if this could be merged for 4.2 since lots of
>> users could potentially experience this issue. Thanks!
>
> Tested-by: Afzal Mohammed <[email protected]>

Thanks!

>
> Please help $subject reach mainline for 4.2, w/o this revert, the
> system here hangs most (>90%) of the time at boot time.
>
> As an aside, yesterday after a marathon git bisect, came to the same
> solution (though I don't understand what that change means). Was
> about to report it and saw this one. Thanks Alexandre.

All credit goes to Eric for bisecting and reporting this issue.

>
> W/o the revert, in the rare case where it boots, below is observed in
> addition to as compared to w/ revert,
>
> [ 9.826010] nouveau E[ PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x122130 [ !ENGINE ]

Could you let me know what your card is? It may be useful to know the
range of affected cards when trying to fix this.

Thanks,
Alex.

2015-08-12 07:37:39

by afzal mohammed

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

Hi,

On Wed, Aug 12, 2015 at 04:12:15PM +0900, Alexandre Courbot wrote:

> Could you let me know what your card is? It may be useful to know the
> range of affected cards when trying to fix this.

grep of nouveau on dmesg as follows, if the following information is
not sufficient, let me know where the details you are asking for can
be found,

Regards
Afzal

nouveau 0000:01:00.0: enabling device (0004 -> 0007)
nouveau [ DEVICE][0000:01:00.0] BOOT0 : 0x108120a1
nouveau [ DEVICE][0000:01:00.0] Chipset: GK208 (NV108)
nouveau [ DEVICE][0000:01:00.0] Family : NVE0
nouveau [ VBIOS][0000:01:00.0] using image from ACPI
nouveau [ VBIOS][0000:01:00.0] BIT signature found
nouveau [ VBIOS][0000:01:00.0] version 80.28.28.00.05
nouveau [ DEVINIT][0000:01:00.0] adaptor not initialised
nouveau [ VBIOS][0000:01:00.0] running init tables
nouveau [ PMC][0000:01:00.0] MSI interrupts enabled
nouveau E[ PIBUS][0000:01:00.0] HUB0: 0x6013d4 0xffff5703 (0x1d708200)
nouveau [ PFB][0000:01:00.0] RAM type: DDR3
nouveau [ PFB][0000:01:00.0] RAM size: 2048 MiB
nouveau [ PFB][0000:01:00.0] ZCOMP: 0 tags
nouveau E[ PIBUS][0000:01:00.0] GPC0: 0x4188ac 0x00000001 (0x1a70822e)
nouveau [ VOLT][0000:01:00.0] GPU voltage: 600000uv
nouveau [ PTHERM][0000:01:00.0] FAN control: none / external
nouveau [ PTHERM][0000:01:00.0] fan management: automatic
nouveau [ PTHERM][0000:01:00.0] internal sensor: yes
nouveau [ CLK][0000:01:00.0] 07: core 405 MHz memory 810 MHz
nouveau [ CLK][0000:01:00.0] 0a: core 405-1058 MHz memory 1620 MHz
nouveau [ CLK][0000:01:00.0] 0f: core 405-1058 MHz memory 2002 MHz
nouveau [ CLK][0000:01:00.0] --: core 405 MHz memory 810 MHz
nouveau [ DRM] VRAM: 2048 MiB
nouveau [ DRM] GART: 1048576 MiB
nouveau E[ DRM] Pointer to TMDS table invalid
nouveau [ DRM] DCB version 4.0
nouveau E[ DRM] Pointer to flat panel table invalid
nouveau [ DRM] MM: using COPY for buffer copies
[drm] Initialized nouveau 1.2.2 20120801 for 0000:01:00.0 on minor 1
nouveau E[ PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x122130 [ !ENGINE ]

2015-08-12 07:41:18

by Alexandre Courbot

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

2015-08-12 16:37 GMT+09:00 Afzal Mohammed <[email protected]>:
> Hi,
>
> On Wed, Aug 12, 2015 at 04:12:15PM +0900, Alexandre Courbot wrote:
>
>> Could you let me know what your card is? It may be useful to know the
>> range of affected cards when trying to fix this.
>
> grep of nouveau on dmesg as follows, if the following information is
> not sufficient, let me know where the details you are asking for can
> be found,

Great, thanks. Are you also on an optimus configuration with the
NVIDIA card being the secondary GPU?

2015-08-12 09:59:16

by afzal mohammed

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

Hi,

On Wed, Aug 12, 2015 at 04:40:57PM +0900, Alexandre Courbot wrote:

> Great, thanks. Are you also on an optimus configuration with the
> NVIDIA card being the secondary GPU?

Spec says graphic processor is NVIDIA GeForce NV14P-GV2 GT40M, system
is Lenovo E431 laptop.

I am a stranger here, started Kernel journey towards north and reached
south since the system wasn't booting :), don't know how to find it is
an optimus configuration, if above details aren't enough, let me know
how to find out.

Regards
Afzal

2015-08-14 03:49:38

by Alexandre Courbot

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

On Wed, Aug 12, 2015 at 6:59 PM, Afzal Mohammed <[email protected]> wrote:
> Hi,
>
> On Wed, Aug 12, 2015 at 04:40:57PM +0900, Alexandre Courbot wrote:
>
>> Great, thanks. Are you also on an optimus configuration with the
>> NVIDIA card being the secondary GPU?
>
> Spec says graphic processor is NVIDIA GeForce NV14P-GV2 GT40M, system
> is Lenovo E431 laptop.
>
> I am a stranger here, started Kernel journey towards north and reached
> south since the system wasn't booting :), don't know how to find it is
> an optimus configuration, if above details aren't enough, let me know
> how to find out.

Thanks for the details!

An optimus configuration means that display and basic acceleration is
provided by an integrated Intel graphics, and the NVIDIA GPU can be
switched on/off dynamically to provide more power when needed.

According to your laptop reference, this seems to be the kind of
configuration you have. It is relevant because this issue seems to
happen when the NVIDIA GPU is switched off during boot.

2015-08-17 03:06:11

by Alexandre Courbot

[permalink] [raw]
Subject: Re: [PATCH] Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

Patch has landed in -rc7, thanks David!

On Fri, Aug 14, 2015 at 12:49 PM, Alexandre Courbot <[email protected]> wrote:
> On Wed, Aug 12, 2015 at 6:59 PM, Afzal Mohammed <[email protected]> wrote:
>> Hi,
>>
>> On Wed, Aug 12, 2015 at 04:40:57PM +0900, Alexandre Courbot wrote:
>>
>>> Great, thanks. Are you also on an optimus configuration with the
>>> NVIDIA card being the secondary GPU?
>>
>> Spec says graphic processor is NVIDIA GeForce NV14P-GV2 GT40M, system
>> is Lenovo E431 laptop.
>>
>> I am a stranger here, started Kernel journey towards north and reached
>> south since the system wasn't booting :), don't know how to find it is
>> an optimus configuration, if above details aren't enough, let me know
>> how to find out.
>
> Thanks for the details!
>
> An optimus configuration means that display and basic acceleration is
> provided by an integrated Intel graphics, and the NVIDIA GPU can be
> switched on/off dynamically to provide more power when needed.
>
> According to your laptop reference, this seems to be the kind of
> configuration you have. It is relevant because this issue seems to
> happen when the NVIDIA GPU is switched off during boot.