2013-03-16 18:30:04

by Lijo Antony

[permalink] [raw]
Subject: [3.9.0-rc2] nouveau deadlock when HDMI TV is connected

Hi,

This is a Dell Inspiron N5110 laptop with Optimus mess.
Until 3.9.0-rc1, tv connected throuh HDMI was not getting detected. But
in 3.9.0-rc2, when I connect HDMI cable, I get this,


=============================================
[ INFO: possible recursive locking detected ]
3.9.0-rc2 #22 Not tainted
---------------------------------------------
kworker/0:1/54 is trying to acquire lock:
(&dmac->lock){+.+...}, at: [<ffffffffa05fffb3>] evo_wait+0x43/0xf0
[nouveau]

but task is already holding lock:
(&dmac->lock){+.+...}, at: [<ffffffffa05fffb3>] evo_wait+0x43/0xf0
[nouveau]

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&dmac->lock);
lock(&dmac->lock);

*** DEADLOCK ***

May be due to missing lock nesting notation

5 locks held by kworker/0:1/54:
#0: (events){.+.+.+}, at: [<ffffffff8106e311>]
process_one_work+0x171/0x4c0
#1: ((&nv_connector->hpd_work)){+.+.+.}, at: [<ffffffff8106e311>]
process_one_work+0x171/0x4c0
#2: (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa022ee2a>]
drm_modeset_lock_all+0x2a/0x70 [drm]
#3: (&crtc->mutex){+.+.+.}, at: [<ffffffffa022ee54>]
drm_modeset_lock_all+0x54/0x70 [drm]
#4: (&dmac->lock){+.+...}, at: [<ffffffffa05fffb3>]
evo_wait+0x43/0xf0 [nouveau]

stack backtrace:
Pid: 54, comm: kworker/0:1 Not tainted 3.9.0-rc2 #22
Call Trace:
[<ffffffff810b71e5>] __lock_acquire+0x715/0x1be0
[<ffffffffa056361c>] ? dcb_table+0x1ac/0x2a0 [nouveau]
[<ffffffff810b8c31>] lock_acquire+0xa1/0x130
[<ffffffffa05fffb3>] ? evo_wait+0x43/0xf0 [nouveau]
[<ffffffff816aac59>] ? mutex_lock_nested+0x299/0x340
[<ffffffff816aaa09>] mutex_lock_nested+0x49/0x340
[<ffffffffa05fffb3>] ? evo_wait+0x43/0xf0 [nouveau]
[<ffffffff810b954f>] ? mark_held_locks+0xaf/0x110
[<ffffffffa05fffb3>] evo_wait+0x43/0xf0 [nouveau]
[<ffffffffa0602a63>] nv50_display_flip_next+0x713/0x7a0 [nouveau]
[<ffffffff816ab95e>] ? mutex_unlock+0xe/0x10
[<ffffffffa0600097>] ? evo_kick+0x37/0x40 [nouveau]
[<ffffffffa0602cee>] nv50_crtc_commit+0x10e/0x230 [nouveau]
[<ffffffffa0158125>] drm_crtc_helper_set_mode+0x365/0x510 [drm_kms_helper]
[<ffffffffa015953e>] drm_crtc_helper_set_config+0xa4e/0xb70
[drm_kms_helper]
[<ffffffffa022fe71>] drm_mode_set_config_internal+0x31/0x70 [drm]
[<ffffffffa0157621>] drm_fb_helper_set_par+0x71/0xf0 [drm_kms_helper]
[<ffffffffa022eaa2>] ? drm_modeset_unlock_all+0x52/0x60 [drm]
[<ffffffffa0157581>] drm_fb_helper_hotplug_event+0x81/0xb0
[drm_kms_helper]
[<ffffffffa05e964c>] nouveau_fbcon_output_poll_changed+0x1c/0x20 [nouveau]
[<ffffffffa0157bbb>] drm_kms_helper_hotplug_event+0x2b/0x40
[drm_kms_helper]
[<ffffffffa0158ada>] drm_helper_hpd_irq_event+0x12a/0x140 [drm_kms_helper]
[<ffffffffa05ec323>] nouveau_connector_hotplug_work+0x93/0x100 [nouveau]
[<ffffffff8106e371>] process_one_work+0x1d1/0x4c0
[<ffffffff8106e311>] ? process_one_work+0x171/0x4c0
[<ffffffff8106fd0f>] worker_thread+0x10f/0x380
[<ffffffff8106fc00>] ? busy_worker_rebind_fn+0xb0/0xb0
[<ffffffff8107aaca>] kthread+0xea/0xf0
[<ffffffff8107a9e0>] ? kthread_create_on_node+0x160/0x160
[<ffffffff816b80ac>] ret_from_fork+0x7c/0xb0
[<ffffffff8107a9e0>] ? kthread_create_on_node+0x160/0x160




This is consistent and git bisect pointed at,

65b5f42e2a9eb9c8383fb67698bf8c27657f8c14 is the first bad commit
commit 65b5f42e2a9eb9c8383fb67698bf8c27657f8c14
Author: Ben Skeggs <[email protected]>
Date: Wed Feb 20 16:47:44 2013 +1000

drm/nve0/graph: some random reg moved on kepler

Signed-off-by: Ben Skeggs <[email protected]>

:040000 040000 9658d1fd413b8797fe06fb2ca8ce681d4dbbedb0
c5b38586625718fc78c0eb062af3baa201fe2e7f M drivers



65b5f42e2a9eb9c8383fb67698bf8c27657f8c14
commit 65b5f42e2a9eb9c8383fb67698bf8c27657f8c14
Author: Ben Skeggs <[email protected]>
Date: Wed Feb 20 16:47:44 2013 +1000

drm/nve0/graph: some random reg moved on kepler

Signed-off-by: Ben Skeggs <[email protected]>

diff --git a/drivers/gpu/drm/nouveau/core/engine/graph/nve0.c
b/drivers/gpu/drm/nouveau/core/engine/graph/nve0.c
index 61cec0f..4857f91 100644
--- a/drivers/gpu/drm/nouveau/core/engine/graph/nve0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/graph/nve0.c
@@ -350,7 +350,7 @@ nve0_graph_init_gpc_0(struct nvc0_graph_priv *priv)
nv_wr32(priv, GPC_UNIT(gpc, 0x0918), magicgpc918);
}

- nv_wr32(priv, GPC_BCAST(0x1bd4), magicgpc918);
+ nv_wr32(priv, GPC_BCAST(0x3fd4), magicgpc918);
nv_wr32(priv, GPC_BCAST(0x08ac), nv_rd32(priv, 0x100800));
}


But looking at 65b5f42e2a9eb9c8383fb67698bf8c27657f8c14, it looks I
might have done some mistake in bisecting. I will try again later.
Please let me know, if you need more info.

-lijo

git bisect log
--------------
# bad: [f6161aa153581da4a3867a2d1a7caf4be19b6ec9] Linux 3.9-rc2
git bisect bad f6161aa153581da4a3867a2d1a7caf4be19b6ec9
# good: [6dbe51c251a327e012439c4772097a13df43c5b8] Linux 3.9-rc1
git bisect good 6dbe51c251a327e012439c4772097a13df43c5b8
# good: [f40ebd6bcbbd0d30591f42dc16be52b5086a366b] drm/i915: Turn off
hsync and vsync on ADPA when disabling crt
git bisect good f40ebd6bcbbd0d30591f42dc16be52b5086a366b
# bad: [2cc79544bd0aabb4b3cf467ead5df526d9134c64] Merge branch
'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into
drm-next
git bisect bad 2cc79544bd0aabb4b3cf467ead5df526d9134c64
# bad: [f6853faa85793bf23b46787e4039824d275453c2] drm/nouveau: Fix typo
in init_idx_addr_latched().
git bisect bad f6853faa85793bf23b46787e4039824d275453c2
# bad: [650e1203c11354ba84d69ba445abc0efcfe3890a] drm/nouveau: Disable
AGP on PowerPC again.
git bisect bad 650e1203c11354ba84d69ba445abc0efcfe3890a
# bad: [65b5f42e2a9eb9c8383fb67698bf8c27657f8c14] drm/nve0/graph: some
random reg moved on kepler
git bisect bad 65b5f42e2a9eb9c8383fb67698bf8c27657f8c14

lijo@pluto:~/linux/linux$ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor
Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core
Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series
Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset
Family High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset
Family PCI Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset
Family PCI Express Root Port 2 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset
Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset
Family PCI Express Root Port 5 (rev b5)
00:1c.7 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset
Family PCI Express Root Port 8 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation HM67 Express Chipset Family LPC
Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset
Family 6 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family
SMBus Controller (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT
540M] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GF108 High Definition Audio
Controller (rev a1)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 05)
09:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless
Network Adapter (PCI-Express) (rev 01)
0b:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04)



Attachments:
dmesg.log (70.12 kB)

2013-03-17 16:38:32

by Lijo Antony

[permalink] [raw]
Subject: Re: [3.9.0-rc2] nouveau deadlock when HDMI TV is connected

On 03/16/2013 10:29 PM, Lijo Antony wrote:
>
> But looking at 65b5f42e2a9eb9c8383fb67698bf8c27657f8c14, it looks I
> might have done some mistake in bisecting.

My bisecting was indeed incorrect. Issue started even before 3.9.0-rc1.
Attempting a proper bisect this time...

-lijo