2006-08-02 21:59:44

by Dave Jones

[permalink] [raw]
Subject: orinoco driver causes *lots* of lockdep spew

Wow. Nearly 400 lines of debug spew, from a simple 'ifup eth1'.

Dave


ADDRCONF(NETDEV_UP): eth1: link is not ready
eth1: New link status: Disconnected (0002)

======================================================
[ INFO: hard-safe -> hard-unsafe lock order detected ]
------------------------------------------------------
events/0/5 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(af_callback_keys + sk->sk_family){-.--}, at: [<ffffffff802136b1>] sock_def_readable+0x19/0x6f

and this task is already holding:
(&priv->lock){++..}, at: [<ffffffff8824f70e>] orinoco_send_wevents+0x28/0x8b [orinoco]
which would create a new lock dependency:
(&priv->lock){++..} -> (af_callback_keys + sk->sk_family){-.--}

but this new dependency connects a hard-irq-safe lock:
(&priv->lock){++..}
... which became hard-irq-safe at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8824f7be>] orinoco_interrupt+0x4d/0xf49 [orinoco]
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65

to a hard-irq-unsafe lock:
(af_callback_keys + sk->sk_family){-.--}
... which became hard-irq-unsafe at:
... [<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267867>] _write_lock_bh+0x29/0x36
[<ffffffff80433960>] netlink_release+0x139/0x2ca
[<ffffffff80257903>] sock_release+0x19/0x9b
[<ffffffff80257b13>] sock_close+0x33/0x3a
[<ffffffff802130ee>] __fput+0xc6/0x1a8
[<ffffffff8022effe>] fput+0x13/0x16
[<ffffffff80225383>] filp_close+0x64/0x70
[<ffffffff8021eecc>] sys_close+0x93/0xb0
[<ffffffff8026048d>] system_call+0x7d/0x83

other info that might help us debug this:

1 lock held by events/0/5:
#0: (&priv->lock){++..}, at: [<ffffffff8824f70e>] orinoco_send_wevents+0x28/0x8b [orinoco]

the hard-irq-safe lock's dependencies:
-> (&priv->lock){++..} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267a3e>] _spin_lock_irq+0x2a/0x38
[<ffffffff8824f102>] orinoco_init+0x934/0x966 [orinoco]
[<ffffffff8041e762>] register_netdevice+0xe6/0x375
[<ffffffff8041ea4b>] register_netdev+0x5a/0x69
[<ffffffff8826155f>] orinoco_cs_probe+0x3d7/0x475 [orinoco_cs]
[<ffffffff803daa02>] pcmcia_device_probe+0x7f/0x124
[<ffffffff803b5e74>] driver_probe_device+0x5b/0xb1
[<ffffffff803b5fde>] __driver_attach+0x88/0xdb
[<ffffffff803b5826>] bus_for_each_dev+0x48/0x7a
[<ffffffff803b5d9e>] driver_attach+0x1b/0x1e
[<ffffffff803b543e>] bus_add_driver+0x88/0x138
[<ffffffff803b6289>] driver_register+0x8e/0x93
[<ffffffff803da89b>] pcmcia_register_driver+0xd0/0xda
[<ffffffff880a9024>] 0xffffffff880a9024
[<ffffffff802af420>] sys_init_module+0x16f2/0x18b7
[<ffffffff8026048d>] system_call+0x7d/0x83
in-hardirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8824f7be>] orinoco_interrupt+0x4d/0xf49 [orinoco]
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65
in-softirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8824f7be>] orinoco_interrupt+0x4d/0xf49 [orinoco]
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65
[<ffffffff8028ebce>] scheduler_tick+0xc1/0x362
[<ffffffff80261739>] call_softirq+0x1d/0x28
[<ffffffff80295edb>] irq_exit+0x56/0x59
[<ffffffff8027a67f>] smp_apic_timer_interrupt+0x5c/0x62
[<ffffffff802610ad>] apic_timer_interrupt+0x69/0x70
}
... key at: [<ffffffff8825fd80>] __key.22351+0x0/0xffffffffffff27fa [orinoco]
-> (&cwq->lock){++..} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff802a0314>] __queue_work+0x17/0x5e
[<ffffffff802a03de>] queue_work+0x4d/0x57
[<ffffffff8029fdda>] call_usermodehelper_keys+0x119/0x137
[<ffffffff8025af79>] kobject_uevent+0x3e5/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff8097f2ed>] vtconsole_class_init+0x74/0xbb
[<ffffffff8026d7fc>] init+0x1fc/0x3cd
[<ffffffff802613dd>] child_rip+0x7/0x12
in-hardirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff802a0314>] __queue_work+0x17/0x5e
[<ffffffff802a03de>] queue_work+0x4d/0x57
[<ffffffff8033c786>] kblockd_schedule_work+0x15/0x18
[<ffffffff8034493b>] __cfq_slice_expired+0x63/0xe6
[<ffffffff80253352>] cfq_completed_request+0x116/0x154
[<ffffffff8033bb82>] elv_completed_request+0x38/0x85
[<ffffffff8033cca7>] __blk_put_request+0x35/0x9f
[<ffffffff8033cdfb>] end_that_request_last+0xea/0xf4
[<ffffffff8020b10a>] ide_end_request+0xf2/0x111
[<ffffffff8023f4a7>] ide_dma_intr+0x70/0xb5
[<ffffffff8020dcd6>] ide_intr+0x169/0x1df
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65
in-softirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff802a0314>] __queue_work+0x17/0x5e
[<ffffffff802a03de>] queue_work+0x4d/0x57
[<ffffffff802a03fd>] schedule_work+0x15/0x18
[<ffffffff803639bb>] cursor_timer_handler+0x1b/0x38
[<ffffffff8029a391>] run_timer_softirq+0x14b/0x1d5
[<ffffffff80212a1f>] __do_softirq+0x67/0xf5
[<ffffffff80261739>] call_softirq+0x1d/0x28
[<ffffffff80295edb>] irq_exit+0x56/0x59
[<ffffffff8027a67f>] smp_apic_timer_interrupt+0x5c/0x62
[<ffffffff802610ad>] apic_timer_interrupt+0x69/0x70
}
... key at: [<ffffffff806c47a0>] __key.10352+0x0/0x8
-> (&q->lock){++..} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267a3e>] _spin_lock_irq+0x2a/0x38
[<ffffffff80265123>] wait_for_completion+0x2f/0xb3
[<ffffffff802a34d4>] keventd_create_kthread+0x35/0x6a
[<ffffffff802a35d3>] kthread_create+0xca/0x153
[<ffffffff8028e085>] migration_call+0x60/0x44f
[<ffffffff80975115>] migration_init+0x27/0x4f
[<ffffffff8026d669>] init+0x69/0x3cd
[<ffffffff802613dd>] child_rip+0x7/0x12
in-hardirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff80230689>] __wake_up+0x21/0x50
[<ffffffff8038719b>] acpi_ec_gpe_handler+0x96/0xdb
[<ffffffff803734f2>] acpi_ev_gpe_dispatch+0x6e/0x160
[<ffffffff80373876>] acpi_ev_gpe_detect+0xae/0xff
[<ffffffff80371cf0>] acpi_ev_sci_xrupt_handler+0x19/0x22
[<ffffffff8036c543>] acpi_irq+0x10/0x1b
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65
in-softirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8028d434>] complete+0x1b/0x4c
[<ffffffff802a12dc>] wakeme_after_rcu+0xc/0xf
[<ffffffff802a1531>] __rcu_process_callbacks+0x154/0x1d9
[<ffffffff802a15d8>] rcu_process_callbacks+0x22/0x44
[<ffffffff80296014>] tasklet_action+0x6c/0xc5
[<ffffffff80212a1f>] __do_softirq+0x67/0xf5
[<ffffffff80261739>] call_softirq+0x1d/0x28
[<ffffffff80295edb>] irq_exit+0x56/0x59
[<ffffffff8027a67f>] smp_apic_timer_interrupt+0x5c/0x62
[<ffffffff802610ad>] apic_timer_interrupt+0x69/0x70
}
... key at: [<ffffffff806c4dd8>] __key.13972+0x0/0x8
-> (&rq->rq_lock_key){++..} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8028dd54>] init_idle+0x98/0xc7
[<ffffffff8097531d>] sched_init+0x1b8/0x1be
[<ffffffff809646e8>] start_kernel+0x7a/0x24c
[<ffffffff8096428a>] _sinittext+0x28a/0x292
[<ffffffffffffffff>] 0xffffffffffffffff
in-hardirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff802677ca>] _spin_lock+0x24/0x31
[<ffffffff8028eb81>] scheduler_tick+0x74/0x362
[<ffffffff8029abe0>] update_process_times+0x67/0x79
[<ffffffff80279f02>] smp_local_timer_interrupt+0x2a/0x50
[<ffffffff80271526>] main_timer_handler+0x202/0x3a5
[<ffffffff802716dd>] timer_interrupt+0x14/0x2a
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65
in-softirq-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff802677ca>] _spin_lock+0x24/0x31
[<ffffffff8028de0c>] task_rq_lock+0x41/0x74
[<ffffffff802489a3>] try_to_wake_up+0x26/0x418
[<ffffffff8028e022>] wake_up_process+0xf/0x12
[<ffffffff8029a593>] process_timeout+0x8/0xb
[<ffffffff8029a391>] run_timer_softirq+0x14b/0x1d5
[<ffffffff80212a1f>] __do_softirq+0x67/0xf5
[<ffffffff80261739>] call_softirq+0x1d/0x28
[<ffffffff80295edb>] irq_exit+0x56/0x59
[<ffffffff8027a67f>] smp_apic_timer_interrupt+0x5c/0x62
[<ffffffff802610ad>] apic_timer_interrupt+0x69/0x70
}
... key at: [<ffff810002618700>] 0xffff810002618700
... acquired at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff802677ca>] _spin_lock+0x24/0x31
[<ffffffff8028de0c>] task_rq_lock+0x41/0x74
[<ffffffff802489a3>] try_to_wake_up+0x26/0x418
[<ffffffff8028e010>] default_wake_function+0xc/0xf
[<ffffffff8028c310>] __wake_up_common+0x3d/0x68
[<ffffffff8028d450>] complete+0x37/0x4c
[<ffffffff80235411>] kthread+0xda/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12

... acquired at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff80230689>] __wake_up+0x21/0x50
[<ffffffff802a0347>] __queue_work+0x4a/0x5e
[<ffffffff802a03de>] queue_work+0x4d/0x57
[<ffffffff8029fdda>] call_usermodehelper_keys+0x119/0x137
[<ffffffff8025af79>] kobject_uevent+0x3e5/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff8097f2ed>] vtconsole_class_init+0x74/0xbb
[<ffffffff8026d7fc>] init+0x1fc/0x3cd
[<ffffffff802613dd>] child_rip+0x7/0x12

... acquired at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff802a0314>] __queue_work+0x17/0x5e
[<ffffffff802a03de>] queue_work+0x4d/0x57
[<ffffffff802a03fd>] schedule_work+0x15/0x18
[<ffffffff8824fc31>] orinoco_interrupt+0x4c0/0xf49 [orinoco]
[<ffffffff8021151f>] handle_IRQ_event+0x2b/0x64
[<ffffffff802c0987>] __do_IRQ+0xae/0x114
[<ffffffff8026fca8>] do_IRQ+0xf7/0x107
[<ffffffff802609c4>] common_interrupt+0x64/0x65

-> (&list->lock){....} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff80258024>] skb_queue_tail+0x1e/0x49
[<ffffffff80259ac6>] netlink_broadcast+0x211/0x2e2
[<ffffffff8025af3f>] kobject_uevent+0x3ab/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff803ff248>] evdev_connect+0xfc/0x121
[<ffffffff803fd73a>] input_register_device+0x1e8/0x26d
[<ffffffff80400ac5>] atkbd_connect+0x23d/0x26d
[<ffffffff803f8861>] serio_connect_driver+0x2c/0x41
[<ffffffff803f8890>] serio_driver_probe+0x1a/0x1d
[<ffffffff803b5e74>] driver_probe_device+0x5b/0xb1
[<ffffffff803b5fde>] __driver_attach+0x88/0xdb
[<ffffffff803b5826>] bus_for_each_dev+0x48/0x7a
[<ffffffff803b5d9e>] driver_attach+0x1b/0x1e
[<ffffffff803b543e>] bus_add_driver+0x88/0x138
[<ffffffff803b6289>] driver_register+0x8e/0x93
[<ffffffff803f942c>] serio_thread+0x14c/0x2a9
[<ffffffff80235436>] kthread+0xff/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12
}
... key at: [<ffffffff80919fb0>] __key.17572+0x0/0x8
... acquired at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267ba2>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff80258024>] skb_queue_tail+0x1e/0x49
[<ffffffff80259ac6>] netlink_broadcast+0x211/0x2e2
[<ffffffff804287ea>] wireless_send_event+0x2ff/0x317
[<ffffffff8824f731>] orinoco_send_wevents+0x4b/0x8b [orinoco]
[<ffffffff8024f99b>] run_workqueue+0xa7/0xfb
[<ffffffff8024c17f>] worker_thread+0xee/0x122
[<ffffffff80235436>] kthread+0xff/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12


the hard-irq-unsafe lock's dependencies:
-> (af_callback_keys + sk->sk_family){-.--} ops: 0 {
initial-use at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267947>] _read_lock+0x27/0x34
[<ffffffff802136b0>] sock_def_readable+0x18/0x6f
[<ffffffff80259ad6>] netlink_broadcast+0x221/0x2e2
[<ffffffff8025af3f>] kobject_uevent+0x3ab/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff803ff248>] evdev_connect+0xfc/0x121
[<ffffffff803fd73a>] input_register_device+0x1e8/0x26d
[<ffffffff80400ac5>] atkbd_connect+0x23d/0x26d
[<ffffffff803f8861>] serio_connect_driver+0x2c/0x41
[<ffffffff803f8890>] serio_driver_probe+0x1a/0x1d
[<ffffffff803b5e74>] driver_probe_device+0x5b/0xb1
[<ffffffff803b5fde>] __driver_attach+0x88/0xdb
[<ffffffff803b5826>] bus_for_each_dev+0x48/0x7a
[<ffffffff803b5d9e>] driver_attach+0x1b/0x1e
[<ffffffff803b543e>] bus_add_driver+0x88/0x138
[<ffffffff803b6289>] driver_register+0x8e/0x93
[<ffffffff803f942c>] serio_thread+0x14c/0x2a9
[<ffffffff80235436>] kthread+0xff/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12
hardirq-on-W at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267867>] _write_lock_bh+0x29/0x36
[<ffffffff80433960>] netlink_release+0x139/0x2ca
[<ffffffff80257903>] sock_release+0x19/0x9b
[<ffffffff80257b13>] sock_close+0x33/0x3a
[<ffffffff802130ee>] __fput+0xc6/0x1a8
[<ffffffff8022effe>] fput+0x13/0x16
[<ffffffff80225383>] filp_close+0x64/0x70
[<ffffffff8021eecc>] sys_close+0x93/0xb0
[<ffffffff8026048d>] system_call+0x7d/0x83
softirq-on-R at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267947>] _read_lock+0x27/0x34
[<ffffffff802136b0>] sock_def_readable+0x18/0x6f
[<ffffffff80259ad6>] netlink_broadcast+0x221/0x2e2
[<ffffffff8025af3f>] kobject_uevent+0x3ab/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff803ff248>] evdev_connect+0xfc/0x121
[<ffffffff803fd73a>] input_register_device+0x1e8/0x26d
[<ffffffff80400ac5>] atkbd_connect+0x23d/0x26d
[<ffffffff803f8861>] serio_connect_driver+0x2c/0x41
[<ffffffff803f8890>] serio_driver_probe+0x1a/0x1d
[<ffffffff803b5e74>] driver_probe_device+0x5b/0xb1
[<ffffffff803b5fde>] __driver_attach+0x88/0xdb
[<ffffffff803b5826>] bus_for_each_dev+0x48/0x7a
[<ffffffff803b5d9e>] driver_attach+0x1b/0x1e
[<ffffffff803b543e>] bus_add_driver+0x88/0x138
[<ffffffff803b6289>] driver_register+0x8e/0x93
[<ffffffff803f942c>] serio_thread+0x14c/0x2a9
[<ffffffff80235436>] kthread+0xff/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12
hardirq-on-R at:
[<ffffffff802a8e62>] lock_acquire+0x4a/0x69
[<ffffffff80267947>] _read_lock+0x27/0x34
[<ffffffff802136b0>] sock_def_readable+0x18/0x6f
[<ffffffff80259ad6>] netlink_broadcast+0x221/0x2e2
[<ffffffff8025af3f>] kobject_uevent+0x3ab/0x42e
[<ffffffff803b6ebf>] class_device_add+0x314/0x471
[<ffffffff803b7034>] class_device_register+0x18/0x1d
[<ffffffff803b7130>] class_device_create+0xf7/0x129
[<ffffffff803ff248>] evdev_connect+0xfc/0x121
[<ffffffff803fd73a>] input_register_device+0x1e8/0x26d
[<ffffffff80400ac5>] atkbd_connect+0x23d/0x26d
[<ffffffff803f8861>] serio_connect_driver+0x2c/0x41
[<ffffffff803f8890>] serio_driver_probe+0x1a/0x1d
[<ffffffff803b5e74>] driver_probe_device+0x5b/0xb1
[<ffffffff803b5fde>] __driver_attach+0x88/0xdb
[<ffffffff803b5826>] bus_for_each_dev+0x48/0x7a
[<ffffffff803b5d9e>] driver_attach+0x1b/0x1e
[<ffffffff803b543e>] bus_add_driver+0x88/0x138
[<ffffffff803b6289>] driver_register+0x8e/0x93
[<ffffffff803f942c>] serio_thread+0x14c/0x2a9
[<ffffffff80235436>] kthread+0xff/0x136
[<ffffffff802613dd>] child_rip+0x7/0x12
}
... key at: [<ffffffff8091a280>] af_callback_keys+0x80/0x100

stack backtrace:

Call Trace:
[<ffffffff8026e7fd>] show_trace+0xae/0x30e
[<ffffffff8026ea72>] dump_stack+0x15/0x17
[<ffffffff802a7dc1>] check_usage+0x27d/0x28e
[<ffffffff802a86e6>] __lock_acquire+0x878/0xa54
[<ffffffff802a8e63>] lock_acquire+0x4b/0x69
[<ffffffff80267948>] _read_lock+0x28/0x34
[<ffffffff802136b1>] sock_def_readable+0x19/0x6f
[<ffffffff80259ad7>] netlink_broadcast+0x222/0x2e2
[<ffffffff804287eb>] wireless_send_event+0x300/0x317
[<ffffffff8824f732>] :orinoco:orinoco_send_wevents+0x4c/0x8b
[<ffffffff8024f99c>] run_workqueue+0xa8/0xfb
[<ffffffff8024c180>] worker_thread+0xef/0x122
[<ffffffff80235437>] kthread+0x100/0x136
[<ffffffff802613de>] child_rip+0x8/0x12
DWARF2 unwinder stuck at child_rip+0x8/0x12
Leftover inexact backtrace:
[<ffffffff80267ab2>] _spin_unlock_irq+0x2b/0x31
[<ffffffff80260a1b>] restore_args+0x0/0x30
[<ffffffff80235337>] kthread+0x0/0x136
[<ffffffff802613d6>] child_rip+0x0/0x12

eth1: New link status: Connected (0001)
ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

--
http://www.codemonkey.org.uk


2006-08-03 12:16:44

by Arjan van de Ven

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Wed, 2006-08-02 at 17:59 -0400, Dave Jones wrote:
> Wow. Nearly 400 lines of debug spew, from a simple 'ifup eth1'.
>
> Dave
>
>
> ADDRCONF(NETDEV_UP): eth1: link is not ready
> eth1: New link status: Disconnected (0002)
>
> ======================================================
> [ INFO: hard-safe -> hard-unsafe lock order detected ]
> ------------------------------------------------------
> events/0/5 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> (af_callback_keys + sk->sk_family){-.--}, at: [<ffffffff802136b1>] sock_def_readable+0x19/0x6f
>
> and this task is already holding:
> (&priv->lock){++..}, at: [<ffffffff8824f70e>] orinoco_send_wevents+0x28/0x8b [orinoco]
> which would create a new lock dependency:
> (&priv->lock){++..} -> (af_callback_keys + sk->sk_family){-.--}

> [<ffffffff80267948>] _read_lock+0x28/0x34
> [<ffffffff802136b1>] sock_def_readable+0x19/0x6f
> [<ffffffff80259ad7>] netlink_broadcast+0x222/0x2e2
> [<ffffffff804287eb>] wireless_send_event+0x300/0x317
> [<ffffffff8824f732>] :orinoco:orinoco_send_wevents+0x4c/0x8b
> [<ffffffff8024f99c>] run_workqueue+0xa8/0xfb
> [<ffffffff8024c180>] worker_thread+0xef/0x122
> [<ffffffff80235437>] kthread+0x100/0x136
> [<ffffffff802613de>] child_rip+0x8/0x12



this is another one of those nasty buggers;

Lock A = the sk->sk_callback_lock
Lock B = priv->lock in the driver

Lock A is only BH safe
Lock B is hardirq safe and used in the hardirq


Cpu 0 cpu 1
user closes the netlink socket
takes lock B in orinoco_send_events
takes lock A in user context in netlink_release() (for write)
interrupt happens
takes lock B in hardirq handler (spins)

calls netlink_broadcast
which takes lock A for read (spins)

and you have a nice classical AB-BA deadlock

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com

2006-08-03 13:55:17

by Herbert Xu

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

Arjan van de Ven <[email protected]> wrote:
>
> this is another one of those nasty buggers;

Good catch. It's really time that we fix this properly rather than
adding more kludges to the core code.

Dave, once this goes in you can revert the previous netlink workaround
that added the _bh suffix.

[WIRELESS]: Send wireless netlink events with a clean slate

Drivers expect to be able to call wireless_send_event in arbitrary
contexts. On the other hand, netlink really doesn't like being
invoked in an IRQ context. So we need to postpone the sending of
netlink skb's to a tasklet.

Signed-off-by: Herbert Xu <[email protected]>

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/wireless.c b/net/core/wireless.c
index d2bc72d..de0bde4 100644
--- a/net/core/wireless.c
+++ b/net/core/wireless.c
@@ -82,6 +82,7 @@ #include <linux/seq_file.h>
#include <linux/init.h> /* for __init */
#include <linux/if_arp.h> /* ARPHRD_ETHER */
#include <linux/etherdevice.h> /* compare_ether_addr */
+#include <linux/interrupt.h>

#include <linux/wireless.h> /* Pretty obvious */
#include <net/iw_handler.h> /* New driver API */
@@ -1842,6 +1843,18 @@ #endif /* CONFIG_NET_WIRELESS_RTNETLINK
*/

#ifdef WE_EVENT_RTNETLINK
+static struct sk_buff_head wireless_nlevent_queue;
+
+static void wireless_nlevent_process(unsigned long data)
+{
+ struct sk_buff *skb;
+
+ while ((skb = skb_dequeue(&wireless_nlevent_queue)))
+ netlink_broadcast(rtnl, skb, 0, RTNLGRP_LINK, GFP_ATOMIC);
+}
+
+static DECLARE_TASKLET(wireless_nlevent_tasklet, wireless_nlevent_process, 0);
+
/* ---------------------------------------------------------------- */
/*
* Fill a rtnetlink message with our event data.
@@ -1904,8 +1917,17 @@ static inline void rtmsg_iwinfo(struct n
return;
}
NETLINK_CB(skb).dst_group = RTNLGRP_LINK;
- netlink_broadcast(rtnl, skb, 0, RTNLGRP_LINK, GFP_ATOMIC);
+ skb_queue_tail(&wireless_nlevent_queue, skb);
+ tasklet_schedule(&wireless_nlevent_tasklet);
+}
+
+static int __init wireless_nlevent_init(void)
+{
+ skb_queue_head_init(&wireless_nlevent_queue);
+ return 0;
}
+
+subsys_initcall(wireless_nlevent_init);
#endif /* WE_EVENT_RTNETLINK */

/* ---------------------------------------------------------------- */

2006-08-03 14:12:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> Arjan van de Ven <[email protected]> wrote:
> >
> > this is another one of those nasty buggers;
>
> Good catch. It's really time that we fix this properly rather than
> adding more kludges to the core code.
>
> Dave, once this goes in you can revert the previous netlink workaround
> that added the _bh suffix.
>
> [WIRELESS]: Send wireless netlink events with a clean slate

Could we please just get rid of the wireless extensions over netlink code
again? It doesn't help to solve anything and just creates a bigger mess
to untangle when switching to a fully fledged wireless stack.

2006-08-03 15:24:11

by Arjan van de Ven

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

Herbert Xu wrote:

Hi,

> Arjan van de Ven <[email protected]> wrote:
>> this is another one of those nasty buggers;
>
> Good catch. It's really time that we fix this properly rather than
> adding more kludges to the core code.

however I'm not quite yet convinced that this patch is going to solve
this particular deadlock.
(I agree with the principle of it and I think it's really needed,
I just don't yet see how it's going to solve this specific deadlock. But
then again it's early and I've not had sufficient coffee yet so I could
well be wrong)

> [WIRELESS]: Send wireless netlink events with a clean slate
>
> Drivers expect to be able to call wireless_send_event in arbitrary
> contexts. On the other hand, netlink really doesn't like being
> invoked in an IRQ context. So we need to postpone the sending of
> netlink skb's to a tasklet.

it's not just about irq context, it's about being called with any lock that's
used in IRQ context; that is what makes this double nasty...

Greetings,
Arjan van de Ven

2006-08-03 18:55:50

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> Arjan van de Ven <[email protected]> wrote:
> >
> > this is another one of those nasty buggers;
>
> Good catch. It's really time that we fix this properly rather than
> adding more kludges to the core code.
>
> Dave, once this goes in you can revert the previous netlink workaround
> that added the _bh suffix.
>
> [WIRELESS]: Send wireless netlink events with a clean slate
>
> Drivers expect to be able to call wireless_send_event in arbitrary
> contexts. On the other hand, netlink really doesn't like being
> invoked in an IRQ context. So we need to postpone the sending of
> netlink skb's to a tasklet.

Yes, this was needed. I really like the way you implemented
it, simple and efficient. Go for it !

> Signed-off-by: Herbert Xu <[email protected]>

For what it's worth :
Signed-off-by: Jean Tourrilhes <[email protected]>

> Cheers,

Thanks !

Jean

2006-08-03 18:58:06

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 03:11:53PM +0100, Christoph Hellwig wrote:
> On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> > Arjan van de Ven <[email protected]> wrote:
> > >
> > > this is another one of those nasty buggers;
> >
> > Good catch. It's really time that we fix this properly rather than
> > adding more kludges to the core code.
> >
> > Dave, once this goes in you can revert the previous netlink workaround
> > that added the _bh suffix.
> >
> > [WIRELESS]: Send wireless netlink events with a clean slate
>
> Could we please just get rid of the wireless extensions over netlink code
> again? It doesn't help to solve anything and just creates a bigger mess
> to untangle when switching to a fully fledged wireless stack.

That's not going to happen any time soon, NetworkManager
depends on Wireless Events, as well as many other apps. And there is
not many mechanisms you can use in the kernel to generate events from
driver to userspace.
Have fun...

Jean

2006-08-03 19:02:54

by Dave Jones

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 11:58:00AM -0700, Jean Tourrilhes wrote:
> On Thu, Aug 03, 2006 at 03:11:53PM +0100, Christoph Hellwig wrote:
> > On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> > > Arjan van de Ven <[email protected]> wrote:
> > > >
> > > > this is another one of those nasty buggers;
> > >
> > > Good catch. It's really time that we fix this properly rather than
> > > adding more kludges to the core code.
> > >
> > > Dave, once this goes in you can revert the previous netlink workaround
> > > that added the _bh suffix.
> > >
> > > [WIRELESS]: Send wireless netlink events with a clean slate
> >
> > Could we please just get rid of the wireless extensions over netlink code
> > again? It doesn't help to solve anything and just creates a bigger mess
> > to untangle when switching to a fully fledged wireless stack.
>
> That's not going to happen any time soon, NetworkManager
> depends on Wireless Events, as well as many other apps. And there is
> not many mechanisms you can use in the kernel to generate events from
> driver to userspace.

It seemed to cope pretty well before we had this ?

Dave
--
http://www.codemonkey.org.uk

2006-08-03 19:02:59

by Dave Jones

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 03:11:53PM +0100, Christoph Hellwig wrote:

> Could we please just get rid of the wireless extensions over netlink code
> again? It doesn't help to solve anything and just creates a bigger mess
> to untangle when switching to a fully fledged wireless stack.

If we're going to do that, now is probably the best time to do it,
before any distro userland starts using it.

Dave

--
http://www.codemonkey.org.uk

2006-08-03 19:40:33

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 02:59:58PM -0400, Dave Jones wrote:
> On Thu, Aug 03, 2006 at 11:58:00AM -0700, Jean Tourrilhes wrote:
> > On Thu, Aug 03, 2006 at 03:11:53PM +0100, Christoph Hellwig wrote:
> > > On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> > > > Arjan van de Ven <[email protected]> wrote:
> > > > >
> > > > > this is another one of those nasty buggers;
> > > >
> > > > Good catch. It's really time that we fix this properly rather than
> > > > adding more kludges to the core code.
> > > >
> > > > Dave, once this goes in you can revert the previous netlink workaround
> > > > that added the _bh suffix.
> > > >
> > > > [WIRELESS]: Send wireless netlink events with a clean slate
> > >
> > > Could we please just get rid of the wireless extensions over netlink code
> > > again? It doesn't help to solve anything and just creates a bigger mess
> > > to untangle when switching to a fully fledged wireless stack.
> >
> > That's not going to happen any time soon, NetworkManager
> > depends on Wireless Events, as well as many other apps. And there is
> > not many mechanisms you can use in the kernel to generate events from
> > driver to userspace.
>
> It seemed to cope pretty well before we had this ?

Wireless Events were introduced in kernel 2.4.20 and 2.5.7,
which means 2002. NetworkManager and WPA Supplicant were based from
the very start on the availability of Wireless Events.
You are confusing different things...

> Dave

Have fun...

Jean

P.S. : By the way, don't ask me why it took four years for this bug to
get discovered...

2006-08-03 19:46:34

by Arjan van de Ven

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

Jean Tourrilhes wrote:
> Jean
>
> P.S. : By the way, don't ask me why it took four years for this bug to
> get discovered...

that I could answer: Only from 2.6.18-rc1 onwards does the kernel have a built in deadlock finder :)

2006-08-03 20:01:06

by John W. Linville

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> Arjan van de Ven <[email protected]> wrote:
> >
> > this is another one of those nasty buggers;
>
> Good catch. It's really time that we fix this properly rather than
> adding more kludges to the core code.
>
> Dave, once this goes in you can revert the previous netlink workaround
> that added the _bh suffix.
>
> [WIRELESS]: Send wireless netlink events with a clean slate
>
> Drivers expect to be able to call wireless_send_event in arbitrary
> contexts. On the other hand, netlink really doesn't like being
> invoked in an IRQ context. So we need to postpone the sending of
> netlink skb's to a tasklet.
>
> Signed-off-by: Herbert Xu <[email protected]>

Does anyone have any objection to Herbert's patch? It seems
appropriate to me.

Arjan, did you convince yourself whether or not this patch actually
resolves the problem at hand? Applying it makes sense to me either
way, but it would be nice to believe it fixed a known issue. :-)

John
--
John W. Linville
[email protected]

2006-08-04 01:06:43

by Herbert Xu

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 03:53:13PM -0400, John W. Linville wrote:
>
> Does anyone have any objection to Herbert's patch? It seems
> appropriate to me.

I have no objections!

:)
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2006-08-04 01:06:25

by Herbert Xu

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, Aug 03, 2006 at 08:22:58AM -0700, Arjan van de Ven wrote:
>
> however I'm not quite yet convinced that this patch is going to solve
> this particular deadlock.
> (I agree with the principle of it and I think it's really needed,
> I just don't yet see how it's going to solve this specific deadlock. But
> then again it's early and I've not had sufficient coffee yet so I could
> well be wrong)

Well it solves the dead lock by breaking the chain that links the
netlink system with the jungle of wireless locking :)

The spin lock in sk_buff_head acts as a mediator. We only feed the
skb to the netlink system once that spin lock has been dropped.

> it's not just about irq context, it's about being called with any lock
> that's
> used in IRQ context; that is what makes this double nasty...

Yes it is nasty. However, so far wireless seems to be the only offender.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2006-08-04 02:12:12

by Arjan van de Ven

[permalink] [raw]
Subject: Re: orinoco driver causes *lots* of lockdep spew

On Thu, 2006-08-03 at 15:53 -0400, John W. Linville wrote:
> On Thu, Aug 03, 2006 at 11:54:41PM +1000, Herbert Xu wrote:
> Arjan, did you convince yourself whether or not this patch actually
> resolves the problem at hand? Applying it makes sense to me either
> way, but it would be nice to believe it fixed a known issue. :-)

it'll fix a whole bunch of issues for sure, and this one as well afaics
(now with coffee ;-).. it probably won't fix all of them, but that's ok,
with this in place we actually CAN fix any others that pop up, right now
without this patch we probably can't.

> John
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com