Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755671Ab0HPSms (ORCPT ); Mon, 16 Aug 2010 14:42:48 -0400 Received: from einhorn.in-berlin.de ([192.109.42.8]:53235 "EHLO einhorn.in-berlin.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752586Ab0HPSmq (ORCPT ); Mon, 16 Aug 2010 14:42:46 -0400 X-Envelope-From: stefanr@s5r6.in-berlin.de Date: Mon, 16 Aug 2010 20:42:34 +0200 (CEST) From: Stefan Richter Subject: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock To: linux1394-devel@lists.sourceforge.net cc: linux-kernel@vger.kernel.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-Disposition: INLINE Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11711 Lines: 240 Hi, I just saw the following (twice, with a reboot in between) with kernel 2.6.35 plus firewire driver patched to their state in 2.6.36-rc1, i.e. basically with a kernel according to commit e78483c5. I suspect it might also happen without firewire patches, as 2.6.35 already contains the change where firewire-core's per-transaction timer was introduced which shows up in the following log. It happened while I was testing hotplugging of FireWire disks with several ones on a bus together with several FireWire 800 repeaters, which sometimes act up so that reconnect does not succeed immediately. Perhaps such reconnection troubles are necessary to provoke this lockdep report. [I only accidentally discovered this on a test PC which do not use very frequently. Unfortunately, I had lockdep disabled on my main PC on which I test most of my FireWire hardware for quite a while now. I guess I should enable lockdep there again, but I have a suspicion that I won't leave it on for too long.] Note how the IN-HARDIRQ-W talks about a SCSI host lock of a libata host, whereas the locking in the firewire stack surely does only interact (hopefully correctly) with SCSI host locks of firewire-sbp2 host instances. What does this all mean? How to act on it? ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.35 #1 --------------------------------------------------------- swapper/0 just changed the state of lock: (&t->split_timeout_timer){+.-...}, at: [] run_timer_softirq+0x112/0x21c but this lock was taken by another, HARDIRQ-safe lock in the past: (&(shost->host_lock)->rlock){-.-...} and interrupts could create inverse lock ordering between them. other info that might help us debug this: no locks held by swapper/0. the shortest dependencies between 2nd lock and 1st lock: -> (&(shost->host_lock)->rlock){-.-...} ops: 123308 { IN-HARDIRQ-W at: [] __lock_acquire+0x612/0x153e [] lock_acquire+0x5c/0x73 [] _raw_spin_lock_irqsave+0x2e/0x3e [] scsi_eh_scmd_add+0x25/0x8a [] scsi_times_out+0x4a/0x64 [] blk_rq_timed_out+0xf/0x4b [] blk_abort_request+0x3e/0x41 [] ata_qc_schedule_eh+0x55/0x67 [] ata_qc_complete+0x59/0x1b6 [] ata_hsm_qc_complete+0xb5/0xcf [] ata_sff_hsm_move+0x6ad/0x6fa [] __ata_sff_port_intr+0x9f/0xad [] ata_bmdma_port_intr+0xb9/0x127 [] ata_bmdma_interrupt+0x8c/0x170 [] handle_IRQ_event+0x1d/0xa2 [] handle_fasteoi_irq+0x79/0xb2 [] handle_irq+0x3b/0x48 [] do_IRQ+0x45/0x9f [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x6d/0xa1 [] cpu_idle+0x49/0x77 [] rest_init+0xbf/0xc4 [] start_kernel+0x306/0x30b [] i386_start_kernel+0xb7/0xbf IN-SOFTIRQ-W at: [] __lock_acquire+0x633/0x153e [] lock_acquire+0x5c/0x73 [] _raw_spin_lock_irqsave+0x2e/0x3e [] scsi_device_unbusy+0x1e/0x8d [] scsi_finish_command+0x22/0xdf [] scsi_softirq_done+0xf4/0xfc [] blk_done_softirq+0x5c/0x69 [] __do_softirq+0x8b/0x10a [] do_softirq+0x2b/0x43 [] run_ksoftirqd+0x74/0x15a [] kthread+0x61/0x66 [] kernel_thread_helper+0x6/0x1a INITIAL USE at: [] __lock_acquire+0x70d/0x153e [] lock_acquire+0x5c/0x73 [] _raw_spin_lock_irqsave+0x2e/0x3e [] scsi_schedule_eh+0xf/0x4f [] ata_port_schedule_eh+0x37/0x3a [] async_port_probe+0x65/0x9c [] async_thread+0xe7/0x1c8 [] kthread+0x61/0x66 [] kernel_thread_helper+0x6/0x1a } ... key at: [] __key.30680+0x0/0x8 ... acquired at: [] lock_acquire+0x5c/0x73 [] del_timer_sync+0x29/0x6c [] close_transaction+0x93/0xb4 [firewire_core] [] transmit_complete_callback+0x3f/0x43 [firewire_core] [] at_context_transmit+0x59b/0x5a3 [firewire_ohci] [] ohci_send_request+0xd/0xf [firewire_ohci] [] fw_send_request+0x21c/0x224 [firewire_core] [] 0xf8dab307 [] 0xf8dac075 [] scsi_dispatch_cmd+0x182/0x1f2 [] scsi_request_fn+0x359/0x492 [] __generic_unplug_device+0x26/0x29 [] blk_execute_rq_nowait+0x56/0x75 [] blk_execute_rq+0xa8/0xc7 [] scsi_execute+0xc0/0x105 [] scsi_execute_req+0x55/0x7d [] sd_start_stop_device+0x7b/0x117 [] sd_shutdown+0x108/0x117 [] sd_remove+0x42/0x6f [] __device_release_driver+0x5e/0x96 [] device_release_driver+0x1a/0x25 [] bus_remove_device+0x7b/0x92 [] device_del+0xfa/0x154 [] __scsi_remove_device+0x3d/0x7b [] scsi_remove_device+0x1c/0x27 [] 0xf8dab83e [] kref_put+0x39/0x44 [] 0xf8dab0ff [] __device_release_driver+0x5e/0x96 [] device_release_driver+0x1a/0x25 [] bus_remove_device+0x7b/0x92 [] device_del+0xfa/0x154 [] device_unregister+0xb/0x15 [] shutdown_unit+0x8/0xc [firewire_core] [] device_for_each_child+0x2a/0x54 [] fw_device_shutdown+0x72/0xa4 [firewire_core] [] worker_thread+0x16a/0x242 [] kthread+0x61/0x66 [] kernel_thread_helper+0x6/0x1a -> (&t->split_timeout_timer){+.-...} ops: 14281 { HARDIRQ-ON-W at: [] __lock_acquire+0x692/0x153e [] lock_acquire+0x5c/0x73 [] run_timer_softirq+0x17b/0x21c [] __do_softirq+0x8b/0x10a [] do_softirq+0x2b/0x43 [] irq_exit+0x38/0x74 [] do_IRQ+0x89/0x9f [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x6d/0xa1 [] cpu_idle+0x49/0x77 [] rest_init+0xbf/0xc4 [] start_kernel+0x306/0x30b [] i386_start_kernel+0xb7/0xbf IN-SOFTIRQ-W at: [] __lock_acquire+0x633/0x153e [] lock_acquire+0x5c/0x73 [] del_timer_sync+0x29/0x6c [] fw_core_handle_response+0x108/0x139 [firewire_core] [] handle_ar_packet+0x108/0x114 [firewire_ohci] [] ar_context_tasklet+0x111/0x11f [firewire_ohci] [] tasklet_action+0x8d/0xe0 [] __do_softirq+0x8b/0x10a [] do_softirq+0x2b/0x43 [] irq_exit+0x38/0x74 [] do_IRQ+0x89/0x9f [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x6d/0xa1 [] cpu_idle+0x49/0x77 [] rest_init+0xbf/0xc4 [] start_kernel+0x306/0x30b [] i386_start_kernel+0xb7/0xbf INITIAL USE at: [] __lock_acquire+0x70d/0x153e [] lock_acquire+0x5c/0x73 [] del_timer_sync+0x29/0x6c [] fw_core_handle_response+0x108/0x139 [firewire_core] [] at_context_transmit+0x27e/0x5a3 [firewire_ohci] [] ohci_send_request+0xd/0xf [firewire_ohci] [] fw_send_request+0x21c/0x224 [firewire_core] [] fw_run_transaction+0x85/0xc5 [firewire_core] [] bm_work+0x16c/0x34b [firewire_core] [] worker_thread+0x16a/0x242 [] kthread+0x61/0x66 [] kernel_thread_helper+0x6/0x1a } ... key at: [] __key.22471+0x0/0xffffe611 [firewire_core] ... acquired at: [] check_usage_backwards+0x94/0x9f [] mark_lock+0x2c0/0x4cd [] __lock_acquire+0x692/0x153e [] lock_acquire+0x5c/0x73 [] run_timer_softirq+0x17b/0x21c [] __do_softirq+0x8b/0x10a [] do_softirq+0x2b/0x43 [] irq_exit+0x38/0x74 [] do_IRQ+0x89/0x9f [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x6d/0xa1 [] cpu_idle+0x49/0x77 [] rest_init+0xbf/0xc4 [] start_kernel+0x306/0x30b [] i386_start_kernel+0xb7/0xbf stack backtrace: Pid: 0, comm: swapper Not tainted 2.6.35 #1 Call Trace: [] ? printk+0xf/0x16 [] print_irq_inversion_bug+0xeb/0xf6 [] check_usage_backwards+0x94/0x9f [] mark_lock+0x2c0/0x4cd [] ? check_usage_backwards+0x0/0x9f [] __lock_acquire+0x692/0x153e [] ? _raw_spin_unlock_irqrestore+0x54/0x58 [] ? debug_object_deactivate+0xa1/0xcc [] lock_acquire+0x5c/0x73 [] ? run_timer_softirq+0x112/0x21c [] run_timer_softirq+0x17b/0x21c [] ? run_timer_softirq+0x112/0x21c [] ? split_transaction_timeout_callback+0x0/0x8b [firewire_core] [] __do_softirq+0x8b/0x10a [] do_softirq+0x2b/0x43 [] irq_exit+0x38/0x74 [] do_IRQ+0x89/0x9f [] common_interrupt+0x2e/0x34 [] ? acpi_idle_enter_simple+0x122/0x14e [] cpuidle_idle_call+0x6d/0xa1 [] cpu_idle+0x49/0x77 [] rest_init+0xbf/0xc4 [] start_kernel+0x306/0x30b [] i386_start_kernel+0xb7/0xbf -- Stefan Richter -=====-==-=- =--- =---- http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/