Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757716Ab0HQOfZ (ORCPT ); Tue, 17 Aug 2010 10:35:25 -0400 Received: from mail-pv0-f174.google.com ([74.125.83.174]:48372 "EHLO mail-pv0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758419Ab0HQOfT (ORCPT ); Tue, 17 Aug 2010 10:35:19 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=g1Ry+W+NnrAGJHo/1vn8OhbX90i9CSlMuVYVMWHZiACUqk3PxdtgTEKTtXKCm8bnpg m1rnHx7ClJtEQ4C3+xLd+HbWXHzJqnlVtA6/LnoDXICZU3CyDMa/sgmfy0cnmRMXUEP0 rL4DM5MW2eM8EiviX5tFVmfV5ac9qA8hf8PIQ= Date: Tue, 17 Aug 2010 22:35:09 +0800 From: Yong Zhang To: Stefan Richter Cc: linux1394-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, Thomas Gleixner Subject: Re: lockdep false positive? -- firewire-core transaction timer vs. scsi-core host lock Message-ID: <20100817143509.GC2838@zhy> Reply-To: Yong Zhang References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12731 Lines: 256 On Mon, Aug 16, 2010 at 08:42:34PM +0200, Stefan Richter wrote: > Hi, I just saw the following (twice, with a reboot in between) with > kernel 2.6.35 plus firewire driver patched to their state in 2.6.36-rc1, > i.e. basically with a kernel according to commit e78483c5. > > I suspect it might also happen without firewire patches, as 2.6.35 > already contains the change where firewire-core's per-transaction timer > was introduced which shows up in the following log. Cc'ing Thomas. I suspect it's introduced by commit 5c40cbfefa828208c671e2f58789e4dd04f79563 which call del_timer_sync() in softirq. comments on del_timer_sync() say "It must not be called from interrupt contexts." Thanks, Yong > > It happened while I was testing hotplugging of FireWire disks with > several ones on a bus together with several FireWire 800 repeaters, > which sometimes act up so that reconnect does not succeed immediately. > Perhaps such reconnection troubles are necessary to provoke this lockdep > report. > > [I only accidentally discovered this on a test PC which do not use very > frequently. Unfortunately, I had lockdep disabled on my main PC on > which I test most of my FireWire hardware for quite a while now. I > guess I should enable lockdep there again, but I have a suspicion that I > won't leave it on for too long.] > > Note how the IN-HARDIRQ-W talks about a SCSI host lock of a libata host, > whereas the locking in the firewire stack surely does only interact > (hopefully correctly) with SCSI host locks of firewire-sbp2 host > instances. > > What does this all mean? How to act on it? > > ========================================================= > [ INFO: possible irq lock inversion dependency detected ] > 2.6.35 #1 > --------------------------------------------------------- > swapper/0 just changed the state of lock: > (&t->split_timeout_timer){+.-...}, at: [] run_timer_softirq+0x112/0x21c > but this lock was taken by another, HARDIRQ-safe lock in the past: > (&(shost->host_lock)->rlock){-.-...} > > and interrupts could create inverse lock ordering between them. > > > other info that might help us debug this: > no locks held by swapper/0. > > the shortest dependencies between 2nd lock and 1st lock: > -> (&(shost->host_lock)->rlock){-.-...} ops: 123308 { > IN-HARDIRQ-W at: > [] __lock_acquire+0x612/0x153e > [] lock_acquire+0x5c/0x73 > [] _raw_spin_lock_irqsave+0x2e/0x3e > [] scsi_eh_scmd_add+0x25/0x8a > [] scsi_times_out+0x4a/0x64 > [] blk_rq_timed_out+0xf/0x4b > [] blk_abort_request+0x3e/0x41 > [] ata_qc_schedule_eh+0x55/0x67 > [] ata_qc_complete+0x59/0x1b6 > [] ata_hsm_qc_complete+0xb5/0xcf > [] ata_sff_hsm_move+0x6ad/0x6fa > [] __ata_sff_port_intr+0x9f/0xad > [] ata_bmdma_port_intr+0xb9/0x127 > [] ata_bmdma_interrupt+0x8c/0x170 > [] handle_IRQ_event+0x1d/0xa2 > [] handle_fasteoi_irq+0x79/0xb2 > [] handle_irq+0x3b/0x48 > [] do_IRQ+0x45/0x9f > [] common_interrupt+0x2e/0x34 > [] cpuidle_idle_call+0x6d/0xa1 > [] cpu_idle+0x49/0x77 > [] rest_init+0xbf/0xc4 > [] start_kernel+0x306/0x30b > [] i386_start_kernel+0xb7/0xbf > IN-SOFTIRQ-W at: > [] __lock_acquire+0x633/0x153e > [] lock_acquire+0x5c/0x73 > [] _raw_spin_lock_irqsave+0x2e/0x3e > [] scsi_device_unbusy+0x1e/0x8d > [] scsi_finish_command+0x22/0xdf > [] scsi_softirq_done+0xf4/0xfc > [] blk_done_softirq+0x5c/0x69 > [] __do_softirq+0x8b/0x10a > [] do_softirq+0x2b/0x43 > [] run_ksoftirqd+0x74/0x15a > [] kthread+0x61/0x66 > [] kernel_thread_helper+0x6/0x1a > INITIAL USE at: > [] __lock_acquire+0x70d/0x153e > [] lock_acquire+0x5c/0x73 > [] _raw_spin_lock_irqsave+0x2e/0x3e > [] scsi_schedule_eh+0xf/0x4f > [] ata_port_schedule_eh+0x37/0x3a > [] async_port_probe+0x65/0x9c > [] async_thread+0xe7/0x1c8 > [] kthread+0x61/0x66 > [] kernel_thread_helper+0x6/0x1a > } > ... key at: [] __key.30680+0x0/0x8 > ... acquired at: > [] lock_acquire+0x5c/0x73 > [] del_timer_sync+0x29/0x6c > [] close_transaction+0x93/0xb4 [firewire_core] > [] transmit_complete_callback+0x3f/0x43 [firewire_core] > [] at_context_transmit+0x59b/0x5a3 [firewire_ohci] > [] ohci_send_request+0xd/0xf [firewire_ohci] > [] fw_send_request+0x21c/0x224 [firewire_core] > [] 0xf8dab307 > [] 0xf8dac075 > [] scsi_dispatch_cmd+0x182/0x1f2 > [] scsi_request_fn+0x359/0x492 > [] __generic_unplug_device+0x26/0x29 > [] blk_execute_rq_nowait+0x56/0x75 > [] blk_execute_rq+0xa8/0xc7 > [] scsi_execute+0xc0/0x105 > [] scsi_execute_req+0x55/0x7d > [] sd_start_stop_device+0x7b/0x117 > [] sd_shutdown+0x108/0x117 > [] sd_remove+0x42/0x6f > [] __device_release_driver+0x5e/0x96 > [] device_release_driver+0x1a/0x25 > [] bus_remove_device+0x7b/0x92 > [] device_del+0xfa/0x154 > [] __scsi_remove_device+0x3d/0x7b > [] scsi_remove_device+0x1c/0x27 > [] 0xf8dab83e > [] kref_put+0x39/0x44 > [] 0xf8dab0ff > [] __device_release_driver+0x5e/0x96 > [] device_release_driver+0x1a/0x25 > [] bus_remove_device+0x7b/0x92 > [] device_del+0xfa/0x154 > [] device_unregister+0xb/0x15 > [] shutdown_unit+0x8/0xc [firewire_core] > [] device_for_each_child+0x2a/0x54 > [] fw_device_shutdown+0x72/0xa4 [firewire_core] > [] worker_thread+0x16a/0x242 > [] kthread+0x61/0x66 > [] kernel_thread_helper+0x6/0x1a > > -> (&t->split_timeout_timer){+.-...} ops: 14281 { > HARDIRQ-ON-W at: > [] __lock_acquire+0x692/0x153e > [] lock_acquire+0x5c/0x73 > [] run_timer_softirq+0x17b/0x21c > [] __do_softirq+0x8b/0x10a > [] do_softirq+0x2b/0x43 > [] irq_exit+0x38/0x74 > [] do_IRQ+0x89/0x9f > [] common_interrupt+0x2e/0x34 > [] cpuidle_idle_call+0x6d/0xa1 > [] cpu_idle+0x49/0x77 > [] rest_init+0xbf/0xc4 > [] start_kernel+0x306/0x30b > [] i386_start_kernel+0xb7/0xbf > IN-SOFTIRQ-W at: > [] __lock_acquire+0x633/0x153e > [] lock_acquire+0x5c/0x73 > [] del_timer_sync+0x29/0x6c > [] fw_core_handle_response+0x108/0x139 [firewire_core] > [] handle_ar_packet+0x108/0x114 [firewire_ohci] > [] ar_context_tasklet+0x111/0x11f [firewire_ohci] > [] tasklet_action+0x8d/0xe0 > [] __do_softirq+0x8b/0x10a > [] do_softirq+0x2b/0x43 > [] irq_exit+0x38/0x74 > [] do_IRQ+0x89/0x9f > [] common_interrupt+0x2e/0x34 > [] cpuidle_idle_call+0x6d/0xa1 > [] cpu_idle+0x49/0x77 > [] rest_init+0xbf/0xc4 > [] start_kernel+0x306/0x30b > [] i386_start_kernel+0xb7/0xbf > INITIAL USE at: > [] __lock_acquire+0x70d/0x153e > [] lock_acquire+0x5c/0x73 > [] del_timer_sync+0x29/0x6c > [] fw_core_handle_response+0x108/0x139 [firewire_core] > [] at_context_transmit+0x27e/0x5a3 [firewire_ohci] > [] ohci_send_request+0xd/0xf [firewire_ohci] > [] fw_send_request+0x21c/0x224 [firewire_core] > [] fw_run_transaction+0x85/0xc5 [firewire_core] > [] bm_work+0x16c/0x34b [firewire_core] > [] worker_thread+0x16a/0x242 > [] kthread+0x61/0x66 > [] kernel_thread_helper+0x6/0x1a > } > ... key at: [] __key.22471+0x0/0xffffe611 [firewire_core] > ... acquired at: > [] check_usage_backwards+0x94/0x9f > [] mark_lock+0x2c0/0x4cd > [] __lock_acquire+0x692/0x153e > [] lock_acquire+0x5c/0x73 > [] run_timer_softirq+0x17b/0x21c > [] __do_softirq+0x8b/0x10a > [] do_softirq+0x2b/0x43 > [] irq_exit+0x38/0x74 > [] do_IRQ+0x89/0x9f > [] common_interrupt+0x2e/0x34 > [] cpuidle_idle_call+0x6d/0xa1 > [] cpu_idle+0x49/0x77 > [] rest_init+0xbf/0xc4 > [] start_kernel+0x306/0x30b > [] i386_start_kernel+0xb7/0xbf > > > stack backtrace: > Pid: 0, comm: swapper Not tainted 2.6.35 #1 > Call Trace: > [] ? printk+0xf/0x16 > [] print_irq_inversion_bug+0xeb/0xf6 > [] check_usage_backwards+0x94/0x9f > [] mark_lock+0x2c0/0x4cd > [] ? check_usage_backwards+0x0/0x9f > [] __lock_acquire+0x692/0x153e > [] ? _raw_spin_unlock_irqrestore+0x54/0x58 > [] ? debug_object_deactivate+0xa1/0xcc > [] lock_acquire+0x5c/0x73 > [] ? run_timer_softirq+0x112/0x21c > [] run_timer_softirq+0x17b/0x21c > [] ? run_timer_softirq+0x112/0x21c > [] ? split_transaction_timeout_callback+0x0/0x8b [firewire_core] > [] __do_softirq+0x8b/0x10a > [] do_softirq+0x2b/0x43 > [] irq_exit+0x38/0x74 > [] do_IRQ+0x89/0x9f > [] common_interrupt+0x2e/0x34 > [] ? acpi_idle_enter_simple+0x122/0x14e > [] cpuidle_idle_call+0x6d/0xa1 > [] cpu_idle+0x49/0x77 > [] rest_init+0xbf/0xc4 > [] start_kernel+0x306/0x30b > [] i386_start_kernel+0xb7/0xbf > > > -- > Stefan Richter > -=====-==-=- =--- =---- > http://arcgraph.de/sr/ > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/