Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753537AbYJQK65 (ORCPT ); Fri, 17 Oct 2008 06:58:57 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751783AbYJQK6q (ORCPT ); Fri, 17 Oct 2008 06:58:46 -0400 Received: from pasmtpa.tele.dk ([80.160.77.114]:45809 "EHLO pasmtpA.tele.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751208AbYJQK6o (ORCPT ); Fri, 17 Oct 2008 06:58:44 -0400 Date: Fri, 17 Oct 2008 12:57:55 +0200 From: Jens Axboe To: Kamalesh Babulal Cc: Stephen Rothwell , linux-next@vger.kernel.org, LKML , linux-scsi@vger.kernel.org Subject: Re: [BUG] linux-next: Tree for October 16 - softlockup detected at blk_invoke_request_fn Message-ID: <20081017105753.GG19428@kernel.dk> References: <20081016211811.a523c8f2.sfr@canb.auug.org.au> <20081016204212.GC7534@linux.vnet.ibm.com> <20081017065558.GD19428@kernel.dk> <20081017102456.GA8115@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081017102456.GA8115@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10979 Lines: 190 On Fri, Oct 17 2008, Kamalesh Babulal wrote: > * Jens Axboe [2008-10-17 08:55:59]: > > > On Fri, Oct 17 2008, Kamalesh Babulal wrote: > > > Hi Stephen, > > > > > > with the next-20081016 kernel, softlockup is detected while creating ext2 > > > filesystem. > > > > > > BUG: soft lockup - CPU#3 stuck for 61s! [mkfs.ext2:3399] > > > Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_log dm_mod video output rfkill input_polldev sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy ide_cd_mod tg3 cdrom libphy serio_raw button i2c_i801 e752x_edac edac_core shpchp i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode] > > > Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_log dm_mod video output rfkill input_polldev sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy ide_cd_mod tg3 cdrom libphy serio_raw button i2c_i801 e752x_edac edac_core shpchp i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode] > > > Call Trace: > > > [] blk_invoke_request_fn+0x69/0x109 > > > [] scsi_run_queue+0x2b5/0x2ba [scsi_mod] > > > [] scsi_next_command+0x2d/0x39 [scsi_mod] > > > [] scsi_end_request+0x86/0x97 [scsi_mod] > > > [] scsi_io_completion+0x1b6/0x3cd [scsi_mod] > > > [] blk_done_softirq+0x5c/0x6b > > > [] __do_softirq+0x7a/0x13d > > > [] ack_apic_level+0x58/0xdc > > > [] call_softirq+0x1c/0x28 > > > [] do_softirq+0x2c/0x68 > > > [] do_IRQ+0x145/0x168 > > > [] ret_from_intr+0x0/0xa > > > [] audit_syscall_entry+0xb6/0x160 > > > [] auditsys+0x19/0x4a > > > .. > > > .. > > > > > > .. > > > INFO: task kjournald:578 blocked for more than 480 seconds. > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > ffff8801e50e1d30 0000000000000046 ffff8801e50e1cf8 ffffffff8031833d > > > ffff8801e5058c58 ffff8801e7360100 ffff8801e70f3470 ffff8801e7360458 > > > 00000001e596c500 00000000fffd66c8 ffffffffffffffff ffffffffffffffff > > > Call Trace: > > > [] elv_next_request+0x155/0x220 > > > [] sync_buffer+0x0/0x3f > > > [] io_schedule+0x5d/0x9f > > > [] sync_buffer+0x3b/0x3f > > > [] __wait_on_bit+0x40/0x6f > > > [] sync_buffer+0x0/0x3f > > > [] out_of_line_wait_on_bit+0x6c/0x78 > > > [] wake_bit_function+0x0/0x23 > > > [] journal_commit_transaction+0x7e4/0xc64 [jbd] > > > [] lock_timer_base+0x26/0x4c > > > [] kjournald+0xc1/0x1fb [jbd] > > > [] autoremove_wake_function+0x0/0x2e > > > [] kjournald+0x0/0x1fb [jbd] > > > [] kthread+0x47/0x73 > > > [] schedule_tail+0x27/0x6e > > > [] child_rip+0xa/0x11 > > > [] kthread+0x0/0x73 > > > [] child_rip+0x0/0x11 > > > INFO: task syslogd:2650 blocked for more than 480 seconds. > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > ffff8801e1587da8 0000000000000082 0000000000000000 0000000000000000 > > > 00000000003fbfc4 ffff8801e3926fe0 ffff8801e38c6f60 ffff8801e3927338 > > > 0000000100000000 ffff8801e59444c0 0000000000000000 0000000000000001 > > > Call Trace: > > > [] log_wait_commit+0x9f/0xed [jbd] > > > [] autoremove_wake_function+0x0/0x2e > > > [] journal_stop+0x165/0x18d [jbd] > > > [] __writeback_single_inode+0x17f/0x282 > > > [] do_readv_writev+0x176/0x18b > > > [] sync_inode+0x24/0x31 > > > [] ext3_sync_file+0x9e/0xb0 [ext3] > > > [] do_fsync+0x52/0xa4 > > > [] __do_fsync+0x23/0x36 > > > [] system_call_fastpath+0x16/0x1b > > > INFO: task lamb-payload:3400 blocked for more than 480 seconds. > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > ffff8801b186daa8 0000000000000082 ffff8801b186da70 0000000000000000 > > > 0000000000000000 ffff8801e3834ee0 ffffffff805ad360 ffff8801e3835238 > > > 0000000000000000 00000000fffd50dd ffffffffffffffff ffffffffffffffff > > > Call Trace: > > > [] do_get_write_access+0x1dc/0x3c3 [jbd] > > > [] wake_bit_function+0x0/0x23 > > > [] __getblk+0x25/0x211 > > > [] journal_get_write_access+0x22/0x33 [jbd] > > > [] __ext3_journal_get_write_access+0x1e/0x47 [ext3] > > > [] ext3_reserve_inode_write+0x3d/0x76 [ext3] > > > [] ext3_mark_inode_dirty+0x21/0x3c [ext3] > > > [] ext3_dirty_inode+0x63/0x7b [ext3] > > > [] __mark_inode_dirty+0x29/0x16b > > > [] touch_atime+0x109/0x112 > > > [] generic_file_aio_read+0x4b8/0x515 > > > [] do_sync_read+0xc9/0x10c > > > [] autoremove_wake_function+0x0/0x2e > > > [] __vma_link+0x42/0x4b > > > [] __up_write+0x27/0xf3 > > > [] vfs_read+0xaa/0x133 > > > [] kernel_read+0x38/0x4e > > > [] do_execve+0x121/0x256 > > > [] sys_execve+0x35/0x4c > > > [] stub_execve+0x6a/0xc0 > > > INFO: task crond:3055 blocked for more than 480 seconds. > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > ffff8801d8197d08 0000000000000086 ffff8801d8197cd0 ffff8801da800700 > > > ffff8801da810a40 ffff8801e68481c0 ffff8801e70f3470 ffff8801e6848518 > > > 00000001da8aeaf0 00000000fffdf3b6 ffffffffffffffff ffffffffffffffff > > > Call Trace: > > > [] do_get_write_access+0x1dc/0x3c3 [jbd] > > > [] wake_bit_function+0x0/0x23 > > > [] __getblk+0x25/0x211 > > > [] filldir+0x0/0xb1 > > > [] journal_get_write_access+0x22/0x33 [jbd] > > > [] __ext3_journal_get_write_access+0x1e/0x47 [ext3] > > > [] ext3_reserve_inode_write+0x3d/0x76 [ext3] > > > [] ext3_mark_inode_dirty+0x21/0x3c [ext3] > > > [] ext3_dirty_inode+0x63/0x7b [ext3] > > > [] __mark_inode_dirty+0x29/0x16b > > > [] filldir+0x0/0xb1 > > > [] touch_atime+0x109/0x112 > > > [] vfs_readdir+0x8a/0xa7 > > > [] sys_getdents+0x75/0xc4 > > > [] system_call_fastpath+0x16/0x1b > > > BUG: soft lockup - CPU#3 stuck for 61s! [mkfs.ext2:3399] > > > Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_log dm_mod video output rfkill input_polldev sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy ide_cd_mod tg3 cdrom libphy serio_raw button i2c_i801 e752x_edac edac_core shpchp i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode] > > > Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_log dm_mod video output rfkill input_polldev sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy ide_cd_mod tg3 cdrom libphy serio_raw button i2c_i801 e752x_edac edac_core shpchp i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode] > > > Call Trace: > > > [] blk_invoke_request_fn+0x69/0x109 > > > [] scsi_run_queue+0x2b5/0x2ba [scsi_mod] > > > [] scsi_next_command+0x2d/0x39 [scsi_mod] > > > [] scsi_end_request+0x86/0x97 [scsi_mod] > > > [] scsi_io_completion+0x1b6/0x3cd [scsi_mod] > > > [] blk_done_softirq+0x5c/0x6b > > > [] __do_softirq+0x7a/0x13d > > > [] ack_apic_level+0x58/0xdc > > > [] call_softirq+0x1c/0x28 > > > [] do_softirq+0x2c/0x68 > > > [] do_IRQ+0x145/0x168 > > > [] ret_from_intr+0x0/0xa > > > [] audit_syscall_entry+0xb6/0x160 > > > [] auditsys+0x19/0x4a > > > .. > > > .. > > > > > > > > > 0xffffffff80319770 is in blk_invoke_request_fn (include/linux/blkdev.h:456). > > > 451 #define QUEUE_FLAG_NONROT 14 /* non-rotational device (SSD) */ > > > 452 > > > 453 static inline int queue_is_locked(struct request_queue *q) > > > 454 { > > > 455 #ifdef CONFIG_SMP > > > 456 spinlock_t *lock = q->queue_lock; > > > 457 return lock && spin_is_locked(lock); > > > 458 #else > > > 459 return 1; > > > 460 #endif > > > > > > 0x4fe3 is in scsi_run_queue (drivers/scsi/scsi_lib.c:604). > > > 599 flagset = test_bit(QUEUE_FLAG_REENTER, &q->queue_flags) && > > > 600 !test_bit(QUEUE_FLAG_REENTER, > > > 601 &sdev->request_queue->queue_flags); > > > 602 if (flagset) > > > 603 queue_flag_set(QUEUE_FLAG_REENTER, sdev->request_queue); > > > 604 __blk_run_queue(sdev->request_queue); > > > 605 if (flagset) > > > 606 queue_flag_clear(QUEUE_FLAG_REENTER, sdev->request_queue); > > > 607 spin_unlock(sdev->request_queue->queue_lock); > > > 608 > > > > Huh, this makes very little sense to me... Can I talk you into trying > > the for-linus branch of the block tree? > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > just to see if it reproduces with that. I'm assuming 2.6-git is fine? > > > > -- > > Jens Axboe > > > > I applied the for-linus branch on top of 2.6.27-git7 kernel and I was not able to > reproduce with it. OK, I'm guessing it's a merge mess in linux-next. I tried to look at the diff, but it was pretty much impossible to read and even included lots of things in -git already. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/