Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755211AbZJRURl (ORCPT ); Sun, 18 Oct 2009 16:17:41 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754276AbZJRURk (ORCPT ); Sun, 18 Oct 2009 16:17:40 -0400 Received: from lucidpixels.com ([75.144.35.66]:59541 "EHLO lucidpixels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751077AbZJRURi (ORCPT ); Sun, 18 Oct 2009 16:17:38 -0400 Date: Sun, 18 Oct 2009 16:17:42 -0400 (EDT) From: Justin Piszcz To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com cc: Alan Piszcz Subject: Re: 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 16247 Lines: 307 On Sat, 17 Oct 2009, Justin Piszcz wrote: > Hello, It has happened again, all sysrq-X output was saved this time. wget http://home.comcast.net/~jpiszcz/20091018/crash.txt wget http://home.comcast.net/~jpiszcz/20091018/dmesg.txt wget http://home.comcast.net/~jpiszcz/20091018/interrupts.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-l.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-m.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-p.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-q.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-t.txt wget http://home.comcast.net/~jpiszcz/20091018/sysrq-w.txt Kernel configuration: wget http://home.comcast.net/~jpiszcz/20091018/config-2.6.30.9.txt wget http://home.comcast.net/~jpiszcz/20091018/config-2.6.31.4.txt Diff of the two configs: $ diff config-2.6.30.9.txt config-2.6.31.4.txt |grep -v "#"|grep "_" > CONFIG_OUTPUT_FORMAT="elf64-x86-64" > CONFIG_CONSTRUCTORS=y > CONFIG_HAVE_PERF_COUNTERS=y > CONFIG_HAVE_DMA_ATTRS=y > CONFIG_BLK_DEV_BSG=y > CONFIG_X86_NEW_MCE=y > CONFIG_X86_THERMAL_VECTOR=y < CONFIG_UNEVICTABLE_LRU=y < CONFIG_PHYSICAL_START=0x200000 > CONFIG_PHYSICAL_START=0x1000000 < CONFIG_PHYSICAL_ALIGN=0x200000 > CONFIG_PHYSICAL_ALIGN=0x1000000 < CONFIG_COMPAT_NET_DEV_OPS=y < CONFIG_SND_JACK=y > CONFIG_HID_DRAGONRISE=y > CONFIG_HID_GREENASIA=y > CONFIG_HID_SMARTJOYPLUS=y > CONFIG_HID_THRUSTMASTER=y > CONFIG_HID_ZEROPLUS=y > CONFIG_FSNOTIFY=y > CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y > CONFIG_HAVE_ARCH_KMEMCHECK=y I have reverted back to 2.6.30.9 to see if the problem recurs with this kernel version. I do not recall seeing this on the older 2.6.30.x kernels: [ 9.276427] md3: detected capacity change from 0 to 5251073572864 [ 9.277411] md2: detected capacity change from 0 to 132706598912 [ 9.278305] md1: detected capacity change from 0 to 139722752 [ 9.278921] md0: detected capacity change from 0 to 17190682624 Again, some more D-state processes: [76325.608073] pdflush D 0000000000000001 0 362 2 0x00000000 [76325.608087] Call Trace: [76325.608095] [] ? xfs_trans_brelse+0x30/0x130 [76325.608099] [] ? xlog_state_sync+0x26c/0x2a0 [76325.608103] [] ? default_wake_function+0x0/0x10 [76325.608106] [] ? _xfs_log_force+0x51/0x80 [76325.608108] [] ? xfs_log_force+0xb/0x40 [76325.608202] xfssyncd D 0000000000000000 0 831 2 0x00000000 [76325.608214] Call Trace: [76325.608216] [] ? xlog_state_sync+0x49/0x2a0 [76325.608220] [] ? __xfs_iunpin_wait+0x95/0xe0 [76325.608222] [] ? autoremove_wake_function+0x0/0x30 [76325.608225] [] ? xfs_iflush+0xdd/0x2f0 [76325.608228] [] ? xfs_reclaim_inode+0x148/0x190 [76325.608231] [] ? xfs_reclaim_inode_now+0x0/0xa0 [76325.608233] [] ? xfs_inode_ag_walk+0x6c/0xc0 [76325.608236] [] ? xfs_reclaim_inode_now+0x0/0xa0 All of the D-state processes: $ cat sysrq-w.txt |grep ' D' [76307.285125] alpine D 0000000000000000 0 7659 29120 0x00000000 [76325.608073] pdflush D 0000000000000001 0 362 2 0x00000000 [76325.608202] xfssyncd D 0000000000000000 0 831 2 0x00000000 [76325.608257] syslogd D 0000000000000002 0 2438 1 0x00000000 [76325.608318] freshclam D 0000000000000000 0 2877 1 0x00000000 [76325.608428] asterisk D 0000000000000001 0 3278 1 0x00000000 [76325.608492] console-kit-d D 0000000000000000 0 3299 1 0x00000000 [76325.608562] dhcpd3 D 0000000000000000 0 3554 1 0x00000000 [76325.608621] plasma-deskto D 0000000000000002 0 32482 1 0x00000000 [76325.608713] kaccess D 0000000000000001 0 32488 1 0x00000000 [76325.608752] mail D 0000000000000000 0 7397 7386 0x00000000 [76325.608830] hal-acl-tool D 0000000000000000 0 7430 3399 0x00000004 [76325.608888] mrtg D 0000000000000000 0 7444 7433 0x00000000 [76325.608981] cron D 0000000000000000 0 7500 3630 0x00000000 [76325.609000] alpine D 0000000000000000 0 7659 29120 0x00000000 List of functions underneath the D-state processes (sorted/uniqued)-- 121 [] ? autoremove_wake_function+0x0/0x30 77 [] ? system_call_fastpath+0x16/0x1b 62 [] ? schedule_timeout+0x165/0x1a0 60 [] ? __alloc_skb+0x66/0x170 60 [] ? sys_sendto+0x119/0x180 59 [] ? unix_dgram_sendmsg+0x467/0x5c0 59 [] ? unix_wait_for_peer+0x86/0xd0 59 [] ? memcpy_fromiovec+0x57/0x80 59 [] ? sock_alloc_send_pskb+0x1d9/0x2f0 59 [] ? sock_sendmsg+0xcb/0x100 59 [] ? sockfd_lookup_light+0x22/0x80 58 [] ? unix_dgram_connect+0xad/0x270 58 [] ? sys_connect+0x86/0xe0 57 [] ? unix_find_other+0x1a5/0x200 57 [] ? mntput_no_expire+0x23/0xf0 57 [] ? page_add_new_anon_rmap+0x54/0x90 57 [] ? current_fs_time+0x1e/0x30 55 [] ? filemap_fault+0x95/0x3e0 8 [] ? default_wake_function+0x0/0x10 7 [] ? xfs_trans_reserve+0xa8/0x220 7 [] ? do_sys_open+0x97/0x150 6 [] ? _xfs_log_force+0x51/0x80 5 [] ? xlog_grant_push_ail+0x30/0xf0 4 [] ? xfs_file_fsync+0x54/0x70 4 [] ? xfs_buf_iorequest+0x42/0x90 4 [] ? kmem_zone_zalloc+0x32/0x50 4 [] ? kmem_zone_alloc+0x83/0xc0 4 [] ? xlog_state_sync+0x26c/0x2a0 4 [] ? sys_fsync+0xb/0x20 4 [] ? do_fsync+0x36/0x60 4 [] ? vfs_fsync+0x9e/0x110 4 [] ? __link_path_walk+0x7e/0x1000 3 [] ? __mutex_lock_slowpath+0xd6/0x160 3 [] ? mutex_lock+0x1a/0x40 3 [] ? xfs_vn_mknod+0x82/0x130 3 [] ? xfs_fsync+0x141/0x190 3 [] ? _xfs_trans_commit+0x38b/0x3a0 3 [] ? xlog_grant_log_space+0x28c/0x3c0 3 [] ? xlog_bdstrat_cb+0x3d/0x50 3 [] ? xfs_log_force+0xb/0x40 3 [] ? xfs_log_release_iclog+0x10/0x40 3 [] ? xlog_sync+0x20b/0x4e0 3 [] ? xfs_bmapi+0x9e2/0x11a0 3 [] ? xfs_bmap_btalloc+0x598/0xa40 3 [] ? xfs_alloc_vextent+0x368/0x4b0 3 [] ? xfs_alloc_ag_vextent+0x123/0x130 3 [] ? alloc_fd+0x4a/0x140 3 [] ? pollwake+0x0/0x60 3 [] ? poll_freewait+0x48/0xb0 3 [] ? do_filp_open+0x9ee/0xac0 3 [] ? do_filp_open+0x234/0xac0 3 [] ? vfs_create+0xa6/0xf0 3 [] ? vfs_fstatat+0x37/0x80 3 [] ? kmem_cache_alloc+0x6d/0xa0 3 [] ? __wake_up+0x43/0x70 2 [] ? __down_write_nested+0x17/0xb0 2 [] ? __down+0x61/0xa0 2 [] ? do_nanosleep+0x95/0xd0 2 [] ? schedule_hrtimeout_range+0x11d/0x140 2 [] ? schedule_timeout+0x119/0x1a0 2 [] ? xfs_reclaim_inode_now+0x0/0xa0 2 [] ? xfs_buf_read_flags+0x12/0xa0 2 [] ? xfs_buf_get_flags+0x6e/0x190 2 [] ? _xfs_buf_find+0x134/0x220 2 [] ? xfs_vm_writepage+0x77/0x130 2 [] ? xfs_page_state_convert+0x414/0x6c0 2 [] ? xfs_map_blocks+0x25/0x30 2 [] ? xfs_create+0x312/0x530 2 [] ? xfs_dir_ialloc+0xa8/0x340 2 [] ? xfs_trans_read_buf+0x1e6/0x360 2 [] ? xlog_state_sync+0x157/0x2a0 2 [] ? xfs_iomap+0x2c0/0x300 2 [] ? xfs_iomap_write_allocate+0x23e/0x3b0 2 [] ? dput+0xac/0x160 2 [] ? d_kill+0x53/0x70 2 [] ? generic_permission+0x78/0x130 2 [] ? handle_mm_fault+0x1b5/0x780 2 [] ? __do_fault+0x3ca/0x4b0 2 [] ? pdflush+0x0/0x220 2 [] ? do_writepages+0x20/0x40 2 [] ? write_cache_pages+0x1df/0x3c0 2 [] ? __writepage+0xa/0x40 2 [] ? __writepage+0x0/0x40 2 [] ? __alloc_pages_nodemask+0x108/0x5f0 2 [] ? find_get_page+0x1b/0xb0 2 [] ? down+0x46/0x50 2 [] ? sys_nanosleep+0x70/0x80 2 [] ? hrtimer_nanosleep+0xa2/0x130 2 [] ? __hrtimer_start_range_ns+0x12b/0x2a0 2 [] ? hrtimer_wakeup+0x0/0x30 2 [] ? __wake_up_bit+0x28/0x30 2 [] ? kthread+0xa6/0xb0 2 [] ? kthread+0x0/0xb0 2 [] ? process_timeout+0x0/0x10 2 [] ? try_to_del_timer_sync+0x54/0x60 2 [] ? lock_timer_base+0x34/0x70 2 [] ? child_rip+0xa/0x20 2 [] ? child_rip+0x0/0x20 1 [] ? _spin_lock_bh+0x9/0x20 1 [] ? __down_read+0x17/0xae 1 [] ? __wait_on_bit+0x50/0x80 1 [] ? io_schedule+0x34/0x50 1 [] ? wait_for_common+0x151/0x180 1 [] ? tcp_write_xmit+0x206/0xa30 1 [] ? tcp_sendmsg+0x859/0xb10 1 [] ? sk_reset_timer+0xf/0x20 1 [] ? release_sock+0x13/0xa0 1 [] ? sock_aio_write+0x13a/0x150 1 [] ? tty_ldisc_try+0x48/0x60 1 [] ? tty_write+0x221/0x270 1 [] ? swiotlb_map_page+0x0/0x100 1 [] ? __up_read+0x21/0xc0 1 [] ? xfs_sync_worker+0x49/0x80 1 [] ? xfs_inode_ag_iterator+0x63/0xa0 1 [] ? xfs_inode_ag_walk+0x6c/0xc0 1 [] ? xfssyncd+0x13c/0x1c0 1 [] ? xfssyncd+0x0/0x1c0 1 [] ? xfs_reclaim_inode+0x148/0x190 1 [] ? xfs_bdstrat_cb+0x45/0x50 1 [] ? xfs_vn_setattr+0x16/0x20 1 [] ? xfs_flush_pages+0xad/0xc0 1 [] ? xfs_wait_on_pages+0x23/0x30 1 [] ? xfs_file_release+0x10/0x20 1 [] ? xfs_buf_rele+0x3b/0x100 1 [] ? _xfs_buf_lookup_pages+0x265/0x340 1 [] ? __xfs_get_blocks+0x8f/0x220 1 [] ? xfs_setattr+0x826/0x880 1 [] ? xfs_fsync+0x56/0x190 1 [] ? xfs_release+0x167/0x1d0 1 [] ? xfs_lookup+0x90/0xe0 1 [] ? xfs_create+0x40b/0x530 1 [] ? xfs_trans_iget+0xda/0x100 1 [] ? xfs_trans_ijoin+0x38/0xa0 1 [] ? xfs_trans_log_inode+0x27/0x60 1 [] ? xfs_trans_get_efd+0x28/0x40 1 [] ? xfs_trans_brelse+0x30/0x130 1 [] ? xlog_state_sync+0x49/0x2a0 1 [] ? xfs_iflush+0xdd/0x2f0 1 [] ? xfs_ialloc+0x52f/0x6f0 1 [] ? xfs_ialloc+0xbe/0x6f0 1 [] ? xfs_ialloc+0x7e/0x6f0 1 [] ? xfs_itruncate_finish+0x15a/0x320 1 [] ? __xfs_iunpin_wait+0x95/0xe0 1 [] ? xfs_iget+0xfd/0x480 1 [] ? xfs_iget+0xeb/0x480 1 [] ? xfs_dialloc+0x2e1/0xa70 1 [] ? xfs_ialloc_ag_select+0x222/0x320 1 [] ? xfs_ialloc_read_agi+0x1f/0x80 1 [] ? xfs_read_agi+0x71/0x110 1 [] ? xfs_dir2_sf_addname+0x430/0x5c0 1 [] ? xfs_dir2_sf_to_block+0x9f/0x5c0 1 [] ? xfs_dir_createname+0x17a/0x1d0 1 [] ? xfs_dir2_grow_inode+0x15a/0x3f0 1 [] ? xfs_bmap_finish+0x164/0x1b0 1 [] ? xfs_free_extent+0x7e/0xc0 1 [] ? xfs_alloc_fix_freelist+0x379/0x450 1 [] ? xfs_alloc_read_agf+0x30/0xd0 1 [] ? xfs_read_agf+0x68/0x190 1 [] ? sys_epoll_wait+0x22f/0x2e0 1 [] ? __set_page_dirty+0x66/0xd0 1 [] ? writeback_inodes+0x46/0xe0 1 [] ? generic_sync_sb_inodes+0x2e6/0x4b0 1 [] ? writeback_single_inode+0x1e9/0x460 1 [] ? notify_change+0x101/0x2f0 1 [] ? __d_lookup+0xaa/0x140 1 [] ? __pollwait+0x0/0x120 1 [] ? sys_select+0x51/0x110 1 [] ? core_sys_select+0x1ff/0x310 1 [] ? do_select+0x4ff/0x670 1 [] ? poll_schedule_timeout+0x2c/0x50 1 [] ? do_filp_open+0x6a0/0xac0 1 [] ? may_open+0x1c1/0x1f0 1 [] ? get_write_access+0x20/0x60 1 [] ? __fput+0xcd/0x1e0 1 [] ? sys_write+0x53/0xa0 1 [] ? do_sync_write+0xe3/0x130 1 [] ? do_truncate+0x5e/0x80 1 [] ? sys_close+0xa6/0x100 1 [] ? filp_close+0x56/0x90 1 [] ? cache_alloc_refill+0x96/0x590 1 [] ? pagevec_lookup_tag+0x1a/0x30 1 [] ? pdflush+0x110/0x220 1 [] ? wb_kupdate+0xb6/0x140 1 [] ? wb_kupdate+0x0/0x140 1 [] ? __filemap_fdatawrite_range+0x4d/0x60 1 [] ? wait_on_page_writeback_range+0xc3/0x140 1 [] ? wait_on_page_bit+0x6c/0x80 1 [] ? find_lock_page+0x23/0x80 1 [] ? sync_page+0x35/0x60 1 [] ? sync_page+0x0/0x60 1 [] ? sched_clock_cpu+0x6e/0x250 1 [] ? wake_bit_function+0x0/0x30 1 [] ? autoremove_wake_function+0x9/0x30 1 [] ? sys_setpriority+0x89/0x240 1 [] ? do_fork+0x16e/0x360 1 [] ? try_to_wake_up+0xaf/0x1d0 1 [] ? task_rq_lock+0x47/0x90 1 [] ? __wake_up_common+0x5b/0x90 1 [] ? sched_slice+0x5f/0x90 1 [] ? sys_vfork+0x20/0x30 1 [] ? stub_vfork+0x13/0x20 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/