2003-06-18 20:18:14

by Olivier Nicolas

[permalink] [raw]
Subject: 2.5.7x: processes in D state

After some time from a few seconds after boot to a few hours, the system
locks.

Processes seem to be in D state

Kernel 2.5.72

System
2 x Athlon MP 1800
512 Mb RAM
IDE disk (reiserfs and ext3 partions)



AltSysRq T ....

.....

konqueror D 00000001 4244652160 1965 1943 (NOTLB)
Call Trace:
[<c011c3aa>] default_wake_function+0x2a/0x30
[<c011cb5e>] sleep_on+0x6e/0x130
[<c011c380>] default_wake_function+0x0/0x30
[<c01e6768>] do_journal_begin_r+0x88/0x2a0
[<c01e69d7>] journal_begin+0x27/0x30
[<c01d5a07>] reiserfs_dirty_inode+0x77/0x140
[<c0183a16>] __mark_inode_dirty+0x126/0x160
[<c017c6c2>] update_atime+0xb2/0xd0
[<c01d0edc>] reiserfs_readdir+0x59c/0x610
[<c021a072>] tty_default_put_char+0x32/0x40
[<c014d461>] do_wp_page+0x281/0x410
[<c014e613>] handle_mm_fault+0x193/0x220
[<c0172c2e>] vfs_readdir+0x7e/0xa0
[<c0172f40>] filldir64+0x0/0x120
[<c01730d4>] sys_getdents64+0x74/0xb5
[<c0172f40>] filldir64+0x0/0x120
[<c01097f3>] syscall_call+0x7/0xb
....


SysRq : Show Regs

Pid: 0, comm: swapper
EIP: 0060:[<c010719f>] CPU: 0
EIP is at default_idle+0x2f/0x40
EFLAGS: 00000246 Not tainted
EAX: 00000000 EBX: c036a000 ECX: 00000000 EDX: c036a000
ESI: c0107170 EDI: ffffe000 EBP: c036bfb0 DS: 007b ES: 007b
CR0: 8005003b CR2: 40018000 CR3: 16b33000 CR4: 000006d0
Call Trace:
[<c0107235>] cpu_idle+0x45/0x60
[<c0105000>] _stext+0x0/0x80
[<c036c93f>] start_kernel+0x16f/0x1a0
[<c036c4e0>] unknown_bootoption+0x0/0x110

SysRq : Show Regs

Pid: 0, comm: swapper
EIP: 0060:[<c010719f>] CPU: 0
EIP is at default_idle+0x2f/0x40
EFLAGS: 00000246 Not tainted
EAX: 00000000 EBX: c036a000 ECX: 00000000 EDX: c036a000
ESI: c0107170 EDI: ffffe000 EBP: c036bfb0 DS: 007b ES: 007b
CR0: 8005003b CR2: 40018000 CR3: 16b33000 CR4: 000006d0
Call Trace:
[<c0107235>] cpu_idle+0x45/0x60
[<c0105000>] _stext+0x0/0x80
[<c036c93f>] start_kernel+0x16f/0x1a0
[<c036c4e0>] unknown_bootoption+0x0/0x110


SysRq : Show Regs

Pid: 0, comm: swapper
EIP: 0060:[<c010719f>] CPU: 0
EIP is at default_idle+0x2f/0x40
EFLAGS: 00000246 Not tainted
EAX: 00000000 EBX: c036a000 ECX: 00000000 EDX: c036a000
ESI: c0107170 EDI: ffffe000 EBP: c036bfb0 DS: 007b ES: 007b
CR0: 8005003b CR2: 40018000 CR3: 16b33000 CR4: 000006d0
Call Trace:
[<c0107235>] cpu_idle+0x45/0x60
[<c0105000>] _stext+0x0/0x80
[<c036c93f>] start_kernel+0x16f/0x1a0
[<c036c4e0>] unknown_bootoption+0x0/0x110


SysRq : Terminate All Tasks




Same problem 2.5.70-mm3

bash D 00000010 1464 1349 (NOTLB)
Call Trace:
[<c011cf8b>] sleep_on+0x7b/0x170
[<c011c710>] default_wake_function+0x0/0x30
[<c01ea4a8>] do_journal_begin_r+0x88/0x2a0
[<c01ea717>] journal_begin+0x27/0x30
[<c01d95d7>] reiserfs_dirty_inode+0x77/0x140
[<c013d725>] file_read_actor+0x115/0x120
[<c01856b6>] __mark_inode_dirty+0x126/0x160
[<c013d30a>] do_generic_mapping_read+0xfa/0x400
[<c017e0e2>] update_atime+0xb2/0xd0
[<c013d8ef>] __generic_file_aio_read+0x1bf/0x1e0
[<c013d610>] file_read_actor+0x0/0x120
[<c013da3a>] generic_file_read+0xba/0xe0
[<c0145f34>] kmem_cache_alloc+0x114/0x190
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c011f340>] autoremove_wake_function+0x0/0x50
[<c015e81a>] vfs_read+0xda/0x120
[<c016baca>] kernel_read+0x4a/0x60
[<c016caab>] prepare_binprm+0xcb/0xf0
[<c016d04e>] do_execve+0xfe/0x1e0
[<c0107cbf>] sys_execve+0x3f/0x80
[<c01098a3>] syscall_call+0x7/0xb

bash D 00000000 1467 1410 1432 (NOTLB)
Call Trace:
[<c011cf8b>] sleep_on+0x7b/0x170
[<c011c710>] default_wake_function+0x0/0x30
[<c01ea4a8>] do_journal_begin_r+0x88/0x2a0
[<c01ea717>] journal_begin+0x27/0x30
[<c01d95d7>] reiserfs_dirty_inode+0x77/0x140
[<c013d725>] file_read_actor+0x115/0x120
[<c01856b6>] __mark_inode_dirty+0x126/0x160
[<c013d30a>] do_generic_mapping_read+0xfa/0x400
[<c017e0e2>] update_atime+0xb2/0xd0
[<c013d8ef>] __generic_file_aio_read+0x1bf/0x1e0
[<c013d610>] file_read_actor+0x0/0x120
[<c013da3a>] generic_file_read+0xba/0xe0
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c011f340>] autoremove_wake_function+0x0/0x50
[<c015e81a>] vfs_read+0xda/0x120
[<c016baca>] kernel_read+0x4a/0x60
[<c016caab>] prepare_binprm+0xcb/0xf0
[<c016d04e>] do_execve+0xfe/0x1e0
[<c0107cbf>] sys_execve+0x3f/0x80
[<c01098a3>] syscall_call+0x7/0xb

bash D 00000000 1467 1410 1432 (NOTLB)
Call Trace:
[<c011cf8b>] sleep_on+0x7b/0x170
[<c011c710>] default_wake_function+0x0/0x30
[<c01ea4a8>] do_journal_begin_r+0x88/0x2a0
[<c01ea717>] journal_begin+0x27/0x30
[<c01d95d7>] reiserfs_dirty_inode+0x77/0x140
[<c013d725>] file_read_actor+0x115/0x120
[<c01856b6>] __mark_inode_dirty+0x126/0x160
[<c013d30a>] do_generic_mapping_read+0xfa/0x400
[<c017e0e2>] update_atime+0xb2/0xd0
[<c013d8ef>] __generic_file_aio_read+0x1bf/0x1e0
[<c013d610>] file_read_actor+0x0/0x120
[<c013da3a>] generic_file_read+0xba/0xe0
[<c0145f34>] kmem_cache_alloc+0x114/0x190
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c0119c0b>] pgd_alloc+0x1b/0x20
[<c011f340>] autoremove_wake_function+0x0/0x50
[<c015e81a>] vfs_read+0xda/0x120
[<c016baca>] kernel_read+0x4a/0x60
[<c016caab>] prepare_binprm+0xcb/0xf0
[<c016d04e>] do_execve+0xfe/0x1e0
[<c0107cbf>] sys_execve+0x3f/0x80
[<c01098a3>] syscall_call+0x7/0xb



2003-06-19 05:48:20

by Oleg Drokin

[permalink] [raw]
Subject: Re: 2.5.7x: processes in D state

Hello!

On Wed, Jun 18, 2003 at 10:30:03PM +0200, Olivier NICOLAS wrote:

> AltSysRq T ....
> .....
> konqueror D 00000001 4244652160 1965 1943 (NOTLB)
> Call Trace:
> [<c011c3aa>] default_wake_function+0x2a/0x30
> [<c011cb5e>] sleep_on+0x6e/0x130
> [<c011c380>] default_wake_function+0x0/0x30
> [<c01e6768>] do_journal_begin_r+0x88/0x2a0

So they want to open the journal, but it is held by somebody else already.
Can you please look include stacktraces for all other processes that have
reiserfs bits in the trace and show those traces too? Or just mail me full sysrq-T outputs
at the time of such a happinings.

Thank you.

Bye,
Oleg

2003-06-21 15:38:42

by Olivier Nicolas

[permalink] [raw]
Subject: Re: 2.5.7x: processes in D state

Oleg,

As I can't repoduce the problem, I had to wait for a new freeze.
After 3 days of uptime, 2.5.72 hasn'nt freezed so I decided to reboot
into 2.5.71.

But the freeze occured during the 2.5.72 shutdown scirpts execution !


Olivier


Shutting down SSH daemon done
Umount SMB File SystemSysRq : Show State

free sibling
task PC stack pid father child younger older
init S 00000001 4294956284 1 0 2 (NOTLB)
Call Trace:
[<c012a636>] schedule_timeout+0x66/0xc0
[<c01413ed>] __get_free_pages+0x1d/0x50
[<c012a5c0>] process_timeout+0x0/0x10
[<c01735da>] do_select+0x29a/0x320
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c0169612>] sys_stat64+0x32/0x40
[<c01097f3>] syscall_call+0x7/0xb

migration/0 S 00000000 3780881424 2 1 3 (L-TLB)
Call Trace:
[<c011e381>] migration_thread+0xd1/0x180
[<c011e2b0>] migration_thread+0x0/0x180
[<c01073b5>] kernel_thread_helper+0x5/0x10

ksoftirqd/0 S 00000001 3780874348 3 1 4 2 (L-TLB)
Call Trace:
[<c0125fef>] ksoftirqd+0xff/0x110
[<c0125ef0>] ksoftirqd+0x0/0x110
[<c01073b5>] kernel_thread_helper+0x5/0x10

migration/1 S 00000000 3780868152 4 1 5 3 (L-TLB)

Call Trace:
[<c011e381>] migration_thread+0xd1/0x180
[<c011e2b0>] migration_thread+0x0/0x180
[<c01073b5>] kernel_thread_helper+0x5/0x10

ksoftirqd/1 S 00000000 4294958708 5 1 6 4 (L-TLB)
Call Trace:
[<c0125fef>] ksoftirqd+0xff/0x110
[<c0125ef0>] ksoftirqd+0x0/0x110
[<c01073b5>] kernel_thread_helper+0x5/0x10

events/0 S 00000001 4294950984 6 1 7 5 (L-TLB)
Call Trace:
[<c0132687>] worker_thread+0x2c7/0x380
[<c021fa90>] batch_entropy_process+0x0/0xd0
[<c011c380>] default_wake_function+0x0/0x30
[<c0109706>] ret_from_fork+0x6/0x14
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

events/1 S DFFCBED0 4294945244 7 1 8 6 (L-TLB)
Call Trace:
[<c0132687>] worker_thread+0x2c7/0x380
[<c023f1c0>] blk_unplug_work+0x0/0x20
[<c011c380>] default_wake_function+0x0/0x30
[<c0109706>] ret_from_fork+0x6/0x14
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

kirqd S 00000001 512372740 8 1 9 7 (L-TLB)
Call Trace:
[<c012a636>] schedule_timeout+0x66/0xc0
[<c012a5c0>] process_timeout+0x0/0x10
[<c01185ec>] balanced_irq+0x4c/0x80
[<c01185a0>] balanced_irq+0x0/0x80
[<c01073b5>] kernel_thread_helper+0x5/0x10

pdflush S DFDD9F6C 512366720 9 1 10 8 (L-TLB)
Call Trace:
[<c01229b6>] reparent_to_init+0x106/0x1a0
[<c0122cc2>] daemonize+0xd2/0xe0
[<c0142eb7>] __pdflush+0xc7/0x350
[<c0143140>] pdflush+0x0/0x20
[<c0143151>] pdflush+0x11/0x20
[<c01073b5>] kernel_thread_helper+0x5/0x10

pdflush D 00000001 4294956740 10 1 11 9 (L-TLB)
Call Trace:
[<c011cb5e>] sleep_on+0x6e/0x130
[<c01e3616>] flush_commit_list+0x26/0x440
[<c011c380>] default_wake_function+0x0/0x30
[<c01e38ad>] flush_commit_list+0x2bd/0x440
[<c011c3ea>] __wake_up_common+0x3a/0x60
[<c01e71e0>] flush_old_commits+0x90/0x1c0
[<c01b388a>] log_start_commit+0x7a/0x110
[<c01d4d98>] reiserfs_write_super+0xa8/0xf0
[<c01652a4>] sync_supers+0x164/0x180
[<c0142738>] wb_kupdate+0x48/0x160
[<c011beb4>] schedule+0x114/0x5e0
[<c0142f52>] __pdflush+0x162/0x350
[<c0143140>] pdflush+0x0/0x20
[<c0143151>] pdflush+0x11/0x20
[<c01426f0>] wb_kupdate+0x0/0x160
[<c01073b5>] kernel_thread_helper+0x5/0x10

kswapd0 S 00000001 4294950528 11 1 12 10 (L-TLB)
Call Trace:
[<c014a899>] kswapd+0xd9/0x120
[<c011ece0>] autoremove_wake_function+0x0/0x50
[<c011ece0>] autoremove_wake_function+0x0/0x50
[<c014a7c0>] kswapd+0x0/0x120
[<c01073b5>] kernel_thread_helper+0x5/0x10

aio/0 S DFE01A74 4294937344 12 1 13 11 (L-TLB)
Call Trace:
[<c012ef6b>] do_sigaction+0x23b/0x3f0
[<c0132687>] worker_thread+0x2c7/0x380
[<c011c380>] default_wake_function+0x0/0x30
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380 [<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

aio/1 S 00000001 4294865196 13 1 14 12 (L-TLB)
Call Trace:
[<c012ef6b>] do_sigaction+0x23b/0x3f0
[<c0132687>] worker_thread+0x2c7/0x380
[<c011c380>] default_wake_function+0x0/0x30
[<c0109706>] ret_from_fork+0x6/0x14
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

kseriod S 00000001 4294383316 14 1 15 13 (L-TLB)
Call Trace:
[<c027b035>] serio_thread+0xf5/0x190
[<c011c380>] default_wake_function+0x0/0x30
[<c027af40>] serio_thread+0x0/0x190
[<c01073b5>] kernel_thread_helper+0x5/0x10

reiserfs/0 S DFDA0418 4294958448 15 1 16 14 (L-TLB)
Call Trace:
[<c0132687>] worker_thread+0x2c7/0x380
[<c01e5810>] reiserfs_journal_commit_task_func+0x0/0x120
[<c011c380>] default_wake_function+0x0/0x30
[<c0109706>] ret_from_fork+0x6/0x14
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

reiserfs/1 S 00000001 4294951840 16 1 150 15 (L-TLB)
Call Trace:
[<c0132687>] worker_thread+0x2c7/0x380
[<c01e5810>] reiserfs_journal_commit_task_func+0x0/0x120
[<c011c380>] default_wake_function+0x0/0x30
[<c0109706>] ret_from_fork+0x6/0x14
[<c011c380>] default_wake_function+0x0/0x30
[<c01323c0>] worker_thread+0x0/0x380
[<c01073b5>] kernel_thread_helper+0x5/0x10

kjournald S 00000001 4292344192 150 1 509 16 (L-TLB)
Call Trace:
[<c011c3aa>] default_wake_function+0x2a/0x30
[<c011c8ee>] interruptible_sleep_on+0x6e/0x130
[<c011c380>] default_wake_function+0x0/0x30
[<c01b33f1>] kjournald+0x221/0x290
[<c0150d64>] sys_munmap+0x54/0x70
[<c01b31b0>] commit_timeout+0x0/0x10
[<c01b31d0>] kjournald+0x0/0x290
[<c01073b5>] kernel_thread_helper+0x5/0x10

dhcpcd S 08053460 4294129064 509 1 528 150 (NOTLB)
Call Trace:
[<c0135518>] adjust_abs_time+0x118/0x140
[<c014e613>] handle_mm_fault+0x193/0x220
[<c01361aa>] do_clock_nanosleep+0x14a/0x350
[<c011c380>] default_wake_function+0x0/0x30
[<c011c380>] default_wake_function+0x0/0x30
[<c0135e30>] nanosleep_wake_up+0x0/0x10
[<c0135ee0>] sys_nanosleep+0x80/0xf0
[<c01097f3>] syscall_call+0x7/0xb

syslogd S 00000001 2015492 528 1 531 509 (NOTLB)
Call Trace:
[<c012a682>] schedule_timeout+0xb2/0xc0
[<c0284916>] datagram_poll+0xe6/0xeb
[<c027e8b7>] sock_poll+0x27/0x30
[<c01735da>] do_select+0x29a/0x320
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c01097f3>] syscall_call+0x7/0xb

klogd R 00000001 4294216928 531 1 602 528 (NOTLB)
Call Trace:
[<c01217af>] do_syslog+0x4ff/0x510
[<c011c380>] default_wake_function+0x0/0x30
[<c015de4a>] vfs_read+0xda/0x120
[<c010bfab>] do_IRQ+0x11b/0x240
[<c015e0bf>] sys_read+0x3f/0x60
[<c01097f3>] syscall_call+0x7/0xb

resmgrd S 00000000 4275623656 602 1 616 531 (NOTLB)
Call Trace:
[<c01413ed>] __get_free_pages+0x1d/0x50
[<c012a682>] schedule_timeout+0xb2/0xc0
[<c027e8b7>] sock_poll+0x27/0x30
[<c0173b6c>] do_pollfd+0x5c/0xa0
[<c0173c52>] do_poll+0xa2/0xd0
[<c0173e53>] sys_poll+0x1d3/0x26d
[<c015d4b6>] sys_close+0x86/0x100
[<c01731a0>] __pollwait+0x0/0xc0
[<c01097f3>] syscall_call+0x7/0xb

portmap S 00000001 4283935664 616 1 1444 602 (NOTLB)
Call Trace:
[<c029fe10>] tcp_poll+0x110/0x1a0
[<c012a682>] schedule_timeout+0xb2/0xc0
[<c027e8b7>] sock_poll+0x27/0x30
[<c0173b6c>] do_pollfd+0x5c/0xa0
[<c0173c52>] do_poll+0xa2/0xd0
[<c0173e53>] sys_poll+0x1d3/0x26d
[<c017297a>] sys_ioctl+0x11a/0x339
[<c015d4b6>] sys_close+0x86/0x100
[<c01731a0>] __pollwait+0x0/0xc0
[<c01097f3>] syscall_call+0x7/0xb

kdeinit S 00000001 4281905676 1441 1 2842 9456 1447 (NOTLB)
Call Trace:
[<c012a682>] schedule_timeout+0xb2/0xc0
[<c02da6d4>] unix_poll+0x94/0xd0
[<c027e8b7>] sock_poll+0x27/0x30
[<c01735da>] do_select+0x29a/0x320
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c01097f3>] syscall_call+0x7/0xb

kdeinit S 00000001 4294884696 1444 1 1447 616 (NOTLB)
Call Trace:
[<c012a636>] schedule_timeout+0x66/0xc0
[<c012a5c0>] process_timeout+0x0/0x10
[<c027e8b7>] sock_poll+0x27/0x30
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c01097f3>] syscall_call+0x7/0xb

kdeinit S 00000001 4294293824 1447 1 1441 1444 (NOTLB)
Call Trace:
[<c012a636>] schedule_timeout+0x66/0xc0
[<c012a5c0>] process_timeout+0x0/0x10
[<c027e8b7>] sock_poll+0x27/0x30
[<c01735da>] do_select+0x29a/0x320
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c01097f3>] syscall_call+0x7/0xb

kdeinit S 00000001 70588800 2842 1441 (NOTLB)
Call Trace:
[<c010bfab>] do_IRQ+0x11b/0x240
[<c012a636>] schedule_timeout+0x66/0xc0
[<c012a5c0>] process_timeout+0x0/0x10
[<c027e8b7>] sock_poll+0x27/0x30
[<c01735da>] do_select+0x29a/0x320
[<c017367e>] select_bits_alloc+0x1e/0x20
[<c01731a0>] __pollwait+0x0/0xc0
[<c01738ee>] sys_select+0x25e/0x480
[<c01097f3>] syscall_call+0x7/0xb

rc D 00000001 4268079552 9456 1 9469 1441 (NOTLB)
Call Trace:
[<c01cad36>] reiserfs_read_locked_inode+0xc6/0x110
[<c011cb5e>] sleep_on+0x6e/0x130
[<c011c380>] default_wake_function+0x0/0x30
[<c01e6768>] do_journal_begin_r+0x88/0x2a0
[<c01e69d7>] journal_begin+0x27/0x30
[<c01d5a07>] reiserfs_dirty_inode+0x77/0x140
[<c0183a16>] __mark_inode_dirty+0x126/0x160
[<c017c6c2>] update_atime+0xb2/0xd0
[<c01d0edc>] reiserfs_readdir+0x59c/0x610
[<c014e63d>] handle_mm_fault+0x1bd/0x220
[<c0172c2e>] vfs_readdir+0x7e/0xa0 [<c0172f40>] filldir64+0x0/0x120
[<c01730d4>] sys_getdents64+0x74/0xb5
[<c0172f40>] filldir64+0x0/0x120
[<c01097f3>] syscall_call+0x7/0xb

blogd D 00000001 321552576 9469 1 9456 (NOTLB)
Call Trace:
[<c0240691>] generic_make_request+0xf1/0x1d0
[<c011d8e4>] io_schedule+0x24/0x30
[<c015f629>] __wait_on_buffer+0x99/0xd0
[<c011ece0>] autoremove_wake_function+0x0/0x50
[<c011ece0>] autoremove_wake_function+0x0/0x50
[<c01e393d>] flush_commit_list+0x34d/0x440
[<c01e80ec>] do_journal_end+0x71c/0xbe0
[<c01e708d>] journal_end_sync+0x3d/0xa0
[<c01e7906>] reiserfs_commit_for_inode+0x86/0xb0
[<c01cd8d7>] reiserfs_sync_file+0x77/0x100
[<c015fb51>] sys_fdatasync+0xa1/0xd0
[<c01097f3>] syscall_call+0x7/0xb

SysRq : Emergency Sync

unused
Shutting down RPC portmap daemonEmergency Sync complete





Oleg Drokin wrote:
> Hello!
>
> On Wed, Jun 18, 2003 at 10:30:03PM +0200, Olivier NICOLAS wrote:
>
>
>>AltSysRq T ....
>>.....
>>konqueror D 00000001 4244652160 1965 1943 (NOTLB)
>>Call Trace:
>> [<c011c3aa>] default_wake_function+0x2a/0x30
>> [<c011cb5e>] sleep_on+0x6e/0x130
>> [<c011c380>] default_wake_function+0x0/0x30
>> [<c01e6768>] do_journal_begin_r+0x88/0x2a0
>
>
> So they want to open the journal, but it is held by somebody else already.
> Can you please look include stacktraces for all other processes that have
> reiserfs bits in the trace and show those traces too? Or just mail me full sysrq-T outputs
> at the time of such a happinings.
>
> Thank you.
>
> Bye,
> Oleg
>
>


2003-06-28 14:19:28

by Olivier Nicolas

[permalink] [raw]
Subject: Re: 2.5.7x: processes in D state

Hello


It still hapen in 2.5.73-bk5

See Sys-Rq T output in attached file

Olivier


Oleg Drokin wrote:
> Hello!
>
> On Wed, Jun 18, 2003 at 10:30:03PM +0200, Olivier NICOLAS wrote:
>
>
>>AltSysRq T ....
>>.....
>>konqueror D 00000001 4244652160 1965 1943 (NOTLB)
>>Call Trace:
>> [<c011c3aa>] default_wake_function+0x2a/0x30
>> [<c011cb5e>] sleep_on+0x6e/0x130
>> [<c011c380>] default_wake_function+0x0/0x30
>> [<c01e6768>] do_journal_begin_r+0x88/0x2a0
>
>
> So they want to open the journal, but it is held by somebody else already.
> Can you please look include stacktraces for all other processes that have
> reiserfs bits in the trace and show those traces too? Or just mail me full sysrq-T outputs
> at the time of such a happinings.
>
> Thank you.
>
> Bye,
> Oleg
>
>


Attachments:
2573bk5.gz (11.98 kB)

2003-06-28 23:42:34

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.7x: processes in D state

Olivier NICOLAS <[email protected]> wrote:
>
> It still hapen in 2.5.73-bk5
>
> See Sys-Rq T output in attached file

This is the offending process:

pdflush D 00000001 4294957500 10 1 11 9 (L-TLB)
dfdd5d28 00000046 c039d540 00000001 00000003 c02450f3 d4cc7a44 dfdd5d18
dfdd5d18 dfda16a0 dfdd5d1c c03d7380 dfdd7980 00000283 00000246 ce311a0c
c039d540 c03d7a00 dfdd5d64 dfdd5d34 c011ddb4 ce16a888 dfdd5d90 c0160ad9
Call Trace:
[<c02450f3>] generic_unplug_device+0x83/0xc0
[<c011ddb4>] io_schedule+0x24/0x30
[<c0160ad9>] __wait_on_buffer+0x99/0xd0
[<c011f210>] autoremove_wake_function+0x0/0x50
[<c011f210>] autoremove_wake_function+0x0/0x50
[<c01e57dd>] flush_commit_list+0x34d/0x440
[<c01e9f8c>] do_journal_end+0x71c/0xbe0
[<c01e911d>] flush_old_commits+0x12d/0x1c0
[<c01b5521>] __log_start_commit+0x31/0x40
[<c01d6c48>] reiserfs_write_super+0xa8/0xf0
[<c0166784>] sync_supers+0x164/0x180
[<c01434d8>] wb_kupdate+0x48/0x190
[<c011c2e4>] schedule+0x114/0x5e0
[<c0143d32>] __pdflush+0x162/0x350
[<c0143f20>] pdflush+0x0/0x20
[<c0143f31>] pdflush+0x11/0x20
[<c0143490>] wb_kupdate+0x0/0x190
[<c01073b9>] kernel_thread_helper+0x5/0xc

It looks like some IO got submitted and then was simply lost.

I can see you're using SMP. Preempt or not?

What disk controller hardware are you using? And which device drivers
for that hardware?

2003-06-29 11:23:47

by Olivier Nicolas

[permalink] [raw]
Subject: Re: 2.5.7x: processes in D state

Hello,


1. SMP enable

System: dual Athlon 1800 MP
MSI K7D master MB using IDE disk controller


2. Preempt is not enable

3. IDE controller

lspci
...
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE
(rev 04)
....


4. device driver used
CONFIG_BLK_DEV_AMD74XX=y

(full config is attached)


5. dmesg output
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD7441: IDE controller at PCI slot 0000:00:07.1
AMD7441: chipset revision 4
AMD7441: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD_IDE: Advanced Micro Devic AMD-768 [Opus] IDE (rev 04) UDMA100
controller on pci0000:00:07.1
ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:pio, hdd:DMA
hda: IC35L060AVVA07-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdd: LG DVD-ROM DRD-8160B, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: host protected area => 1
hda: 120103200 sectors (61493 MB) w/1863KiB Cache, CHS=119150/16/63,
UDMA(100)
hda: hda1 hda2 hda4 < hda5 hda6 hda7 >
end_request: I/O error, dev hdd, sector 0
hdd: ATAPI 48X DVD-ROM drive, 512kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12




Andrew Morton wrote:
> Olivier NICOLAS <[email protected]> wrote:
>
>>It still hapen in 2.5.73-bk5
>>
>> See Sys-Rq T output in attached file
>
>
> This is the offending process:
>
> pdflush D 00000001 4294957500 10 1 11 9 (L-TLB)
> dfdd5d28 00000046 c039d540 00000001 00000003 c02450f3 d4cc7a44 dfdd5d18
> dfdd5d18 dfda16a0 dfdd5d1c c03d7380 dfdd7980 00000283 00000246 ce311a0c
> c039d540 c03d7a00 dfdd5d64 dfdd5d34 c011ddb4 ce16a888 dfdd5d90 c0160ad9
> Call Trace:
> [<c02450f3>] generic_unplug_device+0x83/0xc0
> [<c011ddb4>] io_schedule+0x24/0x30
> [<c0160ad9>] __wait_on_buffer+0x99/0xd0
> [<c011f210>] autoremove_wake_function+0x0/0x50
> [<c011f210>] autoremove_wake_function+0x0/0x50
> [<c01e57dd>] flush_commit_list+0x34d/0x440
> [<c01e9f8c>] do_journal_end+0x71c/0xbe0
> [<c01e911d>] flush_old_commits+0x12d/0x1c0
> [<c01b5521>] __log_start_commit+0x31/0x40
> [<c01d6c48>] reiserfs_write_super+0xa8/0xf0
> [<c0166784>] sync_supers+0x164/0x180
> [<c01434d8>] wb_kupdate+0x48/0x190
> [<c011c2e4>] schedule+0x114/0x5e0
> [<c0143d32>] __pdflush+0x162/0x350
> [<c0143f20>] pdflush+0x0/0x20
> [<c0143f31>] pdflush+0x11/0x20
> [<c0143490>] wb_kupdate+0x0/0x190
> [<c01073b9>] kernel_thread_helper+0x5/0xc
>
> It looks like some IO got submitted and then was simply lost.
>
> I can see you're using SMP. Preempt or not?
>
> What disk controller hardware are you using? And which device drivers
> for that hardware?
>
>


Attachments:
config.gz (4.73 kB)