2003-03-10 02:19:25

by Con Kolivas

[permalink] [raw]
Subject: 2.5.64-mm2->4 hangs on contest

Tried running contest on 2.5.64-mm2 and mm4 and had the same thing happen. It
will hang reliably during process_load. I tried not running process_load but
it would still get stuck in one of the other loads (either a tar load or list
load). I can simply stop contest at that stage but then the machine wont work
well hanging at the console after a minute or so. This started at mm2
(doesn't happen with mm1).

Here is the sysrq-p and sysrq-t output during process_load (which hangs every
time):

SysRq : Show Regs

Pid: 3476, comm: contest
EIP: 0060:[<c0112d5d>] CPU: 0
EIP is at do_schedule+0x2a9/0x338
EFLAGS: 00000286 Not tainted
EAX: ffffe000 EBX: 00000000 ECX: cfbf3620 EDX: 00000000
ESI: cf939b00 EDI: cfbf3300 EBP: cfa03f0c DS: 007b ES: 007b
CR0: 8005003b CR2: 080ac328 CR3: 0fa84000 CR4: 00000690
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0147314>] pipe_write+0x1d0/0x288
[<c013d835>] vfs_write+0xa9/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb


SysRq : Show State
sibling
task PC pid father child younger older
init S 00C03C88 1 0 2 (NOTLB)
Call Trace:
[<c011c568>] schedule_timeout+0x84/0xac
[<c011c4d8>] process_timeout+0x0/0xc
[<c014c780>] do_select+0x1cc/0x208
[<c014c474>] __pollwait+0x0/0x98
[<c014cb2a>] sys_select+0x346/0x480
[<c0108b07>] syscall_call+0x7/0xb

ksoftirqd/0 R C129A000 2 1 3 (L-TLB)
Call Trace:
[<c0118f3e>] ksoftirqd+0x5e/0x9c
[<c0118ee0>] ksoftirqd+0x0/0x9c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

events/0 S CFFF3768 3 1 4 2 (L-TLB)
Call Trace:
[<c0121a9a>] worker_thread+0x102/0x274
[<c0121998>] worker_thread+0x0/0x274
[<c0239584>] flush_to_ldisc+0x0/0xd8
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

pdflush S CFE21FD8 4 1 5 3 (L-TLB)
Call Trace:
[<c012af25>] __pdflush+0x95/0x1ac
[<c012b03c>] pdflush+0x0/0x14
[<c012b047>] pdflush+0xb/0x14
[<c0106f1d>] kernel_thread_helper+0x5/0xc

pdflush S CFE1FFD8 5 1 6 4 (L-TLB)
Call Trace:
[<c012af25>] __pdflush+0x95/0x1ac
[<c012b03c>] pdflush+0x0/0x14
[<c012b047>] pdflush+0xb/0x14
[<c0106f1d>] kernel_thread_helper+0x5/0xc

kswapd0 S CFE0FF48 6 1 7 5 (L-TLB)
Call Trace:
[<c012ef27>] kswapd+0xd3/0xf0
[<c012ee54>] kswapd+0x0/0xf0
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0106f1d>] kernel_thread_helper+0x5/0xc

aio/0 S CFE0BFDC 7 1 8 6 (L-TLB)
Call Trace:
[<c0121a9a>] worker_thread+0x102/0x274
[<c0121998>] worker_thread+0x0/0x274
[<c01089e6>] ret_from_fork+0x6/0x14
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

jfsIO S 00000000 8 1 9 7 (L-TLB)
Call Trace:
[<c01c6d8b>] jfsIOWait+0x10b/0x144
[<c01c6c80>] jfsIOWait+0x0/0x144
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

jfsCommit S C13B3FDC 9 1 10 8 (L-TLB)
Call Trace:
[<c01c9985>] jfs_lazycommit+0x169/0x1a4
[<c01c981c>] jfs_lazycommit+0x0/0x1a4
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

jfsSync S C13B1FDC 10 1 11 9 (L-TLB)
Call Trace:
[<c01c9e56>] jfs_sync+0x1e6/0x21c
[<c01c9c70>] jfs_sync+0x0/0x21c
[<c01c9c70>] jfs_sync+0x0/0x21c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

pagebuf/0 S C13AFFDC 11 1 12 10 (L-TLB)
Call Trace:
[<c0121a9a>] worker_thread+0x102/0x274
[<c0121998>] worker_thread+0x0/0x274
[<c01089e6>] ret_from_fork+0x6/0x14
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

pagebufd S 00000286 12 1 13 11 (L-TLB)
Call Trace:
[<c01130d5>] interruptible_sleep_on+0x5d/0x84
[<c021b450>] pagebuf_daemon+0x0/0x1f0
[<c0112e30>] default_wake_function+0x0/0x1c
[<c021b4d3>] pagebuf_daemon+0x83/0x1f0
[<c021b450>] pagebuf_daemon+0x0/0x1f0
[<c021b424>] pagebuf_daemon_wakeup+0x0/0x2c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

kseriod S CFDBA000 13 1 14 12 (L-TLB)
Call Trace:
[<c02729a7>] serio_thread+0x9b/0x124
[<c027290c>] serio_thread+0x0/0x124
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

kjournald D 00000286 14 1 148 13 (L-TLB)
Call Trace:
[<c01131e7>] sleep_on+0x5b/0x84
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0177d84>] journal_commit_transaction+0x154/0xe2d
[<c0112d40>] do_schedule+0x28c/0x338
[<c0112e30>] default_wake_function+0x0/0x1c
[<c017a5c6>] kjournald+0x106/0x1ec
[<c017a4c0>] kjournald+0x0/0x1ec
[<c017a4b0>] commit_timeout+0x0/0xc
[<c0106f1d>] kernel_thread_helper+0x5/0xc

devfsd S C03050EC 148 1 313 14 (NOTLB)
Call Trace:
[<c0184f07>] devfsd_read+0xe7/0x414
[<c014fe26>] dput+0x1a/0x1a0
[<c0147e51>] path_release+0xd/0x2c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

kjournald S 00000286 313 1 314 148 (L-TLB)
Call Trace:
[<c0108c74>] common_interrupt+0x18/0x20
[<c01130d5>] interruptible_sleep_on+0x5d/0x84
[<c0112e30>] default_wake_function+0x0/0x1c
[<c017a5fb>] kjournald+0x13b/0x1ec
[<c017a4c0>] kjournald+0x0/0x1ec
[<c017a4b0>] commit_timeout+0x0/0xc
[<c0106f1d>] kernel_thread_helper+0x5/0xc

kjournald S 00000286 314 1 315 313 (L-TLB)
Call Trace:
[<c01130d5>] interruptible_sleep_on+0x5d/0x84
[<c0112e30>] default_wake_function+0x0/0x1c
[<c017a5fb>] kjournald+0x13b/0x1ec
[<c017a4c0>] kjournald+0x0/0x1ec
[<c017a4b0>] commit_timeout+0x0/0xc
[<c0106f1d>] kernel_thread_helper+0x5/0xc

reiserfs/0 S CF0E5FDC 315 1 655 314 (L-TLB)
Call Trace:
[<c0121a9a>] worker_thread+0x102/0x274
[<c0121998>] worker_thread+0x0/0x274
[<c01089e6>] ret_from_fork+0x6/0x14
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0106f1d>] kernel_thread_helper+0x5/0xc

dhcpcd S 05224927 655 1 736 315 (NOTLB)
Call Trace:
[<c012518b>] do_clock_nanosleep+0x1cb/0x2c0
[<c0124dd8>] nanosleep_wake_up+0x0/0xc
[<c0124e9a>] sys_nanosleep+0x62/0xb4
[<c0108b07>] syscall_call+0x7/0xb

syslogd D 00000286 736 1 744 655 (NOTLB)
Call Trace:
[<c01131e7>] sleep_on+0x5b/0x84
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0175d36>] start_this_handle+0xfa/0x1a0
[<c0175eb0>] journal_start+0x8c/0xb8
[<c016e9a0>] ext3_dirty_inode+0x68/0x10c
[<c01574e1>] __mark_inode_dirty+0x31/0xdc
[<c0152a89>] inode_update_time+0x85/0x90
[<c012812c>] generic_file_aio_write_nolock+0x360/0x994
[<c0274fd2>] __sock_recvmsg+0x56/0xc8
[<c0112d40>] do_schedule+0x28c/0x338
[<c01287cf>] generic_file_write_nolock+0x6f/0x8c
[<c027613d>] sys_recvfrom+0xad/0x104
[<c0276183>] sys_recvfrom+0xf3/0x104
[<c0112d40>] do_schedule+0x28c/0x338
[<c0128985>] generic_file_writev+0x31/0x44
[<c013d6dc>] do_sync_write+0x0/0xb0
[<c013dbe3>] do_readv_writev+0x1bf/0x2dc
[<c0276864>] sys_socketcall+0x174/0x218
[<c013dd97>] vfs_writev+0x4b/0x50
[<c013de02>] sys_writev+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

klogd S 00000001 744 1 781 736 (NOTLB)
Call Trace:
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c02b6a7c>] unix_wait_for_peer+0xac/0xc8
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c02b74af>] unix_dgram_sendmsg+0x2ff/0x400
[<c0274ec0>] __sock_sendmsg+0xb0/0xdc
[<c0275226>] sock_aio_write+0xae/0xb8
[<c013d75d>] do_sync_write+0x81/0xb0
[<c01319f5>] handle_mm_fault+0x6d/0x124
[<c0111550>] do_page_fault+0x0/0x404
[<c013d848>] vfs_write+0xbc/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

sshd S CFDEB8A0 781 1 916 907 744 (NOTLB)
Call Trace:
[<c028e4d5>] tcp_poll+0x2d/0x154
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c02755d9>] sock_poll+0x1d/0x24
[<c014c6b2>] do_select+0xfe/0x208
[<c014c780>] do_select+0x1cc/0x208
[<c014c474>] __pollwait+0x0/0x98
[<c014cb2a>] sys_select+0x346/0x480
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFDEBD20 907 1 908 781 (NOTLB)
Call Trace:
[<c026fe97>] vgacon_cursor+0x1cf/0x1d8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFDEBF20 908 1 909 907 (NOTLB)
Call Trace:
[<c01318a6>] do_no_page+0x2ea/0x2f8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFDEBE20 909 1 910 908 (NOTLB)
Call Trace:
[<c01318a6>] do_no_page+0x2ea/0x2f8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFDEB820 910 1 911 909 (NOTLB)
Call Trace:
[<c01318a6>] do_no_page+0x2ea/0x2f8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFDEB320 911 1 912 910 (NOTLB)
Call Trace:
[<c01318a6>] do_no_page+0x2ea/0x2f8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

mingetty S CFA781C0 912 1 3431 911 (NOTLB)
Call Trace:
[<c01318a6>] do_no_page+0x2ea/0x2f8
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

sshd S CFDEB3A0 916 781 918 (NOTLB)
Call Trace:
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c0147455>] pipe_poll+0x25/0x64
[<c014c6b2>] do_select+0xfe/0x208
[<c014c780>] do_select+0x1cc/0x208
[<c014c474>] __pollwait+0x0/0x98
[<c014cb2a>] sys_select+0x346/0x480
[<c0108b07>] syscall_call+0x7/0xb

bash S CEEB67BC 918 916 3435 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

agetty S CCA37000 3431 1 912 (NOTLB)
Call Trace:
[<c011c4f8>] schedule_timeout+0x14/0xac
[<c023bc14>] read_chan+0x3b4/0x82c
[<c023bc64>] read_chan+0x404/0x82c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0237784>] tty_read+0xd0/0x138
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

contest S CEEB73FC 3435 918 3475 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

contest S CEEB6DDC 3475 3435 3476 3480 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

contest S CF77210C 3476 3475 3477 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

contest S CF77268C 3477 3476 3478 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

contest S CF7723CC 3478 3476 3479 3477 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0147314>] pipe_write+0x1d0/0x288
[<c013d835>] vfs_write+0xa9/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

contest R current 3479 3476 3478 (NOTLB)
Call Trace:
[<c0112e14>] preempt_schedule+0x28/0x44
[<c0112ef1>] __wake_up+0x55/0x60
[<c0147379>] pipe_write+0x235/0x288
[<c013d835>] vfs_write+0xa9/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3480 3435 3498 3475 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S CC9261DC 3498 3480 3505 3508 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3505 3498 3570 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c0117e27>] sys_wait4+0xab/0x234
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3508 3480 3513 3519 3498 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c0117e27>] sys_wait4+0xab/0x234
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S C4C9871C 3513 3508 3514 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

make S C4C980FC 3514 3513 3564 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

make S CEA127DC 3519 3480 3520 3527 3508 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3520 3519 3574 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c0117e27>] sys_wait4+0xab/0x234
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3527 3480 3533 3519 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c0117e27>] sys_wait4+0xab/0x234
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

make S CF210D5C 3533 3527 3539 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

make S CF735E8C 3539 3533 3540 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c0117e27>] sys_wait4+0xab/0x234
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

gcc S C45199BC 3540 3539 3541 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

cpp0 R C911BE18 3541 3540 3542 (NOTLB)
Call Trace:
[<c0113a7a>] io_schedule+0xe/0x18
[<c01269b4>] __lock_page+0x90/0xac
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0126e2e>] do_generic_mapping_read+0x132/0x328
[<c01272ad>] __generic_file_aio_read+0x195/0x1b0
[<c0127024>] file_read_actor+0x0/0xf4
[<c012730d>] generic_file_aio_read+0x45/0x50
[<c013d569>] do_sync_read+0x81/0xb4
[<c0145121>] sys_fstat64+0x25/0x30
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

cc1 S CF728A6C 3542 3540 3543 3541 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

as S CF728BCC 3543 3540 3542 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

gcc S CF21199C 3564 3514 3565 (NOTLB)
Call Trace:
[<c0117e27>] sys_wait4+0xab/0x234
[<c0117f7d>] sys_wait4+0x201/0x234
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0108b07>] syscall_call+0x7/0xb

cpp0 S CFDE93AC 3565 3564 3566 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0147314>] pipe_write+0x1d0/0x288
[<c013d835>] vfs_write+0xa9/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

cc1 R BFFFEA40 3566 3564 3567 3565 (NOTLB)
Call Trace:
[<c0112fa4>] user_schedule+0x8/0xc
[<c0108b2e>] work_resched+0x5/0x16

as S CFDE90EC 3567 3564 3566 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01470e0>] pipe_read+0x1a0/0x204
[<c013d645>] vfs_read+0xa9/0x140
[<c013d8f6>] sys_read+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

gcc D CDBCBF68 3570 3505 3571 (NOTLB)
Call Trace:
[<c0113035>] wait_for_completion+0x8d/0xd0
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0112e30>] default_wake_function+0x0/0x1c
[<c011568f>] do_fork+0x10f/0x14c
[<c0107627>] sys_vfork+0x1b/0x2c
[<c0108b07>] syscall_call+0x7/0xb

cpp0 Z CFFF9820 3571 3570 3572 (L-TLB)
Call Trace:
[<c01178d1>] do_exit+0x331/0x344
[<c011790a>] sys_exit+0xe/0x10
[<c0108b07>] syscall_call+0x7/0xb

cc1 S CF72EACC 3572 3570 3573 3571 (NOTLB)
Call Trace:
[<c0146f1f>] pipe_wait+0x6f/0x90
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0147314>] pipe_write+0x1d0/0x288
[<c013d835>] vfs_write+0xa9/0x140
[<c013d932>] sys_write+0x2a/0x3c
[<c0108b07>] syscall_call+0x7/0xb

gcc R CD859B60 3573 3570 3572 (NOTLB)
Call Trace:
[<c0113a7a>] io_schedule+0xe/0x18
[<c013e754>] __wait_on_buffer+0x78/0x94
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c0114114>] autoremove_wake_function+0x0/0x38
[<c01761ec>] do_get_write_access+0x68/0x5a4
[<c013f8a0>] __bread+0x14/0x30
[<c0176760>] journal_get_write_access+0x38/0x58
[<c016e88a>] ext3_reserve_inode_write+0x32/0xac
[<c0175d93>] start_this_handle+0x157/0x1a0
[<c016e91e>] ext3_mark_inode_dirty+0x1a/0x34
[<c016e9cf>] ext3_dirty_inode+0x97/0x10c
[<c01574e1>] __mark_inode_dirty+0x31/0xdc
[<c01529c8>] update_atime+0x88/0xc4
[<c0127016>] do_generic_mapping_read+0x31a/0x328
[<c01272ad>] __generic_file_aio_read+0x195/0x1b0
[<c0127024>] file_read_actor+0x0/0xf4
[<c012730d>] generic_file_aio_read+0x45/0x50
[<c013d569>] do_sync_read+0x81/0xb4
[<c012c0b6>] cache_grow+0x156/0x200
[<c012c297>] cache_alloc_refill+0x137/0x174
[<c013d645>] vfs_read+0xa9/0x140
[<c0145aac>] kernel_read+0x40/0x4c
[<c014643d>] prepare_binprm+0xa5/0xb0
[<c0146809>] do_execve+0x125/0x208
[<c0107667>] sys_execve+0x2f/0x68
[<c0108b07>] syscall_call+0x7/0xb

make D 00000286 3574 3520 (NOTLB)
Call Trace:
[<c01131e7>] sleep_on+0x5b/0x84
[<c0112e30>] default_wake_function+0x0/0x1c
[<c0175d36>] start_this_handle+0xfa/0x1a0
[<c0175eb0>] journal_start+0x8c/0xb8
[<c016e9a0>] ext3_dirty_inode+0x68/0x10c
[<c01574e1>] __mark_inode_dirty+0x31/0xdc
[<c01529c8>] update_atime+0x88/0xc4
[<c0148841>] link_path_walk+0x601/0x7fc
[<c0148d25>] path_lookup+0x139/0x140
[<c01459cf>] open_exec+0x1b/0xb8
[<c0146702>] do_execve+0x1e/0x208
[<c0129a5d>] buffered_rmqueue+0xe9/0xf4
[<c0129aea>] __alloc_pages+0x82/0x274
[<c012dc41>] invalidate_vcache+0x19/0x88
[<c0130e91>] do_wp_page+0x325/0x330
[<c0131a4d>] handle_mm_fault+0xc5/0x124
[<c0111682>] do_page_fault+0x132/0x404
[<c0111550>] do_page_fault+0x0/0x404
[<c0129967>] free_hot_page+0x7/0x8
[<c014fe26>] dput+0x1a/0x1a0
[<c011f652>] sys_rt_sigaction+0x7a/0xd4
[<c0147c2d>] getname+0x5d/0x9c
[<c0107667>] sys_execve+0x2f/0x68
[<c0108b07>] syscall_call+0x7/0xb

Con


2003-03-10 09:01:59

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Mon, 10 Mar 2003 18:05, Mike Galbraith wrote:
> At 01:29 PM 3/10/2003 +1100, you wrote:
> >Tried running contest on 2.5.64-mm2 and mm4 and had the same thing happen.
> > It will hang reliably during process_load. I tried not running
> > process_load but it would still get stuck in one of the other loads
> > (either a tar load or list load). I can simply stop contest at that stage
> > but then the machine wont work well hanging at the console after a minute
> > or so. This started at mm2 (doesn't happen with mm1).
> >
> >Here is the sysrq-p and sysrq-t output during process_load (which hangs
> > every time):
>
> hmm, the below looks interesting to me...
>
> >ksoftirqd/0 R C129A000 2 1 3 (L-TLB)
> >Call Trace:
> > [<c0118f3e>] ksoftirqd+0x5e/0x9c
> > [<c0118ee0>] ksoftirqd+0x0/0x9c
> > [<c0106f1d>] kernel_thread_helper+0x5/0xc
>
> I see that too with irman. You could try renicing the shell you start
> contest from to >= +12. With irman, what appears to be cpu starvation
> ceases to be a problem at exactly +12. I also see kapmd constantly wanting
> to run but not being serviced.

Contest uses a modified process load from irman so it exhibits similar
behaviour. Not sure what +12 actually tells me though :-(

My simplistic understanding is that the pipe task in process_load gets
constantly elevated as "interactive" by the new scheduler, and nothing else
ever happens.

Con

2003-03-10 09:50:22

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 08:12 PM 3/10/2003 +1100, Con Kolivas wrote:
>On Mon, 10 Mar 2003 18:05, Mike Galbraith wrote:
> > At 01:29 PM 3/10/2003 +1100, you wrote:
> > >Tried running contest on 2.5.64-mm2 and mm4 and had the same thing happen.
> > > It will hang reliably during process_load. I tried not running
> > > process_load but it would still get stuck in one of the other loads
> > > (either a tar load or list load). I can simply stop contest at that stage
> > > but then the machine wont work well hanging at the console after a minute
> > > or so. This started at mm2 (doesn't happen with mm1).
> > >
> > >Here is the sysrq-p and sysrq-t output during process_load (which hangs
> > > every time):
> >
> > hmm, the below looks interesting to me...
> >
> > >ksoftirqd/0 R C129A000 2 1 3 (L-TLB)
> > >Call Trace:
> > > [<c0118f3e>] ksoftirqd+0x5e/0x9c
> > > [<c0118ee0>] ksoftirqd+0x0/0x9c
> > > [<c0106f1d>] kernel_thread_helper+0x5/0xc
> >
> > I see that too with irman. You could try renicing the shell you start
> > contest from to >= +12. With irman, what appears to be cpu starvation
> > ceases to be a problem at exactly +12. I also see kapmd constantly wanting
> > to run but not being serviced.
>
>Contest uses a modified process load from irman so it exhibits similar
>behaviour. Not sure what +12 actually tells me though :-(

Aha! No wonder your symptoms look so similar. +12 is just a magic number
that works... found by trusty old trial and error method. What I wanted to
see was if your hang would also go away with the same magic number, or if
renicing with any value helped you at all.

>My simplistic understanding is that the pipe task in process_load gets
>constantly elevated as "interactive" by the new scheduler, and nothing else
>ever happens.

Appears so. I can make it "work" by doing a dinky (butt ugly:) tweak in
activate_task().

-Mike

2003-03-10 10:11:38

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 08:12 PM 3/10/2003 +1100, Con Kolivas wrote:
>> Contest uses a modified process load from irman so it exhibits similar
>> behaviour. Not sure what +12 actually tells me though :-(

On Mon, Mar 10, 2003 at 11:05:25AM +0100, Mike Galbraith wrote:
> Aha! No wonder your symptoms look so similar. +12 is just a magic number
> that works... found by trusty old trial and error method. What I wanted to
> see was if your hang would also go away with the same magic number, or if
> renicing with any value helped you at all.

At 08:12 PM 3/10/2003 +1100, Con Kolivas wrote:
>> My simplistic understanding is that the pipe task in process_load gets
>> constantly elevated as "interactive" by the new scheduler, and nothing else
>> ever happens.

On Mon, Mar 10, 2003 at 11:05:25AM +0100, Mike Galbraith wrote:
> Appears so. I can make it "work" by doing a dinky (butt ugly:) tweak in
> activate_task().

IMHO directed yields should attempt to prevent priority inversion but
not elevate priorities otherwise. I'd bug mingo about it.

-- wli

2003-03-10 10:14:45

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 11:05 AM 3/10/2003 +0100, Mike Galbraith wrote:
>At 08:12 PM 3/10/2003 +1100, Con Kolivas wrote:
>>On Mon, 10 Mar 2003 18:05, Mike Galbraith wrote:
>> > At 01:29 PM 3/10/2003 +1100, you wrote:
>> > >Tried running contest on 2.5.64-mm2 and mm4 and had the same thing
>> happen.
>> > > It will hang reliably during process_load. I tried not running
>> > > process_load but it would still get stuck in one of the other loads
>> > > (either a tar load or list load). I can simply stop contest at that
>> stage
>> > > but then the machine wont work well hanging at the console after a
>> minute
>> > > or so. This started at mm2 (doesn't happen with mm1).
>> > >
>> > >Here is the sysrq-p and sysrq-t output during process_load (which hangs
>> > > every time):
>> >
>> > hmm, the below looks interesting to me...
>> >
>> > >ksoftirqd/0 R C129A000 2 1 3 (L-TLB)
>> > >Call Trace:
>> > > [<c0118f3e>] ksoftirqd+0x5e/0x9c
>> > > [<c0118ee0>] ksoftirqd+0x0/0x9c
>> > > [<c0106f1d>] kernel_thread_helper+0x5/0xc
>> >
>> > I see that too with irman. You could try renicing the shell you start
>> > contest from to >= +12. With irman, what appears to be cpu starvation
>> > ceases to be a problem at exactly +12. I also see kapmd constantly
>> wanting
>> > to run but not being serviced.
>>
>>Contest uses a modified process load from irman so it exhibits similar
>>behaviour. Not sure what +12 actually tells me though :-(
>
>Aha! No wonder your symptoms look so similar. +12 is just a magic number
>that works... found by trusty old trial and error method. What I wanted
>to see was if your hang would also go away with the same magic number, or
>if renicing with any value helped you at all.
>
>>My simplistic understanding is that the pipe task in process_load gets
>>constantly elevated as "interactive" by the new scheduler, and nothing else
>>ever happens.
>
>Appears so. I can make it "work" by doing a dinky (butt ugly:) tweak in
>activate_task().

Oh, what the heck. Even a butt ugly patch that works has informational
value. Can you try the attached please? If it works for you too, maybe
it'll tell Ingo something.

The numbers irman spits out with this (cough cough) patch are mucho better
than stock, and better than the ones I get once in a while when it (rare)
doesn't starve me to death with combo.

-Mike

2003-03-10 10:20:40

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> Ahem. Attached this time.

I assume this is against bk? I'll massage it into 2.5.64-mm4

Con

2003-03-10 10:16:13

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

Ahem. Attached this time.


Attachments:
buttugly.diff (434.00 B)

2003-03-10 10:16:45

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Mon, 10 Mar 2003 21:29, Mike Galbraith wrote:
> At 11:05 AM 3/10/2003 +0100, Mike Galbraith wrote:
> >At 08:12 PM 3/10/2003 +1100, Con Kolivas wrote:
> >>On Mon, 10 Mar 2003 18:05, Mike Galbraith wrote:
> >> > At 01:29 PM 3/10/2003 +1100, you wrote:
> >> > >Tried running contest on 2.5.64-mm2 and mm4 and had the same thing
> >>
> >> happen.
> >>
> >> > > It will hang reliably during process_load. I tried not running
> >> > > process_load but it would still get stuck in one of the other loads
> >> > > (either a tar load or list load). I can simply stop contest at that
> >>
> >> stage
> >>
> >> > > but then the machine wont work well hanging at the console after a
> >>
> >> minute
> >>
> >> > > or so. This started at mm2 (doesn't happen with mm1).
> >> > >
> >> > >Here is the sysrq-p and sysrq-t output during process_load (which
> >> > > hangs every time):
> >> >
> >> > hmm, the below looks interesting to me...
> >> >
> >> > >ksoftirqd/0 R C129A000 2 1 3 (L-TLB)
> >> > >Call Trace:
> >> > > [<c0118f3e>] ksoftirqd+0x5e/0x9c
> >> > > [<c0118ee0>] ksoftirqd+0x0/0x9c
> >> > > [<c0106f1d>] kernel_thread_helper+0x5/0xc
> >> >
> >> > I see that too with irman. You could try renicing the shell you start
> >> > contest from to >= +12. With irman, what appears to be cpu starvation
> >> > ceases to be a problem at exactly +12. I also see kapmd constantly
> >>
> >> wanting
> >>
> >> > to run but not being serviced.
> >>
> >>Contest uses a modified process load from irman so it exhibits similar
> >>behaviour. Not sure what +12 actually tells me though :-(
> >
> >Aha! No wonder your symptoms look so similar. +12 is just a magic number
> >that works... found by trusty old trial and error method. What I wanted
> >to see was if your hang would also go away with the same magic number, or
> >if renicing with any value helped you at all.
> >
> >>My simplistic understanding is that the pipe task in process_load gets
> >>constantly elevated as "interactive" by the new scheduler, and nothing
> >> else ever happens.
> >
> >Appears so. I can make it "work" by doing a dinky (butt ugly:) tweak in
> >activate_task().
>
> Oh, what the heck. Even a butt ugly patch that works has informational
> value. Can you try the attached please? If it works for you too, maybe
> it'll tell Ingo something.

Sure. How about attaching it?

Con

2003-03-10 10:28:09

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> > Ahem. Attached this time.
>
>I assume this is against bk? I'll massage it into 2.5.64-mm4

It's against 2.5.64-combo.

-Mike

2003-03-10 10:32:03

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Mon, 10 Mar 2003 21:43, Mike Galbraith wrote:
> At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> >On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> > > Ahem. Attached this time.
> >
> >I assume this is against bk? I'll massage it into 2.5.64-mm4
>
> It's against 2.5.64-combo.

Ok tested and it fixes it. Now what?

Just for the record this is the version I have modified it to on 2.5.64-mm4:

sleep_avg = (p->sleep_avg + sleep_time) / (1 + rq->nr_running);


Con

2003-03-10 10:39:12

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 09:42 PM 3/10/2003 +1100, Con Kolivas wrote:
>On Mon, 10 Mar 2003 21:43, Mike Galbraith wrote:
> > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> > >On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> > > > Ahem. Attached this time.
> > >
> > >I assume this is against bk? I'll massage it into 2.5.64-mm4
> >
> > It's against 2.5.64-combo.
>
>Ok tested and it fixes it. Now what?
>
>Just for the record this is the version I have modified it to on 2.5.64-mm4:
>
> sleep_avg = (p->sleep_avg + sleep_time) / (1 + rq->nr_running);

Wait for Ingo to say "why in GOD's name would anyone do something _so_
stupid!" ?:)))

2003-03-10 12:32:19

by Ed Tomlinson

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

Mike Galbraith wrote:

> At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
>>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
>> > Ahem. Attached this time.
>>
>>I assume this is against bk? I'll massage it into 2.5.64-mm4

Suspect that the interactivity changes have make the problem that my
ptg patch is designed to fix easier to hit. Con where is the latest
contest (a quick google does not help)? Mike what version of irman
are you using? The one I have has problems parsing /proc/mem in mm.

Here is the ptg patch for 64-mm4. To apply it reverse the
schedule-tunables-fix and schedule-tuneables and then apply
the ptg and update tuneables patch.

Ed Tomlinson

----------- ptg-D3-mm2 (applies to mm4) ----------
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1087 -> 1.1088
# include/linux/sched.h 1.137 -> 1.138
# kernel/fork.c 1.112 -> 1.113
# kernel/user.c 1.6 -> 1.7
# kernel/sched.c 1.163 -> 1.164
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/03/07 [email protected] 1.1088
# Add user and thread group governors to prevent either from monoplizing
# the system. The governors work by limiting the sum of the timeslices
# of active tasks in a group to <n> timeslices. The defaults set <n> to
# 1.5 for thread groups and to 10 for user tasks.
#
# For numa systems the governors are per node. Idealy the storage
# should also be local to the node however, we do not have dynamic per
# node storage yet...
# --------------------------------------------
#
diff -Nru a/include/linux/sched.h b/include/linux/sched.h
--- a/include/linux/sched.h Sat Mar 8 14:18:48 2003
+++ b/include/linux/sched.h Sat Mar 8 14:18:48 2003
@@ -178,6 +178,11 @@

#include <linux/aio.h>

+struct ptg_struct { /* pseudo thread groups */
+ atomic_t active[MAX_NUMNODES];
+ atomic_t count; /* number of refs */
+};
+
struct mm_struct {
struct vm_area_struct * mmap; /* list of VMAs */
struct rb_root mm_rb;
@@ -278,6 +283,7 @@
struct user_struct {
atomic_t __count; /* reference count */
atomic_t processes; /* How many processes does this user have? */
+ atomic_t active[MAX_NUMNODES];
atomic_t files; /* How many open files does this user have? */

/* Hash table maintenance information */
@@ -344,6 +350,8 @@
struct list_head ptrace_list;

struct mm_struct *mm, *active_mm;
+ struct ptg_struct * ptgroup; /* pseudo thread group for this task */
+ atomic_t *governor; /* the atomic_t that governs this task */

/* task state */
struct linux_binfmt *binfmt;
diff -Nru a/kernel/fork.c b/kernel/fork.c
--- a/kernel/fork.c Sat Mar 8 14:18:48 2003
+++ b/kernel/fork.c Sat Mar 8 14:18:48 2003
@@ -96,12 +96,24 @@
}
}

+void free_ptgroup(struct task_struct *tsk)
+{
+ if (tsk->ptgroup && atomic_sub_and_test(1,&tsk->ptgroup->count)) {
+ kfree(tsk->ptgroup);
+ tsk->ptgroup = NULL;
+ tsk->governor = &tsk->user->active[cpu_to_node(task_cpu(tsk))];
+ if (tsk == current)
+ atomic_inc(tsk->governor);
+ }
+}
+
void __put_task_struct(struct task_struct *tsk)
{
WARN_ON(!(tsk->state & (TASK_DEAD | TASK_ZOMBIE)));
WARN_ON(atomic_read(&tsk->usage));
WARN_ON(tsk == current);

+ free_ptgroup(tsk);
security_task_free(tsk);
free_uid(tsk->user);
free_task_struct(tsk);
@@ -469,6 +481,7 @@

tsk->mm = NULL;
tsk->active_mm = NULL;
+ tsk->ptgroup = NULL;

/*
* Are we cloning a kernel thread?
@@ -734,6 +747,32 @@
p->flags = new_flags;
}

+static inline int setup_governor(unsigned long clone_flags, struct task_struct *p)
+{
+ if ( ((clone_flags & CLONE_VM) && (clone_flags & CLONE_FILES)) ||
+ (clone_flags & CLONE_THREAD)) {
+ if (current->ptgroup)
+ atomic_inc(&current->ptgroup->count);
+ else {
+ int i;
+ current->ptgroup = kmalloc(sizeof(struct ptg_struct), GFP_ATOMIC);
+ if (!current->ptgroup)
+ return 1;
+ /* printk(KERN_INFO "ptgroup - pid %u\n",current->pid); */
+ atomic_set(&current->ptgroup->count,2);
+ for(i=0; i < MAX_NUMNODES; i++)
+ atomic_set(&current->ptgroup->active[i], 0);
+ atomic_set(&current->ptgroup->active[numa_node_id()], 1);
+ atomic_dec(current->governor);
+ current->governor = &current->ptgroup->active[numa_node_id()];
+ }
+ p->ptgroup = current->ptgroup;
+ p->governor = &p->ptgroup->active[numa_node_id()];
+ } else
+ p->governor = &p->user->active[numa_node_id()];
+ return 0;
+}
+
asmlinkage int sys_set_tid_address(int *tidptr)
{
current->clear_child_tid = tidptr;
@@ -876,6 +915,12 @@
goto bad_fork_cleanup_mm;
retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
if (retval)
+ goto bad_fork_cleanup_namespace;
+ /*
+ * Setup the governor pointer for the new process, allocating a new ptg as
+ * required if the process is a thread.
+ */
+ if (setup_governor(clone_flags, p))
goto bad_fork_cleanup_namespace;

if (clone_flags & CLONE_CHILD_SETTID)
diff -Nru a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c Sat Mar 8 14:18:48 2003
+++ b/kernel/sched.c Sat Mar 8 14:18:48 2003
@@ -68,6 +68,9 @@
#define MAX_SLEEP_AVG (10*HZ)
#define STARVATION_LIMIT (10*HZ)
#define NODE_THRESHOLD 125
+#define THREAD_GOVERNOR 15 /* allow threads groups 1.5 full timeslices */
+#define USER_GOVERNOR 100 /* allow user 10 full timeslices */
+

/*
* If a task is 'interactive' then we reinsert it in the active
@@ -123,7 +126,26 @@

static inline unsigned int task_timeslice(task_t *p)
{
- return BASE_TIMESLICE(p);
+ int slice = BASE_TIMESLICE(p);
+ int threads = atomic_read(p->governor) * 10;
+ int govern = threads;
+ if (p->user->uid)
+ govern = (p->ptgroup) ? THREAD_GOVERNOR : USER_GOVERNOR;
+ if (threads > govern) {
+ slice = (slice * govern) / threads;
+ slice = (slice > MIN_TIMESLICE) ? slice : MIN_TIMESLICE;
+ }
+#if 1
+ {
+ static int next;
+ if (time_after(jiffies, next)) {
+ printk(KERN_INFO "uid %d pid %d nod %d ptg %x gov %x threads %d lim %d slice %d\n",
+ p->uid, p->pid, numa_node_id(), p->ptgroup, p->governor, threads/10, govern, slice);
+ next = jiffies + HZ*300;
+ }
+ }
+#endif
+ return slice;
}

/*
@@ -197,16 +219,18 @@
rq->node_nr_running = &node_nr_running[0];
}

-static inline void nr_running_inc(runqueue_t *rq)
+static inline void nr_running_inc(task_t *p, runqueue_t *rq)
{
atomic_inc(rq->node_nr_running);
rq->nr_running++;
+ atomic_inc(p->governor);
}

-static inline void nr_running_dec(runqueue_t *rq)
+static inline void nr_running_dec(task_t *p, runqueue_t *rq)
{
atomic_dec(rq->node_nr_running);
rq->nr_running--;
+ atomic_dec(p->governor);
}

__init void node_nr_running_init(void)
@@ -220,8 +244,8 @@
#else /* !CONFIG_NUMA */

# define nr_running_init(rq) do { } while (0)
-# define nr_running_inc(rq) do { (rq)->nr_running++; } while (0)
-# define nr_running_dec(rq) do { (rq)->nr_running--; } while (0)
+# define nr_running_inc(p, rq) do { (rq)->nr_running++; atomic_inc((p)->governor); } while (0)
+# define nr_running_dec(p, rq) do { (rq)->nr_running--; atomic_dec((p)->governor); } while (0)

#endif /* CONFIG_NUMA */

@@ -326,7 +350,7 @@
static inline void __activate_task(task_t *p, runqueue_t *rq)
{
enqueue_task(p, rq->active);
- nr_running_inc(rq);
+ nr_running_inc(p, rq);
}

/*
@@ -387,7 +411,7 @@
*/
static inline void deactivate_task(struct task_struct *p, runqueue_t *rq)
{
- nr_running_dec(rq);
+ nr_running_dec(p, rq);
if (p->state == TASK_UNINTERRUPTIBLE)
rq->nr_uninterruptible++;
dequeue_task(p, p->array);
@@ -586,7 +610,7 @@
list_add_tail(&p->run_list, &current->run_list);
p->array = current->array;
p->array->nr_active++;
- nr_running_inc(rq);
+ nr_running_inc(p, rq);
}
task_rq_unlock(rq, &flags);
}
@@ -1004,9 +1028,15 @@
static inline void pull_task(runqueue_t *src_rq, prio_array_t *src_array, task_t *p, runqueue_t *this_rq, int this_cpu)
{
dequeue_task(p, src_array);
- nr_running_dec(src_rq);
+ nr_running_dec(p, src_rq);
set_task_cpu(p, this_cpu);
- nr_running_inc(this_rq);
+#ifdef CONFIG_NUMA
+ if (p->ptgroup)
+ p->governor = &p->ptgroup->active[cpu_to_node(this_cpu)];
+ else
+ p->governor = &p->user->active[cpu_to_node(this_cpu)];
+#endif
+ nr_running_inc(p, this_rq);
enqueue_task(p, this_rq->active);
/*
* Note that idle threads have a prio of MAX_PRIO, for this test
@@ -2521,6 +2551,8 @@
rq->curr = current;
rq->idle = current;
set_task_cpu(current, smp_processor_id());
+ current->governor = &current->user->active[numa_node_id()];
+ atomic_inc(current->governor);
wake_up_forked_process(current);
current->prio = MAX_PRIO;

diff -Nru a/kernel/user.c b/kernel/user.c
--- a/kernel/user.c Sat Mar 8 14:18:48 2003
+++ b/kernel/user.c Sat Mar 8 14:18:48 2003
@@ -30,6 +30,7 @@
struct user_struct root_user = {
.__count = ATOMIC_INIT(1),
.processes = ATOMIC_INIT(1),
+ .active = {[0 ...MAX_NUMNODES-1] = ATOMIC_INIT(0)},
.files = ATOMIC_INIT(0)
};

@@ -89,6 +90,7 @@

if (!up) {
struct user_struct *new;
+ int i;

new = kmem_cache_alloc(uid_cachep, SLAB_KERNEL);
if (!new)
@@ -96,6 +98,8 @@
new->uid = uid;
atomic_set(&new->__count, 1);
atomic_set(&new->processes, 0);
+ for(i=0; i < MAX_NUMNODES; i++)
+ atomic_set(&new->active[i], 0);
atomic_set(&new->files, 0);

/*
@@ -130,6 +134,11 @@
atomic_inc(&new_user->processes);
atomic_dec(&old_user->processes);
current->user = new_user;
+ if (!current->ptgroup) {
+ atomic_dec(current->governor);
+ current->governor = &current->user->active[numa_node_id()];
+ atomic_inc(current->governor);
+ }
free_uid(old_user);
}

----------- schedule-tunables for ptg -------------
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1088 -> 1.1089
# kernel/sysctl.c 1.39 -> 1.40
# Documentation/filesystems/proc.txt 1.14 -> 1.15
# include/linux/sysctl.h 1.42 -> 1.43
# kernel/sched.c 1.164 -> 1.165
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/03/07 [email protected] 1.1089
# + schedule tunables for ptg
# --------------------------------------------
#
diff -Nru a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
--- a/Documentation/filesystems/proc.txt Sat Mar 8 14:19:12 2003
+++ b/Documentation/filesystems/proc.txt Sat Mar 8 14:19:12 2003
@@ -37,6 +37,7 @@
2.8 /proc/sys/net/ipv4 - IPV4 settings
2.9 Appletalk
2.10 IPX
+ 2.11 /proc/sys/sched - scheduler tunables

------------------------------------------------------------------------------
Preface
@@ -1666,6 +1667,113 @@
The /proc/net/ipx_route table holds a list of IPX routes. For each route it
gives the destination network, the router node (or Directly) and the network
address of the router (or Connected) for internal networks.
+
+2.11 /proc/sys/sched - scheduler tunables
+-----------------------------------------
+
+Useful knobs for tuning the scheduler live in /proc/sys/sched.
+
+child_penalty
+-------------
+
+Percentage of the parent's sleep_avg that children inherit. sleep_avg is
+a running average of the time a process spends sleeping. Tasks with high
+sleep_avg values are considered interactive and given a higher dynamic
+priority and a larger timeslice. You typically want this some value just
+under 100.
+
+exit_weight
+-----------
+
+When a CPU hog task exits, its parent's sleep_avg is reduced by a factor of
+exit_weight against the exiting task's sleep_avg.
+
+interactive_delta
+-----------------
+
+If a task is "interactive" it is reinserted into the active array after it
+has expired its timeslice, instead of being inserted into the expired array.
+How "interactive" a task must be in order to be deemed interactive is a
+function of its nice value. This interactive limit is scaled linearly by nice
+value and is offset by the interactive_delta.
+
+max_sleep_avg
+-------------
+
+max_sleep_avg is the largest value (in ms) stored for a task's running sleep
+average. The larger this value, the longer a task needs to sleep to be
+considered interactive (maximum interactive bonus is a function of
+max_sleep_avg).
+
+max_timeslice
+-------------
+
+Maximum timeslice, in milliseconds. This is the value given to tasks of the
+highest dynamic priority.
+
+min_timeslice
+-------------
+
+Minimum timeslice, in milliseconds. This is the value given to tasks of the
+lowest dynamic priority. Every task gets at least this slice of the processor
+per array switch.
+
+parent_penalty
+--------------
+
+Percentage of the parent's sleep_avg that it retains across a fork().
+sleep_avg is a running average of the time a process spends sleeping. Tasks
+with high sleep_avg values are considered interactive and given a higher
+dynamic priority and a larger timeslice. Normally, this value is 100 and thus
+task's retain their sleep_avg on fork. If you want to punish interactive
+tasks for forking, set this below 100.
+
+prio_bonus_ratio
+----------------
+
+Middle percentage of the priority range that tasks can receive as a dynamic
+priority. The default value of 25% ensures that nice values at the
+extremes are still enforced. For example, nice +19 interactive tasks will
+never be able to preempt a nice 0 CPU hog. Setting this higher will increase
+the size of the priority range the tasks can receive as a bonus. Setting
+this lower will decrease this range, making the interactivity bonus less
+apparent and user nice values more applicable.
+
+starvation_limit
+----------------
+
+Sufficiently interactive tasks are reinserted into the active array when they
+run out of timeslice. Normally, tasks are inserted into the expired array.
+Reinserting interactive tasks into the active array allows them to remain
+runnable, which is important to interactive performance. This could starve
+expired tasks, however, since the interactive task could prevent the array
+switch. To prevent starving the tasks on the expired array for too long. the
+starvation_limit is the longest (in ms) we will let the expired array starve
+at the expense of reinserting interactive tasks back into active. Higher
+values here give more preferance to running interactive tasks, at the expense
+of expired tasks. Lower values provide more fair scheduling behavior, at the
+expense of interactivity. The units are in milliseconds.
+
+thread_governor
+---------------
+
+When the number of active threads in a threadgroup exceeds the limit reduce the
+timeslices of active members by thread_governor / active_threads_in_group. In
+the NUMA case the limits are per node. The thread_governor is applied before
+the user_governor. Units are number of threads times 10.
+
+user_governor
+-------------
+
+When the number of active threads of user exceeds the limit reduce the timeslices
+of the user's active processes by user_governor / active_threads_of_user. In the
+NUMA case the limits are per node. Units are number of threads times 10.
+
+node_threshold
+--------------
+
+Consider NUMA nodes imbalanced when there is a difference of more than this
+percentage.

------------------------------------------------------------------------------
Summary
diff -Nru a/include/linux/sysctl.h b/include/linux/sysctl.h
--- a/include/linux/sysctl.h Sat Mar 8 14:19:12 2003
+++ b/include/linux/sysctl.h Sat Mar 8 14:19:12 2003
@@ -66,7 +66,8 @@
CTL_DEV=7, /* Devices */
CTL_BUS=8, /* Busses */
CTL_ABI=9, /* Binary emulation */
- CTL_CPU=10 /* CPU stuff (speed scaling, etc) */
+ CTL_CPU=10, /* CPU stuff (speed scaling, etc) */
+ CTL_SCHED=11, /* scheduler tunables */
};

/* CTL_BUS names: */
@@ -157,6 +158,21 @@
VM_LOWER_ZONE_PROTECTION=20,/* Amount of protection of lower zones */
};

+/* Tunable scheduler parameters in /proc/sys/sched/ */
+enum {
+ SCHED_MIN_TIMESLICE=1, /* minimum process timeslice */
+ SCHED_MAX_TIMESLICE=2, /* maximum process timeslice */
+ SCHED_CHILD_PENALTY=3, /* penalty on fork to child */
+ SCHED_PARENT_PENALTY=4, /* penalty on fork to parent */
+ SCHED_EXIT_WEIGHT=5, /* penalty to parent of CPU hog child */
+ SCHED_PRIO_BONUS_RATIO=6, /* percent of max prio given as bonus */
+ SCHED_INTERACTIVE_DELTA=7, /* delta used to scale interactivity */
+ SCHED_MAX_SLEEP_AVG=8, /* maximum sleep avg attainable */
+ SCHED_STARVATION_LIMIT=9, /* no re-active if expired is starved */
+ SCHED_NODE_THRESHOLD=10, /* NUMA imbalance threshold */
+ SCHED_THREAD_GOVERNOR=11, /* govern threadgroups when needed */
+ SCHED_USER_GOVERNOR=12, /* govern users when needed */
+};

/* CTL_NET names: */
enum
diff -Nru a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c Sat Mar 8 14:19:12 2003
+++ b/kernel/sched.c Sat Mar 8 14:19:12 2003
@@ -57,19 +57,35 @@
* Minimum timeslice is 10 msecs, default timeslice is 100 msecs,
* maximum timeslice is 200 msecs. Timeslices get refilled after
* they expire.
+ *
+ * They are configurable via /proc/sys/sched
*/
-#define MIN_TIMESLICE ( 10 * HZ / 1000)
-#define MAX_TIMESLICE (200 * HZ / 1000)
-#define CHILD_PENALTY 50
-#define PARENT_PENALTY 100
-#define EXIT_WEIGHT 3
-#define PRIO_BONUS_RATIO 25
-#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (10*HZ)
-#define STARVATION_LIMIT (10*HZ)
-#define NODE_THRESHOLD 125
-#define THREAD_GOVERNOR 15 /* allow threads groups 1.5 full timeslices */
-#define USER_GOVERNOR 100 /* allow user 10 full timeslices */
+
+int min_timeslice = (10 * HZ) / 1000;
+int max_timeslice = (200 * HZ) / 1000;
+int child_penalty = 50;
+int parent_penalty = 100;
+int exit_weight = 3;
+int prio_bonus_ratio = 25;
+int interactive_delta = 2;
+int max_sleep_avg = 10 * HZ;
+int starvation_limit = 10 * HZ;
+int node_threshold = 125;
+int thread_governor = 15;
+int user_governor = 100;
+
+#define MIN_TIMESLICE (min_timeslice)
+#define MAX_TIMESLICE (max_timeslice)
+#define CHILD_PENALTY (child_penalty)
+#define PARENT_PENALTY (parent_penalty)
+#define EXIT_WEIGHT (exit_weight)
+#define PRIO_BONUS_RATIO (prio_bonus_ratio)
+#define INTERACTIVE_DELTA (interactive_delta)
+#define MAX_SLEEP_AVG (max_sleep_avg)
+#define STARVATION_LIMIT (starvation_limit)
+#define NODE_THRESHOLD (node_threshold)
+#define THREAD_GOVERNOR (thread_governor)
+#define USER_GOVERNOR (user_governor)


/*
diff -Nru a/kernel/sysctl.c b/kernel/sysctl.c
--- a/kernel/sysctl.c Sat Mar 8 14:19:12 2003
+++ b/kernel/sysctl.c Sat Mar 8 14:19:12 2003
@@ -55,6 +55,18 @@
extern int cad_pid;
extern int pid_max;
extern int sysctl_lower_zone_protection;
+extern int min_timeslice;
+extern int max_timeslice;
+extern int child_penalty;
+extern int parent_penalty;
+extern int exit_weight;
+extern int prio_bonus_ratio;
+extern int interactive_delta;
+extern int max_sleep_avg;
+extern int starvation_limit;
+extern int node_threshold;
+extern int thread_governor;
+extern int user_governor;

/* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
static int maxolduid = 65535;
@@ -112,6 +124,7 @@

static ctl_table kern_table[];
static ctl_table vm_table[];
+static ctl_table sched_table[];
#ifdef CONFIG_NET
extern ctl_table net_table[];
#endif
@@ -156,6 +169,7 @@
{CTL_FS, "fs", NULL, 0, 0555, fs_table},
{CTL_DEBUG, "debug", NULL, 0, 0555, debug_table},
{CTL_DEV, "dev", NULL, 0, 0555, dev_table},
+ {CTL_SCHED, "sched", NULL, 0, 0555, sched_table},
{0}
};

@@ -358,7 +372,47 @@

static ctl_table dev_table[] = {
{0}
-};
+};
+
+static ctl_table sched_table[] = {
+ {SCHED_MAX_TIMESLICE, "max_timeslice", &max_timeslice,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &one, NULL},
+ {SCHED_MIN_TIMESLICE, "min_timeslice", &min_timeslice,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &one, NULL},
+ {SCHED_CHILD_PENALTY, "child_penalty", &child_penalty,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_PARENT_PENALTY, "parent_penalty", &parent_penalty,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_EXIT_WEIGHT, "exit_weight", &exit_weight,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_PRIO_BONUS_RATIO, "prio_bonus_ratio", &prio_bonus_ratio,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_INTERACTIVE_DELTA, "interactive_delta", &interactive_delta,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_MAX_SLEEP_AVG, "max_sleep_avg", &max_sleep_avg,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &one, NULL},
+ {SCHED_STARVATION_LIMIT, "starvation_limit", &starvation_limit,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_NODE_THRESHOLD, "node_threshold", &node_threshold,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_THREAD_GOVERNOR, "thread_governor", &thread_governor,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {SCHED_USER_GOVERNOR, "user_governor", &user_governor,
+ sizeof(int), 0644, NULL, &proc_dointvec_minmax,
+ &sysctl_intvec, NULL, &zero, NULL},
+ {0}
+};

extern void init_irq_proc (void);

-----------

2003-03-10 12:36:58

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Mon, 10 Mar 2003 23:43, Ed Tomlinson wrote:
> Mike Galbraith wrote:
> > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> >>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> >> > Ahem. Attached this time.
> >>
> >>I assume this is against bk? I'll massage it into 2.5.64-mm4
>
> Suspect that the interactivity changes have make the problem that my
> ptg patch is designed to fix easier to hit. Con where is the latest
> contest (a quick google does not help)? Mike what version of irman


http://contest.kolivas.org

Con

2003-03-10 13:26:02

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 07:43 AM 3/10/2003 -0500, Ed Tomlinson wrote:
>Mike Galbraith wrote:
>
> > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> >>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> >> > Ahem. Attached this time.
> >>
> >>I assume this is against bk? I'll massage it into 2.5.64-mm4
>
>Suspect that the interactivity changes have make the problem that my
>ptg patch is designed to fix easier to hit. Con where is the latest
>contest (a quick google does not help)? Mike what version of irman
>are you using? The one I have has problems parsing /proc/mem in mm.

Version 0.5. (also has parse problem)

2003-03-10 13:30:08

by Ed Tomlinson

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On March 10, 2003 08:41 am, Mike Galbraith wrote:
> At 07:43 AM 3/10/2003 -0500, Ed Tomlinson wrote:
> >Mike Galbraith wrote:
> > > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> > >>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> > >> > Ahem. Attached this time.
> > >>
> > >>I assume this is against bk? I'll massage it into 2.5.64-mm4
> >
> >Suspect that the interactivity changes have make the problem that my
> >ptg patch is designed to fix easier to hit. Con where is the latest
> >contest (a quick google does not help)? Mike what version of irman
> >are you using? The one I have has problems parsing /proc/mem in mm.
>
> Version 0.5. (also has parse problem)

If you guys play with the ptg patch there are two tunables you should
be aware of. The first is theard_governor, which probably will not need
to be touched. The second is user_governor. This one you may want to
reduce from 100 to 50 - meaning start reducing a user's timeslices when
there are more than 5 in run queue tasks for that user.

Ed

2003-03-10 13:49:03

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 07:43 AM 3/10/2003 -0500, Ed Tomlinson wrote:
>Mike Galbraith wrote:
>
> > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> >>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> >> > Ahem. Attached this time.
> >>
> >>I assume this is against bk? I'll massage it into 2.5.64-mm4
>
>Suspect that the interactivity changes have make the problem that my
>ptg patch is designed to fix easier to hit.

I think the problem I'm hitting is that the HUGE context switching that
irman does causes the process load proggies to constantly max out their
sleep average. There are no threads involved in irman, it's just plain old
fork().

-Mike

2003-03-10 13:57:44

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 08:41 AM 3/10/2003 -0500, Ed Tomlinson wrote:
>On March 10, 2003 08:41 am, Mike Galbraith wrote:
> > At 07:43 AM 3/10/2003 -0500, Ed Tomlinson wrote:
> > >Mike Galbraith wrote:
> > > > At 09:31 PM 3/10/2003 +1100, Con Kolivas wrote:
> > > >>On Mon, 10 Mar 2003 21:31, Mike Galbraith wrote:
> > > >> > Ahem. Attached this time.
> > > >>
> > > >>I assume this is against bk? I'll massage it into 2.5.64-mm4
> > >
> > >Suspect that the interactivity changes have make the problem that my
> > >ptg patch is designed to fix easier to hit. Con where is the latest
> > >contest (a quick google does not help)? Mike what version of irman
> > >are you using? The one I have has problems parsing /proc/mem in mm.
> >
> > Version 0.5. (also has parse problem)
>
>If you guys play with the ptg patch there are two tunables you should
>be aware of. The first is theard_governor, which probably will not need
>to be touched. The second is user_governor. This one you may want to
>reduce from 100 to 50 - meaning start reducing a user's timeslices when
>there are more than 5 in run queue tasks for that user.

In my experiments, reducing timeslice does horrible things to
throughput. The hind of testing I do is mostly throughput oriented...
latency is secondary when you're running full throttle ;-)

-Mike

2003-03-12 09:06:04

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

I can't help myself... the attached is just too simple and works too darn well.

Somebody stop me! :)

-Mike


Attachments:
bk5_sched.diff (571.00 B)

2003-03-12 10:15:06

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Wed, 12 Mar 2003 20:21, Mike Galbraith wrote:
> I can't help myself... the attached is just too simple and works too darn
> well.
>
> Somebody stop me! :)

Sssssssssssmokin are ya Mike?

Is this in addition to your previous errr hack or instead of?

Con

2003-03-12 10:22:31

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 09:25 PM 3/12/2003 +1100, Con Kolivas wrote:
>On Wed, 12 Mar 2003 20:21, Mike Galbraith wrote:
> > I can't help myself... the attached is just too simple and works too darn
> > well.
> >
> > Somebody stop me! :)
>
>Sssssssssssmokin are ya Mike?

;-)

>Is this in addition to your previous errr hack or instead of?

Instead of. The buttugly patch destroyed interactivity. This one cures
starvation, and interactivity is really nice.

-Mike

2003-03-12 11:09:11

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Wed, 12 Mar 2003 21:37, Mike Galbraith wrote:
> At 09:25 PM 3/12/2003 +1100, Con Kolivas wrote:
> >On Wed, 12 Mar 2003 20:21, Mike Galbraith wrote:
> > > I can't help myself... the attached is just too simple and works too
> > > darn well.
> > >
> > > Somebody stop me! :)
> >
> >Sssssssssssmokin are ya Mike?
>
> ;-)
>
> >Is this in addition to your previous errr hack or instead of?
>
> Instead of. The buttugly patch destroyed interactivity. This one cures
> starvation, and interactivity is really nice.

Ok that fixes the "getting stuck in process load" but it still hangs on
contest. I'll just have to give mm5 a go and see if whatever problem that was
went away in the mean time.

Con

2003-03-12 12:15:41

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 10:19 PM 3/12/2003 +1100, Con Kolivas wrote:
>On Wed, 12 Mar 2003 21:37, Mike Galbraith wrote:
> > >Is this in addition to your previous errr hack or instead of?
> >
> > Instead of. The buttugly patch destroyed interactivity. This one cures
> > starvation, and interactivity is really nice.
>
>Ok that fixes the "getting stuck in process load" but it still hangs on
>contest. I'll just have to give mm5 a go and see if whatever problem that was
>went away in the mean time.

(%$&#!!)

Oh well, Ingo probably has it nailed already anyway.

-Mike

(but meanwhile, where's your website again?)

2003-03-13 01:11:32

by Con Kolivas

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

On Wed, 12 Mar 2003 23:30, Mike Galbraith wrote:
> At 10:19 PM 3/12/2003 +1100, Con Kolivas wrote:
> >On Wed, 12 Mar 2003 21:37, Mike Galbraith wrote:
> > > >Is this in addition to your previous errr hack or instead of?
> > >
> > > Instead of. The buttugly patch destroyed interactivity. This one
> > > cures starvation, and interactivity is really nice.
> >
> >Ok that fixes the "getting stuck in process load" but it still hangs on
> >contest. I'll just have to give mm5 a go and see if whatever problem that
> > was went away in the mean time.
>
> (%$&#!!)

No need to curse. Turns out this is an unrelated bug with the anticipatory
scheduler which akpm is onto. Your fix worked fine for the scheduler based
hang.

> Oh well, Ingo probably has it nailed already anyway.

> (but meanwhile, where's your website again?)

contest?
http://contest.kolivas.org

Con

2003-03-13 03:23:55

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 12:22 PM 3/13/2003 +1100, Con Kolivas wrote:
>On Wed, 12 Mar 2003 23:30, Mike Galbraith wrote:
> > At 10:19 PM 3/12/2003 +1100, Con Kolivas wrote:
> > >On Wed, 12 Mar 2003 21:37, Mike Galbraith wrote:
> > > > >Is this in addition to your previous errr hack or instead of?
> > > >
> > > > Instead of. The buttugly patch destroyed interactivity. This one
> > > > cures starvation, and interactivity is really nice.
> > >
> > >Ok that fixes the "getting stuck in process load" but it still hangs on
> > >contest. I'll just have to give mm5 a go and see if whatever problem that
> > > was went away in the mean time.
> >
> > (%$&#!!)
>
>No need to curse. Turns out this is an unrelated bug with the anticipatory
>scheduler which akpm is onto. Your fix worked fine for the scheduler based
>hang.

Nope (drat), not quite. I fixed the parse Mem: booboo, and see occasional
hangs doing complete irman test runs. The process load works fine, but the
other two loads will hang once in a while.

> > Oh well, Ingo probably has it nailed already anyway.
>
> > (but meanwhile, where's your website again?)
>
>contest?
>http://contest.kolivas.org

Yeah, thanks.

-Mike

2003-03-13 04:16:29

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.64-mm2->4 hangs on contest

At 04:39 AM 3/13/2003 +0100, Mike Galbraith wrote:
>At 12:22 PM 3/13/2003 +1100, Con Kolivas wrote:
>>On Wed, 12 Mar 2003 23:30, Mike Galbraith wrote:
>> > At 10:19 PM 3/12/2003 +1100, Con Kolivas wrote:
>> > >On Wed, 12 Mar 2003 21:37, Mike Galbraith wrote:
>> > > > >Is this in addition to your previous errr hack or instead of?
>> > > >
>> > > > Instead of. The buttugly patch destroyed interactivity. This one
>> > > > cures starvation, and interactivity is really nice.
>> > >
>> > >Ok that fixes the "getting stuck in process load" but it still hangs on
>> > >contest. I'll just have to give mm5 a go and see if whatever problem that
>> > > was went away in the mean time.
>> >
>> > (%$&#!!)
>>
>>No need to curse. Turns out this is an unrelated bug with the anticipatory
>>scheduler which akpm is onto. Your fix worked fine for the scheduler based
>>hang.
>
>Nope (drat), not quite. I fixed the parse Mem: booboo, and see occasional
>hangs doing complete irman test runs. The process load works fine, but
>the other two loads will hang once in a while.

Well shoot, that was easy to "fix". If you want to give it a try, set the
#if 0 to #if 1. Instead of _moving_ the decay to prevent high switch rate
tasks from getting too much boost, you need to add it instead. This may
not be the right fix, but it works for me. X still retains it's boost just
fine.

-Mike