2002-03-02 18:19:21

by Chris Rankin

[permalink] [raw]
Subject: NOW have 'D-state' processes in 2.4.17 !!!

Hi,

[Linux 2.4.17, SMP, devfs, 1.2 GB memory, compiled with gcc-2.95.3,
root partition using EXT3]

I upgraded to 2.4.18 a few days ago, but immediately downgraded
because I suddenly had lots of 'D-state' processes. Well I have now
produced a suspiciously-similar-looking D-state process using 2.4.17,
and I strongly suspect that either EXT3 or ALSA is somehow involved
because mounting my root partition as EXT3 and adding the latest CVS
ALSA modules are the only changes that I have made from my previous
reliable 2.4.17 setup.

The trace of the misbehaving process looks almost exactly like the
last trace from 2.4.18, except this time I have run it through
ksymoops:

Proc; wine
>>EIP; f6b2c780 <_end+36829cb4/38556534> <=====
Trace; c0105af4 <__down+6c/c8>
Trace; c0105c90 <__down_failed+8/c>
Trace; fb3297c6 <[snd-pcm].text.end+238/612>
Trace; fb323c0c <[snd-pcm]snd_pcm_playback_ioctl1+6c/340>
Trace; c0143474 <kill_fasync+2c/48>
Trace; c015a710 <ext3_get_block_handle+bc/2a8>
Trace; c015a710 <ext3_get_block_handle+bc/2a8>
Trace; c012eeae <__alloc_pages+32/164>
Trace; fb326214 <[snd-pcm]snd_pcm_hw_constraint_minmax+34/40>
Trace; fb3230d8 <[snd-pcm]snd_pcm_hw_constraints_complete+138/160>
Trace; fb3e72a0 <[snd-pcm-oss]snd_pcm_oss_open_file+100/220>
Trace; fb3e751e <[snd-pcm-oss]snd_pcm_oss_open+15e/270>
Trace; fb3e7540 <[snd-pcm-oss]snd_pcm_oss_open+180/270>
Trace; c013f600 <link_path_walk+6c0/850>
Trace; c013ead0 <vfs_permission+74/f0>
Trace; c0170b08 <devfs_open+b8/168>
Trace; fb324284 <[snd-pcm]snd_pcm_kernel_playback_ioctl+34/40>
Trace; fb3e6388 <[snd-pcm-oss]snd_pcm_oss_reset+18/50>
Trace; c01437a6 <sys_ioctl+1ba/214>
Trace; c0106dba <system_call+32/38>

Even more interestingly, this process was freed when I killed the
second wine process. This second process's trace looks like this:

Proc; wine
>>EIP; e0ce3c58 <_end+209e118c/38556534> <=====
Trace; c011388a <schedule_timeout+7a/9c>
Trace; c01137b0 <process_timeout+0/60>
Trace; fb3222b2 <[snd-pcm]snd_pcm_playback_drain+162/280>
Trace; fb323c5c <[snd-pcm]snd_pcm_playback_ioctl1+bc/340>
Trace; fb3304f2 <[snd-emu10k1]snd_emu10k1_capture_prepare+52/130>
Trace; fb330590 <[snd-emu10k1]snd_emu10k1_capture_prepare+f0/130>
Trace; fb322010 <[snd-pcm]snd_pcm_prepare+e0/1b0>
Trace; c01ffdd6 <__delay+12/28>
Trace; c01ffe44 <__const_udelay+28/34>
Trace; f88da65a <[eepro100]speedo_start_xmit+162/1f0>
Trace; c0162a5e <do_get_write_access+5f6/61c>
Trace; c0163d64 <__journal_file_buffer+e4/21c>
Trace; c016312c <journal_dirty_metadata+1a4/1cc>
Trace; c015cb9e <ext3_do_update_inode+2fa/398>
Trace; c015cc06 <ext3_do_update_inode+362/398>
Trace; c015d00e <ext3_mark_iloc_dirty+22/48>
Trace; c015d01e <ext3_mark_iloc_dirty+32/48>
Trace; c015d108 <ext3_mark_inode_dirty+28/34>
Trace; c015d1c2 <ext3_dirty_inode+ae/118>
Trace; c0148c12 <__mark_inode_dirty+2e/98>
Trace; c01577e8 <ext3_free_blocks+5a0/5ac>
Trace; c011fa74 <wake_up_parent+1c/30>
Trace; c011fb3a <do_notify_parent+b2/bc>
Trace; c0128984 <filemap_nopage+bc/1f8>
Trace; c01137a6 <reschedule_idle+25e/268>
Trace; fb324284 <[snd-pcm]snd_pcm_kernel_playback_ioctl+34/40>
Trace; c01c53ae <sock_def_wakeup+32/40>
Trace; fb3e64a6 <[snd-pcm-oss]snd_pcm_oss_sync+e6/180>
Trace; fb3e7648 <[snd-pcm-oss]snd_pcm_oss_release+18/80>
Trace; c01362b4 <fput+4c/e8>
Trace; c013514a <filp_close+aa/b4>
Trace; c01191f8 <put_files_struct+58/c0>
Trace; c01199ce <do_exit+12e/27c>
Trace; c0119b42 <sys_exit+e/10>
Trace; c0106dba <system_call+32/38>

Is any of this useful to anybody?
Cheers,
Chris


2002-03-02 18:24:51

by Robert Love

[permalink] [raw]
Subject: Re: NOW have 'D-state' processes in 2.4.17 !!!

On Sat, 2002-03-02 at 13:18, Chris Rankin wrote:

> [Linux 2.4.17, SMP, devfs, 1.2 GB memory, compiled with gcc-2.95.3,
> root partition using EXT3]
>
> I upgraded to 2.4.18 a few days ago, but immediately downgraded
> because I suddenly had lots of 'D-state' processes. Well I have now
> produced a suspiciously-similar-looking D-state process using 2.4.17,
> and I strongly suspect that either EXT3 or ALSA is somehow involved
> because mounting my root partition as EXT3 and adding the latest CVS
> ALSA modules are the only changes that I have made from my previous
> reliable 2.4.17 setup.
>
> The trace of the misbehaving process looks almost exactly like the
> last trace from 2.4.18, except this time I have run it through
> ksymoops:

Pretty clear from these traces it is ALSA - the tasks are going to sleep
on some ALSA method and are not waking up. Bug the ALSA people.

A good test would be to not use ALSA and see if it goes away.

> Proc; wine
> >>EIP; f6b2c780 <_end+36829cb4/38556534> <=====
> Trace; c0105af4 <__down+6c/c8>
> Trace; c0105c90 <__down_failed+8/c>
> Trace; fb3297c6 <[snd-pcm].text.end+238/612>
> Trace; fb323c0c <[snd-pcm]snd_pcm_playback_ioctl1+6c/340>
> Trace; c0143474 <kill_fasync+2c/48>
> Trace; c015a710 <ext3_get_block_handle+bc/2a8>
> Trace; c015a710 <ext3_get_block_handle+bc/2a8>
> Trace; c012eeae <__alloc_pages+32/164>
> Trace; fb326214 <[snd-pcm]snd_pcm_hw_constraint_minmax+34/40>
> Trace; fb3230d8 <[snd-pcm]snd_pcm_hw_constraints_complete+138/160>
> Trace; fb3e72a0 <[snd-pcm-oss]snd_pcm_oss_open_file+100/220>
> Trace; fb3e751e <[snd-pcm-oss]snd_pcm_oss_open+15e/270>
> Trace; fb3e7540 <[snd-pcm-oss]snd_pcm_oss_open+180/270>
> Trace; c013f600 <link_path_walk+6c0/850>
> Trace; c013ead0 <vfs_permission+74/f0>
> Trace; c0170b08 <devfs_open+b8/168>
> Trace; fb324284 <[snd-pcm]snd_pcm_kernel_playback_ioctl+34/40>
> Trace; fb3e6388 <[snd-pcm-oss]snd_pcm_oss_reset+18/50>
> Trace; c01437a6 <sys_ioctl+1ba/214>
> Trace; c0106dba <system_call+32/38>
>
> Even more interestingly, this process was freed when I killed the
> second wine process. This second process's trace looks like this:
>
> Proc; wine
> >>EIP; e0ce3c58 <_end+209e118c/38556534> <=====
> Trace; c011388a <schedule_timeout+7a/9c>
> Trace; c01137b0 <process_timeout+0/60>
> Trace; fb3222b2 <[snd-pcm]snd_pcm_playback_drain+162/280>
> Trace; fb323c5c <[snd-pcm]snd_pcm_playback_ioctl1+bc/340>
> Trace; fb3304f2 <[snd-emu10k1]snd_emu10k1_capture_prepare+52/130>
> Trace; fb330590 <[snd-emu10k1]snd_emu10k1_capture_prepare+f0/130>
> Trace; fb322010 <[snd-pcm]snd_pcm_prepare+e0/1b0>
> Trace; c01ffdd6 <__delay+12/28>
> Trace; c01ffe44 <__const_udelay+28/34>
> Trace; f88da65a <[eepro100]speedo_start_xmit+162/1f0>
> Trace; c0162a5e <do_get_write_access+5f6/61c>
> Trace; c0163d64 <__journal_file_buffer+e4/21c>
> Trace; c016312c <journal_dirty_metadata+1a4/1cc>
> Trace; c015cb9e <ext3_do_update_inode+2fa/398>
> Trace; c015cc06 <ext3_do_update_inode+362/398>
> Trace; c015d00e <ext3_mark_iloc_dirty+22/48>
> Trace; c015d01e <ext3_mark_iloc_dirty+32/48>
> Trace; c015d108 <ext3_mark_inode_dirty+28/34>
> Trace; c015d1c2 <ext3_dirty_inode+ae/118>
> Trace; c0148c12 <__mark_inode_dirty+2e/98>
> Trace; c01577e8 <ext3_free_blocks+5a0/5ac>
> Trace; c011fa74 <wake_up_parent+1c/30>
> Trace; c011fb3a <do_notify_parent+b2/bc>
> Trace; c0128984 <filemap_nopage+bc/1f8>
> Trace; c01137a6 <reschedule_idle+25e/268>
> Trace; fb324284 <[snd-pcm]snd_pcm_kernel_playback_ioctl+34/40>
> Trace; c01c53ae <sock_def_wakeup+32/40>
> Trace; fb3e64a6 <[snd-pcm-oss]snd_pcm_oss_sync+e6/180>
> Trace; fb3e7648 <[snd-pcm-oss]snd_pcm_oss_release+18/80>
> Trace; c01362b4 <fput+4c/e8>
> Trace; c013514a <filp_close+aa/b4>
> Trace; c01191f8 <put_files_struct+58/c0>
> Trace; c01199ce <do_exit+12e/27c>
> Trace; c0119b42 <sys_exit+e/10>
> Trace; c0106dba <system_call+32/38>
>
> Is any of this useful to anybody?

Robert Love

2002-03-02 19:50:28

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: NOW have 'D-state' processes in 2.4.17 !!!

On Sat, Mar 02, 2002 at 01:24:27PM -0500, Robert Love wrote:
> On Sat, 2002-03-02 at 13:18, Chris Rankin wrote:
>
> > [Linux 2.4.17, SMP, devfs, 1.2 GB memory, compiled with gcc-2.95.3,
> > root partition using EXT3]
> >
> > I upgraded to 2.4.18 a few days ago, but immediately downgraded
> > because I suddenly had lots of 'D-state' processes. Well I have now
> > produced a suspiciously-similar-looking D-state process using 2.4.17,
> > and I strongly suspect that either EXT3 or ALSA is somehow involved
> > because mounting my root partition as EXT3 and adding the latest CVS
> > ALSA modules are the only changes that I have made from my previous
> > reliable 2.4.17 setup.
> >
> > The trace of the misbehaving process looks almost exactly like the
> > last trace from 2.4.18, except this time I have run it through
> > ksymoops:
>
> Pretty clear from these traces it is ALSA - the tasks are going to sleep
> on some ALSA method and are not waking up. Bug the ALSA people.
>
> A good test would be to not use ALSA and see if it goes away.

Indeed.

Also please, don't call that 2.4.18 and 2.4.17 (like in subject and in a
earlier message you said 2.4.18 isn't going to be a keeper, you never
run 2.4.18 so you cannot say that). 2.4.18 is only this one:

ftp://ftp.kernel.org/pub/linux/kernel/linux-2.4.18.tar.gz

As soon as you apply a patch to it, it's not longer 2.4.18, it's
2.4.18+patch.

It is very important to get accurate feedback. Linux isn't a microkernel
architecture, anything you change in any kernel subsystem can lead to
destabilize the rest of the kernel completly and so we need to know
exactly what change you made to the kernel before we can debug it.

Andrea

2002-03-02 19:53:05

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: NOW have 'D-state' processes in 2.4.17 !!!

On Sat, Mar 02, 2002 at 01:24:27PM -0500, Robert Love wrote:
> On Sat, 2002-03-02 at 13:18, Chris Rankin wrote:
>
> > [Linux 2.4.17, SMP, devfs, 1.2 GB memory, compiled with gcc-2.95.3,
> > root partition using EXT3]
> >
> > I upgraded to 2.4.18 a few days ago, but immediately downgraded
> > because I suddenly had lots of 'D-state' processes. Well I have now
> > produced a suspiciously-similar-looking D-state process using 2.4.17,
> > and I strongly suspect that either EXT3 or ALSA is somehow involved
> > because mounting my root partition as EXT3 and adding the latest CVS
> > ALSA modules are the only changes that I have made from my previous
> > reliable 2.4.17 setup.
> >
> > The trace of the misbehaving process looks almost exactly like the
> > last trace from 2.4.18, except this time I have run it through
> > ksymoops:
>
> Pretty clear from these traces it is ALSA - the tasks are going to sleep
> on some ALSA method and are not waking up. Bug the ALSA people.
>
> A good test would be to not use ALSA and see if it goes away.

Indeed.

Also please, don't call that 2.4.18 and 2.4.17 (like in subject and in a
earlier message you said 2.4.18 isn't going to be a keeper, you never
run 2.4.18 so you cannot say that). 2.4.18 is only this one:

ftp://ftp.kernel.org/pub/linux/kernel/linux-2.4.18.tar.gz

As soon as you apply a patch to it, it's not longer 2.4.18, it's
2.4.18+patch.

It is very important to get accurate feedback. Linux isn't a microkernel
architecture, anything you change in any kernel subsystem can lead to
destabilize the rest of the kernel completly and so we need to know
exactly what change you made to the kernel before we can debug it.

Andrea