2005-11-25 11:12:14

by Tarkan Erimer

[permalink] [raw]
Subject: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Hi,

I'm having some strange software/package compile problem under Gentoo
and kernels with 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2. When I
install/update a package via emerge, I got occasional hangs at compile
time. When this happenned, system continues to work. No error
messages, no interruption. Just the compile process hangs. Killing
this hanged process is impossible. Immediately, it becomes Zombie
process. Also, Reboot and poweroff hangs, too. Just hard
reboot/poweroff solves it. I've never had this problem under 2.6.14
and downwards.
My ver_linux is attached.

PS: I found a way to reproduce this; installing/updating "man-pages"
package under Gentoo always hangs.


Regards.


Attachments:
(No filename) (682.00 B)
ver_linux.out (1.18 kB)
Download all attachments

2005-11-25 11:19:28

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On Fri, 2005-11-25 at 11:12 +0000, Tarkan Erimer wrote:
> Hi,
>
> I'm having some strange software/package compile problem under Gentoo
> and kernels with 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2. When I
> install/update a package via emerge, I got occasional hangs at compile
> time. When this happenned, system continues to work. No error
> messages, no interruption. Just the compile process hangs. Killing
> this hanged process is impossible. Immediately, it becomes Zombie
> process. Also, Reboot and poweroff hangs, too. Just hard
> reboot/poweroff solves it. I've never had this problem under 2.6.14
> and downwards.
> My ver_linux is attached.
>
> PS: I found a way to reproduce this; installing/updating "man-pages"
> package under Gentoo always hangs.

what is probably needed to diagnose this is that you do a

echo "t" > /proc/sysrq-trigger

and then find the process that hangs in that... and send it to this
list.


2005-11-27 11:17:26

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 11/25/05, Arjan van de Ven <[email protected]> wrote:
> what is probably needed to diagnose this is that you do a
>
> echo "t" > /proc/sysrq-trigger
>
> and then find the process that hangs in that... and send it to this
> list.

Previously; In 2.6.15-rc2 kernel debug is not enabled. I enabled
kernel debug and tried software compiling to reproduce the problem.
But this time, I got hard system lock ups instead of previous process
hangs. I attached my log file. Hope this helps to diagnose this
problem.

Regards.


Attachments:
(No filename) (521.00 B)
syslog.bz2 (14.27 kB)
Download all attachments

2005-11-27 11:58:25

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On Sun, 2005-11-27 at 11:17 +0000, Tarkan Erimer wrote:
> On 11/25/05, Arjan van de Ven <[email protected]> wrote:
> > what is probably needed to diagnose this is that you do a
> >
> > echo "t" > /proc/sysrq-trigger
> >
> > and then find the process that hangs in that... and send it to this
> > list.
>
> Previously; In 2.6.15-rc2 kernel debug is not enabled. I enabled
> kernel debug and tried software compiling to reproduce the problem.
> But this time, I got hard system lock ups instead of previous process
> hangs. I attached my log file. Hope this helps to diagnose this
> proble

which process again was hanging?

(and maybe also post an lsmod output just to get an idea of which
modules/drivers are in play)

2005-11-27 19:08:25

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Hi again,

On 11/27/05, Arjan van de Ven <[email protected]> wrote:
> which process again was hanging?
>
> (and maybe also post an lsmod output just to get an idea of which
> modules/drivers are in play)

I investigated the issue a bit more. When the system under heavy load
(%100 cpu usage), hard lock up/complete freeze occures a few minutes later.
For example, compiling a software or issuing "updatedb" causes this.
At this point, system responds anything. I just hardly caught this
event while freezing:

~#>echo "t" > /proc/sysrq-trigger

[ 849.651134 ] SysRq : Show State

My syslog and lsmod output attached


Regards


Attachments:
(No filename) (631.00 B)
lsmod.out (1.96 kB)
syslog.bz2 (25.35 kB)
Download all attachments

2005-11-27 19:12:32

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On Sun, 2005-11-27 at 19:08 +0000, Tarkan Erimer wrote:
> Hi again,
>
> On 11/27/05, Arjan van de Ven <[email protected]> wrote:
> > which process again was hanging?
> >
> > (and maybe also post an lsmod output just to get an idea of which
> > modules/drivers are in play)
>
> I investigated the issue a bit more. When the system under heavy load
> (%100 cpu usage), hard lock up/complete freeze occures a few minutes later.
hmm this in theory could also be a thermal issue...


2005-11-27 19:18:39

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 11/27/05, Arjan van de Ven <[email protected]> wrote:
> hmm this in theory could also be a thermal issue...

By the way, I'm using IBM R40 machine. Maybe it should be a thermal
issue, as you mentioned. But interestingly, this issue never happens
with 2.6.14 and downwards.

2005-11-28 00:57:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Tarkan Erimer <[email protected]> wrote:
>
> My syslog and lsmod output attached

XFS went nuts. Please test the latest git snapshot which has fixes for
this.

2005-11-29 21:57:49

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 11/28/05, Andrew Morton <[email protected]> wrote:
> XFS went nuts. Please test the latest git snapshot which has fixes for
> this.

I tried 2.6.15-rc2-git6 and just released 2.6.15-rc3. Result is same.
I still got occasional hangs. When I check my syslog, I found no error
messages. But notice, XFS related errors have gone. I paste last few
lines of my syslog.

----syslog ----
Nov 29 23:22:43 hightemple kernel: [ 518.648894] NTFS-fs warning
(device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x516d.
Nov 29 23:22:54 hightemple kernel: [ 529.059660] printk: 36 messages
suppressed.
Nov 29 23:22:54 hightemple kernel: [ 529.059669] NTFS-fs error
(device hda1): ntfs_ucstonls(): Unicode name contains characters that
cannot be converted to character set iso8859-1. You might want to try
to use the mount option nls=utf8.
Nov 29 23:22:54 hightemple kernel: [ 529.059676] NTFS-fs warning
(device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x57db.
Nov 29 23:23:57 hightemple gconfd (root-11625): starting (version
2.12.1), pid 11625 user 'root'
Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only
configuration source at position 0
Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
"xml:readwrite:/root/.gconf" to a writable configuration source at
position 1
Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only
configuration source at position 2
Nov 29 23:41:57 hightemple syslogd 1.4.1: restart.
----syslog----


Regards

2005-11-29 22:09:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Tarkan Erimer <[email protected]> wrote:
>
> On 11/28/05, Andrew Morton <[email protected]> wrote:
> > XFS went nuts. Please test the latest git snapshot which has fixes for
> > this.
>
> I tried 2.6.15-rc2-git6 and just released 2.6.15-rc3. Result is same.
> I still got occasional hangs.

Please generate the sysrq-T trace when the system hangs.

> When I check my syslog, I found no error
> messages. But notice, XFS related errors have gone.

OK, we might have fixed XFS.

> I paste last few
> lines of my syslog.
>
> ----syslog ----
> Nov 29 23:22:43 hightemple kernel: [ 518.648894] NTFS-fs warning
> (device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x516d.
> Nov 29 23:22:54 hightemple kernel: [ 529.059660] printk: 36 messages
> suppressed.
> Nov 29 23:22:54 hightemple kernel: [ 529.059669] NTFS-fs error
> (device hda1): ntfs_ucstonls(): Unicode name contains characters that
> cannot be converted to character set iso8859-1. You might want to try
> to use the mount option nls=utf8.
> Nov 29 23:22:54 hightemple kernel: [ 529.059676] NTFS-fs warning
> (device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x57db.

Anton is the man.

> Nov 29 23:23:57 hightemple gconfd (root-11625): starting (version
> 2.12.1), pid 11625 user 'root'
> Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only
> configuration source at position 0
> Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> "xml:readwrite:/root/.gconf" to a writable configuration source at
> position 1
> Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only
> configuration source at position 2

I assume the above isn't kernel-related?

2005-11-30 13:53:46

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Hi,

On Tue, 29 Nov 2005, Andrew Morton wrote:
> Tarkan Erimer <[email protected]> wrote:
> > On 11/28/05, Andrew Morton <[email protected]> wrote:
> > > XFS went nuts. Please test the latest git snapshot which has fixes for
> > > this.
> >
> > I tried 2.6.15-rc2-git6 and just released 2.6.15-rc3. Result is same.
> > I still got occasional hangs.
>
> Please generate the sysrq-T trace when the system hangs.
>
> > When I check my syslog, I found no error
> > messages. But notice, XFS related errors have gone.
>
> OK, we might have fixed XFS.
>
> > I paste last few
> > lines of my syslog.
> >
> > ----syslog ----
> > Nov 29 23:22:43 hightemple kernel: [ 518.648894] NTFS-fs warning
> > (device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x516d.
> > Nov 29 23:22:54 hightemple kernel: [ 529.059660] printk: 36 messages
> > suppressed.
> > Nov 29 23:22:54 hightemple kernel: [ 529.059669] NTFS-fs error
> > (device hda1): ntfs_ucstonls(): Unicode name contains characters that
> > cannot be converted to character set iso8859-1. You might want to try
> > to use the mount option nls=utf8.
> > Nov 29 23:22:54 hightemple kernel: [ 529.059676] NTFS-fs warning
> > (device hda1): ntfs_filldir(): Skipping unrepresentable inode 0x57db.
>
> Anton is the man.

Yes. (-:

These just means that you have mounted with a bad default code page or
whatever you want to call it and the ntfs volume contains characters
whethe the Unicode (i.e. NTFS) to your code page conversion fails (NLS
conversion returns error due to non-existant character in your code page).
As the message suggests if you adjust your mount options to include the
"nls=utf8" option the errors will go away and everything will work except
maybe your terminal/gui may dislay some garbage characters if it does not
understand utf8 characters but at least you will see all
files/directories.

> > Nov 29 23:23:57 hightemple gconfd (root-11625): starting (version
> > 2.12.1), pid 11625 user 'root'
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only
> > configuration source at position 0
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readwrite:/root/.gconf" to a writable configuration source at
> > position 1
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only
> > configuration source at position 2
>
> I assume the above isn't kernel-related?

Correct. That is just Gnome and in particular gconf stuff (i.e. the Gnome
daemon providing access to the Gnome version of the Windows registry)...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2005-12-01 21:04:55

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 11/30/05, Anton Altaparmakov <[email protected]> wrote:
> Yes. (-:
>
> These just means that you have mounted with a bad default code page or
> whatever you want to call it and the ntfs volume contains characters
> whethe the Unicode (i.e. NTFS) to your code page conversion fails (NLS
> conversion returns error due to non-existant character in your code page).
> As the message suggests if you adjust your mount options to include the
> "nls=utf8" option the errors will go away and everything will work except
> maybe your terminal/gui may dislay some garbage characters if it does not
> understand utf8 characters but at least you will see all
> files/directories.

Hi,

I mounted with "nls=utf8" option as you mentioned and all the related
error messages disappeared. Thanks :)

Regards

2005-12-01 21:05:53

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 11/29/05, Andrew Morton <[email protected]> wrote:
> Tarkan Erimer <[email protected]> wrote:
> >
> > On 11/28/05, Andrew Morton <[email protected]> wrote:
> > > XFS went nuts. Please test the latest git snapshot which has fixes for
> > > this.
> >
> > I tried 2.6.15-rc2-git6 and just released 2.6.15-rc3. Result is same.
> > I still got occasional hangs.
>
> Please generate the sysrq-T trace when the system hangs.

I tried sysrq-T trace. But, When hit the bug, system completely freezes.
Alt+sysrq+t (Normally Alt+sysrq+t works perfectly) or any other
combination does not respond. Is there any other way to trace this?
Also, I will try just-released 2.6.15-rc4 and let know the result.

> > When I check my syslog, I found no error
> > messages. But notice, XFS related errors have gone.
>
> OK, we might have fixed XFS.

Yes, thanks

> > Nov 29 23:23:57 hightemple gconfd (root-11625): starting (version
> > 2.12.1), pid 11625 user 'root'
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only
> > configuration source at position 0
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readwrite:/root/.gconf" to a writable configuration source at
> > position 1
> > Nov 29 23:23:57 hightemple gconfd (root-11625): Resolved address
> > "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only
> > configuration source at position 2
>
> I assume the above isn't kernel-related?

Yes, above is not related to the kernel. It is a Gnome thing.

2005-12-02 12:49:59

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

On 12/1/05, Tarkan Erimer <[email protected]> wrote:
> On 11/29/05, Andrew Morton <[email protected]> wrote:
> > Please generate the sysrq-T trace when the system hangs.
>
> I tried sysrq-T trace. But, When hit the bug, system completely freezes.
> Alt+sysrq+t (Normally Alt+sysrq+t works perfectly) or any other
> combination does not respond. Is there any other way to trace this?
> Also, I will try just-released 2.6.15-rc4 and let know the result.

Today, I tried 2.6.15-rc4. The result is same. Still hangs,
Alt+sysrq+t does not respond and there is nothing related to the issue
in syslog.

2005-12-05 14:30:26

by Tarkan Erimer

[permalink] [raw]
Subject: Re: [BUG]: Software compiling occasionlly hangs under 2.6.15-rc1/rc2 and 2.6.15-rc1-mm2

Hi again,

On 12/2/05, Tarkan Erimer <[email protected]> wrote:
> Today, I tried 2.6.15-rc4. The result is same. Still hangs,
> Alt+sysrq+t does not respond and there is nothing related to the issue
> in syslog.
>

I also tried 2.6.15-rc5. This time things are a bit different. Still,
the bug exist.
Differently, I can do Alt+sysrq+t when hanged. It works (I can see the
sysrq t initiated message on console) now. But, The call trace events
do not appear in syslog. Normally, it appears in syslog.


Regards