2003-03-19 09:38:36

by Philippe Gramoullé

[permalink] [raw]
Subject: Hard freeze with 2.5.65-mm1


Hi,

I experienced a hard freeze one minute or so after starting X and few apps ( xmms,gnomeicu, pan,few Eterms,..)

Machine is 1.5Ghz SMP DELL MT530 WS. Debian unstable. vanilla 2.5.65+mm-1 patches only.

modules loaded are : usbcore, uhci-hcd, hid, soundcore, ac97_codec and emu10k1.

.config is below.

Thanks,

Philippe

CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y

CONFIG_EXPERIMENTAL=y

CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15

CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_KMOD=y

CONFIG_X86_BIGSMP=y
CONFIG_MPENTIUMIII=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_PREFETCH=y
CONFIG_SMP=y
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_NR_CPUS=2
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_NOHIGHMEM=y
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y

CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y

CONFIG_KCORE_ELF=y
CONFIG_BINFMT_ELF=y

CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_CML1=y
CONFIG_PARPORT_PC_FIFO=y
CONFIG_PARPORT_1284=y

CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y

CONFIG_IDE=y

CONFIG_BLK_DEV_IDE=y

CONFIG_BLK_DEV_IDECD=y


CONFIG_SCSI=y

CONFIG_BLK_DEV_SD=y

CONFIG_SCSI_REPORT_LUNS=y
CONFIG_SCSI_CONSTANTS=y

CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=5000
CONFIG_AIC7XXX_DEBUG_MASK=0


CONFIG_IEEE1394=m

CONFIG_IEEE1394_OUI_DB=y


CONFIG_IEEE1394_OHCI1394=m

CONFIG_IEEE1394_VIDEO1394=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
CONFIG_IEEE1394_CMP=m
CONFIG_IEEE1394_AMDTP=m


CONFIG_NET=y

CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETFILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y

CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m

CONFIG_IPV6_SCTP__=y

CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_CSZ=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_POLICE=y

CONFIG_NETDEVICES=y


CONFIG_NET_ETHERNET=y
CONFIG_MII=y

CONFIG_NET_PCI=y
CONFIG_E100=y

CONFIG_INPUT=y

CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768

CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y

CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y

CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y

CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y

CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_PRINTER=y

CONFIG_RTC=y

CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=m
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_JBD=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_REISERFS_PROC_INFO=y
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=y
CONFIG_XFS_QUOTA=y
CONFIG_QUOTACTL=y

CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_UDF_FS=y

CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_NTFS_FS=y

CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y

CONFIG_CRAMFS=y

CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SMB_FS=y
CONFIG_SUNRPC=y

CONFIG_MSDOS_PARTITION=y
CONFIG_SMB_NLS=y
CONFIG_NLS=y

CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_850=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_15=y
CONFIG_NLS_UTF8=y


CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y

CONFIG_SOUND=m


CONFIG_SOUND_PRIME=m
CONFIG_SOUND_EMU10K1=m

CONFIG_USB=m

CONFIG_USB_DEVICEFS=y

CONFIG_USB_EHCI_HCD=m
CONFIG_USB_UHCI_HCD=m


CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT=y


CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_X86_EXTRA_IRQS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y

CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y

CONFIG_ZLIB_INFLATE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y


2003-03-19 10:08:31

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1


Philippe Gramoull? <[email protected]> writes:

> Hi,
>
> [SNIPPED HANG]
>

Did your machine have any disk activity? I had a very similar thing
happen, but could still hear my disk move.

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-19 12:20:23

by Philippe Gramoullé

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1


Hi Alexander,

On 19 Mar 2003 11:19:32 +0100
Alexander Hoogerhuis <[email protected]> wrote:


|
| Philippe Gramoull? <[email protected]> writes:
|
| > Hi,
| >
| > [SNIPPED HANG]
| >
|
| Did your machine have any disk activity? I had a very similar thing
| happen, but could still hear my disk move.

Hmm, i don't really remember, but i don't think it was the case though.

Right now, i rebooted into 2.5.65-mm1 and put a serial console. I tried to reproduce the freeze
by doing approx. the same things: Right now the system has stable ( 2 hours uptime ) :)

Still, i had Pan hogging both CPUs for i don't know the reason, and i did experience
some xmms skips.

Thanks,

Philippe

2003-03-19 17:13:46

by Philippe Gramoullé

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1


Hi,

My workstation froze hard again , this time after few hours.

There was no disk activiy and nothing on the serial console :-(

The machine didn't freeze even even when i ran several program et the same time
(opera, mozilla, sylpheed, pan,etc...) but froze as the box was almost idle.

Thanks,

Philippe

On 19 Mar 2003 11:19:32 +0100
Alexander Hoogerhuis <[email protected]> wrote:

|
| Philippe Gramoull? <[email protected]> writes:
|
| > Hi,
| >
| > [SNIPPED HANG]
| >
|
| Did your machine have any disk activity? I had a very similar thing
| happen, but could still hear my disk move.
|
| mvh,
| A

2003-03-19 18:04:14

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1

I've had I/O stall a few times while watching movies, but only the
mplayer process hung, and I could break it off and restart and it
woudl fun again for a few minutes.

This could all be due to the fact that my laptop rejecting "Married
with children" on grounds of being too stupid? >:)

mvh,
A

Philippe Gramoull? <[email protected]> writes:

> Hi,
>
> My workstation froze hard again , this time after few hours.
>
> There was no disk activiy and nothing on the serial console :-(
>
> The machine didn't freeze even even when i ran several program et the same time
> (opera, mozilla, sylpheed, pan,etc...) but froze as the box was almost idle.
>
> Thanks,
>
> Philippe
>
> On 19 Mar 2003 11:19:32 +0100
> Alexander Hoogerhuis <[email protected]> wrote:
>
> |
> | Philippe Gramoull? <[email protected]> writes:
> |
> | > Hi,
> | >
> | > [SNIPPED HANG]
> | >
> |
> | Did your machine have any disk activity? I had a very similar thing
> | happen, but could still hear my disk move.
> |
> | mvh,
> | A

--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-19 20:08:20

by Andrew Morton

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1

Alexander Hoogerhuis <[email protected]> wrote:
>
> I've had I/O stall a few times while watching movies, but only the
> mplayer process hung, and I could break it off and restart and it
> woudl fun again for a few minutes.

This is a bug in the new nanosleep code. mplayer asks the kernel for a 50
millisecond sleep and the kernel gives it a two month sleep instead.

Please set INITIAL_JIFFIES to zero and retest.

With what compiler are you building your kernels?

2003-03-19 20:40:18

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1

Andrew Morton <[email protected]> writes:

> Alexander Hoogerhuis <[email protected]> wrote:
> >
> > I've had I/O stall a few times while watching movies, but only the
> > mplayer process hung, and I could break it off and restart and it
> > woudl fun again for a few minutes.
>
> This is a bug in the new nanosleep code. mplayer asks the kernel for a 50
> millisecond sleep and the kernel gives it a two month sleep instead.
>
> Please set INITIAL_JIFFIES to zero and retest.
>
> With what compiler are you building your kernels?
>

>From Gentoo's unstable "branch":

alexh@lapper ~/src/linux/linux-2.5.65-mm2 $ gcc -v
Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.2/specs
Configured with: /var/tmp/portage/gcc-3.2.2-r1/work/gcc-3.2.2/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.2 --includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.2/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.2 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.2/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.2/info --enable-shared --host=i686-pc-linux-gnu --target=i686-pc-linux-gnu
--with-system-zlib --enable-languages=c,c++,ada,f77,objc,java --enable-threads=posix --enable-long-long --disable-checking --enable-cstdio=stdio --enable-clocale=generic --enable-__cxa_atexit --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.2/include/g++-v3 --with-local-prefix=/usr/local --enable-shared --enable-nls --without-included-gettext
Thread model: posix
gcc version 3.2.2
alexh@lapper ~/src/linux/linux-2.5.65-mm2 $ ld -v
GNU ld version 2.13.90.0.18 20030206
alexh@lapper ~/src/linux/linux-2.5.65-mm2 $

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-19 22:31:27

by Philippe Gramoullé

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1


Hi,

Just for info, here are my gcc and ld versions:

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/3.2.3/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,proto,pascal,objc,ada --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.2 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm --enable-objc-gc i386-linux
Thread model: posix
gcc version 3.2.3 20030309 (Debian prerelease)
$ ld -v
GNU ld version 2.13.90.0.18 20030121 Debian GNU/Linux

Thanks,

Philippe

On Wed, 19 Mar 2003 12:19:09 -0800
Andrew Morton <[email protected]> wrote:

| With what compiler are you building your kernels?

2003-03-20 00:53:11

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1

Alexander Hoogerhuis <[email protected]> wrote:
>> I've had I/O stall a few times while watching movies, but only the
>> mplayer process hung, and I could break it off and restart and it
>> woudl fun again for a few minutes.

On Wed, Mar 19, 2003 at 12:19:09PM -0800, Andrew Morton wrote:
> This is a bug in the new nanosleep code. mplayer asks the kernel for a 50
> millisecond sleep and the kernel gives it a two month sleep instead.
> Please set INITIAL_JIFFIES to zero and retest.
> With what compiler are you building your kernels?

Just hit it with xmms:

$ less /proc/1284/wchan
sys_rt_sigsuspend
$ less /proc/1285/wchan
schedule_timeout
$ less /proc/1286/wchan
schedule_timeout
$ less /proc/16656/wchan
do_clock_nanosleep
$ less /proc/16657/wchan
do_clock_nanosleep

kill -STOP `pidof xmms` ; kill -CONT `pidof xmms` gets it unstuck so
it's not lethal, but still...


-- wli

2003-03-20 01:00:02

by Andrew Morton

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1

William Lee Irwin III <[email protected]> wrote:
>
> $ less /proc/16657/wchan
> do_clock_nanosleep

There is a bug in do_clock_nanosleep(). I can reproduce it. I'll fix it up
later today.

2003-03-20 16:50:36

by Philippe Gramoullé

[permalink] [raw]
Subject: Re: Hard freeze with 2.5.65-mm1


Just a side note.

I rebooted with elevator=deadline this morning and i didn't experienced any
hard freeze since. Uptime is about 8 hours now ( both former freezes were with
elevator=as)

No xmms audio skips, very good overall feel, very good responsiveness
(openoffice,mozilla, news feed update,etc..)

Thanks,

Philippe


On Wed, 19 Mar 2003 19:15:36 -0800
Andrew Morton <[email protected]> wrote:

| William Lee Irwin III <[email protected]> wrote:
| >
| > $ less /proc/16657/wchan
| > do_clock_nanosleep
|
| There is a bug in do_clock_nanosleep(). I can reproduce it. I'll fix it up
| later today.
|
| -
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to [email protected]
| More majordomo info at http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at http://www.tux.org/lkml/
|