Hello,
We are experiencing some lockup problems with our SMP configuration.
Here are the details :
- The computers lockup with no relevant logs.
- The kernel still replies to ping but higher level services are not
responding.
- After few hours (5-8), the kernel answers again and the load is around
40 then decreasing.
We manage to get some SysRq showPc output (screenshot :
http://www.elonex.ch/shot/)
According to the basic sysreq debugging, the problem seems to be related
to the function flush_tlb_all, and it is triggered with a write or read
(local or on nfs sometimes).
I looked at the LKML, and didn't find any known issues.
Maybe it has been corrected but not backported by redhat !
I'll appreciate any help.
Thank you in advance.
detailed configuration :
---------------
Processor : 2 x 2.8Ghz Pentium Xeon
Motherboard : Intel se7501cw2
Memory : 4 x 512MB DDR 266 ECC registered
Kernel : 2.4.20-31 (Redhat 7.3 with updates)
PLEASE CC the answers/comments
--
Thomas OULEVEY System Engineer
Elonex Switzerland Email: [email protected]
Switzerland
On Wed, Nov 03, 2004 at 12:36:47PM +0100, Thomas Oulevey wrote:
> Hello,
>
> We are experiencing some lockup problems with our SMP configuration.
> Here are the details :
> - The computers lockup with no relevant logs.
> - The kernel still replies to ping but higher level services are not
> responding.
> - After few hours (5-8), the kernel answers again and the load is around
> 40 then decreasing.
>
> We manage to get some SysRq showPc output (screenshot :
> http://www.elonex.ch/shot/)
> According to the basic sysreq debugging, the problem seems to be related
> to the function flush_tlb_all, and it is triggered with a write or read
> (local or on nfs sometimes).
>
> I looked at the LKML, and didn't find any known issues.
> Maybe it has been corrected but not backported by redhat !
> I'll appreciate any help.
>
> Thank you in advance.
>
> detailed configuration :
> ---------------
> Processor : 2 x 2.8Ghz Pentium Xeon
> Motherboard : Intel se7501cw2
> Memory : 4 x 512MB DDR 266 ECC registered
> Kernel : 2.4.20-31 (Redhat 7.3 with updates)
You should report this one to the RH people, but I think RH 7.3
isnt support anymore?
Upgrading the kernel is a good idea.