2005-01-05 18:40:12

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Linux 2.6.10-rc3-bk15 hanged under high load (i386)

Hi all:

I have been experiencing kernel crashes for the last couple of weeks, and
now it has happened again, but this time I have some info, hope it helps.

* Problem: box freezes (no panic or other kind of messages in the logs),
under high load (loadavg up to 10, several "spamassassin" processes
doing their job).

* Description: the box just hangs, nothing gets printed in the logs (at the
time of the crash), SysRq+T, SysRq+M don't dump anything to disk (can't
verify if they get to the local console, because X was running when the
box hangs). However, some time (in the order of seconds, maybe a minute
or so) before the box freezes, I get the following in the logs:

Jan 5 19:09:52 dardhal kernel: swap_dup: Bad swap file entry 10000000
Jan 5 19:09:52 dardhal kernel: swap_free: Bad swap file entry 10000000
Jan 5 19:09:52 dardhal kernel: swap_free: Bad swap file entry 10000000
Jan 5 19:10:08 dardhal kernel: VM: killing process spamassassin
Jan 5 19:10:08 dardhal kernel: swap_free: Bad swap offset entry 00730000

(I noticed my incoming email was not being tagged as spam by
"spamassasin" any more, at least some of it ("spamassassin" configured
to have several copies running in parallel).

* When I noticed the box didn't respond, I tried several SysRq (T, M), and
then tried an emergency reboot via SysRq+S, SysRq+U, SysRq+B. It worked
as expected (root file system seemed to be synced to disk, unmounted,
and the box rebooted), at least upon reboot the root filesystem was
clean, as it should be.


I have searched for old kernel logs, to check if some other hang in the
past happened just after a message similar to the ones above. And in fact,
this is the case, but I am not sure if the following was the last time the
box crashed until today. What I know for sure is the following message
(and the crashed it happened just after it) was also with kernel version
2.6.10-rc3-bk15, as happened today.

Dec 29 20:41:38 dardhal kernel: swap_free: Bad swap offset entry 003f0000


Could it be just a defective sector under my swap partition?. Any help
greatly appreciated, please ask for more information if needed.

Greetings,

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.10-rc3-bk15)


Attachments:
(No filename) (2.24 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-01-05 20:10:29

by jurriaan

[permalink] [raw]
Subject: Re: Linux 2.6.10-rc3-bk15 hanged under high load (i386)

On Wed, Jan 05, 2005 at 07:39:48PM +0100, Jose Luis Domingo Lopez wrote:
> Hi all:
>
> 2.6.10-rc3-bk15
>
> Dec 29 20:41:38 dardhal kernel: swap_free: Bad swap offset entry 003f0000
>
Did you have memtest86 run some loops over your memory?

HTH,
Jurriaan

2005-01-05 20:58:36

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: Linux 2.6.10-rc3-bk15 hanged under high load (i386)

On Wednesday, 05 January 2005, at 20:04:40 +0100,
Jurriaan on adsl-gate wrote:

> Did you have memtest86 run some loops over your memory?
>
Yes, some time ago. I was having occasional lock-ups and suspected from
bad (or low quality) RAM, so I left memtest86 running for a day, and it
reported no problems at all. However, at this time I couln't find anything
in the logs.

Maybe it just another case of crappy hardware, that from time to time
seems to be more sensitive to load and starts giving errors more frequently.

Greetings,

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.10-rc3-bk15)


Attachments:
(No filename) (638.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments