2008-08-05 18:47:58

by Michael Madore

[permalink] [raw]
Subject: task blocked for more than 120 seconds (2.6.26.1)

Hi,

Running 2.6.26.1 I am receiving errors similar to the following when
stress testing systems:

INFO: task kjournald:1004 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald D ffff81013d347000 0 1004 2
ffff81013cce9de0 0000000000000046 ffff81013cce9d90 ffffffff810478c5
ffff810001069378 ffff81013cdd2cc0 ffff810120454320 ffff81013cdd3010
ffff81013cce9de0 ffffffff81048096 ffff81013cce9db0 0000000000000246
Call Trace:
[<ffffffff810478c5>] ? enqueue_hrtimer+0xd1/0xdf
[<ffffffff81048096>] ? hrtimer_start+0x122/0x144
[<ffffffffa0023be3>] :jbd:journal_commit_transaction+0xe9/0xd1e
[<ffffffff8103bb2d>] ? lock_timer_base+0x26/0x4a
[<ffffffff81045534>] ? autoremove_wake_function+0x0/0x38
[<ffffffff8103bba7>] ? try_to_del_timer_sync+0x56/0x62
[<ffffffffa002713e>] :jbd:kjournald+0xc3/0x1fb
[<ffffffff81045534>] ? autoremove_wake_function+0x0/0x38
[<ffffffffa002707b>] ? :jbd:kjournald+0x0/0x1fb
[<ffffffff810453ff>] kthread+0x49/0x76
[<ffffffff8100cd08>] child_rip+0xa/0x12
[<ffffffff810453b6>] ? kthread+0x0/0x76
[<ffffffff8100ccfe>] ? child_rip+0x0/0x12

This is on a 16 core Intel system with a single SATA disk and 4GB of
memory. This particular instance of the error occurred after about an
hour.

I was previously using 2.6.24.7 from Fedora 8 and that did not exhibit
the problem.

Mike Madore


2009-12-17 19:54:47

by Michael Madore

[permalink] [raw]
Subject: Re: task blocked for more than 120 seconds (2.6.26.1)

Hi,

I am experiencing a strange problem running linux on the Intel DP55WG
motherboard. Installation goes fine, but when the system is warm
booted, I get the following error message from the BIOS POST:

"The system BIOS has detected unsuccessful POST attempt(s). Possible
causes include recent changes to BIOS Performance options or recent
hardware changes. Press 'Y' to enter BIOS Setup or press 'N' to cancel
and attempt to boot with previous settings."

Selecting either option allows the system to boot, but the same error
will be displayed on the next warm boot. If the system is powered off
and then on instead, the error is not displayed.

When the error is displayed, the port 80h POST code is 69. According
the the motherboard documentation:

Boot Device Selection (BDS): 60-6F BDS driver entry

The details of the system:

Intel DP55WG motherboard
BIOS tested: 3206, 3878
Memory tested: 4G and 8G
CPU: Core i7 870 @ 2.93GHz

Hard disks tested:
- Western Digital Model WD20EADS (2 TB)
- Seagate Model ST380811AS (80GB)
- 3Ware 9650SE-4LPML with 4 x Seagate Model ST31500341AS (1.5TB)

Linux distributions tested:
- RHEL 5.3
- RHEL 5.4
- Fedora 10
- Fedora 11
- Fedora 12

In addition, I have tested 2.6.32 from kernel.org and today's git.

Here is a posting on the Intel community forums from a user with the
same problem:

http://communities.intel.com/message/71164

I attempted to disable the Failsafe Watchdog as suggested in that
thread, but then the kernel gets stuck during reboot and the system
has to be powered off.

Normally I would try to bisect, but I haven't found a kernel yet that
will boot on this board and doesn't exhibit the behaviour.

Thanks,

Mike Madore