Hello all,
Anybody found this problem before? I kept hitting this issue for 2.6.31
guest kernel even with a simple network test.
INFO: task kjournal:337 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_sec" disables this message.
kjournald D 00000041 0 337 2 0x00000000
My test is totally being blocked.
Thanks
Shirley
On Wed, Sep 30, 2009 at 02:11:52PM -0700, Shirley Ma wrote:
> Hello all,
>
> Anybody found this problem before? I kept hitting this issue for 2.6.31
> guest kernel even with a simple network test.
>
> INFO: task kjournal:337 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_sec" disables this message.
>
> kjournald D 00000041 0 337 2 0x00000000
>
> My test is totally being blocked.
I've hit this in the past with ext3, mounting with data=writeback made it
disappear.
On Thu, 2009-10-01 at 10:20 -0300, Marcelo Tosatti wrote:
> I've hit this in the past with ext3, mounting with data=writeback made
> it
> disappear.
Thanks. I will make a try. Someone should fix this.
Shirley
I talked to Mingming, she suggested to use different IO scheduler. The
default scheduler is cfg, after I switch to noop, the problem is gone.
So there seems a bug in cfg scheduler. It's easily reproduced it when
running the guest kernel, so far I haven't hit this problem on the host
side.
If I need to file a bug for some one to look at, please let me know.
Thanks
Shirley
On Thu, Oct 1, 2009 at 4:00 PM, Shirley Ma <[email protected]> wrote:
> I talked to Mingming, she suggested to use different IO scheduler. The
> default scheduler is cfg, after I switch to noop, the problem is gone.
deadline is the most recommended one for virtualization hosts. some
distros set it as default if you select Xen or KVM at installation
time. (and noop for the guests)
--
Javier
On Thu, 2009-10-01 at 16:03 -0500, Javier Guerra wrote:
> deadline is the most recommended one for virtualization hosts. some
> distros set it as default if you select Xen or KVM at installation
> time. (and noop for the guests)
I spoke too earlier, after a while noop scheduler hit the same issue. I
am switching to deadline to test it again.
Thanks
Shirley
Switching to different scheduler doesn't make the problem gone away.
Shirley
On 09/30/09 14:11, Shirley Ma wrote:
> Anybody found this problem before? I kept hitting this issue for 2.6.31
> guest kernel even with a simple network test.
>
> INFO: task kjournal:337 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_sec" disables this message.
>
> kjournald D 00000041 0 337 2 0x00000000
>
> My test is totally being blocked.
I'm assuming from the lists you've posted to that this is under KVM?
What disk drivers are you using (virtio or emulated)?
Can you get a full stack backtrace of kjournald?
Kevin Bowling submitted a RH bug against Xen with apparently the same
symptoms (https://bugzilla.redhat.com/show_bug.cgi?id=526627). I'm
wondering if there's a core kernel bug here, which is perhaps more
easily triggered by the changed timing in a virtual machine.
Thanks,
J
On Fri, 2009-10-02 at 11:30 -0700, Jeremy Fitzhardinge wrote:
> I'm assuming from the lists you've posted to that this is under KVM?
> What disk drivers are you using (virtio or emulated)?
>
> Can you get a full stack backtrace of kjournald?
Yes, it's under KVM, disk driver is virtio. Since the io has issue, the
stack can't be saved on the disk. I have the image file attached here.
Thanks
Shirley
On 10/02/09 12:06, Shirley Ma wrote:
> On Fri, 2009-10-02 at 11:30 -0700, Jeremy Fitzhardinge wrote:
>
>> I'm assuming from the lists you've posted to that this is under KVM?
>> What disk drivers are you using (virtio or emulated)?
>>
>> Can you get a full stack backtrace of kjournald?
>>
> Yes, it's under KVM, disk driver is virtio. Since the io has issue, the
> stack can't be saved on the disk. I have the image file attached here.
>
Ah, thank you. The backtrace does indeed look very similar.
(BTW, you could get a serial console with "qemu-kvm -nographic -append
console=ttyS0 ...")
J
On 10/2/2009 2:30 PM, Jeremy Fitzhardinge wrote:
> On 09/30/09 14:11, Shirley Ma wrote:
>
>> Anybody found this problem before? I kept hitting this issue for 2.6.31
>> guest kernel even with a simple network test.
>>
>> INFO: task kjournal:337 blocked for more than 120 seconds.
>> "echo 0> /proc/sys/kernel/hung_task_timeout_sec" disables this message.
>>
>> kjournald D 00000041 0 337 2 0x00000000
>>
>> My test is totally being blocked.
>>
> I'm assuming from the lists you've posted to that this is under KVM?
> What disk drivers are you using (virtio or emulated)?
>
> Can you get a full stack backtrace of kjournald?
>
> Kevin Bowling submitted a RH bug against Xen with apparently the same
> symptoms (https://bugzilla.redhat.com/show_bug.cgi?id=526627). I'm
> wondering if there's a core kernel bug here, which is perhaps more
> easily triggered by the changed timing in a virtual machine.
>
> Thanks,
> J
>
I've had a stable system thus far by appending "clocksource=jiffies" to
the kernel boot line. The default clocksource is otherwise "xen".
The dmesg boot warnings in my bugzilla report still occur.
Regards,
Kevin Bowling
http://www.analograils.com/