2010-01-18 23:58:12

by bugzilla-daemon

[permalink] [raw]
Subject: [Bug 14830] When other IO is running sync times go to 10 to 20 minutes

http://bugzilla.kernel.org/show_bug.cgi?id=14830





--- Comment #3 from Michael Godfrey <[email protected]> 2010-01-18 23:58:09 ---
>This problem prevents production use of systems using this kernel.

>evokes a question: Do you have a kernel which behaved better for you? Which
>one?

Yes. RHEL5.4 does not show this problem. It is the production
system that works in this environment.

The response above is disappointing. Is sync response of 20 minutes,
including several task timeouts to be considered "normal?"

>If you think the time is inappropriately long, we can have a look at it
>but for that we'd need much more details like amount and nature of data writen
>(many small files vs a few large ones), time it takes sync to complete, speed
>of disks for sequential IO...

I am sorry to have to tell you that in this environment we do not
deal in exclusively small or large files, we actually have quite a
few of both. When an rsync which transfers about 50GB of files of
various sizes is running, the hung condition is continuous until the rsync
completes. This is just a pretty typical load. You could try it
yourself. No special sizes of files are required. I think I
mentioned that the ext4 LVM is a RAID 50 3ware 9650SE-8LPML,
with 8 2T drives. Its throughput for reading and writing is good
when the system is not locked up.

--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.


2010-01-19 17:24:40

by Chris Lee

[permalink] [raw]
Subject: Re: [Bug 14830] When other IO is running sync times go to 10 to 20 minutes



[email protected] wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14830
>
>
>
>
>
> --- Comment #3 from Michael Godfrey <[email protected]> 2010-01-18 23:58:09 ---
>
>> This problem prevents production use of systems using this kernel.
>>
>
>
>> evokes a question: Do you have a kernel which behaved better for you? Which
>> one?
>>
>
> Yes. RHEL5.4 does not show this problem. It is the production
> system that works in this environment.
>
> The response above is disappointing. Is sync response of 20 minutes,
> including several task timeouts to be considered "normal?"
>
>
>> If you think the time is inappropriately long, we can have a look at it
>> but for that we'd need much more details like amount and nature of data writen
>> (many small files vs a few large ones), time it takes sync to complete, speed
>> of disks for sequential IO...
>>
>
> I am sorry to have to tell you that in this environment we do not
> deal in exclusively small or large files, we actually have quite a
> few of both. When an rsync which transfers about 50GB of files of
> various sizes is running, the hung condition is continuous until the rsync
> completes. This is just a pretty typical load. You could try it
> yourself. No special sizes of files are required. I think I
> mentioned that the ext4 LVM is a RAID 50 3ware 9650SE-8LPML,
> with 8 2T drives. Its throughput for reading and writing is good
> when the system is not locked up.
>
>
Is it possible that it is something allong the lines of what is
described at this link:
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html

If so a runtime adjustment might help you out.

Chris.