LinuxLists.cc - 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-29 21:15:45

Subject: 2.2.19pre3 and poor reponse to RT-scheduled processes?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Content-Type: text/plain; charset=us-ascii

[...Please CC me on any replies, as I'm not on the list(s)...]

Folks:
I was experiencing problems with 2.2.16 where the box would go out
to lunch for a few seconds flushing buffer or paging at inopportune
times (is there ever an opportune time for the box to become non-
reponsive for 5 seconds? 8-).

2.2.19pre3 makes the behaviour much better, but I still see ~ 2sec
pauses at times. I'm sending this to the MM list as well, since I
believe the poor behaviour in 2.2.16 was an MM issue... I don't
know where the slowdowns are happening this time around.

The box in question is running the linux-ha.org heartbeat package,
which is a RT-scheduled, mlock()'ed process, and as such should
get as good service as the box is able to mange. Often, under
high disk (and/or MM) loads, the box becomes unreponsive for a
period of time from ~ 1 sec to a high of ~ 2.8sec.

The test is simply running a 'dd if=/dev/zero of=/u1/big-empty-file
bs=1k count=512000 && date'. Generally, the box will sieze up around
the same time as the the 'dd' finishes (maybe trying to exec date?).

I'd appreciate any hints at how to reduce the non-reponsiveness
window down as much as possible. I haven't yet looked to see if
there is a version of the low-latency patches for 2.2.18 or 19pre,
but I'd appreciate other ideas on tracking this down as well.

Thanks!
- --rafal

- ----
Rafal Boni [email protected]
PGP key C7D3024C, print EA49 160D F5E4 C46A 9E91 524E 11E0 7133 C7D3 024C
Need to get a hold of me? http://800.eDial.com/[email protected]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.0 (GNU/Linux)
Comment: Exmh version 2.1.1 10/15/1999

iD8DBQE6TPfjEeBxM8fTAkwRAiPaAKDSp1udFSypqq838fwAjQnlFW0m2wCgtycm
xF7xuBroSl3YXCTqUXGDAy0=
=JHLL
-----END PGP SIGNATURE-----

2000-12-29 21:50:18

by Gregory Maxwell

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

On Fri, Dec 29, 2000 at 03:45:23PM -0500, Rafal Boni wrote:
[snip]
> The box in question is running the linux-ha.org heartbeat package,
> which is a RT-scheduled, mlock()'ed process, and as such should
> get as good service as the box is able to mange. Often, under
> high disk (and/or MM) loads, the box becomes unreponsive for a
> period of time from ~ 1 sec to a high of ~ 2.8sec.
[snip]

You are running IDE aren't you?

Enable DMA and/or unmask interupts.

man hdparm

Good luck.

2000-12-29 22:24:48

by Rafal Boni

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Content-Type: text/plain; charset=us-ascii

In message <[email protected]>, Greg Maxwell wrote:

- -> You are running IDE aren't you?
- ->
- -> Enable DMA and/or unmask interupts.

D'oh! Thanks to Greg for the clue-by-four! I *am* running IDE and I had
both DMA (due to misreading of kernel boot message) and interrupt unmasking
(since I had forgotten that one) off....

I had assumed that DMA was on from the mention of it in kernel messages
(which on closer reading do indicate CMOS/BIOS configured default modes,
not what the kernel is using), and the lack of an explicit message on
the order of "I know it's there, but I'm not going to use it all the
same" 8-)

Now my box behaves much more reasonably... I'll just have to beat harder
on it and see what happens.

Thank for the help,
- --rafal

- ----
Rafal Boni [email protected]
PGP key C7D3024C, print EA49 160D F5E4 C46A 9E91 524E 11E0 7133 C7D3 024C
Need to get a hold of me? http://800.edial.com/[email protected]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.0 (GNU/Linux)
Comment: Exmh version 2.1.1 10/15/1999

iD8DBQE6TQgOEeBxM8fTAkwRArCFAKDVrzaWxGtRFR0pbyNwvIF20bOSiwCfdhg9
wK1ZAhaCfK5qcrQezDECiK4=
=9x6E
-----END PGP SIGNATURE-----

2000-12-30 18:47:58

by Andrea Arcangeli

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
> Now my box behaves much more reasonably... I'll just have to beat harder
> on it and see what happens.

Another thing: while writing to disk if you want low latency readers you can
do:

elvtune -r 1 /dev/hd[abcd]

The 1/2 seconds stalls you see could be just because of applications that waits
I/O synchronously while the elevator is reodering I/O requests (and even if the
elevator wouldn't reorder anything the new requests would go to the end of the
I/O queue so they would have some higher latency anyways). That's normal and if
it's the case to avoid those stalls you can only decrease the I/O load or
increase disk throughput ;). The important thing is that the kernel is
not sitting in a tight kernel loop without reschedule in it during such 2
seconds.

However 2.2.19pre3aa4 includes also the lowlatency bugfixes in case you have
tons of ram and you're sending huge buffers to syscalls.

Andrea

2000-12-30 19:40:31

by Linus Torvalds

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

In article <[email protected]>,
Andrea Arcangeli <[email protected]> wrote:
>On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
>> Now my box behaves much more reasonably... I'll just have to beat harder
>> on it and see what happens.
>
>Another thing: while writing to disk if you want low latency readers you can
>do:
>
> elvtune -r 1 /dev/hd[abcd]
>
>The 1/2 seconds stalls you see could be just because of applications that waits
>I/O synchronously while the elevator is reodering I/O requests (and even if the
>elevator wouldn't reorder anything the new requests would go to the end of the
>I/O queue so they would have some higher latency anyways).

That sounds like too long a stall to be due to elevator ordering except
with some _really_ unlucky access patterns (or with slow disks).

There are other, equally likely, candidates for these kinds of stalls:

- filesystem locks. Especially the ext2 superblock lock. You can easily
hit this one, as some ext2 functions actually do a lot of IO while
holding the lock.

- synchronously waiting for bdflush with balance_dirty_buffers().
Especially mixed with the above.

A mixture of the two above will bascally stall the whole machine: almost
any non-cached file access ends up waiting for the superblock lock and
bdflush, and it can easily get quite unfair.

Linus

2000-12-30 19:56:48

by Alexander Viro

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

On 30 Dec 2000, Linus Torvalds wrote:

> There are other, equally likely, candidates for these kinds of stalls:
>
> - filesystem locks. Especially the ext2 superblock lock. You can easily
> hit this one, as some ext2 functions actually do a lot of IO while
> holding the lock.

Hmm... In 2.4 we can make the situation with superblock lock on ext2
much better. I didn't go the whole way down to spinlocks, but right now
I'm sitting on a box with modified ext2 that doesn't do _any_ IO in
protected parts of ext2_new_inode()/ext2_new_block(). I can try to
extract the relevant parts of the patch if you are interested (it also
got directories-in-pagecache stuff and better SMP threading of
get_block()/truncate()). The thing seems to be working fine and I see
no serious contention on lock_super(). Dunno if it's worth doing before
2.4.0, but since it has zero impact on the rest of tree (OK, zero except
that write_on_page() had been exported, but I could trivially get rid
of that)... Maybe 2.4.early would be a good idea.
Cheers,
Al

2000-12-30 20:02:30

by Linus Torvalds

[permalink] [raw]

Subject: Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

On Sat, 30 Dec 2000, Alexander Viro wrote:
> On 30 Dec 2000, Linus Torvalds wrote:
>
> > There are other, equally likely, candidates for these kinds of stalls:
> >
> > - filesystem locks. Especially the ext2 superblock lock. You can easily
> > hit this one, as some ext2 functions actually do a lot of IO while
> > holding the lock.
>
> Hmm... In 2.4 we can make the situation with superblock lock on ext2
> much better.

Actually, 2.4.x right now is worse than 2.2.x in this regard, for a really
simple reason: 2.2.x will only do the equivalent of "rebalance_dirty" when
it dirties a previously clean buffer. The current 2.4.x code does that
regardless of whether the buffer was dirty before or not.

I want to see your patches to fix this for good in a 2.5.x timeframe (or,
if they are really clean and obvious, at a later 2.4.x date), but for
2.4.x I think that we'll do either "remove rebalance dirty completely" or
at the very least we'll not re-balance for re-dirtying a dirty buffer.

The re-dirtying a dirty buffer is the common case for the superblock
stuff: bitmap blocks etc are often dirty already, _especially_ in the case
of an active writer. So 2.4.x is actually more likely to hit the
superblock/bdflush contention.

Of course, 2.4.x has had so many improvements in file writing memory
pressure that it might not end up being that noticeable, but even so..

Linus