User-Agent: Microsoft-Entourage/12.15.0.081119
Date: Fri, 20 Mar 2009 19:26:06 +0100
Subject: Page Cache writeback too slow,   SSD/noop scheduler/ext2
From: Jos Houtman <jos@hyves.nl>
To: <linux-kernel@vger.kernel.org>
Message-ID: <C5E99E4E.C0E0%jos@hyves.nl>
Thread-Topic: Page Cache writeback too slow,   SSD/noop scheduler/ext2
Thread-Index: AcmpfTHn4SEn3CUtJ0mwkgr/mpVpUAADCMzZ
In-Reply-To: <C5E989F1.C0C7%jos@hyves.nl>
Mime-version: 1.0
Content-type: text/plain;
	charset="ISO-8859-1"
Content-transfer-encoding: 8BIT
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1911
Lines: 54

Hi,

We have hit a problem where the page-cache writeback algorithm is not
keeping up.
When memory gets low this will result in very irregular performance drops.

Our setup is as follows:
30 x Quad core machine with 64GB ram.
These are single purpose machines running MySQL.
Kernel version: 2.6.28.7
A dedicated SSD drive for the ext2 database partition
Noop scheduler for the ssd drive.


The current hypothesis is as follows:
The wk_update function does not write enough dirty pages, which allows the
number of dirty pages to grow to the dirty_background limit.
When memory is low,  ?background_writeout() comes around and ?forcefully?
writes dirty pages to disk.
This forced write fills the disk queue and starves read calls that MySQL is
trying to do: basically killing performance  for a few seconds.
This pattern repeats as soon as the cleared memory is filled again.

Decreasing the dirty_writeback_centisecs to 100 doesn?t help

I don?t know why this is, but I did some preliminary tracing using systemtap
and it seems that the majority of times wk_update calls decides to do
nothing.

Doubling /sys/block/sdb/queue/nr_requests  to 256, seems to help abit:  the
nr_dirty pages is increasing more slowly.
But I am unsure of side-effects and am afraid of increasing the starvation
problem for mysql.


I?am very much willing to work on this issue and see it fixed, but would
like to tap into the knowledge of people here.
So: 
* Have more people seen this or simular issues?
* Is the hypothesis above a viable one?
* Suggestions/pointers for further research and statistics I should measure
to improve the understanding of this problem.


With regards,

Jos

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/