From: Andy Lutomirski Subject: Re: [PATCHSET v3.1 0/7] data integrity: Stabilize pages during writeback for various fses Date: Sun, 23 Oct 2011 09:38:50 -0700 Message-ID: <4EA4431A.3010104@amacapital.net> References: <20110509230318.19566.66202.stgit@elm3c44.beaverton.ibm.com> <87tyd31fkc.fsf@devron.myhome.or.jp> <20110510123819.GB4402@quack.suse.cz> <87hb924s2x.fsf@devron.myhome.or.jp> <20110510132953.GE4402@quack.suse.cz> <878vue4qjb.fsf@devron.myhome.or.jp> <87zkmu3b2i.fsf@devron.myhome.or.jp> <20110510145421.GJ4402@quack.suse.cz> <87zkmupmaq.fsf@devron.myhome.or.jp> <20110510162237.GM4402@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: OGAWA Hirofumi , "Darrick J. Wong" , Theodore Tso , Alexander Viro , Jens Axboe , "Martin K. Petersen" , Jeff Layton , Dave Chinner , linux-kernel , Dave Hansen , Christoph Hellwig , linux-mm@kvack.org, Chris Mason , Joel Becker , linux-scsi , linux-fsdevel , linux-ext4@vger.kernel.org, Mingming Cao To: Jan Kara Return-path: Received: from mail-yw0-f46.google.com ([209.85.213.46]:41718 "EHLO mail-yw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752112Ab1JWQjF (ORCPT ); Sun, 23 Oct 2011 12:39:05 -0400 In-Reply-To: <20110510162237.GM4402@quack.suse.cz> Sender: linux-ext4-owner@vger.kernel.org List-ID: On 05/10/2011 09:22 AM, Jan Kara wrote: > On Wed 11-05-11 01:12:13, OGAWA Hirofumi wrote: >> Jan Kara writes: >> >>>> Did you already consider, to copy only if page was writeback (like >>>> copy-on-write)? I.e. if page is on I/O, copy, then switch the page for >>>> writing new data. >>> Yes, that was considered as well. We'd have to essentially migrate the >>> page that is under writeback and should be written to. You are going to pay >>> the cost of page allocation, copy, increased memory& cache pressure. >>> Depending on your backing storage and workload this may or may not be better >>> than waiting for IO... >> >> Maybe possible, but you really think on usual case just blocking is >> better? > Define usual case... As Christoph noted, we don't currently have a real > practical case where blocking would matter (since frequent rewrites are > rather rare). So defining what is usual when we don't have a single real > case is kind of tough ;) > I'm a bit late to the party, but I have such a use case. I have a real-time program that generates logs. There's a thread that makes sure that there are always mlocked, MAP_SHARED, writable pages for the logs, and under normal (or even very heavy) load, the mlocked pages always stay far ahead of the logs. On 2.6.39, it works great [1]. On 3.0, it's unusable -- latencies of 30-100 ms are very common. In this case, neither throughput nor available memory matter at all -- I'm not stressing either. So copying the pages (especially if they're mlocked) would be more than a small percentage win -- it would be the difference between great performance and unusability. I wonder if we want a stronger version of mlock that says "this page must not be swapped out and, in addition, ptes must always be mapped with all appropriate permission bits set". (This is only possible with hardware dirty and accessed bits, but we could come close even without them.) [1] file_update_time is a problem. patches coming. --Andy