Date: Thu, 18 Mar 2004 16:28:16 +0100
Message-ID: <s5hlllycgz3.wl@alsa2.suse.de>
From: Takashi Iwai <tiwai@suse.de>
To: Andrea Arcangeli <andrea@suse.de>
Cc: "Marinos J. Yannikos" <mjy@geizhals.at>, Andrew Morton <akpm@osdl.org>,
       linux-kernel@vger.kernel.org
Subject: Re: CONFIG_PREEMPT and server workloads
In-Reply-To: <20040318060358.GC29530@dualathlon.random>
References: <40591EC1.1060204@geizhals.at>
	<20040318060358.GC29530@dualathlon.random>
User-Agent: Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.5 (Awara-Onsen) FLIM/1.14.5 (Demachiyanagi) APEL/10.6 MULE XEmacs/21.4 (patch 13) (Rational FORTRAN) (i386-suse-linux)
MIME-Version: 1.0 (generated by SEMI 1.14.5 - "Awara-Onsen")
Content-Type: text/plain; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3614
Lines: 100

At Thu, 18 Mar 2004 07:03:58 +0100,
Andrea Arcangeli wrote:
> 
> On Thu, Mar 18, 2004 at 05:00:01AM +0100, Marinos J. Yannikos wrote:
> > Hi,
> > 
> > we upgraded a few production boxes from 2.4.x to 2.6.4 recently and the 
> > default .config setting was CONFIG_PREEMPT=y. To get straight to the 
> > point: according to our measurements, this results in severe performance 
> > degradation with our typical and some artificial workload. By "severe" I 
> > mean this:
> 
> this is expected (see the below email, I predicted it on Mar 2000), keep
> preempt turned off always, it's useless. Worst of all we're now taking
> spinlocks earlier than needed, and the preempt_count stuff isn't
> optmized away by PREEMPT=n, once those bits will be fixed too it'll go
> even faster.
> 
> preempt just wastes cpu with tons of branches in fast paths that should
> take one cycle instead.
> 
> Takashi Iwai did lots of research on the preempt vs lowlatency and
> he found that preempt buys nothing and he confirmed my old theories

well, i personally am not against the current preempt mechanism from
the viewpoint of the audio-processing purpose :)  the implementation
is relatively clean and easy.

but i agree with Andrea, that surely we can achieve the alsmo same
RT-performance even without preemption, i.e. with less perempt
overhead.  it's not necessary to be default.

(snip)
> These fixes from Takashi Iwai brings 2.6 back in line with 2.4, I
> suggested to use EIP dumps from interrupts to get the hotspots, he
> promptly used the RTC for that and he could fixup all the spots, great
> job he did since now we've a very low worst case sched latency in 2.6
> too:
> 
> --- linux/fs/mpage.c-dist	2004-03-10 16:26:54.293647478 +0100
> +++ linux/fs/mpage.c	2004-03-10 16:27:07.405673634 +0100
> @@ -695,6 +695,7 @@ mpage_writepages(struct address_space *m
>  			unlock_page(page);
>  		}
>  		page_cache_release(page);
> +		cond_resched();
>  		spin_lock(&mapping->page_lock);
>  	}
>  	/*

the above one is the major source of RT-latency.
only this oneliner will reduce more than 90% of RT-latencies.
in my case with reiserfs, i got 0.4ms RT-latency with my test suite
(with athlon 2200+).

there is another point to be fixed in the reiserfs journal
transaction.  then you'll get 0.1ms RT-latency without preemption.

for ext3, these two spots are relevant.

--- linux-2.6.4-8/fs/jbd/commit.c-dist	2004-03-16 23:00:40.000000000 +0100
+++ linux-2.6.4-8/fs/jbd/commit.c	2004-03-18 02:42:41.043448624 +0100
@@ -290,6 +290,9 @@ write_out_data_locked:
 			commit_transaction->t_sync_datalist = jh;
 			break;
 		}
+
+		if (need_resched())
+			break;
 	} while (jh != last_jh);
 
 	if (bufs || need_resched()) {
--- linux-2.6.4-8/fs/ext3/inode.c-dist	2004-03-18 02:33:38.000000000 +0100
+++ linux-2.6.4-8/fs/ext3/inode.c	2004-03-18 02:33:40.000000000 +0100
@@ -1987,6 +1987,7 @@ static void ext3_free_branches(handle_t 
 	if (is_handle_aborted(handle))
 		return;
 
+	cond_resched();
 	if (depth--) {
 		struct buffer_head *bh;
 		int addr_per_block = EXT3_ADDR_PER_BLOCK(inode->i_sb);


i think the first one is needed for preemptive kernel, too.
with these patches, also 0.1-0.2ms RT-latency is achieved.


BTW, my measurement tool is found at

	http://www.alsa-project.org/~iwai/latencytest-0.5.2.tar.gz


--
Takashi Iwai <tiwai@suse.de>		ALSA Developer - www.alsa-project.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/