Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 4 Jan 2002 05:34:43 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 4 Jan 2002 05:34:33 -0500 Received: from samba.sourceforge.net ([198.186.203.85]:57868 "HELO lists.samba.org") by vger.kernel.org with SMTP id ; Fri, 4 Jan 2002 05:34:20 -0500 Date: Fri, 4 Jan 2002 21:30:18 +1100 From: Anton Blanchard To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Linus Torvalds , Alan Cox Subject: Re: [announce] [patch] ultra-scalable O(1) SMP and UP scheduler Message-ID: <20020104103018.GA22745@krispykreme> In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.24i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Great stuff Ingo! > now that new-year's parties are over things are getting boring again. For > those who want to see and perhaps even try something more complex, i'm > announcing this patch that is a pretty radical rewrite of the Linux > scheduler for 2.5.2-pre6: Good timing :) We were just looking at an application that hit sched_yield heavily on a large SMP machine (dont have the source so fixing the symptoms not the cause). Performance was pretty bad with the standard scheduler. We managed to get a 4 way ppc64 machine to boot, but unfortunately the 18 way hung somewhere after smp initialisation and before execing init. More testing to come :) Is the idle loop optimisation (atomic exchange -1 into need_resched to avoid IPI) gone forever? Is my understanding of this right?: #define BITMAP_SIZE (MAX_RT_PRIO/8+1) ... char bitmap[BITMAP_SIZE+1]; list_t queue[MAX_RT_PRIO]; You have an bitmap of size MAX_RT_PRIO+1 (actually I think you have one too many +1 here :) and you zero the last bit to terminate it. You use find_first_zero_bit to search the entire priority list and sched_find_first_zero_bit to search only the first MAX_PRIO (64) bits. > the SMP load-balancer can be extended/switched with additional parallel > computing and cache hierarchy concepts: NUMA scheduling, multi-core CPUs > can be supported easily by changing the load-balancer. Right now it's > tuned for my SMP systems. It will be interesting to test this on our HMT hardware. > - the SMP idle-task startup code was still racy and the new scheduler > triggered this. So i streamlined the idle-setup code a bit. We do not call > into schedule() before all processors have started up fully and all idle > threads are in place. I like this cleanup, it pushes more stuff out the arch specific code into init_idle(). > another benchmark measures sched_yield() performance. (which the pthreads > code relies on pretty heavily.) Can you send me this benchmark and I'll get some more results :) I dont think pthreads uses sched_yield on spinlocks any more, but there seems to be enough userspace code that does. Anton - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/