2009-11-26 22:49:45

by Buck

[permalink] [raw]
Subject: Re: observe and act upon workload parallelism: PERF_TYPE_PARALLELISM (Was: [RFC][PATCH] sched_wait_block: wait for blocked threads)

Hi,

I think our research on scheduling is somewhat related to this topic.

I posted to the kernel list a while back, but I'm not sure anyone
noticed. Here is a link to
that post, which contains links to slides and our paper.

http://lwn.net/Articles/358295/

-- Buck

On Nov 21, 3:27?am, Stijn Devriendt <[email protected]> wrote:
> > Think of it like a classic user-level threading package, where one process
> > implements multiple threads entirely in user space, and switches between
> > them. Except we'd do the exact reverse: create multiple threads in the
> > kernel, but only run _one_ of them at a time. So as far as the scheduler
> > is concerned, it acts as just a single thread - except it's a single
> > thread that has multiple instances associated with it.
>
> > And every time the "currently active" thread in that group runs out of CPU
> > time - or any time it sleeps - we'd just go on to the next thread in the
> > group.
>
> Without trying to sound selfish: after some thinking I can't see how this
> solves my problem. This is fine for the case you mentioned later on,
> like UI threads, but it's not powerful enough for what I'm trying to achieve.
>
> Let's make the round-trip for the thread pool case and start with an empty
> thread pool queue:
> - All threads are blocked on the queue condition variable untill new work
> is queued.
> - Thread 1 happily wakes up and runs the work item untill it's blocked.
> - A new work item arrives and Thread 2 is woken to handle the new work
> ? item.
> - As long as new work arrives and Thread 2 is not blocked (regardless
> ? of preemption because the deal was that they will not preempt each
> ? other) Thread 2 keeps running this work.
> ? Even when Thread 1 is woken, it will not preempt Thread 1.
>
> One solution would be to let Thread 2 call sched_yield, but the
> question then is "when" and "how much". Every time a lightweight
> thread yields, you'll incur context switches. Because you don't
> know when or how much, you'll be penalized for context switching
> even when not needed. (Consider 1 blocked thread and 4 extra threads
> sched_yield'ing every 5 work items)
>
> Another option is to have a group-leader. Non-leader threads will call
> sched_yield once in a while in order to try and jump back to the group-leader.
> The group-leader will always continue work without sched_yield'ing.
> There's no preemption between these threads.
> The down-side is that in case multiple of these threads are waiting for
> an event, wake-ups must wake the group leader rather than the other
> coop-scheduled threads for this model to work.
> Another down-side is that when a non-leader thread is blocked and the
> group-leader is run, the non-leader thread is treated unfair.
>
> Either solution's end-result is a very unfair threadpool where one cannot
> guarantee even a loose FIFO-model where items are handled more or
> less in the order they are queued and a library that needs to make
> trade-offs in performance to get this behaviour back.
>
> The solution is great when the threads are blocked most of the time
> and have little CPU processing to do (like UI threads), but doesn't
> scale beyond that.
>
> As ever, enlighten me when you have a great solution to this problem.
>
> Stijn
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/


2009-11-27 12:10:07

by Stijn Devriendt

[permalink] [raw]
Subject: Re: observe and act upon workload parallelism: PERF_TYPE_PARALLELISM (Was: [RFC][PATCH] sched_wait_block: wait for blocked threads)

Hi Buck,

Thanks for taking the time to inform me about your research.
I've just read up on it, but as far as I can tell there's still
quite some difference between our goals. My goal was to utilize the
CPU to it's fullest, regardless of blocking calls, in the context of a
threadpool. Your approach seems to try and do away with scheduling
latency and is better suited when timing information is available.

A generic threadpool doesn't have timing information about the work-items
it is processing, so it cannot take part in this type of cooperative scheduling.

Secondly you're using coop_poll() to wait for timeout and/or fds. In my case
I'm trying to let every blocking syscall cause the wake-up of the next
thread. This means that the threadpool should make no assumptions
about work-items and threadpool users can write any code they want.

Regards,
Stijn

On Thu, Nov 26, 2009 at 10:08 PM, Buck <[email protected]> wrote:
> Hi,
>
> I think our research on scheduling is somewhat related to this topic.
>
> I posted to the kernel list a while back, but I'm not sure anyone
> noticed. ? Here is a link to
> that post, which contains links to slides and our paper.
>
> http://lwn.net/Articles/358295/
>
> -- Buck
>
> On Nov 21, 3:27?am, Stijn Devriendt <[email protected]> wrote:
>> > Think of it like a classic user-level threading package, where one process
>> > implements multiple threads entirely in user space, and switches between
>> > them. Except we'd do the exact reverse: create multiple threads in the
>> > kernel, but only run _one_ of them at a time. So as far as the scheduler
>> > is concerned, it acts as just a single thread - except it's a single
>> > thread that has multiple instances associated with it.
>>
>> > And every time the "currently active" thread in that group runs out of CPU
>> > time - or any time it sleeps - we'd just go on to the next thread in the
>> > group.
>>
>> Without trying to sound selfish: after some thinking I can't see how this
>> solves my problem. This is fine for the case you mentioned later on,
>> like UI threads, but it's not powerful enough for what I'm trying to achieve.
>>
>> Let's make the round-trip for the thread pool case and start with an empty
>> thread pool queue:
>> - All threads are blocked on the queue condition variable untill new work
>> is queued.
>> - Thread 1 happily wakes up and runs the work item untill it's blocked.
>> - A new work item arrives and Thread 2 is woken to handle the new work
>> ? item.
>> - As long as new work arrives and Thread 2 is not blocked (regardless
>> ? of preemption because the deal was that they will not preempt each
>> ? other) Thread 2 keeps running this work.
>> ? Even when Thread 1 is woken, it will not preempt Thread 1.
>>
>> One solution would be to let Thread 2 call sched_yield, but the
>> question then is "when" and "how much". Every time a lightweight
>> thread yields, you'll incur context switches. Because you don't
>> know when or how much, you'll be penalized for context switching
>> even when not needed. (Consider 1 blocked thread and 4 extra threads
>> sched_yield'ing every 5 work items)
>>
>> Another option is to have a group-leader. Non-leader threads will call
>> sched_yield once in a while in order to try and jump back to the group-leader.
>> The group-leader will always continue work without sched_yield'ing.
>> There's no preemption between these threads.
>> The down-side is that in case multiple of these threads are waiting for
>> an event, wake-ups must wake the group leader rather than the other
>> coop-scheduled threads for this model to work.
>> Another down-side is that when a non-leader thread is blocked and the
>> group-leader is run, the non-leader thread is treated unfair.
>>
>> Either solution's end-result is a very unfair threadpool where one cannot
>> guarantee even a loose FIFO-model where items are handled more or
>> less in the order they are queued and a library that needs to make
>> trade-offs in performance to get this behaviour back.
>>
>> The solution is great when the threads are blocked most of the time
>> and have little CPU processing to do (like UI threads), but doesn't
>> scale beyond that.
>>
>> As ever, enlighten me when you have a great solution to this problem.
>>
>> Stijn
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at ?http://www.tux.org/lkml/
>
>