Just bringing up a latency issue I've noticed recently.
In or around 2.6.14-rc4 some changes were made to have the call to
kmem_cache_free() from file_free() in the Linux kernel be deferred, running
as a tasklet via file_free_rcu(), rather than running kmem_cache_free()
right from file_free() directly.
I've noticed that rcu_process_callbacks() can take quite a while to run
now that it routinely calls file_free_rcu() to run kmem_cache_free().
This can make the cpu unavailable for 100's of usec on 1GHz machines, with
or without preemption configured on (much of this path is non-preemptible).
This can result in some unpredictable periods of fairly long cpu latency,
such as when a thread is waiting to be woken by an interrupt handler on a
'now quiet' cpu. Changing file_free() to call kmem_cache_free() directly
completely eliminates this unexpected latency.
Here's the stack trace that illustrates what I'm talking about:
[<a0000001001154a0>] kmem_cache_free+0x140/0x3c0
sp=e00000307bc27dc0 bsp=e00000307bc21070
[<a000000100153950>] file_free_rcu+0x30/0x60
sp=e00000307bc27dd0 bsp=e00000307bc21050
[<a0000001000d89c0>] __rcu_process_callbacks+0x2c0/0x5e0
sp=e00000307bc27dd0 bsp=e00000307bc21010
[<a0000001000d8d40>] rcu_process_callbacks+0x60/0xc0
sp=e00000307bc27dd0 bsp=e00000307bc20fe8
[<a0000001000baae0>] tasklet_action+0x2c0/0x320
sp=e00000307bc27dd0 bsp=e00000307bc20f98
[<a0000001000ba0d0>] __do_softirq+0x130/0x240
sp=e00000307bc27dd0 bsp=e00000307bc20ef8
[<a0000001000ba260>] do_softirq+0x80/0xe0
sp=e00000307bc27dd0 bsp=e00000307bc20e98
[<a0000001000ba4a0>] ksoftirqd+0x140/0x1a0
sp=e00000307bc27dd0 bsp=e00000307bc20e68
Dimitri Sivanich
Dimitri Sivanich a ?crit :
> Just bringing up a latency issue I've noticed recently.
>
> In or around 2.6.14-rc4 some changes were made to have the call to
> kmem_cache_free() from file_free() in the Linux kernel be deferred, running
> as a tasklet via file_free_rcu(), rather than running kmem_cache_free()
> right from file_free() directly.
>
> I've noticed that rcu_process_callbacks() can take quite a while to run
> now that it routinely calls file_free_rcu() to run kmem_cache_free().
> This can make the cpu unavailable for 100's of usec on 1GHz machines, with
> or without preemption configured on (much of this path is non-preemptible).
>
> This can result in some unpredictable periods of fairly long cpu latency,
> such as when a thread is waiting to be woken by an interrupt handler on a
> 'now quiet' cpu. Changing file_free() to call kmem_cache_free() directly
> completely eliminates this unexpected latency.
Well, you cannot change file_free() to call kmem_cache_free() directly, or
risk corruption/crash.
See Documentation/RCU/UP.txt
Dont you notice latency issue with other RCU protected data, like dentries ?
BTW a change in 2.6.14-rc5 might give different latency results.
Eric
On Thu, Oct 20, 2005 at 05:56:12PM +0200, Eric Dumazet wrote:
> Dimitri Sivanich a ?crit :
> >Just bringing up a latency issue I've noticed recently.
> >
> >In or around 2.6.14-rc4 some changes were made to have the call to
> >kmem_cache_free() from file_free() in the Linux kernel be deferred, running
> >as a tasklet via file_free_rcu(), rather than running kmem_cache_free()
> >right from file_free() directly.
> >
> >I've noticed that rcu_process_callbacks() can take quite a while to run
> >now that it routinely calls file_free_rcu() to run kmem_cache_free().
> >This can make the cpu unavailable for 100's of usec on 1GHz machines, with
> >or without preemption configured on (much of this path is non-preemptible).
> >
> >This can result in some unpredictable periods of fairly long cpu latency,
> >such as when a thread is waiting to be woken by an interrupt handler on a
> >'now quiet' cpu. Changing file_free() to call kmem_cache_free() directly
> >completely eliminates this unexpected latency.
>
> Well, you cannot change file_free() to call kmem_cache_free() directly, or
> risk corruption/crash.
>
> See Documentation/RCU/UP.txt
OK. I'll have to look at this more closely. I simply ran across this as a
substantial change between this and earlier kernels and decided to test
against the original file_free()->kmem_cache_free() code to ensure that that
alone was indeed the issue (for the circumstance I'll describe below).
>
> Dont you notice latency issue with other RCU protected data, like dentries ?
No, but here's the circumstance under which I notice this:
I'm running on a single cpu of an SMP system (4 cpu). When I hit this I'm in
a situation where I've written some file data, and am now sleeping waiting to
be woken up. No other threads are running on that cpu other than a few kernel
threads, so all is fairly quiet.
By the simple one line change (file_free() calling kmem_cache_free() again),
I'm always woken up very quickly. Too bad we cannot revert back that way
with the rcu changes.
>
> BTW a change in 2.6.14-rc5 might give different latency results.
I'll look at this as soon as I get a chance.
>
> Eric