are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm.git slub-linus
(includes the cmpxchg_local fastpath since the cmpxchg_local work
by Matheiu is in now, and the non atomic unlock by Nick. Verified that
this is not doing any harm after some other patches had been removed.
cmpxchg_local fastpath was stripped of support for CONFIG_PREEMPT since
that uglified the code and did not seem to work right. We will be
able to handle preempt much better in the future with some upcoming
patches)
Christoph Lameter (4):
SLUB: Deal with annoying gcc warning on kfree()
SLUB: Use unique end pointer for each slab page.
SLUB: Alternate fast paths using cmpxchg_local
SLUB: Support for performance statistics
Ingo Molnar (1):
SLUB: fix checkpatch warnings
Nick Piggin (1):
Use non atomic unlock
Documentation/vm/slabinfo.c | 149 ++++++++++++++++++--
arch/x86/Kconfig | 4 +
include/linux/mm_types.h | 5 +-
include/linux/slub_def.h | 23 +++
lib/Kconfig.debug | 13 ++
mm/slub.c | 326
++++++++++++++++++++++++++++++++++++-------
6 files changed, 457 insertions(+), 63 deletions(-)
On Friday 08 February 2008 13:13, Christoph Lameter wrote:
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm.git slub-linus
>
> (includes the cmpxchg_local fastpath since the cmpxchg_local work
> by Matheiu is in now, and the non atomic unlock by Nick. Verified that
> this is not doing any harm after some other patches had been removed.
Ah, good. I think it is always a good thing to be able to remove atomics.
They place quite a bit of burden on the CPU, especially x86 where it also
has implicit memory ordering semantics (although x86 can speculatively
get around much of the problem, it's obviously worse than no restriction)
Even if perhaps some cache coherency or timing quirk makes the non-atomic
version slower (all else being equal), then I'd still say that the non
atomic version should be preferred.
Thanks,
Nick
Nick Piggin a ?crit :
> On Friday 08 February 2008 13:13, Christoph Lameter wrote:
>> are available in the git repository at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm.git slub-linus
>>
>> (includes the cmpxchg_local fastpath since the cmpxchg_local work
>> by Matheiu is in now, and the non atomic unlock by Nick. Verified that
>> this is not doing any harm after some other patches had been removed.
>
> Ah, good. I think it is always a good thing to be able to remove atomics.
> They place quite a bit of burden on the CPU, especially x86 where it also
> has implicit memory ordering semantics (although x86 can speculatively
> get around much of the problem, it's obviously worse than no restriction)
>
> Even if perhaps some cache coherency or timing quirk makes the non-atomic
> version slower (all else being equal), then I'd still say that the non
> atomic version should be preferred.
>
What about IRQ masking then ?
Many CPU pay high cost for cli/sti pair...
And SLAB/SLUB allocators, even if only used from process context, want to
disable/re-enable interrupts...
I understand kmalloc() want generic pools, but dedicated pools could avoid
this cli/sti
On Friday 08 February 2008 18:29, Eric Dumazet wrote:
> Nick Piggin a ?crit :
> > On Friday 08 February 2008 13:13, Christoph Lameter wrote:
> >> are available in the git repository at:
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm.git
> >> slub-linus
> >>
> >> (includes the cmpxchg_local fastpath since the cmpxchg_local work
> >> by Matheiu is in now, and the non atomic unlock by Nick. Verified that
> >> this is not doing any harm after some other patches had been removed.
> >
> > Ah, good. I think it is always a good thing to be able to remove atomics.
> > They place quite a bit of burden on the CPU, especially x86 where it also
> > has implicit memory ordering semantics (although x86 can speculatively
> > get around much of the problem, it's obviously worse than no restriction)
> >
> > Even if perhaps some cache coherency or timing quirk makes the non-atomic
> > version slower (all else being equal), then I'd still say that the non
> > atomic version should be preferred.
>
> What about IRQ masking then ?
I really did mean all else being equal. eg. "clear_bit" vs "__clear_bit".
> Many CPU pay high cost for cli/sti pair...
True, and many UP architectures have to implement atomic operations
with cli/sti pairs... so those are more reasons to use non-atomics.
> And SLAB/SLUB allocators, even if only used from process context, want to
> disable/re-enable interrupts...
>
> I understand kmalloc() want generic pools, but dedicated pools could avoid
> this cli/sti
Sure, I guess that would be possible. I've kind of toyed with doing
some cli/sti mitigation in the page allocator, but in that case I
found that it wasn't a win outside microbenchmarks: the cache
characteristics of the returned pages are just as important if not
more so than cli/sti costs (although that balance would change
depending on the CPU and workload I guess).
For slub yes you could do it with fewer downsides with process context
pools.
Is it possible instead for architectures where cli/sti is so expensive
to change their lowest level of irq handling to do this by setting and
clearing a soft flag somewhere? That's what I'd rather see, if possible.
On Fri, 8 Feb 2008, Eric Dumazet wrote:
> And SLAB/SLUB allocators, even if only used from process context, want to
> disable/re-enable interrupts...
Not any more..... The new fastpath does allow avoiding interrupt
enable/disable and we will be hopefully able to increase the scope of that
over time.
Eric Dumazet <[email protected]> writes:
>
> What about IRQ masking then ?
>
> Many CPU pay high cost for cli/sti pair...
Many? In the x86 world only P4. On the other cores cli/sti (and even
pushf ; cli ; popf) is reasonably fast.
>
> And SLAB/SLUB allocators, even if only used from process context, want
> to disable/re-enable interrupts...
>
> I understand kmalloc() want generic pools, but dedicated pools could
> avoid this cli/sti
While there are a lot of P4s around they are obsolete by now and I would
advise against major redesigns for tuning obsolete CPUs.
-Andi
Christoph Lameter a ?crit :
> On Fri, 8 Feb 2008, Eric Dumazet wrote:
>
>> And SLAB/SLUB allocators, even if only used from process context, want to
>> disable/re-enable interrupts...
>
> Not any more..... The new fastpath does allow avoiding interrupt
> enable/disable and we will be hopefully able to increase the scope of that
> over time.
>
>
Oh, I missed this new SLUB_FASTPATH stuff (not yet in net-2.6), thanks Christoph !