> > Do you have benchmarks or something to show that is this actually a
> > _significant_ problem?
>
> you need benchmarks to tell that pure per-IRQ stacks are bad for SMP
> performance?
No. I asked how _significant_ the difference is, as compared to the rest of
the accesses it makes.
> per-IRQ+per-CPU and pure per-CPU IRQ stacks should perform rougly equally
> well on SMP - with per-CPU IRQ stacks having lower runtime setup cost.
There are problems with purely per-CPU stacks if you run with interrupts
enabled. You theoretically ought to have a stack big enough to allow all
possible interrupts to be nested on one CPU.
And having non-uniform stack sizes of course introduces other
problems... notably the fact that you can no longer locate the thread_info
struct by means of AND-ing the stack pointer.
> there's a difference between bouncing 1-2 cachelines and bouncing a *full,
> dirtied stack*. The irq_desc[] bouncing is pretty much unavoidable (IRQs do
> need some global state) - the stack bouncing is just plain stupid and
> perfectly avoidable.
I wonder if it might be possible to invalidate just that bit of the
cache... though I suspect that's not worth it, even if it is.
David