2006-01-04 18:52:31

by Chuck Ebbert

[permalink] [raw]
Subject: [patch 2.6.15] i386: Optimize local APIC timer interrupt code

Local APIC timer interrupt happens HZ times per second on each CPU.

Optimize it for the case where profile multiplier equals one and does
not change (99+% of cases); this saves about 20 CPU cycles on Pentium II.

Also update the old multiplier immediately after noticing it changed,
while values are register-hot, saving eight bytes of stack depth.

Signed-off-by: Chuck Ebbert <[email protected]>

diff -up 2.6.15a.orig/arch/i386/kernel/apic.c 2.6.15a/arch/i386/kernel/apic.c
--- 2.6.15a.orig/arch/i386/kernel/apic.c
+++ 2.6.15a/arch/i386/kernel/apic.c
@@ -1137,7 +1137,7 @@ inline void smp_local_timer_interrupt(st
int cpu = smp_processor_id();

profile_tick(CPU_PROFILING, regs);
- if (--per_cpu(prof_counter, cpu) <= 0) {
+ if (likely(--per_cpu(prof_counter, cpu)) <= 0) {
/*
* The multiplier may have changed since the last time we got
* to this point as a result of the user writing to
@@ -1147,13 +1147,13 @@ inline void smp_local_timer_interrupt(st
* Interrupts are already masked off at this point.
*/
per_cpu(prof_counter, cpu) = per_cpu(prof_multiplier, cpu);
- if (per_cpu(prof_counter, cpu) !=
- per_cpu(prof_old_multiplier, cpu)) {
+ if (unlikely(per_cpu(prof_counter, cpu) !=
+ per_cpu(prof_old_multiplier, cpu))) {
+ per_cpu(prof_old_multiplier, cpu) =
+ per_cpu(prof_counter, cpu);
__setup_APIC_LVTT(
calibration_result/
per_cpu(prof_counter, cpu));
- per_cpu(prof_old_multiplier, cpu) =
- per_cpu(prof_counter, cpu);
}

#ifdef CONFIG_SMP
--
Chuck


2006-01-04 23:01:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 2.6.15] i386: Optimize local APIC timer interrupt code

Chuck Ebbert <[email protected]> wrote:
>
> Local APIC timer interrupt happens HZ times per second on each CPU.
>
> Optimize it for the case where profile multiplier equals one and does
> not change (99+% of cases); this saves about 20 CPU cycles on Pentium II.
>
> Also update the old multiplier immediately after noticing it changed,
> while values are register-hot, saving eight bytes of stack depth.

The code which you're patching is cheerfully nuked by a patch in Andi's
tree:
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/no-subjiffy-profile

I don't immediately understand that patch and I don't recall seeing it
discussed - maybe I was asleep.

It removes the profile multiplier (readprofile -M). I've used that
occasionally, but can't say that I noticed much benefit from it.

What's the thinking here?

2006-01-05 00:22:44

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [patch 2.6.15] i386: Optimize local APIC timer interrupt code

In-Reply-To: <[email protected]>


Andrew Morton wrote:

> The code which you're patching is cheerfully nuked by a patch in Andi's
> tree:
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/no-subjiffy-profile
>
> I don't immediately understand that patch and I don't recall seeing it
> discussed - maybe I was asleep.
>
> It removes the profile multiplier (readprofile -M). I've used that
> occasionally, but can't say that I noticed much benefit from it.

It's probably required for the i386-timer-broadcast patch from the same
patchset. Apparently some Intel CPUs stop their local APIC timer when in
certain ACPI C-states, so that patch switches them to use an IPI broadcast
from the main timer interrupt instead. Supporting subjiffy profiling
is impossible in that case, so the code was removed.

--
Chuck

2006-01-05 06:03:24

by Willy Tarreau

[permalink] [raw]
Subject: Re: [patch 2.6.15] i386: Optimize local APIC timer interrupt code

On Wed, Jan 04, 2006 at 01:48:09PM -0500, Chuck Ebbert wrote:
> Local APIC timer interrupt happens HZ times per second on each CPU.
>
> Optimize it for the case where profile multiplier equals one and does
> not change (99+% of cases); this saves about 20 CPU cycles on Pentium II.
>
> Also update the old multiplier immediately after noticing it changed,
> while values are register-hot, saving eight bytes of stack depth.
>
> Signed-off-by: Chuck Ebbert <[email protected]>
>
> diff -up 2.6.15a.orig/arch/i386/kernel/apic.c 2.6.15a/arch/i386/kernel/apic.c
> --- 2.6.15a.orig/arch/i386/kernel/apic.c
> +++ 2.6.15a/arch/i386/kernel/apic.c
> @@ -1137,7 +1137,7 @@ inline void smp_local_timer_interrupt(st
> int cpu = smp_processor_id();
>
> profile_tick(CPU_PROFILING, regs);
> - if (--per_cpu(prof_counter, cpu) <= 0) {
> + if (likely(--per_cpu(prof_counter, cpu)) <= 0) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
are you sure about this ? it looks suspicious to me. I would have
expected something like this instead :

+ if (likely(--per_cpu(prof_counter, cpu) <= 0)) {

Willy

2006-01-05 06:41:43

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [patch 2.6.15] i386: Optimize local APIC timer interrupt code

In-Reply-To: <[email protected]>

On Thu, 5 Jan 2006 at 07:03:14 +0100, Willy Tarreau wrote:

> + if (likely(--per_cpu(prof_counter, cpu)) <= 0) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
are you sure about this ? it looks suspicious to me. I would have
expected something like this instead :

+ if (likely(--per_cpu(prof_counter, cpu) <= 0)) {


Nice catch, proving yet again the value of posting code for review!


--
Chuck
Currently reading: _Thud!_ by Terry Pratchett