MIME-Version: 1.0
In-Reply-To: <551EC5CD.3050901@redhat.com>
References: <1428059634-11782-1-git-send-email-dvlasenk@redhat.com>
	<20150403140336.GA16422@gmail.com>
	<551EC5CD.3050901@redhat.com>
Date: Fri, 3 Apr 2015 11:08:58 -0700
Message-ID: <CA+55aFxHSCUmsWZR4ey_NfZep-h5oyEoUL15GamQt9v17znyvQ@mail.gmail.com>
Subject: Re: [PATCH] x86/asm/entry/64: pack interrupt dispatch table tighter
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>, Steven Rostedt <rostedt@goodmis.org>,
        Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
        Andy Lutomirski <luto@amacapital.net>, Oleg Nesterov <oleg@redhat.com>,
        Frederic Weisbecker <fweisbec@gmail.com>,
        Alexei Starovoitov <ast@plumgrid.com>, Will Drewry <wad@chromium.org>,
        Kees Cook <keescook@chromium.org>,
        "the arch/x86 maintainers" <x86@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3036
Lines: 84

On Fri, Apr 3, 2015 at 9:54 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
> How about this version?
> It's still isn't a star of readability,
> but the structure of the 32-byte code block is more visible now...

Do we really even want to be this clever in the first place?

The thing is, when we take an interrupt:

 (a) the L1 I$ is always cold

 (b) the instruction decoder has never had time to run ahead

 (c) there are usually not that many different interrupts anyway, even
under load (ie you'd have maybe disk and networking)

 (d) we intentionally spread out the different interrupt vector numbers

 (e) the 32-byte block thing is questionable, most older
micro-architectures fetch in 16-byte blocks iirc.

So what this tells me is that:

 - (a+b) the jump-to-jump is likely fairly expensive, because even
though they are in the same cacheline, the front end hasn't gotten
ahead of anything, so there's no hiding any front end pipeline
hickups.

 - (c+d) there is likely very little advantage to trying to "pack"
things in cachelines

 - (d+e) the 7-instructions-in-one-32-byte-block doesn't really sound
all that big of a win, and it does cause a 16-byte split for some
interrupt.

In other words, I'd suggest that we just use simple unconditional
5-byte branch instead. Add the two-byte "push" instruction, you have 7
bytes per interrupt. Align that 7 bytes up to 8, and none of them ever
cross a 16-byte boundary.

Simple, clean, and slightly bigger in memory footprint, but probably
not noticeably more so in cache footprint, simply because there
usually aren't that many active interrupts anyway.

The people who do millions of networking interrupts per second and
have network cards that steer things to many different interrupts
already try to make sure that the steering goes to different CPU's -
otherwise there wouldn't be any *point* to steering things. So that
particular case of "lots of active interrupts" doesn't have a bigger
cache footprint *either*, since any particular CPU L1 I$ will still
only handle a few interrupts.

So you get "only" 4 interrupt cases per 32 bytes rather than 7. But is
that odd double jump and all this complexity really worth it?

So I really suggest just doing something stupid and straightforward
(and completely untested) like this:

    .macro push_vector
        pushq_cfi $(~vector+0x80)
        jmp common_interrupt
        .align 8
    .endm

    vector=FIRST_EXTERNAL_VECTOR
    .align 64
    ENTRY(irq_entries_start)
    .rept 256 /* this number does not need to be exact, just big enough */
         make_vector
    .endr

and just be done with it.

(Of course, you have to change the code that knows about the "7
entries in 32 bytes" patterns too, but that's just going to be much
simpler now).

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/