Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751941AbcDPVI5 (ORCPT ); Sat, 16 Apr 2016 17:08:57 -0400 Received: from mail-qk0-f180.google.com ([209.85.220.180]:36591 "EHLO mail-qk0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751621AbcDPVIz (ORCPT ); Sat, 16 Apr 2016 17:08:55 -0400 MIME-Version: 1.0 In-Reply-To: <555DC39C.8060502@redhat.com> References: <20150410121808.GA19918@gmail.com> <20150517055551.GB17002@gmail.com> <20150519213820.GA31688@gmail.com> <555C7C57.1070608@redhat.com> <555DC39C.8060502@redhat.com> From: Denys Vlasenko Date: Sat, 16 Apr 2016 23:08:34 +0200 Message-ID: Subject: Re: [RFC PATCH] x86/64: Optimize the effective instruction cache footprint of kernel functions To: Denys Vlasenko Cc: Linus Torvalds , Ingo Molnar , Andy Lutomirski , Davidlohr Bueso , Peter Anvin , Linux Kernel Mailing List , Tim Chen , Borislav Petkov , Peter Zijlstra , "Chandramouleeswaran, Aswin" , Peter Zijlstra , Brian Gerst , Paul McKenney , Thomas Gleixner , Jason Low , "linux-tip-commits@vger.kernel.org" , Arjan van de Ven , Andrew Morton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1547 Lines: 42 On Thu, May 21, 2015 at 1:38 PM, Denys Vlasenko wrote: > On 05/20/2015 02:21 PM, Denys Vlasenko wrote: >> So what we need is to put something like ".p2align 64,,7" >> before every function. >> >> ( >> Why 7? >> >> defconfig vmlinux (w/o FRAME_POINTER) has 42141 functions. >> 6923 of them have 1st insn 5 or more bytes long, >> 5841 of them have 1st insn 6 or more bytes long, >> 5095 of them have 1st insn 7 or more bytes long, >> 786 of them have 1st insn 8 or more bytes long, >> 548 of them have 1st insn 9 or more bytes long, >> 375 of them have 1st insn 10 or more bytes long, >> 73 of them have 1st insn 11 or more bytes long, >> one of them has 1st insn 12 bytes long: >> this "heroic" instruction is in local_touch_nmi() >> 65 48 c7 05 44 3c 00 7f 00 00 00 00 >> movq $0x0,%gs:0x7f003c44(%rip) >> >> Thus ensuring that at least seven first bytes do not cross >> 64-byte boundary would cover >98% of all functions. >> ) >> >> gcc can't do that right now. With -falign-functions=N, >> it emits ".p2align next_power_of_2(N),,N-1" >> >> We need to make it just a tiny bit smarter. >> >>> We'd need toolchain help to do saner alignment. >> >> Yep. >> I'm going to create a gcc BZ with a feature request, >> unless you disagree with my musings above. > > The BZ is here: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66240 ...and now this BZ has a working patch, which implements e.g. -falign-functions=64,7