Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756279AbbEUN23 (ORCPT ); Thu, 21 May 2015 09:28:29 -0400 Received: from mail-wi0-f182.google.com ([209.85.212.182]:33342 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756256AbbEUN2Z (ORCPT ); Thu, 21 May 2015 09:28:25 -0400 Date: Thu, 21 May 2015 15:28:18 +0200 From: Ingo Molnar To: Denys Vlasenko Cc: Linus Torvalds , Andy Lutomirski , Davidlohr Bueso , Peter Anvin , Linux Kernel Mailing List , Tim Chen , Borislav Petkov , Peter Zijlstra , "Chandramouleeswaran, Aswin" , Peter Zijlstra , Brian Gerst , Paul McKenney , Thomas Gleixner , Jason Low , "linux-tip-commits@vger.kernel.org" , Arjan van de Ven , Andrew Morton Subject: Re: [RFC PATCH] x86/64: Optimize the effective instruction cache footprint of kernel functions Message-ID: <20150521132818.GA544@gmail.com> References: <20150410121808.GA19918@gmail.com> <20150517055551.GB17002@gmail.com> <20150519213820.GA31688@gmail.com> <555C7012.3040806@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <555C7012.3040806@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2590 Lines: 60 * Denys Vlasenko wrote: > Can you post your .config for the test? > If you have CONFIG_OPTIMIZE_INLINING=y in your -Os test, > consider re-testing with it turned off. Yes, I had CONFIG_OPTIMIZE_INLINING=y. With that turned off, on GCC 4.9.2, I'm seeing: fomalhaut:~/linux/linux-____CC_OPTIMIZE_FOR_SIZE=y> size vmlinux.OPTIMIZE_INLINING\=* text data bss dec hex filename 12150606 2565544 1634304 16350454 f97cf6 vmlinux.OPTIMIZE_INLINING=y 12354814 2572520 1634304 16561638 fcb5e6 vmlinux.OPTIMIZE_INLINING=n I.e. forcing the inlining increases the kernel size again, by about 1.7%. I re-ran the tests on the Intel system, and got these I$ miss rates: linux-falign-functions=_64-bytes: 647,853,942 L1-icache-load-misses ( +- 0.07% ) (100.00%) linux-falign-functions=_16-bytes: 706,080,917 L1-icache-load-misses ( +- 0.05% ) (100.00%) linux-CC_OPTIMIZE_FOR_SIZE=y+OPTIMIZE_INLINING=y: 921,910,808 L1-icache-load-misses ( +- 0.05% ) (100.00%) linux-CC_OPTIMIZE_FOR_SIZE=y+OPTIMIZE_INLINING=n: 792,395,265 L1-icache-load-misses ( +- 0.05% ) (100.00%) So yeah, it got better - but the I$ cache miss rate is still 22.4% higher than that of the 64-bytes aligned kernel and 12.2% higher than the vanilla kernel. Elapsed time had this original OPTIMIZE_FOR_SIZE result: 8.531418784 seconds time elapsed ( +- 0.19% ) this now improved to: 7.686174880 seconds time elapsed ( +- 0.18% ) but it's still much worse than the 64-byte aligned one: 7.154816369 seconds time elapsed ( +- 0.03% ) and the 16-byte aligned one: 7.333597250 seconds time elapsed ( +- 0.48% ) > You may be seeing this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 Yeah, disabling OPTIMIZE_INLINING made a difference - but it didn't recover the performance loss, -Os is still 4.8% slower in this workload than the vanilla kernel. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/