Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753089AbbEKSyS (ORCPT ); Mon, 11 May 2015 14:54:18 -0400 Received: from relay3-d.mail.gandi.net ([217.70.183.195]:50312 "EHLO relay3-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755892AbbEKSxx (ORCPT ); Mon, 11 May 2015 14:53:53 -0400 X-Originating-IP: 173.246.103.110 Date: Mon, 11 May 2015 11:53:25 -0700 From: Josh Triplett To: Denys Vlasenko Cc: Linus Torvalds , Thomas Graf , "David S. Miller" , Bart Van Assche , Peter Zijlstra , David Rientjes , Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH] force inlining of spinlock ops Message-ID: <20150511185324.GA3671@jtriplet-mobl1> References: <1431367042-31475-1-git-send-email-dvlasenk@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1431367042-31475-1-git-send-email-dvlasenk@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2629 Lines: 60 On Mon, May 11, 2015 at 07:57:22PM +0200, Denys Vlasenko wrote: > With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline > very small functions we expect to be inlined. In particular, > with this config: http://busybox.net/~vda/kernel_config > there are more than a thousand copies of tiny spinlock-related functions: > > $ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^ *1 ' | sort -rn | grep ' spin' > 473 000000000000000b t spin_unlock_irqrestore > 292 000000000000000b t spin_unlock > 215 000000000000000b t spin_lock > 134 000000000000000b t spin_unlock_irq > 130 000000000000000b t spin_unlock_bh > 120 000000000000000b t spin_lock_irq > 106 000000000000000b t spin_lock_bh > > Disassembly: > > ffffffff81004720 : > ffffffff81004720: 55 push %rbp > ffffffff81004721: 48 89 e5 mov %rsp,%rbp > ffffffff81004724: e8 f8 4e e2 02 callq <_raw_spin_lock> > ffffffff81004729: 5d pop %rbp > ffffffff8100472a: c3 retq Frame pointers make this even more awful, since without them this could just become a single jmp. (Assuming _raw_spin_lock shouldn't be inlined too.) > This patch fixes this via s/inline/__always_inline/ in spinlock.h. > This decreases vmlinux by about 30k: > > text data bss dec hex filename > 82375570 22255544 20627456 125258570 7774b4a vmlinux.before > 82335059 22255416 20627456 125217931 776ac8b vmlinux Nice improvement. Given that this actually makes the kernel *smaller*, presumably in addition to faster, this forced inlining seems completely reasonable. > Signed-off-by: Denys Vlasenko > Cc: Thomas Graf > Cc: David S. Miller > Cc: Bart Van Assche > Cc: Peter Zijlstra > Cc: David Rientjes > Cc: David S. Miller > Cc: Andrew Morton > Cc: Linus Torvalds > Cc: Oleg Nesterov > Cc: Paul E. McKenney > Cc: Ingo Molnar > Cc: Paul E. McKenney > CC: linux-kernel@vger.kernel.org Reviewed-by: Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/