Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750890AbbDFE06 (ORCPT ); Mon, 6 Apr 2015 00:26:58 -0400 Received: from hofr.at ([212.69.189.236]:47922 "EHLO mail.hofr.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750776AbbDFE05 (ORCPT ); Mon, 6 Apr 2015 00:26:57 -0400 Date: Mon, 6 Apr 2015 06:26:54 +0200 From: Nicholas Mc Guire To: Joe Perches Cc: Nicholas Mc Guire , Michal Marek , Masahiro Yamada , Sam Ravnborg , Thomas Gleixner , "H. Peter Alvin" , John Stultz , Andrew Hunter , Paul Turner , linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/3] time: allow gcc to fold constants when using msecs_to_jiffies Message-ID: <20150406042654.GA28155@opentech.at> References: <1428218636-3780-1-git-send-email-hofrat@osadl.org> <1428218636-3780-3-git-send-email-hofrat@osadl.org> <1428278627.2775.75.camel@perches.com> <20150406010025.GA5956@opentech.at> <1428286502.2775.92.camel@perches.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1428286502.2775.92.camel@perches.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5675 Lines: 141 On Sun, 05 Apr 2015, Joe Perches wrote: > On Mon, 2015-04-06 at 03:00 +0200, Nicholas Mc Guire wrote: > > On Sun, 05 Apr 2015, Joe Perches wrote: > > > > > On Sun, 2015-04-05 at 09:23 +0200, Nicholas Mc Guire wrote: > > > > The majority of the msecs_to_jiffies() users in the kernel are passing in > > > > constants which would allow gcc to do constant folding by checking with > > > > __builtin_constant_p() in msecs_to_jiffies(). > > > > > > > > The original msecs_to_jiffies is renamed to __msecs_to_jiffies and aside > > > > from the removal of the check for negative values being moved out, is > > > > unaltered. > > > > > > At least for gcc 4.9, this doesn't allow the compiler > > > to optimize / precalculation msecs_to_jiffies calls > > > with a constant. > > > > > > This does: (on top of your patch x86-64 defconfig) > > > > > > $ size vmlinux.o.* > > > text data bss dec hex filename > > > 11770523 1505971 1018454 14294948 da1fa4 vmlinux.o.next-b0a12fb5bc8 > > > 11770530 1505971 1018454 14294955 da1fab vmlinux.o.next-b0a12fb5bc8-inline > > > 11768734 1505971 1018454 14293159 da18a7 vmlinux.o.next-b0a12fb5bc8-macro > > > > > > I think this should still move the if (m) < 0 back into the > > > original __msecs_to_jiffies function. > > > > > > > could you check if you can reproduce the results below ? > > my assumption was that gcc would always optimize out an > > if(CONST < 0) return CONST; reducing it to the return CONST; > > only and thus this should not make any difference but Im not > > that familiar with gcc. > > > > gcc versions here are: > > for x86 gcc version 4.7.2 (Debian 4.7.2-5) > > for powerpc it is a gcc version 4.9.2 (crosstool-NG 1.20.0) > > for arm gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) > > > > Procedure used: > > root@debian:~/linux-next# make distclean > > root@debian:~/linux-next# make defconfig > > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.lst > > root@debian:~/linux-next# make drivers/net/wireless/p54/p54usb.s > > > > same setup in unpatched /usr/src/linux-next/ > > > > e.g: > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.c > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > > > So both calls are constants and should be optimized out if it works as > > expected. > > > > without the patch applied: > > > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > > call msecs_to_jiffies # > > call msecs_to_jiffies # > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > > timeout = jiffies + msecs_to_jiffies(1000); > > e19: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > fd8: R_X86_64_PC32 msecs_to_jiffies+0xfffffffffffffffc > > timeout = jiffies + msecs_to_jiffies(1000); > > > > > > with the patch applied this then gives me: > > > > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.s > > root@debian:~/linux-next# grep msecs_to_jiffies drivers/net/wireless/p54/p54usb.lst > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > timeout = jiffies + msecs_to_jiffies(1000); > > > > Conversely in kernel/sched/core.c the msecs_to_jiffies is not a constant > > and the result is that it calls __msecs_to_jiffies > > > > patched: > > root@debian:~/linux-next# grep msecs_to_jiffies kernel/sched/core.s > > call __msecs_to_jiffies # > > > > unpatched: > > root@debian:/usr/src/linux-next# grep msecs_to_jiffies kernel/sched/core.s > > call msecs_to_jiffies # > > > > > > Could you check if you get these results for this test-case ? > > If this really were compiler dependant that would be very bad. > > Hi Nicholas. > > Your inline version has not worked with any of > x86-64 gcc 4.4, 4.6, 4.7, or 4.9 > > I suggest you add some lines to > lib/test_module.c/test_module_init like: > > unsigned int m; > > for (m = 10; m < 200; m += 10) > pr_info("msecs_to_jiffies(%u) is %lu\n", > m, msecs_to_jiffies(m)); > > pr_info("msecs_to_jiffies(%u) is %lu\n", > 10, msecs_to_jiffies(10)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 100, msecs_to_jiffies(100)); > pr_info("msecs_to_jiffies(%u) is %lu\n", > 1000, msecs_to_jiffies(1000)); > > Then it's pretty easy to look at the assembly/.lst file > > Your inline function doesn't allow gcc to precompute > the msecs_to_jiffies value. The macro one does for all > those gcc versions. > > Try it and look at the generated .lst files with and > without the patch I sent. > will do that - thanks ! Managed to fool my self - the difference in the .lst/.s files is simply due to chaning msecs_to_jiffies being inline (which it was not) and thus it "vanished" - kind of stupid - sorry back to gcc manual first - need to understand __builtin_constant_p better and the constraints - from all that I understood it should be doable both as macro and inline. thx! hofrat -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/