Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755334AbcJQTna (ORCPT ); Mon, 17 Oct 2016 15:43:30 -0400 Received: from mail-it0-f51.google.com ([209.85.214.51]:38801 "EHLO mail-it0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932682AbcJQTnV (ORCPT ); Mon, 17 Oct 2016 15:43:21 -0400 MIME-Version: 1.0 In-Reply-To: <20161017183806.GG5601@arm.com> References: <20161017183806.GG5601@arm.com> From: Ard Biesheuvel Date: Mon, 17 Oct 2016 20:43:19 +0100 Message-ID: Subject: Re: Build failure with v4.9-rc1 and GCC trunk -- compiler weirdness To: Will Deacon Cc: "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Linus Torvalds , Peter Zijlstra , Ingo Molnar , james.greenhalgh@arm.com, Gregory CLEMENT , Stephen Boyd Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3970 Lines: 122 On 17 October 2016 at 19:38, Will Deacon wrote: > Hi all, > > I'm seeing an arm64 build failure with -rc1 and GCC trunk, although I > believe that the new compiler behaviour at the heart of the problem > has the potential to affect other architectures and other pieces of > kernel code relying on dead-code elimination to remove deliberately > undefined functions. > > The failure looks like: > > | drivers/built-in.o: In function `armada_3700_add_composite_clk': > | > | linux/drivers/clk/mvebu/armada-37xx-periph.c:351: > | undefined reference to `____ilog2_NaN' > | > | linux/drivers/clk/mvebu/armada-37xx-periph.c:351:(.text+0xc72e0): > | relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol > | `____ilog2_NaN' > | > | make: *** [vmlinux] Error 1 > > and if we look at the source for armada_3700_add_composite_clk, we see > that this is caused by: > > int table_size = 0; > > rate->reg = reg + (u64)rate->reg; > for (clkt = rate->table; clkt->div; clkt++) > table_size++; > rate->width = order_base_2(table_size); > > order_base_2 calls ilog2, which has the ____ilog2_NaN call: > > #define ilog2(n) \ > ( \ > __builtin_constant_p(n) ? ( \ > (n) < 1 ? ____ilog2_NaN() : \ > > This is because we're in a curious case where GCC has emitted a > special-cased version of armada_3700_add_composite_clk, with table_size > effectively constant-folded as 0. Whilst we shouldn't see this in a > non-buggy kernel (hence the deliberate call to the undefined function > ____ilog2_NaN), it means that the final link fails because we have a > ____ilog2_NaN in the code, with a runtime check on table_size. > This is indeed an unintended side effect, but I would not call it weird behaviour at all. The code in its current form does not handle the case where it could end up passing 0 into order_base_2(), and we simply need to handle that case. If order_base_2() is not defined for input 0, it should BUG() in that case, and the associated __builtin_unreachable() should prevent the special version from being emitted. If order_base_2() is defined for input 0, it should not invoke ilog2() with that argument, and the problem should go away as well. -- Ard. > In other words, __builtin_constant_p appears to be weaker than we've > been assuming. Talking to the compiler guys here, this is due to the > "jump-threading" optimisation pass, so the patch below disables that. > > A simpler example is: > > int foo(); > int bar(); > > int count(int *argc) > { > int table_size = 0; > > for (; *argc; argc++) > table_size++; > > if (__builtin_constant_p(table_size)) > return table_size == 0 ? foo() : bar(); > > return bar(); > } > > which compiles to: > > count: > ldr w0, [x0] > cbz w0, .L4 > b bar > .p2align 3 > .L4: > b foo > > and, with the "optimisation" disabled: > > count: > b bar > > Thoughts? It feels awfully fragile disabling passes like this, but with > GCC transforming the code like this, I can't immediately think of a way > to preserve the intended behaviour of the code. > > Will > > --->8 > > diff --git a/Makefile b/Makefile > index 512e47a53e9a..750873d6d11e 100644 > --- a/Makefile > +++ b/Makefile > @@ -641,6 +641,11 @@ endif > # Tell gcc to never replace conditional load with a non-conditional one > KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) > > +# Stop gcc from converting switches into a form that defeats dead code > +# elimination and can subsequently lead to calls to intentionally > +# undefined functions appearing in the final link. > +KBUILD_CFLAGS += $(call cc-option,--param=max-fsm-thread-path-insns=1) > + > include scripts/Makefile.gcc-plugins > > ifdef CONFIG_READABLE_ASM