Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751793AbdILReN (ORCPT ); Tue, 12 Sep 2017 13:34:13 -0400 Received: from mail-io0-f178.google.com ([209.85.223.178]:37808 "EHLO mail-io0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751681AbdILReA (ORCPT ); Tue, 12 Sep 2017 13:34:00 -0400 X-Google-Smtp-Source: AOwi7QDktF1nXvQkbbPcj9nYBuzn6SWGizvdUd+/OUyNlV4jgSbXbj+zH1PaiSwY5ZMxboLiRu+omHduGgqYUsMJPns= MIME-Version: 1.0 In-Reply-To: <74260214-ade2-48d8-9b18-d00e31b11421@redhat.com> References: <20170912151851.GA24313@flask> <5900628c-222c-fa71-ae11-00ce611c56dd@redhat.com> <74260214-ade2-48d8-9b18-d00e31b11421@redhat.com> From: Dmitry Vyukov Date: Tue, 12 Sep 2017 19:33:38 +0200 Message-ID: Subject: Re: "KVM: x86: generalize guest_cpuid_has_ helpers" breaks clang To: Paolo Bonzini Cc: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , David Hildenbrand , LKML , KVM list , llvmlinux@lists.linuxfoundation.org, Alexander Potapenko , andreyknvl , Michael Davidson , Greg Hackmann , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5046 Lines: 159 On Tue, Sep 12, 2017 at 6:33 PM, Paolo Bonzini wrote: > On 12/09/2017 18:16, Dmitry Vyukov wrote: >> On Tue, Sep 12, 2017 at 6:03 PM, Paolo Bonzini wrote: >>> On 12/09/2017 17:54, Dmitry Vyukov wrote: >>>>> I guess clang still eliminates dead branches. Clang optimizer does >>>>> know that these are constant, it just does not allow build >>>>> success/failure nor runtime behavior depend on optimization level and >>>>> compiler version. I.e. with gcc you can get build failure with only >>>>> some compiler flags and/or compiler versions. Clang gives stable >>>>> result. But the optimizer does use constant propagation, etc during >>>>> optimization. >>> >>> I can reproduce it: >>> >>> $ cat f.c >>> int bad_code(); >>> >>> static inline void __attribute__((always_inline)) f(int x) >>> { >>> if (!__builtin_constant_p(x)) >>> bad_code(); >>> } >>> >>> int main() >>> { >>> f(0); >>> f(1); >>> f(100); >>> } >>> >>> $ clang --version >>> clang version 4.0.0 (tags/RELEASE_400/final) >>> $ clang f.c -O2 -c -o f.o >>> $ nm f.o >>> U bad_code >>> 0000000000000000 T main >>> >>> $ gcc f.c -O2 -c -o f.o >>> $ nm f.o >>> 0000000000000000 T main >>> >>> ... but I don't know, it seems very weird. The purpose of >>> __builtin_constant_p is to be resolved only relatively late in the >>> optimization pipeline, and it has been like this for at least 15 years >>> in GCC. >>> >>> The docs say what to expect: >>> >>> You may use this built-in function in either a macro or an inline >>> function. However, if you use it in an inlined function and pass an >>> argument of the function as the argument to the built-in, GCC never >>> returns 1 when you call the inline function with a string constant or >>> compound literal (see Compound Literals) and does not return 1 when >>> you pass a constant numeric value to the inline function **unless you >>> specify the -O option**. >>> >>> (emphasis mine). >> >> >> Yes, I know. This difference was surprising for me and lots of other >> people as well. But this is a fundamental position for clang and is >> related to some implementation choices. Namely, C/C++ frontend needs >> to know values of compile-time const expressions in order to verify >> correctness and generate middle-end representation. But for gcc's >> meaning of __builtin_constant_p, its value only becomes known deep >> inside of middle-end. Which kinda creates a cycle. In gcc it's all >> somehow mixed together (front-end/middle-end) and somehow works. Can't >> possibly work for clang with strict separation between front-end and >> middle-end. > > This is nonsense, GCC is also separating front-end and middle-end. The > front-end only ever produces a 0 value for __builtin_constant_p if an > integer constant expression is syntactically required. > > When entering the middle-end a __builtin_constant_p with non-constant > argument is lowered to a builtin function when optimization is on, or 0 > when optimization is off. > > The middle-end knows about __builtin_constant_p and can fold it to 1 > when the argument is a constant. At some point, GCC decides it's had > enough and changes all remaining calls to return 0. There's no reason > why LLVM couldn't have such a builtin. That's still build breakages/behavior differences depending on optimization level and compiler version. Also these funny effects: #include void f1(int x) { char a[__builtin_constant_p(x) ? 2 : 1]; printf("%d\n", (int)sizeof(a)); } void f2(int x) { const int y = __builtin_constant_p(x) ? 2 : 1; char a[y]; printf("%d\n", (int)sizeof(a)); } void f3(int x) { char a[__builtin_constant_p(x) ? 2 : 1]; printf("%d %d\n", (int)sizeof(a), (int)__builtin_constant_p(x) ? 2 : 1); } void f4(int x) { const int y = __builtin_constant_p(x) ? 2 : 1; char a[y]; printf("%d %d\n", (int)sizeof(a), y); } int main() { f1(1); f2(1); f3(1); f4(1); return 0; } $ clang-3.9 /tmp/test.c -O2 && ./a.out 1 1 1 1 1 1 $ gcc /tmp/test.c -O2 && ./a.out 2 2 1 1 2 2 Print value -- different behavior. Print value but assign to a temp -- again different behavior. There are merits for not doing this. But I admit clang effectively breaks most practical uses of __builtin_constant_p. >> I proposed to introduce another builtin that returns a value that is >> constant from optimizer point of view (e.g. it can eliminate dead code >> on branches), but is not constant from language/front-end point of >> view (e.g. you can't declare a stack array using the value as size). >> It should do in such cases and should be implementable in clang. But >> we don't have it yet, and again it's not __builtin_constant_p, because >> gcc's __builtin_constant_p returns a compile-time constant. > > I think this has to be fixed at the include/linux/ level. I'm okay with > warning instead of erroring, so maybe add WARN_IF_NONCONSTANT() and make > it do nothing (or live with the warning) on clang? > > Paolo