Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751462AbdILQdr (ORCPT ); Tue, 12 Sep 2017 12:33:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35110 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751054AbdILQdo (ORCPT ); Tue, 12 Sep 2017 12:33:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 965C33680B Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=pbonzini@redhat.com Subject: Re: "KVM: x86: generalize guest_cpuid_has_ helpers" breaks clang To: Dmitry Vyukov Cc: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , David Hildenbrand , LKML , KVM list , llvmlinux@lists.linuxfoundation.org, Alexander Potapenko , andreyknvl , Michael Davidson , Greg Hackmann , Nick Desaulniers References: <20170912151851.GA24313@flask> <5900628c-222c-fa71-ae11-00ce611c56dd@redhat.com> From: Paolo Bonzini Message-ID: <74260214-ade2-48d8-9b18-d00e31b11421@redhat.com> Date: Tue, 12 Sep 2017 18:33:24 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Tue, 12 Sep 2017 16:33:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3855 Lines: 95 On 12/09/2017 18:16, Dmitry Vyukov wrote: > On Tue, Sep 12, 2017 at 6:03 PM, Paolo Bonzini wrote: >> On 12/09/2017 17:54, Dmitry Vyukov wrote: >>>> I guess clang still eliminates dead branches. Clang optimizer does >>>> know that these are constant, it just does not allow build >>>> success/failure nor runtime behavior depend on optimization level and >>>> compiler version. I.e. with gcc you can get build failure with only >>>> some compiler flags and/or compiler versions. Clang gives stable >>>> result. But the optimizer does use constant propagation, etc during >>>> optimization. >> >> I can reproduce it: >> >> $ cat f.c >> int bad_code(); >> >> static inline void __attribute__((always_inline)) f(int x) >> { >> if (!__builtin_constant_p(x)) >> bad_code(); >> } >> >> int main() >> { >> f(0); >> f(1); >> f(100); >> } >> >> $ clang --version >> clang version 4.0.0 (tags/RELEASE_400/final) >> $ clang f.c -O2 -c -o f.o >> $ nm f.o >> U bad_code >> 0000000000000000 T main >> >> $ gcc f.c -O2 -c -o f.o >> $ nm f.o >> 0000000000000000 T main >> >> ... but I don't know, it seems very weird. The purpose of >> __builtin_constant_p is to be resolved only relatively late in the >> optimization pipeline, and it has been like this for at least 15 years >> in GCC. >> >> The docs say what to expect: >> >> You may use this built-in function in either a macro or an inline >> function. However, if you use it in an inlined function and pass an >> argument of the function as the argument to the built-in, GCC never >> returns 1 when you call the inline function with a string constant or >> compound literal (see Compound Literals) and does not return 1 when >> you pass a constant numeric value to the inline function **unless you >> specify the -O option**. >> >> (emphasis mine). > > > Yes, I know. This difference was surprising for me and lots of other > people as well. But this is a fundamental position for clang and is > related to some implementation choices. Namely, C/C++ frontend needs > to know values of compile-time const expressions in order to verify > correctness and generate middle-end representation. But for gcc's > meaning of __builtin_constant_p, its value only becomes known deep > inside of middle-end. Which kinda creates a cycle. In gcc it's all > somehow mixed together (front-end/middle-end) and somehow works. Can't > possibly work for clang with strict separation between front-end and > middle-end. This is nonsense, GCC is also separating front-end and middle-end. The front-end only ever produces a 0 value for __builtin_constant_p if an integer constant expression is syntactically required. When entering the middle-end a __builtin_constant_p with non-constant argument is lowered to a builtin function when optimization is on, or 0 when optimization is off. The middle-end knows about __builtin_constant_p and can fold it to 1 when the argument is a constant. At some point, GCC decides it's had enough and changes all remaining calls to return 0. There's no reason why LLVM couldn't have such a builtin. > I proposed to introduce another builtin that returns a value that is > constant from optimizer point of view (e.g. it can eliminate dead code > on branches), but is not constant from language/front-end point of > view (e.g. you can't declare a stack array using the value as size). > It should do in such cases and should be implementable in clang. But > we don't have it yet, and again it's not __builtin_constant_p, because > gcc's __builtin_constant_p returns a compile-time constant. I think this has to be fixed at the include/linux/ level. I'm okay with warning instead of erroring, so maybe add WARN_IF_NONCONSTANT() and make it do nothing (or live with the warning) on clang? Paolo