Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3326596imm; Mon, 8 Oct 2018 01:59:25 -0700 (PDT) X-Google-Smtp-Source: ACcGV61MPhsNvnmoMsYnxKm5eOTokde4SuywrZfr4dGfSyWnG/ED1OMmf0hEVdB2a8hT6ynm4CKJ X-Received: by 2002:a62:f909:: with SMTP id o9-v6mr24246890pfh.160.1538989165586; Mon, 08 Oct 2018 01:59:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538989165; cv=none; d=google.com; s=arc-20160816; b=pJeoHF39mA3Es718gkdaJCPmOoxm5UJr97d1T5I++sywbgey0g+qaka3e0SdQkmMCq PHAHmzxliDF09QiQW/X/qIzwkie75poKLmxqizGx1KJSCl5c1eunYlvK/AgAdoZVhM3q HVwoBXuW0J0oo0xDzfFC0eGlqc6qtOaUR4yXTg4+SKB9SfRQyCTlpbaiDPxfh4TlezFl LfCTLm8micOaVHiKArRMf1M/aJChbeUF+T7kg/pbqGn66D1V2CbJzTPLpYVtJuzo/Hw7 ikVz70/snlul+QkVlsb6aN3N1xwKEzazIuGjy5Ny0Rg8cqIbR1Z0NU8nIMSi/n3hz5vf HAbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=c9M772n71ywbyHJoI6bRoQL4yxsKkHzwnOFsGGgdgZE=; b=W9KAgGpKWQnEBPJqNwn55Rmc/MrlujpDbgijqrhzpW9WH90fb7Ed0Pw3BEdgH/7KCc 3HWmmslN2mV+V4QY/coNtoFpgUkKGScqL4Ojq0y73X+Zv7GslmjlrtCgXBPsfUqKsrix Zn07Bl01RuFmkIpDwebBEAS85dP0jgTwE4WZbi04cQGm2gvsYwvgV/Z9fklPJcEadwNQ 0MCwf4WVaE5PdYxGMNSG3J5hhY81VbsKQ5oda3K4XkHSeoHut4usNEd/HeCfVuTYuDrs e6A2A9cgO2cXDKbqBfGW5OyUzwdTy9MuioSqGKGa3Xb63owOaxK2aB2Syu8c33uQm9uw fkjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 136-v6si19737395pfw.278.2018.10.08.01.59.10; Mon, 08 Oct 2018 01:59:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727400AbeJHQJa (ORCPT + 99 others); Mon, 8 Oct 2018 12:09:30 -0400 Received: from gate.crashing.org ([63.228.1.57]:36280 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726096AbeJHQJ3 (ORCPT ); Mon, 8 Oct 2018 12:09:29 -0400 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w987VXHJ023908; Mon, 8 Oct 2018 02:31:33 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id w987VTKk023867; Mon, 8 Oct 2018 02:31:29 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Mon, 8 Oct 2018 02:31:28 -0500 From: Segher Boessenkool To: Michael Matz Cc: Borislav Petkov , gcc@gcc.gnu.org, Richard Biener , Nadav Amit , Ingo Molnar , linux-kernel@vger.kernel.org, x86@kernel.org, Masahiro Yamada , Sam Ravnborg , Alok Kataria , Christopher Li , Greg Kroah-Hartman , "H. Peter Anvin" , Jan Beulich , Josh Poimboeuf , Juergen Gross , Kate Stewart , Kees Cook , linux-sparse@vger.kernel.org, Peter Zijlstra , Philippe Ombredanne , Thomas Gleixner , virtualization@lists.linux-foundation.org, Linus Torvalds , Chris Zankel , Max Filippov , linux-xtensa@linux-xtensa.org Subject: Re: PROPOSAL: Extend inline asm syntax with size spec Message-ID: <20181008073128.GL29268@gate.crashing.org> References: <20181003213100.189959-1-namit@vmware.com> <20181007091805.GA30687@zn.tnic> <20181007132228.GJ29268@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi! On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > this is an attempt to see whether gcc's inline asm heuristic when > > > estimating inline asm statements' cost for better inlining can be > > > improved. > > > > GCC already estimates the *size* of inline asm, and this is required > > *for correctness*. So any workaround that works against this will only > > end in tears. > > You're right and wrong. GCC can't even estimate the size of mildly > complicated inline asms right now, so your claim of it being necessary for > correctness can't be true in this absolute form. I know what you try to > say, but still, consider inline asms like this: > > insn1 > .section bla > insn2 > .previous > > or > invoke_asm_macro foo,bar > > in both cases GCCs size estimate will be wrong however you want to count > it. This is actually the motivating example for the kernel guys, the > games they play within their inline asms make the estimates be wildly > wrong to a point it interacts with the inliner. Right. The manual says: """ Some targets require that GCC track the size of each instruction used in order to generate correct code. Because the final length of the code produced by an @code{asm} statement is only known by the assembler, GCC must make an estimate as to how big it will be. It does this by counting the number of instructions in the pattern of the @code{asm} and multiplying that by the length of the longest instruction supported by that processor. (When working out the number of instructions, it assumes that any occurrence of a newline or of whatever statement separator character is supported by the assembler -- typically @samp{;} --- indicates the end of an instruction.) Normally, GCC's estimate is adequate to ensure that correct code is generated, but it is possible to confuse the compiler if you use pseudo instructions or assembler macros that expand into multiple real instructions, or if you use assembler directives that expand to more space in the object file than is needed for a single instruction. If this happens then the assembler may produce a diagnostic saying that a label is unreachable. """ It *is* necessary for correctness, except you can do things that can confuse the compiler and then you are on your own anyway. > > So I guess the real issue is that the inline asm size estimate for x86 > > isn't very good (since it has to be pessimistic, and x86 insns can be > > huge)? > > No, see above, even if we were to improve the size estimates (e.g. based > on some average instruction size) the kernel examples would still be off > because they switch sections back and forth, use asm macros and computed > .fill directives and maybe further stuff. GCC will never be able to > accurately calculate these sizes What *is* such a size, anyway? If it can be spread over multiple sections (some of which support section merging), and you can have huge alignments, etc. What is needed here is not knowing the maximum size of the binary output (however you want to define that), but some way for the compiler to understand how bad it is to inline some assembler. Maybe manual direction, maybe just the current jeuristics can be tweaked a bit, maybe we need to invent some attribute or two. > (without an built-in assembler which hopefully noone proposes). Not me, that's for sure. > So, there is a case for extending the inline-asm facility to say > "size is complicated here, assume this for inline decisions". Yeah, that's an option. It may be too complicated though, or just not useful in its generality, so that everyone will use "1" (or "1 normal size instruction"), and then we are better off just making something for _that_ (or making it the default). > > > Now, Richard suggested doing something like: > > > > > > 1) inline asm ("...") > > > > What would the semantics of this be? > > The size of the inline asm wouldn't be counted towards the inliner size > limits (or be counted as "1"). That sounds like a good option. > > I don't like 2) either. But 1) looks interesting, depends what its > > semantics would be? "Don't count this insn's size for inlining decisions", > > maybe? > > TBH, I like the inline asm (...) suggestion most currently, but what if we > want to add more attributes to asms? We could add further special > keywords to the clobber list: > asm ("...." : : : "cc,memory,inline"); > sure, it might seem strange to "clobber" inline, but if we reinterpret the > clobber list as arbitrary set of attributes for this asm, it'd be fine. All of a targets register names and alternative register names are allowed in the clobber list. Will that never conflict with an attribute name? We already *have* syntax for specifying attributes on an asm (on *any* statement even), so mixing these two things has no advantage. Both "cc" and "memory" have their own problems of course, adding more things to this just feels bad. It may not be so bad ;-) > > Another option is to just force inlining for those few functions where > > GCC currently makes an inlining decision you don't like. Or are there > > more than a few? > > I think the examples I saw from Boris were all indirect inlines: > > static inline void foo() { asm("large-looking-but-small-asm"); } > static void bar1() { ... foo() ... } > static void bar2() { ... foo() ... } > void goo (void) { bar1(); } // bar1 should have been inlined > > So, while the immediate asm user was marked as always inline that in turn > caused users of it to become non-inlined. I'm assuming the kernel guys > did proper measurements that they _really_ get some non-trivial speed > benefit by inlining bar1/bar2, but for some reasons (I didn't inquire) > didn't want to mark them all as inline as well. Yeah that makes sense, like if this happens with the fixup stuff, it will quickly spiral out of control. Segher