Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3272868imm; Mon, 8 Oct 2018 00:47:22 -0700 (PDT) X-Google-Smtp-Source: ACcGV62E1p4U3VnOC5JQc+FD32xpJal0qjMFgoS4d07zAXQvhxrzbKbCikSpSLuVQ4+jgkXvuHXu X-Received: by 2002:a63:af5b:: with SMTP id s27-v6mr19823348pgo.448.1538984842781; Mon, 08 Oct 2018 00:47:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538984842; cv=none; d=google.com; s=arc-20160816; b=jiT/DZzLETH5TkDgBTCVVepCHjYpntSNGPNAFK4ym9E8NvWJY7/8foWV9ze5JW/k4b XXo9+Dx4s2B2gxXdNPUayoTm4sUcktr33bRt70x4hfVqhXI7NKVYmaE3+FDRS8x/1A8M 0IzQ0DMAjJ3hgNVrCbqAsjtC4PuEoPXzbJFrcHBKZjPWweC8XB0wgZQd0fvXLPRoX2gS WF+TTCvaA9D5bBNErCU1yP0zBcVDNUYOX/ttyEgP/pohpOL8/TfohjH6sd9xS12O2Hn5 bUlya+tYDFPVMXhl3dpikwvqWHD5o+K9fVd1z7pN+oqrOIT2nALjDs+IRqlChRRzAR/Y 1aDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=R8pFluLehC/+WNv/8JYaULz1N8M7eou1oTI9L2nTwxw=; b=aHoQCRDryQC9OM/QW7hrn5q4wjuioPnVVhec+8TW46GIDkJGJHPuZUMRR2DFt32OFC od+Kcnrtfr39OVBDrGtm9Jev5htWdmTgX0l1HrlDWj0nO/TVz6KzFrm++5cPrV9/1Jjl L5Wb4bv6dFnLSk1jynfzjGWloJI7WccNv3AId0TT6Hqa1CEb6mH80N1WZV3auaxhrhRh 3FvFvOdla+lGnrKw/G4sfCe7WNiaiRW1TRJUVLnMBdoq+TA1W6xyEzA5GQP/En0Gh0SE Az6w63NaJvQm5FkSlfLztPyjAFK8MGGeevO+0S0vIHR/5OeSxtFL8FhPJOBPY7vhJ/P+ 9FGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5-v6si17399139plh.312.2018.10.08.00.47.07; Mon, 08 Oct 2018 00:47:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726612AbeJHO5R (ORCPT + 99 others); Mon, 8 Oct 2018 10:57:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:41768 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725973AbeJHO5R (ORCPT ); Mon, 8 Oct 2018 10:57:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2BFF8AEBC; Mon, 8 Oct 2018 07:46:51 +0000 (UTC) Date: Mon, 8 Oct 2018 09:46:49 +0200 (CEST) From: Richard Biener To: Nadav Amit cc: Borislav Petkov , "gcc@gcc.gnu.org" , Michael Matz , Ingo Molnar , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , Masahiro Yamada , Sam Ravnborg , Alok Kataria , Christopher Li , Greg Kroah-Hartman , "H. Peter Anvin" , Jan Beulich , Josh Poimboeuf , Juergen Gross , Kate Stewart , Kees Cook , "linux-sparse@vger.kernel.org" , Peter Zijlstra , Philippe Ombredanne , Thomas Gleixner , "virtualization@lists.linux-foundation.org" , Linus Torvalds , Chris Zankel , Max Filippov , "linux-xtensa@linux-xtensa.org" Subject: Re: PROPOSAL: Extend inline asm syntax with size spec In-Reply-To: <56EB8A07-8D24-40F1-8CCE-614D7B712519@vmware.com> Message-ID: References: <20181003213100.189959-1-namit@vmware.com> <20181007091805.GA30687@zn.tnic> <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> <56EB8A07-8D24-40F1-8CCE-614D7B712519@vmware.com> User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: multipart/mixed; BOUNDARY="-1609908220-1676946831-1538984810=:16707" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1609908220-1676946831-1538984810=:16707 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Sun, 7 Oct 2018, Nadav Amit wrote: > at 9:46 AM, Richard Biener wrote: > > > On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit wrote: > >> at 2:18 AM, Borislav Petkov wrote: > >> > >>> Hi people, > >>> > >>> this is an attempt to see whether gcc's inline asm heuristic when > >>> estimating inline asm statements' cost for better inlining can be > >>> improved. > >>> > >>> AFAIU, the problematic arises when one ends up using a lot of inline > >>> asm statements in the kernel but due to the inline asm cost > >> estimation > >>> heuristic which counts lines, I think, for example like in this here > >>> macro: > >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975505381&sdata=Nd0636K9Z1IsUs1RWSRAhVuVboLxlBCB4peiAMfmQzQ%3D&reserved=0 > >>> the resulting code ends up not inlining the functions themselves > >> which > >>> use this macro. I.e., you see a CALL instead of its body > >>> getting inlined directly. > >>> > >>> Even though it should be because the actual instructions are only a > >>> couple in most cases and all those other directives end up in another > >>> section anyway. > >>> > >>> The issue is explained below in the forwarded mail in a larger detail > >>> too. > >>> > >>> Now, Richard suggested doing something like: > >>> > >>> 1) inline asm ("...") > >>> 2) asm ("..." : : : : ) > >>> 3) asm ("...") __attribute__((asm_size())); > >>> > >>> with which user can tell gcc what the size of that inline asm > >> statement > >>> is and thus allow for more precise cost estimation and in the end > >> better > >>> inlining. > >>> > >>> And FWIW 3) looks pretty straight-forward to me because attributes > >> are > >>> pretty common anyways. > >>> > >>> But I'm sure there are other options and I'm sure people will have > >>> better/different ideas so feel free to chime in. > >> > >> Thanks for taking care of it. I would like to mention a second issue, > >> since > >> you may want to resolve both with a single solution: not inlining > >> conditional __builtin_constant_p(), in which there are two code-paths - > >> one > >> for constants and one for variables. > >> > >> Consider for example the Linux kernel ilog2 macro, which has a > >> condition > >> based on __builtin_constant_p() ( > >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv4.19-rc7%2Fsource%2Finclude%2Flinux%2Flog2.h%23L160&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975515386&sdata=Hk39Za9%2FxcFyK0sGENB24d6QySjsDGzF%2FwqjnUEMiGk%3D&reserved=0 > >> ). The compiler mistakenly considers the “heavy” code-path that is > >> supposed > >> to be evaluated only in compilation time to evaluate the code size. > > > > But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). > > > > Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. > > I understand that this is might not be the right way to implement macros > such as ilog2() and test_bit(), but this code is around for some time. > > I thought of using __builtin_choose_expr() instead of ternary operator, but > this introduces a different problem, as the variable version is used instead > of the constant one in many cases. From my brief experiments with llvm, it > appears that llvm does not have both of these issues (wrong cost attributed > to inline asm and conditions based on __builtin_constant_p()). > > So what alternative do you propose to implement ilog2() like behavior? I was > digging through the gcc code to find a workaround with no success. 1) Don't try to cheat the compilers constant propagation abilities 2) Use a language that allows this (C++) 3) Define (and implement) the corresponding GNU C extension __builtin_constant_p() isn't the right fit (I wonder what it was implemented for in the first place though...). I suppose you want sth like if (__builtin_constant_p (x)) return __constexpr ...; or use a call and have constexpr functions. Note it wouldn't be C++-constexpr like since you want the constexpr evaluation to happen very late in the compilation to benefit from optimizations and you are fine with the non-constexpr path. Properly defining a language extension is hard. Richard. > > Thanks, > Nadav > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ---1609908220-1676946831-1538984810=:16707--