Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp2700632imm; Sun, 7 Oct 2018 09:47:00 -0700 (PDT) X-Google-Smtp-Source: ACcGV60nzmRBJf65on7CVH0sK8+yUYe52QHsVSVJIHP3xqKZQL+qU7tCJBZiIBgHdt/idtLMYIkK X-Received: by 2002:a63:ea43:: with SMTP id l3-v6mr18166267pgk.427.1538930820800; Sun, 07 Oct 2018 09:47:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538930820; cv=none; d=google.com; s=arc-20160816; b=0SIXLTXVn2WLc/XQP1f8I8basPS62f6ZECT6dr1R4EMv+ec/bqdi1AOOJKutbSHwqU mnkTq3wOVl+0X/+iX7LiQpBzz92i1rNjixnTR1fMTV56f0HotZ9paPpYEpb/D2woKJpK AhkEkt7WPq0KKTyOacudTDPYIAi9Dsh8prS9UqFnlJt4N5PdYXS8fS+aheE95TF+oTBc DBmvrsZpWoNxBb5k/Yl2/jgUQ/1akBGyXpJR2IlbVK5tr+Lrttm72I6c5q5aSk4JOTuY mjVWJxReU9XiNItxo7wkFWmrkqUL4GHslU40iybwUldq75mJU0Md5DQcF+3/unGCQrPB KSGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:from:cc:to:subject :content-transfer-encoding:mime-version:references:in-reply-to :user-agent:date; bh=1pxmKAl+JW6y7HI9UIY0Jl4GzJ+W7SQfwrCCxB1gQic=; b=qqovKM8d7wjtdOifme4zreFByCN9+T+mMkd/ajul7/pU25BiUsUDqDkW0UPbBw+KZp Ycqs8XLo/f3MgKnRzAb2R3AiHXC9k9u1QaSkh20EWsFGcxrUuucH5T6SCGrf7MrhnD4C ykMFB/gG2qMc+pBg65Z+i3fOplIpWoBOl1Pvvnms6z45xKKKoM8JqH5rwW62skRS3QFx QkPDeFGJ/AIdjfCZzsodErOjdJDIm4OrxOEStjp7X9XZse3YF4MgQPuShFcXa/KGi9wt bkSeWFetHISzEe2P1eSB2NkOJqOf7D2PIYWOevO81TUHqyCufpf82paE6qqsXFsm7CBc m31g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i9-v6si693449pgk.20.2018.10.07.09.46.45; Sun, 07 Oct 2018 09:47:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728151AbeJGXyV convert rfc822-to-8bit (ORCPT + 99 others); Sun, 7 Oct 2018 19:54:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:47360 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727820AbeJGXyV (ORCPT ); Sun, 7 Oct 2018 19:54:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 23679AD3C; Sun, 7 Oct 2018 16:46:28 +0000 (UTC) Date: Sun, 07 Oct 2018 18:46:24 +0200 User-Agent: K-9 Mail for Android In-Reply-To: <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> References: <20181003213100.189959-1-namit@vmware.com> <20181007091805.GA30687@zn.tnic> <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Subject: Re: PROPOSAL: Extend inline asm syntax with size spec To: Nadav Amit , Borislav Petkov , "gcc@gcc.gnu.org" , Michael Matz CC: Ingo Molnar , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , Masahiro Yamada , Sam Ravnborg , Alok Kataria , Christopher Li , Greg Kroah-Hartman , "H. Peter Anvin" , Jan Beulich , Josh Poimboeuf , Juergen Gross , Kate Stewart , Kees Cook , "linux-sparse@vger.kernel.org" , Peter Zijlstra , Philippe Ombredanne , Thomas Gleixner , "virtualization@lists.linux-foundation.org" , Linus Torvalds , Chris Zankel , Max Filippov , "linux-xtensa@linux-xtensa.org" From: Richard Biener Message-ID: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit wrote: >at 2:18 AM, Borislav Petkov wrote: > >> Hi people, >> >> this is an attempt to see whether gcc's inline asm heuristic when >> estimating inline asm statements' cost for better inlining can be >> improved. >> >> AFAIU, the problematic arises when one ends up using a lot of inline >> asm statements in the kernel but due to the inline asm cost >estimation >> heuristic which counts lines, I think, for example like in this here >> macro: >> >> >https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C6db1258c65ea45bbe4ea08d62c35ceec%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745007006838299&sdata=iehl%2Fb8h%2BZE%2Frqb4qjac19WekSgOObc9%2BM1Jto1VgF4%3D&reserved=0 >> >> the resulting code ends up not inlining the functions themselves >which >> use this macro. I.e., you see a CALL instead of its body >> getting inlined directly. >> >> Even though it should be because the actual instructions are only a >> couple in most cases and all those other directives end up in another >> section anyway. >> >> The issue is explained below in the forwarded mail in a larger detail >> too. >> >> Now, Richard suggested doing something like: >> >> 1) inline asm ("...") >> 2) asm ("..." : : : : ) >> 3) asm ("...") __attribute__((asm_size())); >> >> with which user can tell gcc what the size of that inline asm >statement >> is and thus allow for more precise cost estimation and in the end >better >> inlining. >> >> And FWIW 3) looks pretty straight-forward to me because attributes >are >> pretty common anyways. >> >> But I'm sure there are other options and I'm sure people will have >> better/different ideas so feel free to chime in. > >Thanks for taking care of it. I would like to mention a second issue, >since >you may want to resolve both with a single solution: not inlining >conditional __builtin_constant_p(), in which there are two code-paths - >one >for constants and one for variables. > >Consider for example the Linux kernel ilog2 macro, which has a >condition >based on __builtin_constant_p() ( >https://elixir.bootlin.com/linux/v4.19-rc7/source/include/linux/log2.h#L160 >). The compiler mistakenly considers the “heavy” code-path that is >supposed >to be evaluated only in compilation time to evaluate the code size. But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. Richard. >This >causes the kernel to consider functions such as kmalloc() as “big”. >kmalloc() is marked with always_inline attribute, so instead the >calling >functions, such as kzalloc() are not inlined. > >When I thought about hacking gcc to solve this issue, I considered an >intrinsic that would override the cost of a given statement. This >solution >is not too nice, but may solve both issues. > >In addition, note that AFAIU the impact of a wrong cost of code >estimation >can also impact loop and other optimizations. > >Regards, >Nadav