Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5722758imm; Tue, 18 Sep 2018 14:32:36 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYF7fccGwxm3mZ6qfEhnF+EoI/iWeiK6vQQqwYL9Y4m6NQW9BLi2Ekq0SJVYUEKA2nZDtGS X-Received: by 2002:a63:ac54:: with SMTP id z20-v6mr28565412pgn.74.1537306356102; Tue, 18 Sep 2018 14:32:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537306356; cv=none; d=google.com; s=arc-20160816; b=yGo9WrnGA1QpopZrG2UOD/+EscW/E3Xe0PlNyuRKYKnUtNDgXxcL4Vr2vwHnGMWCe7 A3HVDQ+ozsHiNgqHTiQZ2ia4bpuPFcXc0pNj0jhPLydwsMSfer6mPZ8Yehe55DUeJMYb /mlcHnWq9qdZa5FT6qFg1VgiyQCEl42O7x/yAWhfnKVxYHtH4iw5A5u8k+CygOBUUvar Oo+Fcv5AaF2C+Ygz18CEUxStYLaIOVlNusbdXTyGg7w/YcldnEStTujHmo0M3UiRVl3A lsbsoxV8FC9gnqGDFFDIWC/TeXZuiPCdmuUsa/T19DnmvKiy21jVZL+qPYwO9fK/q/Yf nEqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject:cc :to:from; bh=3crNtGIMwK9d8OWALgu7BujPfX5FrCGERE+9wIyw0l4=; b=SgLPchDU214FDUr2fc7qrDo5wpud/Bb2TR2j08TKcVluVyPwFtkhrtgunN//ufm+Fd Ja4nhH4Ir+LE5zkQOeC+6IEF/PPDVjEhVMaOQNRDmBJU5i/W2aY20i0jKpOidjwuic/j 7iA3+TgnQmoBMBtQ3DscHdTFV9BPL86bDJVFY4lvACgyWpm5x7dDi/6rOkmdiOCx62+h cbXf4LyNmqjLA/dC6mTWzLj3wqcwGBrskiDNT866HN4fp1Pw65sSF30r9tNypk8EiAct VzYVZRXfF6X6hnY/KmAAXEKtFQBb2SmEZeHtkOZPvzp+quSJOHfLbiYJhszpQbOLw49T 6HeA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d13-v6si21455318plj.286.2018.09.18.14.32.21; Tue, 18 Sep 2018 14:32:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730437AbeISDFF (ORCPT + 99 others); Tue, 18 Sep 2018 23:05:05 -0400 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:1294 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730173AbeISDFE (ORCPT ); Tue, 18 Sep 2018 23:05:04 -0400 Received: from sc9-mailhost2.vmware.com (10.113.161.72) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Tue, 18 Sep 2018 14:30:10 -0700 Received: from sc2-haas01-esx0118.eng.vmware.com (sc2-haas01-esx0118.eng.vmware.com [10.172.44.118]) by sc9-mailhost2.vmware.com (Postfix) with ESMTP id 60059B0BF8; Tue, 18 Sep 2018 17:30:35 -0400 (EDT) From: Nadav Amit To: Ingo Molnar CC: , , Nadav Amit , Masahiro Yamada , Sam Ravnborg , Alok Kataria , Christopher Li , Greg Kroah-Hartman , "H. Peter Anvin" , Jan Beulich , Josh Poimboeuf , Juergen Gross , Kate Stewart , Kees Cook , , Peter Zijlstra , Philippe Ombredanne , Thomas Gleixner , , Linus Torvalds , Chris Zankel , Max Filippov , Subject: [PATCH v8 00/10] x86: macrofying inline asm for better compilation Date: Tue, 18 Sep 2018 14:28:37 -0700 Message-ID: <20180918212847.199085-1-namit@vmware.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (EX13-EDG-OU-002.vmware.com: namit@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch-set deals with an interesting yet stupid problem: kernel code that does not get inlined despite its simplicity. There are several causes for this behavior: "cold" attribute on __init, different function optimization levels; conditional constant computations based on __builtin_constant_p(); and finally large inline assembly blocks. This patch-set deals with the inline assembly problem. I separated these patches from the others (that were sent in the RFC) for easier inclusion. I also separated the removal of unnecessary new-lines which would be sent separately. The problem with inline assembly is that inline assembly is often used by the kernel for things that are other than code - for example, assembly directives and data. GCC however is oblivious to the content of the blocks and assumes their cost in space and time is proportional to the number of the perceived assembly "instruction", according to the number of newlines and semicolons. Alternatives, paravirt and other mechanisms are affected, causing code not to be inlined, and degrading compilation quality in general. The solution that this patch-set carries for this problem is to create an assembly macro, and then call it from the inline assembly block. As a result, the compiler sees a single "instruction" and assigns the more appropriate cost to the code. To avoid uglification of the code, as many noted, the macros are first precompiled into an assembly file, which is later assembled together with the C files. This also enables to avoid duplicate implementation that was set before for the asm and C code. This can be seen in the exception table changes. Overall this patch-set slightly increases the kernel size (my build was done using my Ubuntu 18.04 config + localyesconfig for the record): text data bss dec hex filename 18140829 10224724 2957312 31322865 1ddf2f1 ./vmlinux before 18163608 10227348 2957312 31348268 1de562c ./vmlinux after (+0.1%) The number of static functions in the image is reduced by 379, but actually inlining is even better, which does not always shows in these numbers: a function may be inlined causing the calling function not to be inlined. I ran some limited number of benchmarks, and in general the performance impact is not very notable. You can still see >10 cycles shaved off some syscalls that manipulate page-tables (e.g., mprotect()), in which paravirt caused many functions not to be inlined. In addition this patch-set can prevent issues such as [1], and improves code readability and maintainability. [1] https://patchwork.kernel.org/patch/10450037/ v7->v8: * Add acks (Masahiro, Max) * Rebase on 4.19 (Ingo) v6->v7: * Fix context switch tracking (Ingo) * Fix xtensa build error (Ingo) * Rebase on 4.18-rc8 v5->v6: * Removing more code from jump-labels (PeterZ) * Fix build issue on i386 (0-day, PeterZ) v4->v5: * Makefile fixes (Masahiro, Sam) v3->v4: * Changed naming of macros in 2 patches (PeterZ) * Minor cleanup of the paravirt patch v2->v3: * Several build issues resolved (0-day) * Wrong comments fix (Josh) * Change asm vs C order in refcount (Kees) v1->v2: * Compiling the macros into a separate .s file, improving readability (Linus) * Improving assembly formatting, applying most of the comments according to my judgment (Jan) * Adding exception-table, cpufeature and jump-labels * Removing new-line cleanup; to be submitted separately Cc: Masahiro Yamada Cc: Sam Ravnborg Cc: Alok Kataria Cc: Christopher Li Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jan Beulich Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Kate Stewart Cc: Kees Cook Cc: linux-sparse@vger.kernel.org Cc: Peter Zijlstra Cc: Philippe Ombredanne Cc: Thomas Gleixner Cc: virtualization@lists.linux-foundation.org Cc: Linus Torvalds Cc: x86@kernel.org Cc: Chris Zankel Cc: Max Filippov Cc: linux-xtensa@linux-xtensa.org Nadav Amit (10): xtensa: defining LINKER_SCRIPT for the linker script Makefile: Prepare for using macros for inline asm x86: objtool: use asm macro for better compiler decisions x86: refcount: prevent gcc distortions x86: alternatives: macrofy locks for better inlining x86: bug: prevent gcc distortions x86: prevent inline distortion by paravirt ops x86: extable: use macros instead of inline assembly x86: cpufeature: use macros instead of inline assembly x86: jump-labels: use macros instead of inline assembly Makefile | 9 ++- arch/x86/Makefile | 11 ++- arch/x86/entry/calling.h | 2 +- arch/x86/include/asm/alternative-asm.h | 20 ++++-- arch/x86/include/asm/alternative.h | 11 +-- arch/x86/include/asm/asm.h | 61 +++++++--------- arch/x86/include/asm/bug.h | 98 +++++++++++++++----------- arch/x86/include/asm/cpufeature.h | 82 ++++++++++++--------- arch/x86/include/asm/jump_label.h | 77 ++++++++------------ arch/x86/include/asm/paravirt_types.h | 56 +++++++-------- arch/x86/include/asm/refcount.h | 74 +++++++++++-------- arch/x86/kernel/macros.S | 16 +++++ arch/xtensa/kernel/Makefile | 4 +- include/asm-generic/bug.h | 8 +-- include/linux/compiler.h | 56 +++++++++++---- scripts/Kbuild.include | 4 +- scripts/mod/Makefile | 2 + 17 files changed, 333 insertions(+), 258 deletions(-) create mode 100644 arch/x86/kernel/macros.S -- 2.17.1