Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3859998imm; Thu, 17 May 2018 16:31:01 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp2YhSvjJW81MlJEeH4RfLTciQUmi9bR4tI0YtQi4wOvkXXNs8vx21GTw+yFWS7E35YPDGe X-Received: by 2002:a17:902:2826:: with SMTP id e35-v6mr7079312plb.348.1526599861862; Thu, 17 May 2018 16:31:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526599861; cv=none; d=google.com; s=arc-20160816; b=JKVXsye1Hlwfm0pHd40WXb46amXPfz1GN66rIZaMpts49n3qNOWfsLFvf1wOCCLQfg 4ut8uWHU7XZg2NGAKLZXXOOwoSFLv7pBvUPMYF89y648B6e59uJbfO0CFreB64Z2FArs SIKs6GrU6CmJlF7/ANFyRiSRl60voHUavCMiWYLbSrtPS/+ucHMg8i62/Pnu7axpCx6Y G3t2BQ4oHE8Gy+dlM0SAc+pqPHcrBTCbuo9SIvJZeHyXOQzlfHIr/G94eZyU5ZCmjKuE iNEhzqVjhrd+bgPnugsI8KJV0FxExa5WF4KIIcUsgJMPiXmJZXZWUVVoh4pka+Yct5Kq Nq0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=/KLsUo3EQl2JYQ6k11L8T8N/7qL4TBe24ZhFMsN0HOs=; b=Sh/Wy3kog+uH250dAJjpwv+k10yLOMiUCiNUFDqPCiP3DYxkfPhzUZncFqpfZ/LUZ3 1HqOQ6gBft/IYYhCTFcHMIU7a5VSv8I81hUB9xY5nc2XUmJjmjEHAdnxg1WcthuJWOu1 fZpzVH3z1ma2q+YItYL3Ga5hxbV2RPGH2SaGme9P+8L3h3hqZlFbzXBeaBYreDURkKGq DEqpF9TuOAm26jD8gw+WOG0wHZ/nttlOJOQiL1iQefpAVPHD1D/YJSpbf1mFtDK79y+H JOXsXPelTWb3E8XBG2mOENmdpzUARYS9KGAvKIyBCj9KAG1G2JQ+7JYVkCrWpfrCN795 GgqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 200-v6si4967527pge.492.2018.05.17.16.30.47; Thu, 17 May 2018 16:31:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752396AbeEQX3q (ORCPT + 99 others); Thu, 17 May 2018 19:29:46 -0400 Received: from ex13-edg-ou-001.vmware.com ([208.91.0.189]:31398 "EHLO EX13-EDG-OU-001.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751986AbeEQX2m (ORCPT ); Thu, 17 May 2018 19:28:42 -0400 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-001.vmware.com (10.113.208.155) with Microsoft SMTP Server id 15.0.1156.6; Thu, 17 May 2018 16:28:13 -0700 Received: from sc2-haas01-esx0118.eng.vmware.com (sc2-haas01-esx0118.eng.vmware.com [10.172.44.118]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 579F140788; Thu, 17 May 2018 16:28:40 -0700 (PDT) From: Nadav Amit To: , CC: , Nadav Amit , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Josh Poimboeuf Subject: [PATCH 3/6] x86: alternative: macrofy locks for better inlining Date: Thu, 17 May 2018 09:13:59 -0700 Message-ID: <20180517161402.78089-4-namit@vmware.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180517161402.78089-1-namit@vmware.com> References: <20180517161402.78089-1-namit@vmware.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (EX13-EDG-OU-001.vmware.com: namit@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org GCC considers the number of statements in inlined assembly blocks, according to new-lines and semicolons, as an indication to the cost of the block in time and space. This data is distorted by the kernel code, which puts information in alternative sections. As a result, the compiler may perform incorrect inlining and branch optimizations. The solution is to set an assembly macro and call it from the inlined assembly block. As a result GCC considers the inline assembly block as a single instruction. This patch handles the LOCK prefix, allowing more aggresive inlining. text data bss dec hex filename 18127205 10068388 2936832 31132425 1db0b09 ./vmlinux before 18131468 10068488 2936832 31136788 1db1c14 ./vmlinux after (+4363) Static text symbols: Before: 39860 After: 39788 (-72) Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x86@kernel.org Cc: Josh Poimboeuf Signed-off-by: Nadav Amit --- arch/x86/include/asm/alternative.h | 34 +++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 4cd6a3b71824..1dc47c9fd480 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -28,17 +28,35 @@ * The very common lock prefix is handled as special case in a * separate table which is a pure address list without replacement ptr * and size information. That keeps the table sizes small. + * + * Saving the lock data is encapsulated within an assembly macro, which is then + * called on each use. This hack is necessary to prevent GCC from considering + * the inline assembly blocks as costly in time and space, which can prevent + * function inlining and lead to other bad compilation decisions. GCC computes + * inline assembly cost according to the number of perceived number of assembly + * instruction, based on the number of new-lines and semicolons in the assembly + * block. The macro will eventually be compiled into a single instruction (and + * some data). This scheme allows GCC to better understand the inline asm cost. */ #ifdef CONFIG_SMP -#define LOCK_PREFIX_HERE \ - ".pushsection .smp_locks,\"a\"\n" \ - ".balign 4\n" \ - ".long 671f - .\n" /* offset */ \ - ".popsection\n" \ - "671:" - -#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; " + +asm(".macro __LOCK_PREFIX_HERE\n\t" + ".pushsection .smp_locks,\"a\"\n\t" + ".balign 4\n\t" + ".long 671f - .\n\t" /* offset */ + ".popsection\n" + "671:\n\t" + ".endm"); + +#define LOCK_PREFIX_HERE "__LOCK_PREFIX_HERE\n\t" + +asm(".macro __LOCK_PREFIX ins:vararg\n\t" + "__LOCK_PREFIX_HERE\n\t" + "lock; \\ins\n\t" + ".endm"); + +#define LOCK_PREFIX "__LOCK_PREFIX " #else /* ! CONFIG_SMP */ #define LOCK_PREFIX_HERE "" -- 2.17.0