Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1181418imm; Tue, 15 May 2018 15:16:08 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrErXAsjq59451wuug1l+R3aAqIU+OkpK+tNy6YJG3SG9m28aHxjaPBVsBuwo9bb7tSznD5 X-Received: by 2002:a17:902:8d85:: with SMTP id v5-v6mr16256981plo.93.1526422568516; Tue, 15 May 2018 15:16:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526422568; cv=none; d=google.com; s=arc-20160816; b=q7u60a37xfrfDDOu6Nmh8peZvLUl7IvPl+I1hw+CjnNwUizKIcveMKkaPAtlmx0GtL 4ytJVZXxlvzt1kET+cgKVhbWZvoKOSfvo7pjWBU7OsGllQcOkXVSk0bRw4IW6iCmUN00 jYj/Jg93Soxc7UkJdsVML7eeatnwy87uXapKAmhQkOoOh5GUkzr3CWteYuY98qGB6YE2 9T9ZAunxsZq+Aa6rXkFTYJy/Pq7c0GFYLkxFgHBFITZTykQ+GarISlEtmLjtmV5UjcL6 2wwSlPvIdLwIODmpfQmDRnr4WP5kRM8ufDi0oVDDxzHH0FphbzYRVybjZ8x7Hw8sgI4d psTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-id:spamdiagnosticmetadata:spamdiagnosticoutput :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=UjeQkEDu6KM8J6qc1vvMozosmc9qzg0DH9vKn/s3QBQ=; b=JwmSJ29yMnUn9BWgBYosCDmQAhKk0arIer5yX0H5SmFsDExfVGCFHIR+0bHccX1ZUR C9lGMsAyPWyMZRyWtEpF/boPNfPRDE4Q14S/o0kP5Pu++/yNbLw759EtccXUMaC8Omh1 tJkvaCkrv1zpjhMSLocgdTof+vdICsrJ9k7VxwL2Xs2FjdMk6IWXFdWdHW0lMmJEoFJG Gi6gyNMij8MhCaP1UJV2o404RM5RqD+h1KlNuqysQCTRWu2VZcrKcJj1clpZReDy1Z6G Cc2cD1VM5EvhlgbN+PflMaM40jZJ8BcyD6MnlNIOB216xbtJkz3hev7C8L5sYhuwa/Oz FhOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@onevmw.onmicrosoft.com header.s=selector1-vmware-com header.b=YshBeY22; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 70-v6si1009707pfu.274.2018.05.15.15.15.53; Tue, 15 May 2018 15:16:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@onevmw.onmicrosoft.com header.s=selector1-vmware-com header.b=YshBeY22; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751881AbeEOWOQ (ORCPT + 99 others); Tue, 15 May 2018 18:14:16 -0400 Received: from mail-cys01nam02on0061.outbound.protection.outlook.com ([104.47.37.61]:21472 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751104AbeEOWON (ORCPT ); Tue, 15 May 2018 18:14:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onevmw.onmicrosoft.com; s=selector1-vmware-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UjeQkEDu6KM8J6qc1vvMozosmc9qzg0DH9vKn/s3QBQ=; b=YshBeY22uwMRD3bvcCsCQCpTS0orcKX+ZYv/ouIM1WWmndnRRAH5pGroe2Fz9XRhVpxYuIdTmS7G6hbGNUdD5QWu2BNZ/6P2+T/f+wBpIr1HWBltyHVoXS5SFW2ryL0d4CiUUeat+B2HZcV7eMICeE7+HCVwBMrGgjhRmBRsNkg= Received: from SN2PR05MB2654.namprd05.prod.outlook.com (10.166.212.137) by SN2PR05MB2767.namprd05.prod.outlook.com (10.167.19.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.776.4; Tue, 15 May 2018 22:14:10 +0000 Received: from SN2PR05MB2654.namprd05.prod.outlook.com ([fe80::4924:30ca:59af:89db]) by SN2PR05MB2654.namprd05.prod.outlook.com ([fe80::4924:30ca:59af:89db%4]) with mapi id 15.20.0776.008; Tue, 15 May 2018 22:14:09 +0000 From: Nadav Amit To: "linux-kernel@vger.kernel.org" CC: Alok Kataria , Christopher Li , "H. Peter Anvin" , Ingo Molnar , Jan Beulich , Jonathan Corbet , Josh Poimboeuf , Juergen Gross , Kees Cook , "linux-sparse@vger.kernel.org" , Peter Zijlstra , Randy Dunlap , Thomas Gleixner , "virtualization@lists.linux-foundation.org" , "x86@kernel.org" Subject: Re: [RFC 0/8] Improving compiler inlining decisions Thread-Topic: [RFC 0/8] Improving compiler inlining decisions Thread-Index: AQHT7JNaBX0TIj5NlUyn2mGKiojEhaQxWu6A Date: Tue, 15 May 2018 22:14:08 +0000 Message-ID: References: <20180515141124.84254-1-namit@vmware.com> In-Reply-To: <20180515141124.84254-1-namit@vmware.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [208.91.2.2] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;SN2PR05MB2767;7:dXlPZfk2WhTlXJ0QG/vNovgvTZWK5mdbHqxr/od5wy6t9yrxbWKgBlejrAGwf9lPwWYSlaMPr/MSkoukFPES/wRfnF/5THzkKSi8RqnJJMa6ZtTt76TIU7eBD2xo3SsHQjQ3/Jb5E8hb1/8niawzU30X55glUBhnKr0XwpIZcTZHd9dlhs+bdEJVkpHgQNnsXl8JyWTbIe8f0bskFDOnLayh2YD/1Zal9wh4HD27DN0oMlCBL0T1C7ev9ocmydEB;20:5/ue5/ezuYGd2mh3yVzvPRtZ59QQhcHObo1j0ogdFLkLw6cFbvoJfTdvlvnXsbt4NLnZETRCEmEOj+cfx2axclsPU81byAMU2Efqj3wBpLm1te/SwFqi5koYDtJPT+6zSKprjNLlt/Wojn9Ep3Yor+w5oZ09u/x9By4ywK5IfIs= x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-forefront-antispam-report: SFV:SKI;SCL:-1;SFV:NSPM;SFS:(10009020)(376002)(396003)(346002)(39860400002)(366004)(39380400002)(189003)(199004)(2900100001)(66066001)(102836004)(53936002)(6916009)(6246003)(14454004)(6512007)(5640700003)(186003)(2501003)(6506007)(4326008)(68736007)(106356001)(2906002)(26005)(105586002)(2351001)(36756003)(76176011)(478600001)(82746002)(5250100002)(5660300001)(6486002)(6436002)(7416002)(229853002)(83716003)(54906003)(86362001)(305945005)(81166006)(81156014)(33656002)(7736002)(11346002)(25786009)(3846002)(6116002)(316002)(97736004)(486006)(8676002)(446003)(3280700002)(476003)(3660700001)(2616005)(8936002)(99286004);DIR:OUT;SFP:1101;SCL:1;SRVR:SN2PR05MB2767;H:SN2PR05MB2654.namprd05.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020);SRVR:SN2PR05MB2767; x-ms-traffictypediagnostic: SN2PR05MB2767: authentication-results: spf=none (sender IP is ) smtp.mailfrom=namit@vmware.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(61668805478150); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231254)(944501410)(52105095)(93006095)(93001095)(10201501046)(3002001)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(20161123558120)(20161123560045)(6072148)(201708071742011);SRVR:SN2PR05MB2767;BCL:0;PCL:0;RULEID:;SRVR:SN2PR05MB2767; x-forefront-prvs: 0673F5BE31 received-spf: None (protection.outlook.com: vmware.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: C34yRgdYA6l/vqDAZGo4FEctbx/f6KG8gScrFtZrGl7g/OxV2PliEw71V8ZjXXKjelTolUjaloVm1O4+wPrQhIatDDr8VAPnFP7PhwHxZt9tjjmQtGkRcgSaN9nQ1+E3oqP1mBjeoEdFP5d+jW7kHvJu7xqJaAa8ZNVMoM2Ba1yJWg3/1j3IkOA0Q6S6TQcv spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <4AC4BC090489D04AA61B53FA42BF2395@namprd05.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: bf35aba4-e06b-4ce6-9661-08d5bab12e79 X-OriginatorOrg: vmware.com X-MS-Exchange-CrossTenant-Network-Message-Id: bf35aba4-e06b-4ce6-9661-08d5bab12e79 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 May 2018 22:14:09.2789 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN2PR05MB2767 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Nadav Amit wrote: > This patch-set deals with an interesting yet stupid problem: code that > does not get inlined despite its simplicity. >=20 > I find 5 classes of causes: >=20 > 1. Inline assembly blocks in which code and data are added to > alternative sections. The compiler is oblivious to the content of the > blocks and assumes their cost in space and time is proportional to the > number of the perceived assembly "instruction", according to the number > of newlines and semicolons. Alternatives, paravirt and other mechanisms > are affected. >=20 > 2. Inline assembly with redundant new-lines and semicolons. Similarly to > (1) this code is considered "heavier" than it actually is. >=20 > 3. Code with constant value optimizations. Quite a few parts of the > kernel check whether a variable is constant (using > __builtin_constant_p()) and perform heavy computations in that case. > These computations are eventually optimized out so they do not land in > the binary. However, the cost of these computations is also associated > with the calling function, which might prevent inlining of the calling > function. ilog2() is an example for such case. >=20 > 4. Code that is marked with the "cold" attribute, including all the > __init functions. Some may consider it the desired behavior. >=20 > 5. Code that is marked with a different optimization levels. This > affects for example vmx_vcpu_run(), inducing overheads of up to 10% on > exit. >=20 >=20 > This patch-set deals with some instances of first 3 classes.=20 >=20 > For (1) we insert an assembly macro, and call it from the inline > assembly block. As a result, the compiler sees a single "instruction" > and assigns the more appropriate cost to the code. >=20 > For (2) the solution is trivial: just remove the newlines. >=20 > (3) is somewhat tricky. The proposed solution is to use > __builtin_choose_expr() to check whether a variable is actually constant > instead of using an if-condition or the C ternary operator. > __builtin_choose_expr() is evaluated earlier in the compilation, so it > allows the compiler to associate the right cost for the variable case > before the inlining decisions take place. So far so good. >=20 > Still, there is a drawback. Since __builtin_choose_expr() is evaluated > earlier, it can fail to recognize constants, which an if-condition would > recognize correctly. As a result, this patch-set only applies it to the > simplest cases. >=20 > Overall this patch-set slightly increases the kernel size (my build was > done using localmodconfig + localyesconfig for the record): >=20 > text data bss dec hex filename > 18126699 10066728 2936832 31130259 1db0293 ./vmlinux before > 18149210 10064048 2936832 31150090 1db500a ./vmlinux after (+0.06%) >=20 > The patch-set eliminates many of the static text symbols: > Before: 40033 > After: 39632 (-10%) Oops. Should be -1%...=20