Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp733400pxj; Thu, 10 Jun 2021 11:17:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyylGvYveBi+xztUq9uXFh0FM0yeGg7RLAleFHQKtE26iXnfpyOpClcvzWn522scyMnQALW X-Received: by 2002:a05:6402:645:: with SMTP id u5mr765357edx.293.1623349022392; Thu, 10 Jun 2021 11:17:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623349022; cv=none; d=google.com; s=arc-20160816; b=mdna8zBZf8Uv+Dhkt7+P0Jo4ci6bDIehGhldrNGoqI0FK0HVU9UyYpIzTgTMUhdJqw SEN8vpZ06ZD86g5Rx1EofqGtnJLU66fZ6ZY8rneCmdwQKUNOwxjs6PgAY/zzpSOlce9I ABdv46QOp9cRUQogJ9IlRxZhh4qCb5hGJ0vpOGhNpLZR6dhyIp/NwmI123d+XPo70vju G52b7q0SmL2r53WjzEPY5fV3grXIgPttpMlsOWi/J2XSu0mW9EeK7Gm2qpn7HA1NYaSh Bdphk65H0g4THGuFUU/ava2SJy2Nfr7nlwZozNsh0oY5V0UxlYxTyPteDFQpnFQo6Dtu JJNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ouPloUcQuNld0P0ZWy90imBdW72MiMTfihqU7gskmcA=; b=cBZK6ehM0cbRPQL7SS17UjwNObMBxvLefelRlVZc/84pM4+czv8Zkb+UKmFZcMgtO/ 01XOPbg/B31IpLJwIloceDzKNZ2CNKH5GA2s1aehpCXxMapprojQFJ8RZTD0Bxs5ryEv 0wQ/Tnvx88zSneHGDI2WQryAdmGonkvwU+/+Cbzge5Hw14SsRFKLzeofc29USvALq8pk 4k1wbity1N+I7wxJegY1PxVhMGkuuasDhtTbJLo8zIZa38MIETDsH6+ldwrMH4u8ABtD 6smijNtL4QcAE52r63D0dYOKvQkhUeFPiWQxV+AYWxSpsX2V2jZQyzxmGEGZgVLAH09V 9R0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rZ+gjzX8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zh6si2798199ejb.453.2021.06.10.11.16.29; Thu, 10 Jun 2021 11:17:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rZ+gjzX8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230416AbhFJSQt (ORCPT + 99 others); Thu, 10 Jun 2021 14:16:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:53638 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230376AbhFJSQt (ORCPT ); Thu, 10 Jun 2021 14:16:49 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 417CB613F5; Thu, 10 Jun 2021 18:14:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623348892; bh=PADwB9sqrvEq0/W9oqNmCqgADNHwyj1AM7E076Ndpy8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=rZ+gjzX8DykFeOrU9l9xA6YbyKgbjW3VSyTWrXErzTZjJVtlG0k+8+V5gXquBEFcZ 96b4z8yhyIGmBDKQlL0cT0tg4jnWusockBru+bZhC459GHXRabWR140+sC/3fggx8G 8HrZA4aMbS78v82eDFlsqIDIzbyyJCgZiw6lLmaLyI6D6a0UUeoRljJjTYdgzdRBV1 GhdJIUqYyZQMPOjEkak5pdNiUnigZnBNkRUnEwU0exoybZpW5VH5HGwwwGrDoi2/lw sfdywd8Php3R3LtChihtunu909tYgKb34zIX9sJAhYnQs/e7CniU8GY5SB7Rn5Pr31 fQD6LEyJikx2w== Date: Thu, 10 Jun 2021 11:14:51 -0700 From: Nathan Chancellor To: Peter Zijlstra Cc: x86@kernel.org, jpoimboe@redhat.com, jbaron@akamai.com, rostedt@goodmis.org, ardb@kernel.org, linux-kernel@vger.kernel.org, samitolvanen@google.com, ndesaulniers@google.com, clang-built-linux@googlegroups.com Subject: Re: [PATCH 01/13] objtool: Rewrite hashtable sizing Message-ID: References: <20210506193352.719596001@infradead.org> <20210506194157.452881700@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20210506194157.452881700@infradead.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On Thu, May 06, 2021 at 09:33:53PM +0200, Peter Zijlstra wrote: > Currently objtool has 5 hashtables and sizes them 16 or 20 bits > depending on the --vmlinux argument. > > However, a single side doesn't really work well for the 5 tables, > which among them, cover 3 different uses. Also, while vmlinux is > larger, there is still a very wide difference between a defconfig and > allyesconfig build, which again isn't optimally covered by a single > size. > > Another aspect is the cost of elf_hash_init(), which for large tables > dominates the runtime for small input files. It turns out that all it > does it assign NULL, something that is required when using malloc(). > However, when we allocate memory using mmap(), we're guaranteed to get > zero filled pages. > > Therefore, rewrite the whole thing to: > > 1) use more dynamic sized tables, depending on the input file, > 2) avoid the need for elf_hash_init() entirely by using mmap(). > > This speeds up a regular kernel build (100s to 98s for > x86_64-defconfig), and potentially dramatically speeds up vmlinux > processing. > > Signed-off-by: Peter Zijlstra (Intel) This patch as commit 25cf0d8aa2a3 ("objtool: Rewrite hashtable sizing") in -tip causes a massive compile time regression with allmodconfig + ThinLTO. At v5.13-rc1, the performance penalty is only about 23%, as measured with hyperfine for two runs [1]: Benchmark #1: allmodconfig Time (mean ± σ): 625.173 s ± 2.198 s [User: 35120.895 s, System: 2176.868 s] Range (min … max): 623.619 s … 626.727 s 2 runs Benchmark #2: allmodconfig with ThinLTO Time (mean ± σ): 771.034 s ± 0.369 s [User: 39706.084 s, System: 2326.166 s] Range (min … max): 770.773 s … 771.295 s 2 runs Summary 'allmodconfig' ran 1.23 ± 0.00 times faster than 'allmodconfig with ThinLTO' However, at 25cf0d8aa2a3, it is almost 150% on a 64-core server. Benchmark #1: allmodconfig Time (mean ± σ): 624.759 s ± 2.153 s [User: 35114.379 s, System: 2145.456 s] Range (min … max): 623.237 s … 626.281 s 2 runs Benchmark #2: allmodconfig with ThinLTO Time (mean ± σ): 1555.377 s ± 12.806 s [User: 40558.463 s, System: 2310.139 s] Range (min … max): 1546.321 s … 1564.432 s 2 runs Summary 'allmodconfig' ran 2.49 ± 0.02 times faster than 'allmodconfig with ThinLTO' Adding Sami because I am not sure why this patch would have much of an impact in relation to LTO. https://git.kernel.org/tip/25cf0d8aa2a3 is the patch in question. If I can provide any further information or help debug, please let me know. If you are interested in reproducing this locally, you will need a fairly recent LLVM stack (I used the stable release/12.x branch) and to cherry-pick commit 976aac5f8829 ("kcsan: Fix debugfs initcall return type") to fix an unrelated build failure. My script [2] can build a self-contained toolchain fairly quickly if you cannot get one from your package manager. A command like below will speed up the build a bit: $ ./build-llvm.py \ --branch "release/12.x" \ --build-stage1-only \ --install-stage1-only \ --projects "clang;lld" \ --targets X86 After adding the "install/bin" directory to PATH: $ echo "CONFIG_GCOV_KERNEL=n CONFIG_KASAN=n CONFIG_LTO_CLANG_THIN=y" >allmod.config $ make -skj"$(nproc)" LLVM=1 LLVM_IAS=1 allmodconfig all [1]: https://github.com/sharkdp/hyperfine [2]: https://github.com/ClangBuiltLinux/tc-build Cheers, Nathan