Received: by 10.192.165.148 with SMTP id m20csp4945524imm; Tue, 1 May 2018 06:40:47 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpOvwAiq3j8WO+NVjCOtN536ukM05lc0DR32qJ4oLWAeFTUXw39G2ZEN9LRxibW8VvnwUGA X-Received: by 2002:a17:902:f24:: with SMTP id 33-v6mr16703034ply.242.1525182047245; Tue, 01 May 2018 06:40:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525182047; cv=none; d=google.com; s=arc-20160816; b=UaxbaJlBewhI6Wct1V3uK+zy4Z/uWw5d24SzTqN5HRJe2auPWPTxKMkcPWkMhG4BaP /8a+eBCokfGKQj9JKW77dntCaaRzbf2ugtNBc9dnBDEFZ048PFVgNbkBq8fTjyAAVsM4 K1AK4rNdzPXt4HQY7G2Wqpu8alfyeuFfpcJzfBv6Qpyo8kx/ybAWl8FefXneehEXjZY6 VEzE3G09u5mhno8rCowWT2VU/vExdO04aiQgIWe3ymDPNuM2OjTiVPUwOBjDrwp9IWyp NaavwW2l2thhxHQI01ocOnCEFWxWwLBf3pE2bnqp8FToMq/QBeyP8wqjYxpk644rplhN YY0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=0aqKyOY8ywFxkSXT0G4edrg0MF5MfDdhyDo51OmejYg=; b=kmEDL7r0y0c/Y3hXtzbkjs28NX3oDkTc0fUr0D0zt8o6uw8H7oo1JkK6gTRpLf0nsD sLk/LWIZh11V0K0FFgTuNVNkqOG+lC7nuaGmWpurbtdErDKKnIrRWDivqN4X2BSNIi1E roacwWIphr9c8GXkYVl9ic34tQPVh2y11od31elQWXdRfwAHAa2WPH6l79Eu9025a445 3FEKgshqWx6dq+B0tRJWPtoHs4g8VoGBnBjOX4C+viOubYjaYxYGLaba4QsEWiuJl/AR hWj+GglxOlA84XcjqGjMcmGiKiUCfvB2A9NWY6vVEAb90vA7vgT1P+i0zMgS17CvlTUi tfSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g3-v6si9353967plb.536.2018.05.01.06.40.32; Tue, 01 May 2018 06:40:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755363AbeEANkX (ORCPT + 99 others); Tue, 1 May 2018 09:40:23 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752009AbeEANkW (ORCPT ); Tue, 1 May 2018 09:40:22 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B88F8FA467; Tue, 1 May 2018 13:40:21 +0000 (UTC) Received: from treble (ovpn-123-232.rdu2.redhat.com [10.10.123.232]) by smtp.corp.redhat.com (Postfix) with ESMTP id DA125208C177; Tue, 1 May 2018 13:40:20 +0000 (UTC) Date: Tue, 1 May 2018 08:40:20 -0500 From: Josh Poimboeuf To: Nadav Amit Cc: Peter Zijlstra , Ingo Molnar , LKML , Thomas Gleixner , Linus Torvalds Subject: Re: Suboptimal inline heuristics due to non-code sections Message-ID: <20180501134020.fonel3x6plea5xdt@treble> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 01 May 2018 13:40:21 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 01 May 2018 13:40:21 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jpoimboe@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 01, 2018 at 06:50:14AM +0000, Nadav Amit wrote: > When gcc considers the size of a function for inlining decisions, it > apparently considers *all* sections. Since the kernel extensively uses > sections for things other than code (e.g., exception-table, bug-table), the > optimality of these decisions seem questionable to me. > > The objtool’s sections may be the most extreme case, as these sections are > discarded, while their size still appears to be considered by the inlining > heuristics. It may be beneficial not to consider (some) the other sections > as well, as they do not affect code-caching but only increase the kernel > size. > > To illustrate the issue, consider the function copy_overflow(): > > 0xffffffff819315e0 <+0>: push %rbp > 0xffffffff819315e1 <+1>: mov %rsi,%rdx > 0xffffffff819315e4 <+4>: mov %edi,%esi > 0xffffffff819315e6 <+6>: mov $0xffffffff820bc4b8,%rdi > 0xffffffff819315ed <+13>: mov %rsp,%rbp > 0xffffffff819315f0 <+16>: callq 0xffffffff81089b70 <__warn_printk> > 0xffffffff819315f5 <+21>: ud2 > 0xffffffff819315f7 <+23>: pop %rbp > 0xffffffff819315f8 <+24>: retq > > This function seems to me as a great candidate for inlining. Yet, in my 4.16 > build (using gcc 7.2), I get 38 non-inlined instances of this function in > vmlinux. Forcing CONFIG_STACK_VALIDATION to be disabled reduces the number > non-inlined instances to 35. Removing, in addition, the data which is saved > in the __bug_table makes all the instances of the function to be inlined. > > Obviously this certain function can be set as __always_inline, but the inline > heuristics seems to me as wrongfully biased. > > What do you think? > > Is there a way to make gcc to ignore sections for its inlining heuristics? Good find. Playing around with one of the affected files (crypto/af_alg.o), if I make the .discard.reachable section empty by removing the text reference from the annotate_reachable() macro, then copy_overflow() still isn't inlined. But if I remove the section completely by removing the pushsection/popsection, then copy_overflow() gets inlined. So GCC's inlining decisions are somehow influenced by the existence of some random empty section. This definitely seems like a GCC bug to me. -- Josh