Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp330062ybn; Tue, 1 Oct 2019 21:59:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqxBcOV3LSTZukBcrpnEZm1l9Jd1t85EfrZCWus+i5sasxbqc1xm3wBEKFJ+vPlcEL40YR0I X-Received: by 2002:a05:6402:35b:: with SMTP id r27mr1815743edw.140.1569992356307; Tue, 01 Oct 2019 21:59:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569992356; cv=none; d=google.com; s=arc-20160816; b=zfDhZrD2gCGWB9p18/RyqmppU4wUkAvcjuP9iapUItU4qooi4PVL0oiLxVRKZrVd2Q fa040NIQjjpERTlO6hRlGtpTLvya+mnCfQ0FhE6ustfeBzB7zgq4bfCvyIaRoXCKr+AJ 0eFgnvys7UU9OqIhTZ+Rs3VpcER5+c77YR3kQapz1TinjCHlro8MqWYyaCO6+rmNE4qf KL15weUM6ucQK6hkKfO6AGMh58wrDQdmaAO0ig5J0RPR39kQCvPQOqYX0WG7HVPMR3Mz PydxvPFI9cIRVv7Fh0S3C3+dd0d/2qT24uBECg6a6zwIWM95n4Gg9c6LdcK3UA56n3oO /2pQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=he2KQaLN15Ukf5d3JOb59B6IuidQ3MGDFSAF4GATnaE=; b=xSUiSrQkYdad7W+KEiPqdfWs+rZX+Mg0mYpCcdoulNGYXumd25EHZYVfsnixMjVT0w BIdVvFIOcg0kO4sxUhOQ5omsejgdBi9yub/nosKYgUI/Y387tXF/zIuw79/WTIcvQuel 9aZy32NmlixQoolFvaKm6u7O784EKU9TZJLJ38Xjr+jzOOukUHJB3b57uxHfmLfDmr9K I9CZ3GkeG4JI/G3NEUVx+ew3dpE+SHyl/93wTq+gbVhOFv8SUpmnXif5A62x9kUYPL4H Ea9PjEvooRPxg+bo+2itQgUe037n4ezX0BEU1w6Cwec07PcEf+G+EHcAv5FtbGqlOvRw cf7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=FWKCS+Bw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x29si10138870eda.297.2019.10.01.21.58.51; Tue, 01 Oct 2019 21:59:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=FWKCS+Bw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727832AbfJAV0X (ORCPT + 99 others); Tue, 1 Oct 2019 17:26:23 -0400 Received: from pandora.armlinux.org.uk ([78.32.30.218]:44316 "EHLO pandora.armlinux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727822AbfJAV0X (ORCPT ); Tue, 1 Oct 2019 17:26:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To: Content-Transfer-Encoding:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ZcyJFaU199HQGTOCbuTWwERJGL8QNllHqpGJn20ojIM=; b=FWKCS+Bwiqq5ZsX1/9d8F12vD tuooA0OvfZXCfpJZF3S24mg2oLjsVBsTB5Pwy0c4l2/wk3PwEODS45XUhOxL4C+JbDrzo8dY5adxZ WQdb5jXXXZpzmvB9nuq0uZKkrq8HFHmyIyomYvrELiPKm61MomZCB5e5DvWoKlV+7h0vGmAE4y1yI h+oPQ6Mgqbf7pmVJ4xPKKS7+nrX/DmLLrNxeTFvSnsz6iYII67NAAgF0iqVA8Jz3eU4EZ8uDhqa+V hmodnzNt7/Ycz47gm9XbqaqmkH7hTZJTyvz6D+Z1niBGtzHab0YwqD3BT9O8A04C7PBEsPQNKxE14 0o/pEzK2A==; Received: from shell.armlinux.org.uk ([2002:4e20:1eda:1:5054:ff:fe00:4ec]:46420) by pandora.armlinux.org.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1iFPet-0004pL-7S; Tue, 01 Oct 2019 22:26:11 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.92) (envelope-from ) id 1iFPeq-0008WV-Vu; Tue, 01 Oct 2019 22:26:09 +0100 Date: Tue, 1 Oct 2019 22:26:08 +0100 From: Russell King - ARM Linux admin To: Nick Desaulniers Cc: Will Deacon , Masahiro Yamada , Linus Torvalds , Nicolas Saenz Julienne , Andrew Morton , Ingo Molnar , Borislav Petkov , Miguel Ojeda , linux-arch , LKML , Catalin Marinas , Stefan Wahren , Kees Cook , Arnd Bergmann , clang-built-linux Subject: Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly Message-ID: <20191001212608.GN25745@shell.armlinux.org.uk> References: <20191001092823.z4zhlbwvtwnlotwc@willie-the-truck> <20191001170142.x66orounxuln7zs3@willie-the-truck> <20191001175512.GK25745@shell.armlinux.org.uk> <20191001181438.GL25745@shell.armlinux.org.uk> <20191001205938.GM25745@shell.armlinux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20191001205938.GM25745@shell.armlinux.org.uk> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 01, 2019 at 09:59:38PM +0100, Russell King - ARM Linux admin wrote: > On Tue, Oct 01, 2019 at 01:21:44PM -0700, Nick Desaulniers wrote: > > On Tue, Oct 1, 2019 at 11:14 AM Russell King - ARM Linux admin > > wrote: > > > > > > On Tue, Oct 01, 2019 at 11:00:11AM -0700, Nick Desaulniers wrote: > > > > On Tue, Oct 1, 2019 at 10:55 AM Russell King - ARM Linux admin > > > > wrote: > > > > > > > > > > On Tue, Oct 01, 2019 at 10:44:43AM -0700, Nick Desaulniers wrote: > > > > > > I apologize; I don't mean to be difficult. I would just like to avoid > > > > > > surprises when code written with the assumption that it will be > > > > > > inlined is not. It sounds like we found one issue in arm32 and one in > > > > > > arm64 related to outlining. If we fix those two cases, I think we're > > > > > > close to proceeding with Masahiro's cleanup, which I view as a good > > > > > > thing for the health of the Linux kernel codebase. > > > > > > > > > > Except, using the C preprocessor for this turns the arm32 code into > > > > > yuck: > > > > > > > > > > 1. We'd need to turn get_domain() and set_domain() into multi-line > > > > > preprocessor macro definitions, using the GCC ({ }) extension > > > > > so that get_domain() can return a value. > > > > > > > > > > 2. uaccess_save_and_enable() and uaccess_restore() also need to > > > > > become preprocessor macro definitions too. > > > > > > > > > > So, we end up with multiple levels of nested preprocessor macros. > > > > > When something goes wrong, the compiler warning/error message is > > > > > going to be utterly _horrid_. > > > > > > > > That's why I preferred V1 of Masahiro's patch, that fixed the inline > > > > asm not to make use of caller saved registers before calling a > > > > function that might not be inlined. > > > > > > ... which I objected to based on the fact that this uaccess stuff is > > > supposed to add protection against the kernel being fooled into > > > accessing userspace when it shouldn't. The whole intention there is > > > that [sg]et_domain(), and uaccess_*() are _always_ inlined as close > > > as possible to the call site of the accessor touching userspace. > > > > Then use the C preprocessor to force the inlining. I'm sorry it's not > > as pretty as static inline functions. > > > > > > > > Moving it before the assignments mean that the compiler is then free > > > to issue memory loads/stores to load up those registers, which is > > > exactly what we want to avoid. > > > > > > > > > In any case, I violently disagree with the idea that stuff we have > > > in header files should be permitted not to be inlined because we > > > have soo much that is marked inline. > > > > So there's a very important subtly here. There's: > > 1. code that adds `inline` cause "oh maybe it would be nice to inline > > this, but if it isn't no big deal" > > 2. code that if not inlined is somehow not correct. > > 3. avoid ODR violations via `static inline` > > > > I'll posit that "we have soo much that is marked inline [is > > predominantly case 1 or 3, not case 2]." Case 2 is a code smell, and > > requires extra scrutiny. > > > > > Having it moved out of line, > > > and essentially the same function code appearing in multiple C files > > > is really not an improvement over the current situation with excessive > > > use of inlining. Anyone who has looked at the code resulting from > > > dma_map_single() will know exactly what I'm talking about, which is > > > way in excess of the few instructions we have for the uaccess_* stuff > > > here. > > > > > > The right approach is to move stuff out of line - and by that, I > > > mean _actually_ move the damn code, so that different compilation > > > units can use the same instructions, and thereby gain from the > > > whole point of an instruction cache. > > > > And be marked __attribute__((noinline)), otherwise might be inlined via LTO. > > > > > > > > The whole "let's make inline not really mean inline" is nothing more > > > than a band-aid to the overuse (and abuse) of "inline". > > > > Let's triple check the ISO C11 draft spec just to be sure: > > ? 6.7.4.6: A function declared with an inline function specifier is an > > inline function. Making a > > function an inline function suggests that calls to the function be as > > fast as possible. > > The extent to which such suggestions are effective is > > implementation-defined. 139) > > 139) For example, an implementation might never perform inline > > substitution, or might only perform inline > > substitutions to calls in the scope of an inline declaration. > > ? J.3.8 [Undefined Behavior] Hints: The extent to which suggestions > > made by using the inline function specifier are effective (6.7.4). > > > > My translation: > > "Please don't assume inline means anything." > > > > For the unspecified GNU C extension __attribute__((always_inline)), it > > seems to me like it's meant more for performing inlining (an > > optimization) at -O0. Whether the compiler warns or not seems like a > > nice side effect, but provides no strong guarantee otherwise. > > > > I'm sorry that so much code may have been written with that > > assumption, and I'm sorry to be the bearer of bad news, but this isn't > > a recent change. If code was written under false assumptions, it > > should be rewritten. Sorry. > > You may quote C11, but that is not relevent. The kernel is coded to > gnu89 standard - see the -std=gnu89 flag. There's more to this and why C11 is entirely irrelevant. The "inline" you see in our headers is not the compiler keyword that you find in various C standards, it is a macro that gets expanded to either: #define inline inline __attribute__((__always_inline__)) __gnu_inline \ __maybe_unused notrace or #define inline inline __gnu_inline \ __maybe_unused notrace __gnu_inline is defined as: #define __gnu_inline __attribute__((__gnu_inline__)) So this attaches the gnu_inline attribute to the function: `gnu_inline' This attribute should be used with a function that is also declared with the `inline' keyword. It directs GCC to treat the function as if it were defined in gnu90 mode even when compiling in C99 or gnu99 mode. ... Since ISO C99 specifies a different semantics for `inline', this function attribute is provided as a transition measure and as a useful feature in its own right. This attribute is available in GCC 4.1.3 and later. It is available if either of the preprocessor macros `__GNUC_GNU_INLINE__' or `__GNUC_STDC_INLINE__' are defined. *Note An Inline Function is As Fast As a Macro: Inline. which is quite clear that C99 semantics do not apply to _this_ inline. The manual goes on to explain: GCC implements three different semantics of declaring a function inline. One is available with `-std=gnu89' or `-fgnu89-inline' or when `gnu_inline' attribute is present on all inline declarations, another when `-std=c99', `-std=c11', `-std=gnu99' or `-std=gnu11' (without `-fgnu89-inline'), and the third is used when compiling C++. I'd suggest gnu90 mode is pretty similar to gnu89 mode, and as we build the kernel in gnu89 mode, that is the inlining definition that is appropriate. When it comes to __always_inline, the GCC manual is the definitive reference, since we use the GCC attribute for that: #define __always_inline inline __attribute__((__always_inline__)) and I've already quoted what the GCC manual says for always_inline. Arguing about what the C11 spec says about inlining when we aren't using C11 dialect in the kernel, but are using GCC features, does not move the discussion on. Thanks anyway, maybe it will become relevent in the future if we decide to move to C11. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up