2019-03-20 06:23:34

by Masahiro Yamada

[permalink] [raw]
Subject: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.

The idea is obviously arch-agnostic although we need some code fixups.
This commit moves the config entry from arch/x86/Kconfig.debug to
lib/Kconfig.debug so that all architectures (except MIPS for now) can
benefit from it.

At this moment, I added "depends on !MIPS" because fixing 0day bot reports
for MIPS was complex to me.

I tested this patch on my arm/arm64 boards.

This can make a huge difference in kernel image size especially when
CONFIG_OPTIMIZE_FOR_SIZE is enabled.

For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.

dec file
18983424 arch/arm64/boot/Image.before
18321920 arch/arm64/boot/Image.after

This also slightly improves the "Kernel hacking" Kconfig menu.
Commit e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen')
mentioned this config option would be a good fit in the "compiler option"
menu. I did so.

I fixed up some files to avoid build warnings/errors.

[1] arch/arm64/include/asm/cpufeature.h

In file included from ././include/linux/compiler_types.h:68,
from <command-line>:
./arch/arm64/include/asm/jump_label.h: In function 'cpus_have_const_cap':
./include/linux/compiler-gcc.h:120:38: warning: asm operand 0 probably doesn't match constraints
#define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
^~~
./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 'asm_volatile_goto'
asm_volatile_goto(
^~~~~~~~~~~~~~~~~
./include/linux/compiler-gcc.h:120:38: error: impossible constraint in 'asm'
#define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
^~~
./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 'asm_volatile_goto'
asm_volatile_goto(
^~~~~~~~~~~~~~~~~

[2] arch/mips/kernel/cpu-bugs64.c

arch/mips/kernel/cpu-bugs64.c: In function 'mult_sh_align_mod.constprop':
arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match constraints [-Werror]
asm volatile(
^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match constraints [-Werror]
asm volatile(
^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
asm volatile(
^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
asm volatile(
^~~

[3] arch/powerpc/mm/tlb-radix.c

arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match constraints [-Werror]
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
^~~
arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'
CC arch/powerpc/perf/hv-gpci.o

[4] arch/s390/include/asm/cpacf.h

In file included from arch/s390/crypto/des_s390.c:19:
./arch/s390/include/asm/cpacf.h: In function 'cpacf_query':
./arch/s390/include/asm/cpacf.h:170:2: warning: asm operand 3 probably doesn't match constraints
asm volatile(
^~~
./arch/s390/include/asm/cpacf.h:170:2: error: impossible constraint in 'asm'

[5] arch/powerpc/kernel/prom_init.c

WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in reference from the function .prom_getprop() to the function .init.text:.call_prom()
The function .prom_getprop() references
the function __init .call_prom().
This is often because .prom_getprop lacks a __init
annotation or the annotation of .call_prom is wrong.

WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in reference from the function .prom_getproplen() to the function .init.text:.call_prom()
The function .prom_getproplen() references
the function __init .call_prom().
This is often because .prom_getproplen lacks a __init
annotation or the annotation of .call_prom is wrong.

[6] drivers/mtd/nand/raw/vf610_nfc.c

drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nfc->regs + NFC_MAIN_AREA(0) + offset,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
trfr_sz, !nfc->data_access);
~~~~~~~~~~~~~~~~~~~~~~~~~~~

[7] arch/arm/kernel/smp.c

arch/arm/kernel/smp.c: In function ‘raise_nmi’:
arch/arm/kernel/smp.c:522:2: warning: array subscript is above array bounds [-Warray-bounds]
trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The fixup is not included in this. The patch is available in ML:

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409393.html

Signed-off-by: Masahiro Yamada <[email protected]>
---

arch/arm64/include/asm/cpufeature.h | 4 ++--
arch/mips/kernel/cpu-bugs64.c | 4 ++--
arch/powerpc/kernel/prom_init.c | 6 +++---
arch/powerpc/mm/tlb-radix.c | 2 +-
arch/s390/include/asm/cpacf.h | 2 +-
arch/x86/Kconfig | 3 ---
arch/x86/Kconfig.debug | 14 --------------
drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
include/linux/compiler_types.h | 3 +--
lib/Kconfig.debug | 15 +++++++++++++++
10 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index e505e1f..77d1aa5 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -406,7 +406,7 @@ static inline bool cpu_have_feature(unsigned int num)
}

/* System capability check for constant caps */
-static inline bool __cpus_have_const_cap(int num)
+static __always_inline bool __cpus_have_const_cap(int num)
{
if (num >= ARM64_NCAPS)
return false;
@@ -420,7 +420,7 @@ static inline bool cpus_have_cap(unsigned int num)
return test_bit(num, cpu_hwcaps);
}

-static inline bool cpus_have_const_cap(int num)
+static __always_inline bool cpus_have_const_cap(int num)
{
if (static_branch_likely(&arm64_const_caps_ready))
return __cpus_have_const_cap(num);
diff --git a/arch/mips/kernel/cpu-bugs64.c b/arch/mips/kernel/cpu-bugs64.c
index bada74a..c04b97a 100644
--- a/arch/mips/kernel/cpu-bugs64.c
+++ b/arch/mips/kernel/cpu-bugs64.c
@@ -42,8 +42,8 @@ static inline void align_mod(const int align, const int mod)
: "n"(align), "n"(mod));
}

-static inline void mult_sh_align_mod(long *v1, long *v2, long *w,
- const int align, const int mod)
+static __always_inline void mult_sh_align_mod(long *v1, long *v2, long *w,
+ const int align, const int mod)
{
unsigned long flags;
int m1, m2;
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index f33ff41..241fe6b 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -501,14 +501,14 @@ static int __init prom_next_node(phandle *nodep)
}
}

-static inline int prom_getprop(phandle node, const char *pname,
- void *value, size_t valuelen)
+static inline int __init prom_getprop(phandle node, const char *pname,
+ void *value, size_t valuelen)
{
return call_prom("getprop", 4, 1, node, ADDR(pname),
(u32)(unsigned long) value, (u32) valuelen);
}

-static inline int prom_getproplen(phandle node, const char *pname)
+static inline int __init prom_getproplen(phandle node, const char *pname)
{
return call_prom("getproplen", 2, 1, node, ADDR(pname));
}
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 6a23b9e..a2b2848 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -928,7 +928,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
tlb->need_flush_all = 0;
}

-static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
+static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
unsigned long start, unsigned long end,
int psize, bool also_pwc)
{
diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
index 3cc52e3..f316de4 100644
--- a/arch/s390/include/asm/cpacf.h
+++ b/arch/s390/include/asm/cpacf.h
@@ -202,7 +202,7 @@ static inline int __cpacf_check_opcode(unsigned int opcode)
}
}

-static inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
+static __always_inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
{
if (__cpacf_check_opcode(opcode)) {
__cpacf_query(opcode, mask);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1f9b3c..1a3e2b5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -310,9 +310,6 @@ config ZONE_DMA32
config AUDIT_ARCH
def_bool y if X86_64

-config ARCH_SUPPORTS_OPTIMIZED_INLINING
- def_bool y
-
config ARCH_SUPPORTS_DEBUG_PAGEALLOC
def_bool y

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 15d0fbe..f730680 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -266,20 +266,6 @@ config CPA_DEBUG
---help---
Do change_page_attr() self-tests every 30 seconds.

-config OPTIMIZE_INLINING
- bool "Allow gcc to uninline functions marked 'inline'"
- ---help---
- This option determines if the kernel forces gcc to inline the functions
- developers have marked 'inline'. Doing so takes away freedom from gcc to
- do what it thinks is best, which is desirable for the gcc 3.x series of
- compilers. The gcc 4.x series have a rewritten inlining algorithm and
- enabling this option will generate a smaller kernel there. Hopefully
- this algorithm is so good that allowing gcc 4.x and above to make the
- decision will become the default in the future. Until then this option
- is there to test gcc for this.
-
- If unsure, say N.
-
config DEBUG_ENTRY
bool "Debug low-level entry code"
depends on DEBUG_KERNEL
diff --git a/drivers/mtd/nand/raw/vf610_nfc.c b/drivers/mtd/nand/raw/vf610_nfc.c
index a662ca1..19792d7 100644
--- a/drivers/mtd/nand/raw/vf610_nfc.c
+++ b/drivers/mtd/nand/raw/vf610_nfc.c
@@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
{
const struct nand_op_instr *instr;
struct vf610_nfc *nfc = chip_to_nfc(chip);
- int op_id = -1, trfr_sz = 0, offset;
+ int op_id = -1, trfr_sz = 0, offset = 0;
u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
bool force8bit = false;

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index ba814f1..19e58b9 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -140,8 +140,7 @@ struct ftrace_likely_data {
* Do not use __always_inline here, since currently it expands to inline again
* (which would break users of __always_inline).
*/
-#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
- !defined(CONFIG_OPTIMIZE_INLINING)
+#if !defined(CONFIG_OPTIMIZE_INLINING)
#define inline inline __attribute__((__always_inline__)) __gnu_inline \
__maybe_unused notrace
#else
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0d9e817..20f3dfc 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -310,6 +310,21 @@ config HEADERS_CHECK
exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
your build tree), to make sure they're suitable.

+config OPTIMIZE_INLINING
+ bool "Allow compiler to uninline functions marked 'inline'"
+ depends on !MIPS # TODO: look into MIPS code
+ help
+ This option determines if the kernel forces gcc to inline the functions
+ developers have marked 'inline'. Doing so takes away freedom from gcc to
+ do what it thinks is best, which is desirable for the gcc 3.x series of
+ compilers. The gcc 4.x series have a rewritten inlining algorithm and
+ enabling this option will generate a smaller kernel there. Hopefully
+ this algorithm is so good that allowing gcc 4.x and above to make the
+ decision will become the default in the future. Until then this option
+ is there to test gcc for this.
+
+ If unsure, say N.
+
config DEBUG_SECTION_MISMATCH
bool "Enable full Section mismatch analysis"
help
--
2.7.4



2019-03-20 06:42:27

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 3:21 PM Masahiro Yamada
<[email protected]> wrote:
>
> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
>
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
>
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.

BTW, I got the following error if I enabled CONFIG_OPTIMIZE_INLINING for MIPS.

It is unclear to me how to fix it.
That's why I ended up with "depends on !MIPS".


MODPOST vmlinux.o
arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined
references to `mips_gcr_base'


Perhaps, MIPS folks may know how to fix it.




> I tested this patch on my arm/arm64 boards.
>
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
>
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
>
> dec file
> 18983424 arch/arm64/boot/Image.before
> 18321920 arch/arm64/boot/Image.after
>
> This also slightly improves the "Kernel hacking" Kconfig menu.
> Commit e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen')
> mentioned this config option would be a good fit in the "compiler option"
> menu. I did so.
>
> I fixed up some files to avoid build warnings/errors.
>
> [1] arch/arm64/include/asm/cpufeature.h
>
> In file included from ././include/linux/compiler_types.h:68,
> from <command-line>:
> ./arch/arm64/include/asm/jump_label.h: In function 'cpus_have_const_cap':
> ./include/linux/compiler-gcc.h:120:38: warning: asm operand 0 probably doesn't match constraints
> #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> ^~~
> ./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 'asm_volatile_goto'
> asm_volatile_goto(
> ^~~~~~~~~~~~~~~~~
> ./include/linux/compiler-gcc.h:120:38: error: impossible constraint in 'asm'
> #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> ^~~
> ./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 'asm_volatile_goto'
> asm_volatile_goto(
> ^~~~~~~~~~~~~~~~~
>
> [2] arch/mips/kernel/cpu-bugs64.c
>
> arch/mips/kernel/cpu-bugs64.c: In function 'mult_sh_align_mod.constprop':
> arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match constraints [-Werror]
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match constraints [-Werror]
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
> asm volatile(
> ^~~
>
> [3] arch/powerpc/mm/tlb-radix.c
>
> arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
> arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match constraints [-Werror]
> asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
> ^~~
> arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'
> CC arch/powerpc/perf/hv-gpci.o
>
> [4] arch/s390/include/asm/cpacf.h
>
> In file included from arch/s390/crypto/des_s390.c:19:
> ./arch/s390/include/asm/cpacf.h: In function 'cpacf_query':
> ./arch/s390/include/asm/cpacf.h:170:2: warning: asm operand 3 probably doesn't match constraints
> asm volatile(
> ^~~
> ./arch/s390/include/asm/cpacf.h:170:2: error: impossible constraint in 'asm'
>
> [5] arch/powerpc/kernel/prom_init.c
>
> WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in reference from the function .prom_getprop() to the function .init.text:.call_prom()
> The function .prom_getprop() references
> the function __init .call_prom().
> This is often because .prom_getprop lacks a __init
> annotation or the annotation of .call_prom is wrong.
>
> WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in reference from the function .prom_getproplen() to the function .init.text:.call_prom()
> The function .prom_getproplen() references
> the function __init .call_prom().
> This is often because .prom_getproplen lacks a __init
> annotation or the annotation of .call_prom is wrong.
>
> [6] drivers/mtd/nand/raw/vf610_nfc.c
>
> drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
> drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> nfc->regs + NFC_MAIN_AREA(0) + offset,
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> trfr_sz, !nfc->data_access);
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> [7] arch/arm/kernel/smp.c
>
> arch/arm/kernel/smp.c: In function ‘raise_nmi’:
> arch/arm/kernel/smp.c:522:2: warning: array subscript is above array bounds [-Warray-bounds]
> trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The fixup is not included in this. The patch is available in ML:
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409393.html
>
> Signed-off-by: Masahiro Yamada <[email protected]>
> ---
>
> arch/arm64/include/asm/cpufeature.h | 4 ++--
> arch/mips/kernel/cpu-bugs64.c | 4 ++--
> arch/powerpc/kernel/prom_init.c | 6 +++---
> arch/powerpc/mm/tlb-radix.c | 2 +-
> arch/s390/include/asm/cpacf.h | 2 +-
> arch/x86/Kconfig | 3 ---
> arch/x86/Kconfig.debug | 14 --------------
> drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
> include/linux/compiler_types.h | 3 +--
> lib/Kconfig.debug | 15 +++++++++++++++
> 10 files changed, 26 insertions(+), 29 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index e505e1f..77d1aa5 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -406,7 +406,7 @@ static inline bool cpu_have_feature(unsigned int num)
> }
>
> /* System capability check for constant caps */
> -static inline bool __cpus_have_const_cap(int num)
> +static __always_inline bool __cpus_have_const_cap(int num)
> {
> if (num >= ARM64_NCAPS)
> return false;
> @@ -420,7 +420,7 @@ static inline bool cpus_have_cap(unsigned int num)
> return test_bit(num, cpu_hwcaps);
> }
>
> -static inline bool cpus_have_const_cap(int num)
> +static __always_inline bool cpus_have_const_cap(int num)
> {
> if (static_branch_likely(&arm64_const_caps_ready))
> return __cpus_have_const_cap(num);
> diff --git a/arch/mips/kernel/cpu-bugs64.c b/arch/mips/kernel/cpu-bugs64.c
> index bada74a..c04b97a 100644
> --- a/arch/mips/kernel/cpu-bugs64.c
> +++ b/arch/mips/kernel/cpu-bugs64.c
> @@ -42,8 +42,8 @@ static inline void align_mod(const int align, const int mod)
> : "n"(align), "n"(mod));
> }
>
> -static inline void mult_sh_align_mod(long *v1, long *v2, long *w,
> - const int align, const int mod)
> +static __always_inline void mult_sh_align_mod(long *v1, long *v2, long *w,
> + const int align, const int mod)
> {
> unsigned long flags;
> int m1, m2;
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index f33ff41..241fe6b 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -501,14 +501,14 @@ static int __init prom_next_node(phandle *nodep)
> }
> }
>
> -static inline int prom_getprop(phandle node, const char *pname,
> - void *value, size_t valuelen)
> +static inline int __init prom_getprop(phandle node, const char *pname,
> + void *value, size_t valuelen)
> {
> return call_prom("getprop", 4, 1, node, ADDR(pname),
> (u32)(unsigned long) value, (u32) valuelen);
> }
>
> -static inline int prom_getproplen(phandle node, const char *pname)
> +static inline int __init prom_getproplen(phandle node, const char *pname)
> {
> return call_prom("getproplen", 2, 1, node, ADDR(pname));
> }
> diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
> index 6a23b9e..a2b2848 100644
> --- a/arch/powerpc/mm/tlb-radix.c
> +++ b/arch/powerpc/mm/tlb-radix.c
> @@ -928,7 +928,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
> tlb->need_flush_all = 0;
> }
>
> -static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
> +static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
> unsigned long start, unsigned long end,
> int psize, bool also_pwc)
> {
> diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
> index 3cc52e3..f316de4 100644
> --- a/arch/s390/include/asm/cpacf.h
> +++ b/arch/s390/include/asm/cpacf.h
> @@ -202,7 +202,7 @@ static inline int __cpacf_check_opcode(unsigned int opcode)
> }
> }
>
> -static inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
> +static __always_inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
> {
> if (__cpacf_check_opcode(opcode)) {
> __cpacf_query(opcode, mask);
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1f9b3c..1a3e2b5 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -310,9 +310,6 @@ config ZONE_DMA32
> config AUDIT_ARCH
> def_bool y if X86_64
>
> -config ARCH_SUPPORTS_OPTIMIZED_INLINING
> - def_bool y
> -
> config ARCH_SUPPORTS_DEBUG_PAGEALLOC
> def_bool y
>
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index 15d0fbe..f730680 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -266,20 +266,6 @@ config CPA_DEBUG
> ---help---
> Do change_page_attr() self-tests every 30 seconds.
>
> -config OPTIMIZE_INLINING
> - bool "Allow gcc to uninline functions marked 'inline'"
> - ---help---
> - This option determines if the kernel forces gcc to inline the functions
> - developers have marked 'inline'. Doing so takes away freedom from gcc to
> - do what it thinks is best, which is desirable for the gcc 3.x series of
> - compilers. The gcc 4.x series have a rewritten inlining algorithm and
> - enabling this option will generate a smaller kernel there. Hopefully
> - this algorithm is so good that allowing gcc 4.x and above to make the
> - decision will become the default in the future. Until then this option
> - is there to test gcc for this.
> -
> - If unsure, say N.
> -
> config DEBUG_ENTRY
> bool "Debug low-level entry code"
> depends on DEBUG_KERNEL
> diff --git a/drivers/mtd/nand/raw/vf610_nfc.c b/drivers/mtd/nand/raw/vf610_nfc.c
> index a662ca1..19792d7 100644
> --- a/drivers/mtd/nand/raw/vf610_nfc.c
> +++ b/drivers/mtd/nand/raw/vf610_nfc.c
> @@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
> {
> const struct nand_op_instr *instr;
> struct vf610_nfc *nfc = chip_to_nfc(chip);
> - int op_id = -1, trfr_sz = 0, offset;
> + int op_id = -1, trfr_sz = 0, offset = 0;
> u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
> bool force8bit = false;
>
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index ba814f1..19e58b9 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -140,8 +140,7 @@ struct ftrace_likely_data {
> * Do not use __always_inline here, since currently it expands to inline again
> * (which would break users of __always_inline).
> */
> -#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
> - !defined(CONFIG_OPTIMIZE_INLINING)
> +#if !defined(CONFIG_OPTIMIZE_INLINING)
> #define inline inline __attribute__((__always_inline__)) __gnu_inline \
> __maybe_unused notrace
> #else
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 0d9e817..20f3dfc 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -310,6 +310,21 @@ config HEADERS_CHECK
> exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
> your build tree), to make sure they're suitable.
>
> +config OPTIMIZE_INLINING
> + bool "Allow compiler to uninline functions marked 'inline'"
> + depends on !MIPS # TODO: look into MIPS code
> + help
> + This option determines if the kernel forces gcc to inline the functions
> + developers have marked 'inline'. Doing so takes away freedom from gcc to
> + do what it thinks is best, which is desirable for the gcc 3.x series of
> + compilers. The gcc 4.x series have a rewritten inlining algorithm and
> + enabling this option will generate a smaller kernel there. Hopefully
> + this algorithm is so good that allowing gcc 4.x and above to make the
> + decision will become the default in the future. Until then this option
> + is there to test gcc for this.
> +
> + If unsure, say N.
> +
> config DEBUG_SECTION_MISMATCH
> bool "Enable full Section mismatch analysis"
> help
> --
> 2.7.4
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/



--
Best Regards
Masahiro Yamada

2019-03-20 09:41:06

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 7:41 AM Masahiro Yamada
<[email protected]> wrote:

> It is unclear to me how to fix it.
> That's why I ended up with "depends on !MIPS".
>
>
> MODPOST vmlinux.o
> arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined
> references to `mips_gcr_base'
>
>
> Perhaps, MIPS folks may know how to fix it.

I would guess like this:

diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
index 8bc5df49b0e1..a27483fedb7d 100644
--- a/arch/mips/include/asm/mips-cm.h
+++ b/arch/mips/include/asm/mips-cm.h
@@ -79,7 +79,7 @@ static inline int mips_cm_probe(void)
*
* Returns true if a CM is present in the system, else false.
*/
-static inline bool mips_cm_present(void)
+static __always_inline bool mips_cm_present(void)
{
#ifdef CONFIG_MIPS_CM
return mips_gcr_base != NULL;
@@ -93,7 +93,7 @@ static inline bool mips_cm_present(void)
*
* Returns true if the system implements an L2-only sync region, else false.
*/
-static inline bool mips_cm_has_l2sync(void)
+static __always_inline bool mips_cm_has_l2sync(void)
{
#ifdef CONFIG_MIPS_CM
return mips_cm_l2sync_base != NULL;

2019-03-20 09:45:13

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 7:21 AM Masahiro Yamada
<[email protected]> wrote:
>
> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
>
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
>
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.
>
> I tested this patch on my arm/arm64 boards.
>
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
>
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
>
> dec file
> 18983424 arch/arm64/boot/Image.before
> 18321920 arch/arm64/boot/Image.after
>
> This also slightly improves the "Kernel hacking" Kconfig menu.
> Commit e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen')
> mentioned this config option would be a good fit in the "compiler option"
> menu. I did so.

I think this is a good idea in general, but it is likely to cause a lot of
new warnings. Especially the -Wmaybe-uninitialized warnings get
new false positives every time we get substantially different inlining
decisions.

I've added your patch to my randconfig test setup and will let you
know if I see anything noticeable. I'm currently testing clang-arm32,
clang-arm64 and gcc-x86.

Arnd

2019-03-20 10:20:31

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Arnd,


On Wed, Mar 20, 2019 at 6:39 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 7:41 AM Masahiro Yamada
> <[email protected]> wrote:
>
> > It is unclear to me how to fix it.
> > That's why I ended up with "depends on !MIPS".
> >
> >
> > MODPOST vmlinux.o
> > arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> > sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> > sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> > sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> > sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> > sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> > arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined
> > references to `mips_gcr_base'
> >
> >
> > Perhaps, MIPS folks may know how to fix it.
>
> I would guess like this:
>
> diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
> index 8bc5df49b0e1..a27483fedb7d 100644
> --- a/arch/mips/include/asm/mips-cm.h
> +++ b/arch/mips/include/asm/mips-cm.h
> @@ -79,7 +79,7 @@ static inline int mips_cm_probe(void)
> *
> * Returns true if a CM is present in the system, else false.
> */
> -static inline bool mips_cm_present(void)
> +static __always_inline bool mips_cm_present(void)
> {
> #ifdef CONFIG_MIPS_CM
> return mips_gcr_base != NULL;
> @@ -93,7 +93,7 @@ static inline bool mips_cm_present(void)
> *
> * Returns true if the system implements an L2-only sync region, else false.
> */
> -static inline bool mips_cm_has_l2sync(void)
> +static __always_inline bool mips_cm_has_l2sync(void)
> {
> #ifdef CONFIG_MIPS_CM
> return mips_cm_l2sync_base != NULL;
>


Thanks, I applied the above, but I still see
undefined reference to `mips_gcr_base'


I attached .config to produce this error.

I use prebuilt mips-linux-gcc from
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/


--
Best Regards
Masahiro Yamada


Attachments:
config.gz (13.33 kB)

2019-03-20 13:06:40

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 11:19 AM Masahiro Yamada
<[email protected]> wrote:
> On Wed, Mar 20, 2019 at 6:39 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Wed, Mar 20, 2019 at 7:41 AM Masahiro Yamada
> > <[email protected]> wrote:
> >
> > > It is unclear to me how to fix it.
> > > That's why I ended up with "depends on !MIPS".
> > >
> > >
> > > MODPOST vmlinux.o
> > > arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> > > sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> > > sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> > > sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> > > sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> > > sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> > > arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined
> > > references to `mips_gcr_base'
> > >
> > >
> > > Perhaps, MIPS folks may know how to fix it.
> >
> > I would guess like this:
> >
> > diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
> > index 8bc5df49b0e1..a27483fedb7d 100644
> > --- a/arch/mips/include/asm/mips-cm.h
> > +++ b/arch/mips/include/asm/mips-cm.h
> > @@ -79,7 +79,7 @@ static inline int mips_cm_probe(void)
> > *
> > * Returns true if a CM is present in the system, else false.
> > */
> > -static inline bool mips_cm_present(void)
> > +static __always_inline bool mips_cm_present(void)
> > {
> > #ifdef CONFIG_MIPS_CM
> > return mips_gcr_base != NULL;
> > @@ -93,7 +93,7 @@ static inline bool mips_cm_present(void)
> > *
> > * Returns true if the system implements an L2-only sync region, else false.
> > */
> > -static inline bool mips_cm_has_l2sync(void)
> > +static __always_inline bool mips_cm_has_l2sync(void)
> > {
> > #ifdef CONFIG_MIPS_CM
> > return mips_cm_l2sync_base != NULL;
> >
>
>
> Thanks, I applied the above, but I still see
> undefined reference to `mips_gcr_base'
>
>
> I attached .config to produce this error.
>
> I use prebuilt mips-linux-gcc from
> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/

I got to this patch experimentally, it fixes the problem for me:

diff --git a/arch/mips/mm/sc-mips.c b/arch/mips/mm/sc-mips.c
index 394673991bab..d70d02da038b 100644
--- a/arch/mips/mm/sc-mips.c
+++ b/arch/mips/mm/sc-mips.c
@@ -181,7 +181,7 @@ static int __init mips_sc_probe_cm3(void)
return 0;
}

-static inline int __init mips_sc_probe(void)
+static __always_inline int __init mips_sc_probe(void)
{
struct cpuinfo_mips *c = &current_cpu_data;
unsigned int config1, config2;
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 830c93a010c3..186c28463bf3 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -548,7 +548,7 @@ static inline unsigned long __fls(unsigned long word)
* Returns 0..SZLONG-1
* Undefined if no bit exists, so code should check against 0 first.
*/
-static inline unsigned long __ffs(unsigned long word)
+static __always_inline unsigned long __ffs(unsigned long word)
{
return __fls(word & -word);
}


It does look like a gcc bug though, as at least some of the references
are from a function that got split out from an inlined function but that
has no remaining call sites.

Arnd

2019-03-20 13:36:01

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
>
> I've added your patch to my randconfig test setup and will let you
> know if I see anything noticeable. I'm currently testing clang-arm32,
> clang-arm64 and gcc-x86.

This is the only additional bug that has come up so far:

`.exit.text' referenced in section `.alt.smp.init' of
drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
`exit.text' of drivers/char/ipmi/ipmi_msghandler.o

diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
index 201100226301..84b12e33104d 100644
--- a/arch/arm/kernel/atags.h
+++ b/arch/arm/kernel/atags.h
@@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
unsigned int machine_nr);
#else
-static inline const struct machine_desc *
+static __always_inline const struct machine_desc *
setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
{
early_print("no ATAGS support: can't continue\n");

2019-03-21 08:02:35

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 03:20:27PM +0900, Masahiro Yamada wrote:
> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
>
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
>
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.
>
> I tested this patch on my arm/arm64 boards.
>
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
>
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
>
> dec file
> 18983424 arch/arm64/boot/Image.before
> 18321920 arch/arm64/boot/Image.after

Well, this will change, since now people (have to) start adding
__always_inline annotations on all architectures, most likely until
all have about the same amount of annotations like x86. This will
reduce the benefit.

Not sure if it's really a win that we get the inline vs
__always_inline discussion now on all architectures.


2019-03-23 08:27:35

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Arnd Bergmann <[email protected]> a écrit :

> On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
>>
>> I've added your patch to my randconfig test setup and will let you
>> know if I see anything noticeable. I'm currently testing clang-arm32,
>> clang-arm64 and gcc-x86.
>
> This is the only additional bug that has come up so far:
>
> `.exit.text' referenced in section `.alt.smp.init' of
> drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> `exit.text' of drivers/char/ipmi/ipmi_msghandler.o

Wouldn't it be useful to activate -Winline gcc warning to ease finding
out problematic usage of inline keyword ?

Christophe

>
> diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
> index 201100226301..84b12e33104d 100644
> --- a/arch/arm/kernel/atags.h
> +++ b/arch/arm/kernel/atags.h
> @@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
> const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
> unsigned int machine_nr);
> #else
> -static inline const struct machine_desc *
> +static __always_inline const struct machine_desc *
> setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
> {
> early_print("no ATAGS support: can't continue\n");



2019-03-23 08:40:11

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Masahiro Yamada <[email protected]> a écrit :

> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
>
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
>
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.
>
> I tested this patch on my arm/arm64 boards.
>
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
>
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
>
> dec file
> 18983424 arch/arm64/boot/Image.before
> 18321920 arch/arm64/boot/Image.after
>
> This also slightly improves the "Kernel hacking" Kconfig menu.
> Commit e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen')
> mentioned this config option would be a good fit in the "compiler option"
> menu. I did so.
>
> I fixed up some files to avoid build warnings/errors.
>
> [1] arch/arm64/include/asm/cpufeature.h
>
> In file included from ././include/linux/compiler_types.h:68,
> from <command-line>:
> ./arch/arm64/include/asm/jump_label.h: In function 'cpus_have_const_cap':
> ./include/linux/compiler-gcc.h:120:38: warning: asm operand 0
> probably doesn't match constraints
> #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> ^~~
> ./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of
> macro 'asm_volatile_goto'
> asm_volatile_goto(
> ^~~~~~~~~~~~~~~~~
> ./include/linux/compiler-gcc.h:120:38: error: impossible constraint in 'asm'
> #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
> ^~~
> ./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of
> macro 'asm_volatile_goto'
> asm_volatile_goto(
> ^~~~~~~~~~~~~~~~~
>
> [2] arch/mips/kernel/cpu-bugs64.c
>
> arch/mips/kernel/cpu-bugs64.c: In function 'mult_sh_align_mod.constprop':
> arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably
> doesn't match constraints [-Werror]
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably
> doesn't match constraints [-Werror]
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
> asm volatile(
> ^~~
> arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
> asm volatile(
> ^~~
>
> [3] arch/powerpc/mm/tlb-radix.c
>
> arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
> arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably
> doesn't match constraints [-Werror]
> asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
> ^~~
> arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'
> CC arch/powerpc/perf/hv-gpci.o
>
> [4] arch/s390/include/asm/cpacf.h
>
> In file included from arch/s390/crypto/des_s390.c:19:
> ./arch/s390/include/asm/cpacf.h: In function 'cpacf_query':
> ./arch/s390/include/asm/cpacf.h:170:2: warning: asm operand 3
> probably doesn't match constraints
> asm volatile(
> ^~~
> ./arch/s390/include/asm/cpacf.h:170:2: error: impossible constraint in 'asm'
>
> [5] arch/powerpc/kernel/prom_init.c
>
> WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in
> reference from the function .prom_getprop() to the function
> .init.text:.call_prom()
> The function .prom_getprop() references
> the function __init .call_prom().
> This is often because .prom_getprop lacks a __init
> annotation or the annotation of .call_prom is wrong.
>
> WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in
> reference from the function .prom_getproplen() to the function
> .init.text:.call_prom()
> The function .prom_getproplen() references
> the function __init .call_prom().
> This is often because .prom_getproplen lacks a __init
> annotation or the annotation of .call_prom is wrong.
>
> [6] drivers/mtd/nand/raw/vf610_nfc.c
>
> drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
> drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be
> used uninitialized in this function [-Wmaybe-uninitialized]
> vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> nfc->regs + NFC_MAIN_AREA(0) + offset,
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> trfr_sz, !nfc->data_access);
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> [7] arch/arm/kernel/smp.c
>
> arch/arm/kernel/smp.c: In function ‘raise_nmi’:
> arch/arm/kernel/smp.c:522:2: warning: array subscript is above array
> bounds [-Warray-bounds]
> trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The fixup is not included in this. The patch is available in ML:
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409393.html
>
> Signed-off-by: Masahiro Yamada <[email protected]>
> ---
>
> arch/arm64/include/asm/cpufeature.h | 4 ++--
> arch/mips/kernel/cpu-bugs64.c | 4 ++--
> arch/powerpc/kernel/prom_init.c | 6 +++---
> arch/powerpc/mm/tlb-radix.c | 2 +-
> arch/s390/include/asm/cpacf.h | 2 +-
> arch/x86/Kconfig | 3 ---
> arch/x86/Kconfig.debug | 14 --------------
> drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
> include/linux/compiler_types.h | 3 +--
> lib/Kconfig.debug | 15 +++++++++++++++
> 10 files changed, 26 insertions(+), 29 deletions(-)

The arch fixups can be done regardless of the Kconfig change as far as
they are done before.

I guess it would then be easier to manage and review with a series of
small patches addressing each arch independently

Christophe

>
> diff --git a/arch/arm64/include/asm/cpufeature.h
> b/arch/arm64/include/asm/cpufeature.h
> index e505e1f..77d1aa5 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -406,7 +406,7 @@ static inline bool cpu_have_feature(unsigned int num)
> }
>
> /* System capability check for constant caps */
> -static inline bool __cpus_have_const_cap(int num)
> +static __always_inline bool __cpus_have_const_cap(int num)
> {
> if (num >= ARM64_NCAPS)
> return false;
> @@ -420,7 +420,7 @@ static inline bool cpus_have_cap(unsigned int num)
> return test_bit(num, cpu_hwcaps);
> }
>
> -static inline bool cpus_have_const_cap(int num)
> +static __always_inline bool cpus_have_const_cap(int num)
> {
> if (static_branch_likely(&arm64_const_caps_ready))
> return __cpus_have_const_cap(num);
> diff --git a/arch/mips/kernel/cpu-bugs64.c b/arch/mips/kernel/cpu-bugs64.c
> index bada74a..c04b97a 100644
> --- a/arch/mips/kernel/cpu-bugs64.c
> +++ b/arch/mips/kernel/cpu-bugs64.c
> @@ -42,8 +42,8 @@ static inline void align_mod(const int align,
> const int mod)
> : "n"(align), "n"(mod));
> }
>
> -static inline void mult_sh_align_mod(long *v1, long *v2, long *w,
> - const int align, const int mod)
> +static __always_inline void mult_sh_align_mod(long *v1, long *v2, long *w,
> + const int align, const int mod)
> {
> unsigned long flags;
> int m1, m2;
> diff --git a/arch/powerpc/kernel/prom_init.c
> b/arch/powerpc/kernel/prom_init.c
> index f33ff41..241fe6b 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -501,14 +501,14 @@ static int __init prom_next_node(phandle *nodep)
> }
> }
>
> -static inline int prom_getprop(phandle node, const char *pname,
> - void *value, size_t valuelen)
> +static inline int __init prom_getprop(phandle node, const char *pname,
> + void *value, size_t valuelen)
> {
> return call_prom("getprop", 4, 1, node, ADDR(pname),
> (u32)(unsigned long) value, (u32) valuelen);
> }
>
> -static inline int prom_getproplen(phandle node, const char *pname)
> +static inline int __init prom_getproplen(phandle node, const char *pname)
> {
> return call_prom("getproplen", 2, 1, node, ADDR(pname));
> }
> diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
> index 6a23b9e..a2b2848 100644
> --- a/arch/powerpc/mm/tlb-radix.c
> +++ b/arch/powerpc/mm/tlb-radix.c
> @@ -928,7 +928,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
> tlb->need_flush_all = 0;
> }
>
> -static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
> +static __always_inline void __radix__flush_tlb_range_psize(struct
> mm_struct *mm,
> unsigned long start, unsigned long end,
> int psize, bool also_pwc)
> {
> diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
> index 3cc52e3..f316de4 100644
> --- a/arch/s390/include/asm/cpacf.h
> +++ b/arch/s390/include/asm/cpacf.h
> @@ -202,7 +202,7 @@ static inline int __cpacf_check_opcode(unsigned
> int opcode)
> }
> }
>
> -static inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
> +static __always_inline int cpacf_query(unsigned int opcode,
> cpacf_mask_t *mask)
> {
> if (__cpacf_check_opcode(opcode)) {
> __cpacf_query(opcode, mask);
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1f9b3c..1a3e2b5 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -310,9 +310,6 @@ config ZONE_DMA32
> config AUDIT_ARCH
> def_bool y if X86_64
>
> -config ARCH_SUPPORTS_OPTIMIZED_INLINING
> - def_bool y
> -
> config ARCH_SUPPORTS_DEBUG_PAGEALLOC
> def_bool y
>
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index 15d0fbe..f730680 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -266,20 +266,6 @@ config CPA_DEBUG
> ---help---
> Do change_page_attr() self-tests every 30 seconds.
>
> -config OPTIMIZE_INLINING
> - bool "Allow gcc to uninline functions marked 'inline'"
> - ---help---
> - This option determines if the kernel forces gcc to inline the functions
> - developers have marked 'inline'. Doing so takes away freedom from gcc to
> - do what it thinks is best, which is desirable for the gcc 3.x series of
> - compilers. The gcc 4.x series have a rewritten inlining algorithm and
> - enabling this option will generate a smaller kernel there. Hopefully
> - this algorithm is so good that allowing gcc 4.x and above to make the
> - decision will become the default in the future. Until then this option
> - is there to test gcc for this.
> -
> - If unsure, say N.
> -
> config DEBUG_ENTRY
> bool "Debug low-level entry code"
> depends on DEBUG_KERNEL
> diff --git a/drivers/mtd/nand/raw/vf610_nfc.c
> b/drivers/mtd/nand/raw/vf610_nfc.c
> index a662ca1..19792d7 100644
> --- a/drivers/mtd/nand/raw/vf610_nfc.c
> +++ b/drivers/mtd/nand/raw/vf610_nfc.c
> @@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
> {
> const struct nand_op_instr *instr;
> struct vf610_nfc *nfc = chip_to_nfc(chip);
> - int op_id = -1, trfr_sz = 0, offset;
> + int op_id = -1, trfr_sz = 0, offset = 0;
> u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
> bool force8bit = false;
>
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index ba814f1..19e58b9 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -140,8 +140,7 @@ struct ftrace_likely_data {
> * Do not use __always_inline here, since currently it expands to
> inline again
> * (which would break users of __always_inline).
> */
> -#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
> - !defined(CONFIG_OPTIMIZE_INLINING)
> +#if !defined(CONFIG_OPTIMIZE_INLINING)
> #define inline inline __attribute__((__always_inline__)) __gnu_inline \
> __maybe_unused notrace
> #else
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 0d9e817..20f3dfc 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -310,6 +310,21 @@ config HEADERS_CHECK
> exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
> your build tree), to make sure they're suitable.
>
> +config OPTIMIZE_INLINING
> + bool "Allow compiler to uninline functions marked 'inline'"
> + depends on !MIPS # TODO: look into MIPS code
> + help
> + This option determines if the kernel forces gcc to inline the functions
> + developers have marked 'inline'. Doing so takes away freedom from gcc to
> + do what it thinks is best, which is desirable for the gcc 3.x series of
> + compilers. The gcc 4.x series have a rewritten inlining algorithm and
> + enabling this option will generate a smaller kernel there. Hopefully
> + this algorithm is so good that allowing gcc 4.x and above to make the
> + decision will become the default in the future. Until then this option
> + is there to test gcc for this.
> +
> + If unsure, say N.
> +
> config DEBUG_SECTION_MISMATCH
> bool "Enable full Section mismatch analysis"
> help
> --
> 2.7.4



2019-03-25 06:05:33

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Arnd,




On Wed, Mar 20, 2019 at 10:05 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 11:19 AM Masahiro Yamada
> <[email protected]> wrote:
> > On Wed, Mar 20, 2019 at 6:39 PM Arnd Bergmann <[email protected]> wrote:
> > >
> > > On Wed, Mar 20, 2019 at 7:41 AM Masahiro Yamada
> > > <[email protected]> wrote:
> > >
> > > > It is unclear to me how to fix it.
> > > > That's why I ended up with "depends on !MIPS".
> > > >
> > > >
> > > > MODPOST vmlinux.o
> > > > arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> > > > sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> > > > sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> > > > sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> > > > sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> > > > sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> > > > arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined
> > > > references to `mips_gcr_base'
> > > >
> > > >
> > > > Perhaps, MIPS folks may know how to fix it.
> > >
> > > I would guess like this:
> > >
> > > diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
> > > index 8bc5df49b0e1..a27483fedb7d 100644
> > > --- a/arch/mips/include/asm/mips-cm.h
> > > +++ b/arch/mips/include/asm/mips-cm.h
> > > @@ -79,7 +79,7 @@ static inline int mips_cm_probe(void)
> > > *
> > > * Returns true if a CM is present in the system, else false.
> > > */
> > > -static inline bool mips_cm_present(void)
> > > +static __always_inline bool mips_cm_present(void)
> > > {
> > > #ifdef CONFIG_MIPS_CM
> > > return mips_gcr_base != NULL;
> > > @@ -93,7 +93,7 @@ static inline bool mips_cm_present(void)
> > > *
> > > * Returns true if the system implements an L2-only sync region, else false.
> > > */
> > > -static inline bool mips_cm_has_l2sync(void)
> > > +static __always_inline bool mips_cm_has_l2sync(void)
> > > {
> > > #ifdef CONFIG_MIPS_CM
> > > return mips_cm_l2sync_base != NULL;
> > >
> >
> >
> > Thanks, I applied the above, but I still see
> > undefined reference to `mips_gcr_base'
> >
> >
> > I attached .config to produce this error.
> >
> > I use prebuilt mips-linux-gcc from
> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/
>
> I got to this patch experimentally, it fixes the problem for me:
>
> diff --git a/arch/mips/mm/sc-mips.c b/arch/mips/mm/sc-mips.c
> index 394673991bab..d70d02da038b 100644
> --- a/arch/mips/mm/sc-mips.c
> +++ b/arch/mips/mm/sc-mips.c
> @@ -181,7 +181,7 @@ static int __init mips_sc_probe_cm3(void)
> return 0;
> }
>
> -static inline int __init mips_sc_probe(void)
> +static __always_inline int __init mips_sc_probe(void)
> {
> struct cpuinfo_mips *c = &current_cpu_data;
> unsigned int config1, config2;
> diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
> index 830c93a010c3..186c28463bf3 100644
> --- a/arch/mips/include/asm/bitops.h
> +++ b/arch/mips/include/asm/bitops.h
> @@ -548,7 +548,7 @@ static inline unsigned long __fls(unsigned long word)
> * Returns 0..SZLONG-1
> * Undefined if no bit exists, so code should check against 0 first.
> */
> -static inline unsigned long __ffs(unsigned long word)
> +static __always_inline unsigned long __ffs(unsigned long word)
> {
> return __fls(word & -word);
> }
>
>
> It does look like a gcc bug though, as at least some of the references
> are from a function that got split out from an inlined function but that
> has no remaining call sites.
>
> Arnd


I applied it, but
"undefined reference to `mips_gcr_base'" still remains.


Then, I got a solution. This is it:

diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 830c93a..6a26ead 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -482,7 +482,7 @@ static inline void __clear_bit_unlock(unsigned
long nr, volatile unsigned long *
* Return the bit position (0..63) of the most significant 1 bit in a word
* Returns -1 if no 1 bit exists
*/
-static inline unsigned long __fls(unsigned long word)
+static __always_inline unsigned long __fls(unsigned long word)
{
int num;



LEROY Christophe provided me a tip
to find out the cause of the error.

-Winline pin-points which function
was not inlined despite of its inline marker.



$ make -j8 ARCH=mips CROSS_COMPILE=mips-linux- KCFLAGS=-Winline
arch/mips/mm/sc-mips.o
CC scripts/mod/devicetable-offsets.s
CC scripts/mod/empty.o
MKELF scripts/mod/elfconfig.h
HOSTCC scripts/mod/modpost.o
HOSTCC scripts/mod/sumversion.o
HOSTCC scripts/mod/file2alias.o
HOSTLD scripts/mod/modpost
CC kernel/bounds.s
CALL scripts/atomic/check-atomics.sh
CC arch/mips/kernel/asm-offsets.s
CALL scripts/checksyscalls.sh
<stdin>:1478:2: warning: #warning syscall pidfd_send_signal not
implemented [-Wcpp]
<stdin>:1481:2: warning: #warning syscall io_uring_setup not implemented [-Wcpp]
<stdin>:1484:2: warning: #warning syscall io_uring_enter not implemented [-Wcpp]
<stdin>:1487:2: warning: #warning syscall io_uring_register not
implemented [-Wcpp]
CC arch/mips/mm/sc-mips.o
In file included from ./include/linux/bitops.h:19,
from ./include/linux/kernel.h:12,
from arch/mips/mm/sc-mips.c:6:
./arch/mips/include/asm/bitops.h: In function 'mips_sc_init':
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~
./arch/mips/include/asm/bitops.h:485:29: warning: inlining failed in
call to '__fls': call is unlikely and code size would grow [-Winline]
static inline unsigned long __fls(unsigned long word)
^~~~~
./arch/mips/include/asm/bitops.h:553:9: note: called from here
return __fls(word & -word);
^~~~~~~~~~~~~~~~~~~



--
Best Regards
Masahiro Yamada

2019-03-25 06:12:06

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Arnd,




On Wed, Mar 20, 2019 at 10:34 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> >
> > I've added your patch to my randconfig test setup and will let you
> > know if I see anything noticeable. I'm currently testing clang-arm32,
> > clang-arm64 and gcc-x86.
>
> This is the only additional bug that has come up so far:
>
> `.exit.text' referenced in section `.alt.smp.init' of
> drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
>
> diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
> index 201100226301..84b12e33104d 100644
> --- a/arch/arm/kernel/atags.h
> +++ b/arch/arm/kernel/atags.h
> @@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
> const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
> unsigned int machine_nr);
> #else
> -static inline const struct machine_desc *
> +static __always_inline const struct machine_desc *
> setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
> {
> early_print("no ATAGS support: can't continue\n");
>


I do not know why to reproduce it,
but is "__init __noreturn" more sensible than
"__always_inline" here?


--
Best Regards
Masahiro Yamada

2019-03-25 06:15:53

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Mon, Mar 25, 2019 at 3:10 PM Masahiro Yamada
<[email protected]> wrote:
>
> Hi Arnd,
>
>
>
>
> On Wed, Mar 20, 2019 at 10:34 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> > >
> > > I've added your patch to my randconfig test setup and will let you
> > > know if I see anything noticeable. I'm currently testing clang-arm32,
> > > clang-arm64 and gcc-x86.
> >
> > This is the only additional bug that has come up so far:
> >
> > `.exit.text' referenced in section `.alt.smp.init' of
> > drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> > `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
> >
> > diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
> > index 201100226301..84b12e33104d 100644
> > --- a/arch/arm/kernel/atags.h
> > +++ b/arch/arm/kernel/atags.h
> > @@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
> > const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
> > unsigned int machine_nr);
> > #else
> > -static inline const struct machine_desc *
> > +static __always_inline const struct machine_desc *
> > setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
> > {
> > early_print("no ATAGS support: can't continue\n");
> >
>
>
> I do not know why to reproduce it,

"how to"


> but is "__init __noreturn" more sensible than
> "__always_inline" here?
>
>
> --
> Best Regards
> Masahiro Yamada



--
Best Regards
Masahiro Yamada

2019-03-25 06:43:13

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Heiko,


On Thu, Mar 21, 2019 at 5:02 PM Heiko Carstens
<[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 03:20:27PM +0900, Masahiro Yamada wrote:
> > Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> > CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
> >
> > The idea is obviously arch-agnostic although we need some code fixups.
> > This commit moves the config entry from arch/x86/Kconfig.debug to
> > lib/Kconfig.debug so that all architectures (except MIPS for now) can
> > benefit from it.
> >
> > At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> > for MIPS was complex to me.
> >
> > I tested this patch on my arm/arm64 boards.
> >
> > This can make a huge difference in kernel image size especially when
> > CONFIG_OPTIMIZE_FOR_SIZE is enabled.
> >
> > For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
> >
> > dec file
> > 18983424 arch/arm64/boot/Image.before
> > 18321920 arch/arm64/boot/Image.after
>
> Well, this will change, since now people (have to) start adding
> __always_inline annotations on all architectures, most likely until
> all have about the same amount of annotations like x86. This will
> reduce the benefit.


If people start to replace inline with __always_inline here and there,
yes, the difference will be reduced.

Perhaps, we might end up with fixing dozens of places or so,
but I guess we would still get benefit.


> Not sure if it's really a win that we get the inline vs
> __always_inline discussion now on all architectures.


This feature is not x86-specific.

I prefer "do it for all arches or don't do it at all"
instead of the half-baked state.

If we force inlining for the 'inline' marker
there is no point of having __always_inline.


--
Best Regards
Masahiro Yamada

2019-03-25 06:45:43

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Christophe,


On Sat, Mar 23, 2019 at 5:27 PM LEROY Christophe
<[email protected]> wrote:
>
> Arnd Bergmann <[email protected]> a écrit :
>
> > On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> >>
> >> I've added your patch to my randconfig test setup and will let you
> >> know if I see anything noticeable. I'm currently testing clang-arm32,
> >> clang-arm64 and gcc-x86.
> >
> > This is the only additional bug that has come up so far:
> >
> > `.exit.text' referenced in section `.alt.smp.init' of
> > drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> > `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
>
> Wouldn't it be useful to activate -Winline gcc warning to ease finding
> out problematic usage of inline keyword ?


Yes, it is useful to find out
which function is causing the error.
Thanks for the tip.




--
Best Regards
Masahiro Yamada

2019-03-25 07:33:57

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Mon, Mar 25, 2019 at 7:11 AM Masahiro Yamada
<[email protected]> wrote:
> On Wed, Mar 20, 2019 at 10:34 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> > >
> > > I've added your patch to my randconfig test setup and will let you
> > > know if I see anything noticeable. I'm currently testing clang-arm32,
> > > clang-arm64 and gcc-x86.
> >
> > This is the only additional bug that has come up so far:
> >
> > `.exit.text' referenced in section `.alt.smp.init' of
> > drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> > `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
> >
> > diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
> > index 201100226301..84b12e33104d 100644
> > --- a/arch/arm/kernel/atags.h
> > +++ b/arch/arm/kernel/atags.h
> > @@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
> > const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
> > unsigned int machine_nr);
> > #else
> > -static inline const struct machine_desc *
> > +static __always_inline const struct machine_desc *
> > setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
> > {
> > early_print("no ATAGS support: can't continue\n");
> >
>
>
> I do not know why to reproduce it,
> but is "__init __noreturn" more sensible than
> "__always_inline" here?

It's in a header file, so it has to be 'inline'. We could make it
static inline __init __noreturn, but I don't see an advantage over
__always_inline there.

Arnd

2019-03-25 07:56:49

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Mon, Mar 25, 2019 at 4:33 PM Arnd Bergmann <[email protected]> wrote:
>
> On Mon, Mar 25, 2019 at 7:11 AM Masahiro Yamada
> <[email protected]> wrote:
> > On Wed, Mar 20, 2019 at 10:34 PM Arnd Bergmann <[email protected]> wrote:
> > >
> > > On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> > > >
> > > > I've added your patch to my randconfig test setup and will let you
> > > > know if I see anything noticeable. I'm currently testing clang-arm32,
> > > > clang-arm64 and gcc-x86.
> > >
> > > This is the only additional bug that has come up so far:
> > >
> > > `.exit.text' referenced in section `.alt.smp.init' of
> > > drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> > > `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
> > >
> > > diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
> > > index 201100226301..84b12e33104d 100644
> > > --- a/arch/arm/kernel/atags.h
> > > +++ b/arch/arm/kernel/atags.h
> > > @@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
> > > const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
> > > unsigned int machine_nr);
> > > #else
> > > -static inline const struct machine_desc *
> > > +static __always_inline const struct machine_desc *
> > > setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
> > > {
> > > early_print("no ATAGS support: can't continue\n");
> > >
> >
> >
> > I do not know why to reproduce it,
> > but is "__init __noreturn" more sensible than
> > "__always_inline" here?
>
> It's in a header file, so it has to be 'inline'. We could make it
> static inline __init __noreturn,

Yes, I like 'static inline __init __noreturn'

> but I don't see an advantage over
> __always_inline there.

__always_inline takes away the compiler's freedom.

I'd like to leave it up to the compiler where possible.

The inlining decision may change
depending on -O2, -Os, -Og(which was rejected)
or whatever optimization strategy.


>
> Arnd
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/



--
Best Regards
Masahiro Yamada

2019-03-25 10:26:45

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Mon, Mar 25, 2019 at 8:55 AM Masahiro Yamada
<[email protected]> wrote:
>
> On Mon, Mar 25, 2019 at 4:33 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Mon, Mar 25, 2019 at 7:11 AM Masahiro Yamada
> > <[email protected]> wrote:
> > > I do not know why to reproduce it,
> > > but is "__init __noreturn" more sensible than
> > > "__always_inline" here?
> >
> > It's in a header file, so it has to be 'inline'. We could make it
> > static inline __init __noreturn,
>
> Yes, I like 'static inline __init __noreturn'
>
> > but I don't see an advantage over
> > __always_inline there.
>
> __always_inline takes away the compiler's freedom.
>
> I'd like to leave it up to the compiler where possible.

Ok, fair enough.

Arnd

2019-03-25 10:28:41

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

On Wed, Mar 20, 2019 at 2:34 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> >
> > I've added your patch to my randconfig test setup and will let you
> > know if I see anything noticeable. I'm currently testing clang-arm32,
> > clang-arm64 and gcc-x86.
>
> This is the only additional bug that has come up so far:
>
> `.exit.text' referenced in section `.alt.smp.init' of
> drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> `exit.text' of drivers/char/ipmi/ipmi_msghandler.o

FWIW, the message up there does not match the patch anyway,
that was an unrelated bug.

Arnd

2019-03-26 06:05:04

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Masahiro,

Le 25/03/2019 à 07:44, Masahiro Yamada a écrit :
> Hi Christophe,
>
>
> On Sat, Mar 23, 2019 at 5:27 PM LEROY Christophe
> <[email protected]> wrote:
>>
>> Arnd Bergmann <[email protected]> a écrit :
>>
>>> On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
>>>>
>>>> I've added your patch to my randconfig test setup and will let you
>>>> know if I see anything noticeable. I'm currently testing clang-arm32,
>>>> clang-arm64 and gcc-x86.
>>>
>>> This is the only additional bug that has come up so far:
>>>
>>> `.exit.text' referenced in section `.alt.smp.init' of
>>> drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
>>> `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
>>
>> Wouldn't it be useful to activate -Winline gcc warning to ease finding
>> out problematic usage of inline keyword ?
>
>
> Yes, it is useful to find out
> which function is causing the error.
> Thanks for the tip.
>

I did a mass build on kisskb. Almost ok, see results at
http://kisskb.ellerman.id.au/kisskb/head/203d912edf8dde291977c71aa64024065e197f79/

ps3_defconfig with GCC 5 fails
(http://kisskb.ellerman.id.au/kisskb/buildresult/13742711/) with:

arch/powerpc/mm/tlb-radix.c:148:2: error: asm operand 3 probably doesn't
match constraints [-Werror]
arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
arch/powerpc/mm/tlb-radix.c:118:2: error: asm operand 3 probably doesn't
match constraints [-Werror]
arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't
match constraints [-Werror]

ps3_defconfig with GCC 4.6 fails
(http://kisskb.ellerman.id.au/kisskb/buildresult/13742591/) with:

arch/powerpc/mm/tlb-radix.c:148:2: error: asm operand 3 probably doesn't
match constraints [-Werror]
arch/powerpc/mm/tlb-radix.c:118:2: error: asm operand 3 probably doesn't
match constraints [-Werror]
arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't
match constraints [-Werror]

randconfig with GCC 4.6 fails
(http://kisskb.ellerman.id.au/kisskb/buildresult/13742698/) with:

arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'

Christophe

2019-04-18 13:22:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING


* Masahiro Yamada <[email protected]> wrote:

> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
>
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
>
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.
>
> I tested this patch on my arm/arm64 boards.
>
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
>
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
>
> dec file
> 18983424 arch/arm64/boot/Image.before
> 18321920 arch/arm64/boot/Image.after
>
> This also slightly improves the "Kernel hacking" Kconfig menu.
> Commit e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen')
> mentioned this config option would be a good fit in the "compiler option"
> menu. I did so.

No objections against moving it from x86 code to generic code.

Thanks,

Ingo

2019-04-23 04:41:08

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Hi Christophe,

On Tue, Mar 26, 2019 at 3:03 PM Christophe Leroy
<[email protected]> wrote:
>
> Hi Masahiro,
>
> Le 25/03/2019 à 07:44, Masahiro Yamada a écrit :
> > Hi Christophe,
> >
> >
> > On Sat, Mar 23, 2019 at 5:27 PM LEROY Christophe
> > <[email protected]> wrote:
> >>
> >> Arnd Bergmann <[email protected]> a écrit :
> >>
> >>> On Wed, Mar 20, 2019 at 10:41 AM Arnd Bergmann <[email protected]> wrote:
> >>>>
> >>>> I've added your patch to my randconfig test setup and will let you
> >>>> know if I see anything noticeable. I'm currently testing clang-arm32,
> >>>> clang-arm64 and gcc-x86.
> >>>
> >>> This is the only additional bug that has come up so far:
> >>>
> >>> `.exit.text' referenced in section `.alt.smp.init' of
> >>> drivers/char/ipmi/ipmi_msghandler.o: defined in discarded section
> >>> `exit.text' of drivers/char/ipmi/ipmi_msghandler.o
> >>
> >> Wouldn't it be useful to activate -Winline gcc warning to ease finding
> >> out problematic usage of inline keyword ?
> >
> >
> > Yes, it is useful to find out
> > which function is causing the error.
> > Thanks for the tip.
> >
>
> I did a mass build on kisskb. Almost ok, see results at
> http://kisskb.ellerman.id.au/kisskb/head/203d912edf8dde291977c71aa64024065e197f79/
>
> ps3_defconfig with GCC 5 fails
> (http://kisskb.ellerman.id.au/kisskb/buildresult/13742711/) with:
>
> arch/powerpc/mm/tlb-radix.c:148:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
> arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
> arch/powerpc/mm/tlb-radix.c:118:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
> arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
>
> ps3_defconfig with GCC 4.6 fails
> (http://kisskb.ellerman.id.au/kisskb/buildresult/13742591/) with:
>
> arch/powerpc/mm/tlb-radix.c:148:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
> arch/powerpc/mm/tlb-radix.c:118:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
> arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't
> match constraints [-Werror]
>
> randconfig with GCC 4.6 fails
> (http://kisskb.ellerman.id.au/kisskb/buildresult/13742698/) with:
>
> arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'


Thanks.

I fixed these ppc errors,
and separate out arch-fixes.

Now, the latest version is v3.


--
Best Regards
Masahiro Yamada